Data-driven modeling in Python
When we have data about some process, we often want to use it to build understanding, or make predictions about new conditions. We can do that by building models from the data. Sometimes we know enough about the system to build a physics-based or phenomenological model. Often though, we don't know enough about our systems to do that. That is when we transition to a data-driven modeling approach, today known more popularly as machine learning.
This book introduces the tools needed to develop data-driven models through interpolation, linear and nonlinear regression methods, symbolic regression and more. Many of these are implemented in scikit-learn, which is a Python library designed for data-driven modeling and machine learning. We also introduce some symbolic regression libraries, and XGBoost for gradient-boosted decision tree modeling.
Updated July 17, 2024 for Python 3.11
A PDF and IPython notebooks with the content.