The python ecosystem is growing and may become the dominant platform for applied machine learning.
The primary rationale for adopting Python for time series forecasting is because it is a general-purpose programming language that you can use both for R&D and in production.
In this post, you will discover the Python ecosystem for time series forecasting.
After reading this post, you will know:
The three standard Python libraries that are critical for time series forecasting. How to install and setup the Python and SciPy environment for development. How to confirm your environment is working correctly and ready for time series forecasting.Let’s get started.

Python Environment for Time Series Forecasting
Photo by Joao Trindade , some rights reserved.
Why Python?Python is a general-purpose interpreted programming language (unlike R or Matlab).
It is easy to learn and use primarily because the language focuses on readability.
It is a popular language in general, consistently appearing in the top 10 programming languages in surveys on StackOverflow (for example, the 2015 survey results ).
Python is a dynamic language and very suited to interactive development and quick prototyping with the power to support the development of large applications.
Python is also widely used for machine learning and data science because of the excellent library support. It has quickly become one of the dominant platforms for machine learning and data science practitioners and is in greater demand than even the R platform by employers (see the graph below).

Python Machine Learning Jobs vs R Machine Learning Jobs
This is a simple and very important consideration.
It means that you can perform your research and development (figuring out what models to use) in the same programming language that you use in operations, greatly simplifying the transition from development to operations.
Stop learning Time Series Forecasting the slow way Sign-up and get a FREE 7-day Time Series Forecasting Mini-CourseYou will get:
... onelesson each day delivered to your inbox
... exclusive PDF ebook containing all lessons
...
confidence and skills
to work through your own projects
Download Your FREE Mini-Course
Python Libraries for Time SeriesSciPy is an ecosystem of Python libraries for mathematics, science, and engineering. It is an add-on to Python that you will need for time series forecasting.
Two SciPy libraries provide a foundation for most others; they are NumPy for providing efficient array operations and Matplotlib for plotting data.There are three higher-level SciPy libraries that provide the key features for time series forecasting in Python.
They are pandas, statsmodels, and scikit-learn for data handling, time series modeling, and machine learning respectively.
Let’s take a closer look at each in turn.
Library: pandasThe pandas library provides high-performance tools for loading and handling data in Python.
It is built upon and requires the SciPy ecosystem and uses primarily NumPy arrays under the covers but provides convenient and easy to use data structures like DataFrame and Series for representing data.
Pandas provides a special focus on support for time series data .
Key features relevant for time series forecasting in pandas include:
The Series object for representing a univariate time series. Explicit handling of date-time indexes in data and date-time ranges. Transforms such as shifting, lagging, and filling. Resampling methods such as up-sampling, down-sampling, and aggregation. Library: statsmodelsThe statsmodels library provides tools for statistical modeling.
It is built upon and requires the SciPy ecosystem and supports data in the form of NumPy arrays and Pandas Series objects.
It provides a suite of statistical test and modeling methods, as well as tools dedicated to time series analysis that can also be used for forecasting.
Key features of statsmodels relevant to time series forecasting include:
Statistical tests for stationarity such as the Augmented Dickey-Fuller unit root test. Time series analysis plots such as autocorrelation function (ACF) and partial autocorrelation function (PACF). Linear time series models such as autoregression (AR), moving average (MA), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA). Library: scikit-learnThe scikit-learn library is how you can develop and practice machine learning in Python.
It is built upon and requires the SciPy ecosystem. The name “sckit” suggests that it is a SciPy plug-in or toolkit. You can review a full list of available SciKits .
The focus of the library is machine learning algorithms for classification, regression, clustering, and more. It also provides tools for related tasks such as evaluating models, tuning parameters, and pre-processing data.
Key features relevant for time series forecasting in scikit-learn include:
The suite of data preparation tools, such as scaling and imputing data. The suite of machine learning algorithms that could be used to model data and make predictions. The resampling methods for estimating the performance of a model on unseen data, specifically the TimeSeriesSplit . Python Ecosystem InstallationThis section will provide you general advice for setting up your Python environment for time series forecasting.
We will cover:
Automatic installation with Anaconda. Manual installation with your platform’s package management. Confirmation of the installed environment.If you already have a functioning Python environment, skip to the confirmation step to check if your software libraries are up-to-date.
Let’s dive in.
1. Automatic Installation If you are not confident at insta