TIME SERIES FORECASTING

Forecasting Through Economic Uncertainty — Multivariable Time Series Analysis with VAR and Prophet Python Module

Forecast Growth or Loss of Net Sales

Taufik Azri

--

Many businesses are negatively effected from the Covid-19 pandemic. Photo by Anastasiia Chepinska on Unsplash.

INTRODUCTION

Forecasting is one of the most profound economic, business, and financial analysis, yet it barely receives the same height and attention as other niche in data analytics, partly due to its tedious, complex, and complicated procedures. Open-sourced automation remains a rarity, if not non-existent. But nowhere forecasting is more important than now — the current Covid-19 pandemic (as this article was written) sheds lights upon uncertainties to many businesses and retailers. Sales growth and sustenance during this time that can only be described as ­­­uncertain. Are we heading to recession or temporary depression? Nothing is ever certain, but we can plan and prepare, and for most businesses, that entails estimating the future demand or future sales. Hence, this issue necessitates a proper time series forecasting, not only to predict the future outcome, but also to estimate the deviation from the current trajectory, be it growth or loss. This article will do so by demonstrating a tutorial on time series forecasting using statsmodels Vector Autoregressive (VAR) model and Facebook’s flagship open-sourced forecasting tool Prophet.

OBJECTIVE

The purpose of this article is to demonstrate time series forecasting of sales amount using the two methods mentioned above. I write this article from the lens of a corporate data scientist, business analysts, and managers who express interest in measuring potential growth, possibility of loss, or range uncertainty during this difficult time. Readers would be able to apply these examples to their domain knowledge.

For the purpose of this tutorial, I created a mock dataset of Net Sales of a hypothetical food and beverage company in Germany, accrued monthly (why Germany? That was a random choice). This tutorial requires some competency in Python programming language, the language of choice of data scientist and engineers, but readers of different skillset or of curiosity can also read and understand the theoretical reasoning, intuitions, and calculation of each procedure. The code will be available in Github and I encourage you to study the code in tandem as you read.

FROM INTUITION TO HYPOTHESIS

A lot of time series articles and guides provide forecasting examples that are rather limited to a certain academic or scientific niche, for instance, cyclical activity, such as weather, or random walk, such as stock market. Another popular example is macroeconomic forecast, such as GDP. However, majorities work in atypical retail and consumer business, either in e-commerce business or in traditional retail. Key metrics such as net sales and demand are crucial for business’ growth and sustenance. Therefore, I believe this example would demonstrate the urgency to fill this gap within the analytics domain.

As mentioned, I would forecast sales amount for a hypothetical food and beverage manufacturing company in Germany. I would do so from April to December 2020, revealing the extent of growth or losses based on the current economic uncertainty. I purport certain assumptions — a company with a complex supply chain network, with material sources both from local and overseas suppliers, and with a diverse channel of customers, both domestically and globally. Consequently, the company is sensitive to customer sentiment, retail economics, industrial production, and supply chain flow. In other words, it has complexity and sensitivity spans from sourcing, manufacturing, and distribution, both domestically and internationally.

Annual macroeconomic indicators from IMF and OECD.

The table above shows the macroeconomic data of Germany gathered from two external resources. I extracted the Gross Domestic Product (GDP), Inflation, and Unemployment rate from IMF World Economic and Financial Surveys[1] while the trade data from OECD[2]. My hypothesis is that we can estimate the company’s performance with the circumferential economic situation. I believe, as a preliminary forecast, I experiment with Gross Domestic Product (GDP), Import and Export volume, unemployment rate, and inflation rate — these are common macroeconomic indicators that often portray a comprehensive economic condition in a country. We may consider including other economic metrics as well, such as consumer price index, and financial indicators, such as EUR/USD currency rate. But for the purpose of this tutorial, I limit the factors to those aforementioned variables.

A key benefit of using data from OECD and IMF is that they have forecasted the macroeconomic indicators until 2021. We can use these figures as regressors when we predict the Net Sales from April to December 2020 in Prophet module. We also will create a scenario forecast where we believe these economic indicators may decline (as speculated in the news), and measure new prediction of net sales corresponding to this shift in macroeconomic indicators.

We should note that these are lagging economic indicators, meaning that they are reported as a response to the economy, derived from collected financial data. It may not be timely aligned with the sales data, but since the sales data is accrued monthly, we assume the numbers are monthly lag reports. It lags closely with these macroeconomic data, if not at an exact timepoint, thus, closely move together.

WHY VAR MODEL AND PROPHET FORECASTING MODULE?

“It is difficult to predict, especially the future.”

This quote is often attributed to the mathematician Niels Bohr[3]. Indeed, he is right, predicting the future is not easy, but I believe I have chosen to examine these dataset using two of the most reliable method, Variable Autoregressive (VAR) using statsmodels module and Generalized Additive Model (GAM) using Facebook’s Prophet module, because both are suitable for a multi-variable regression problem. There are many proven models as well such as ARIMA and SARIMA, but they work best for univariate data. Ours, however, is multivariable, and I insist to remain multivariate because we know intuitively the predictors (Net Sales) relies heavily on economic conditions. Utilizing multivariable models like VAR would have higher chances to generate higher accuracy.

While there are various time series models, ranging from easy-to-use Exponential Smoothing to black-boxed Keras-backed LSTM Neural Network, I found many of them lacking in terms of ability to incorporate various external factors, such as customized seasonality, and not to mention that Neural Network model is inherently black-box, meaning that is difficult to examine and tune the components in various iterations. Flovik has written a great blog about how not to use a certain model for time series forecasting[4].

In this article, the forecasting guide does not stop at VAR. I will proceed with further analysis and scenario forecast using Prophet package, which as you will see, remarkably user friendly. However, because it is a high-level tool, I postulate that the VAR model process remains necessary to unearth the core principal of time series analysis and forecasting.

There is a caveat, however. The key difference in between these two approaches is that VAR is a multivariable regression, meaning that it can forecasts two or more series altogether, while Prophet remains a uni-variable forecast, but with the ability to add other factors as additional regressors. Since the goal is to estimate only Net Sales, I purport that we can appropriately to use both tools.

KEY CONSIDERATION — TIME SERIES COMPONENTS

A) Frequency

What defines a time series data? The answer is simple: if your data has a time stamp, then it is one. Its analysis requires a different approach than, say, a classification or a regression data. Time is a dimension that can appear in many types of frequencies, such as hourly (e.g., temperature), daily (e.g., close price of a stock portfolio), monthly (e.g., account receivables), or even quarterly (e.g., market sentiment survey). In many effective models, the data frequency is either daily or hourly for a reason — daily and hourly data are often abundant, and because they are cyclical in nature (e.g., a cycle of a retail store data often goes from 8am-10pm), analyzing these series often yield consistent result.

The mock Net Sales, January 2012 until April 2020.

However, not every time series is collected daily. A lot of sales data are available weekly or monthly. For most companies, it takes a certain period for the amount to be aggregated and recorded into the ERP system. If you work as an analyst in a corporate desk, chances are you will only be able to see the data after a week or a month.

While our data is only available monthly, we are fortunate enough to have it spanning from 2012 to March 2020, with 100 observations, about one month before the Covid-19 pandemic reached its peak globally. However, the economic data is only collected annually. How can we fit these data linearly together with the sales amount? We could aggregate the monthly total by annual, but that will dramatically reduce the number of observations. Instead, I propose changing the frequency of the economic data from annual to monthly through interpolation­, a type of estimation in between two data points.

Three GDP tables — original, changed frequency, and interpolated.

Interpolation is common in economics, where a data is often collected in low sampling frequency, or in some cases, contains missing data in between observations[5][6]. Here, I used Pandas’ resampling method to change the GDP’s frequency from yearly to monthly, and then use linear interpolation to fill the missing data.

B) Seasonality and Trend

Net Sales, plotted.

You may not notice, but the Net Sales shows an upward trajectory from 2012 until 2020, yielding an upward growth. In time series analysis, this upward trajectory is called trend, specifically, an upward or downward trajectory of the values in the time series for a considerably consistent amount of time.

Seasonality, on the other hand, refers to the specific period in the cycle with repeated variability. Think of holiday season where shoppers would increase the purchase of specific festive items, or climate seasonal change (e.g., in between summer and fall) where the amount of rain often increases drastically. We need to identify the onset of these seasonalities so that we do not mistaken that periodic increase or decrease in value as a structural change of growth in the series as a whole.

Trend and Seasonality of Net Sales as revealed by Prophet.

Prophet provides a convenient function where you can call the plot components and view the plots of trend and seasonality. It calculates these components in the model, so you can identify them without doing any manual calculation. Of course, you can specify a custom seasonality too (see Prophet).

C) Domain Knowledge — optional, but strongly recommended.

Insert Net Sales with arrow questioning certain time points.

A domain knowledge about the company and the industry would increase the model’s accuracy. Time series model is event-agnostic, meaning that it has no knowledge that can explain certain peaks and thorough in the series. Precisely, in this example, those points marked with arrows are questionable — was there any merger and acquisition occurred over the past? Was there any major event that can explain major increase and decrease in sales? Prophet seamlessly identity these changepoints where it believes there is a significant change that is neither seasonality nor cyclical, ensuring these events do not influence the magnitude of the calculated forecast parameters. Akin to seasonality, you can also specify a custom changepoints if you know exactly where they occur (see Prophet).

VAR: SOME THEORETICAL CONTEXT

Variable Autoregression method is a suitable tool to analyze multiple time series together (i.e. regressors). VAR builds on the assumptions that several regressors move or correlate with each other, which as a result, necessitates a technique that analyzes their moves altogether. Precisely, VAR model takes a linear combination of past values of itself and the past values of other time series (or factors/variables) into the equation. Below is a typical additive time-series model:

Where for a time series Y valued at time t, the series is a sum of an intercept α and its past values Y(t-1) Y(t-2) with coefficient β1 β2 all the way up until period p, better known as lag. Lag here refers to the periods over the past that are used as predictors into the equation. This lag could span past six hours for daily data or 3 months for a monthly-collected customer data.

The model expands into multi-variable linear equation when the model considers two or more time series. Consider two time-series with lag order 1, Y1 and Y2 at time t below:

We can expand the lag from order 1 to order 2:

The equation gets larger as you add more time series (factors or variables) and increase the lag, and hence, necessitates the use of computerized statistical system, with the objective to estimate the parameters α and β using Ordinary Least Squares (OLS) method in statsmodels package.

WHERE DO WE BEGIN?

A VAR forecasting process flow.

Where do we start? And what are the steps to produce a forecast for the next 12 months? In research and literature, there are numerous time series procedures, each with various steps befitting to specific case. There is no one-model-fits-all device, but I believe have outlined a generally acceptable steps that will yield a robust forecast model.

The flowchart above shows the process flow from data ingestion to hypothesis testing and metric validation. It aims to relieve the confusion on tests and steps necessary before we can fit the data into the model. While these procedures are not definite, they cover most of the inspection of principal components, namely causality, correlation, stationarity, and residual error, in which, any part of them lacking could affect the performance of the model.

VAR MODEL

Variable Relationship

Net Sales (y henceforth), with only GDP, Inflation, and Net Trade chosen for forecasting.

To recap, we will predict the monthly net sales, using the economic indicators from OECD and IMF as additional factors. From the eight macroeconomic and trade indicators, I have narrowed down to only three series — GDP, Average Consumer Prices, and Net Trade. The reason is that, after I have run Granger Causality Test and Johansen Co-integration test, I found out that only these three variables commensurate linear relationship with Net Sales.

Visualization of Granger Causality. Regenerated from Liu and Bahadori, “A Survey on Granger Causality, A Computational View”.

Remember that earlier in this article, I purport that the initial hypothesis is that we believe these variables influence each other. The Granger Causality Test is a statistical test that measures if a time series is useful in forecasting another. Specifically, it identifies if a variable’s lag is repeated on another time series. The figure above shows an example illustrating time series X’s lag which is exhibited at series Y.

We can use statsmodels modules to test causality among our variables (see codes in Github). Since the test can only examine two variables at a time, we can use a looping function that loops through all variables and displays them in a matrix, as below:

Granger Causality Test result. Net Sales is denoted as y, since that is the variable to be predicted.

The null hypothesis is that the coefficients corresponding to past values of the second time series are zero. This table shows p-value with the confidence to support the hypothesis, which in this case, if the p-value is less than the significant level (which the default is 0.05), then we reject the hypothesis, and we can conclude there is no evidence that time series X does not influence time series Y. From the table we can see that GDP and Inflation has critical value of less than 0.05, meaning the hypothesis fail to assume they are not a lag of each other, which means, we can purport they may be useful. Net trade, however, does not seem to move in correlation with y, but since it is correlated with other indicators, we will retain them inside the model.

Note that the test does not imply “causality” — it does not purport a common fallacy that the correlation implies causation, but rather it assumes a “temporally related” movements in between the two series. We can use them as variables with the notion that Y forecasts X, not as Y causes X, and vice versa.

Another test that is useful to measure relation is Johansen Co-integration test, or simply Johansen test. Often utilized in measuring portfolio values in trading, Johansen Test analyses a “co-integration” in between two or more variables. Co-integration here connotes a statistically significant relationship in between two or more variables in the long run. In other words, in the long run, the average of the variables moves closely with each other.

Table displaying Johansen Co-integrating test statistics.

The test can measure results using either traces (sum of elements in the main diagonal of a linear algebra equation) or eigenvalues (a scalar of an eigenvector). Do not worry about this linear algebra concept, as you only need to understand the result. The null hypothesis is that there are no co-integrating equations, while the alternative hypothesis is that there is at least one linear combination of co-integrating relationship among the time series.

For y (Net Sales), at 95% critical value (p-value at 0.05), we reject the null hypothesis and accept that it is a linear combination of at least one of the variables indicated in the test. Note that since we only concern about predicting y, we may not need other variables to show a statistically significant result.

Stationarity

Actual, non-stationary and 2nd order differenced, stationary Net Sales.

A major criterion in time series analysis is that the series itself is stationary. Stationary means that the key statistical properties of a time series do not change over time. Consider two sales figures above, one with the original series, and another one that has been differenced with lag 2. What that means is that each number has been subtracted by the prior second month. Notice how the trend disappears on the second chart, but with the peak and thorough remain. A time series must be stationary, otherwise the model would not be able to forecast it accurately.

Moving on — Training and Testing

We split the data into train and test set. As a rule of thumb, we take 90% as training set and leave the remainder as test. We can also conduct multi-fold cross-validation process, but we can leave that when we discuss Prophet.

# 1st order differencing
df_differenced = df_train.diff().dropna()
# 2nd order differencing
df_differenced = df_differenced.diff().dropna()

I remain with second-order difference in this example as it deemed as adequate. After I apply differencing twice, the ADF test shows that the variables are stationary.

The benefit of the transformation is that it also reduces the autocorrelation — the series is dependence on the lagged version of itself, which means that the lagged values can be linearly related to its actual value. If a series is autocorrelated, then the analysis can be misleading because the data has hidden trends and seasonalities that can explain the next values, but the model fails to identify them.

ACF and PACF Plot before differencing.
ACF and PACF Plot after second-order differencing.

To observe the contrast of non-stationary and stationary data, we can plot the autocorrelation and partial autocorrelation plot on both the original dataset and the differenced dataset. The x-axis shows the lag values, while the y-axis shows the correlation, lies within 1 and -1. The blue cone represents 95% confidence that the series are correlated with each other for each time lag. An autocorrelation measurement shows the linear relationship in between the lag values of a time series, y(t) and y(t-1). A partial auto-correlation shows the linear dependence after removing the effect of other variables. Precisely, at lag k, it only measures the linear relationship from y(t) and y(t+k), removing all previous lags from 1 to k-1.

The plot above shows the ACF and PACF plot of the data before differencing and after differencing. Prior to that, the blue shade grows large, gets closer to 1 and -1, meaning that the auto-correlation is huge among the lags. After we apply differencing, the correlation drops substantially, only hovering in between 0.25 and -0.25. Similarly, the partial auto-correlation shows less fluctuation after differencing, with only minimal fluctuations beyond 0, as opposed to before differencing.

model = VAR(df_differenced)
x = model.select_order(maxlags=12)
x.summary()
AIC, BIC, FPE, and HQIC error metrics for each lag (maximum 12).

We are at the final step before we can run the model. We just need to choose the appropriate number of lags for the time series. Statsmodels can assist us in choosing the optimal lag that minimizes AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) out-of-sample error prediction. Usually for a good model, the AIC is low, close to zero, but how low can it be is subjective — but the key here is to find a minimal difference in between AIC and BIC, with a relatively low AIC.

Lag: As a recap, lag is the number of periods utilized for forecasting the next time step. That means, if the lag is 4, the model uses 4 periods (in this case, four months) to make prediction of the value of y on the next month.

model_fitted = model.fit(4)

Once we have fit the data into the model, we should check if the residual values are serially correlated with each other. Residual refers to the difference in between the fitted and the actual values (yhat — y). The assumption is that the residuals are not correlated. If they are, that means there are information left inside the time series that have not been captured by the model. Another assumption is that the residuals have zero mean, which, if not, means that the forecast is biased. We can also check if the residuals have constant variance, or, as a whole, normally distributed.

We can use Durbin-Watson Test to investigate if any serial correlation exists within the residuals. In essence, the test statistics will produce a value in between 0 to 4. As a rule of thumb, a value of 2 means there is no auto-correlation, while 0 to 2 indicates positive autocorrelation and 2 to 4 indicate negative correlation.

from statsmodels.stats.stattools import durbin_watson
out = durbin_watson(model_fitted.resid)
for col, val in zip(tsdf.columns, out):
print(col, ‘:’, round(val, 2))

Now that we approved the forecast on the train set, we called the forecast method to calculate future values of the last four time periods of the train dataset (since we set the lag at 4).

# Get the lag order
lag_order = model_fitted.k_ar
print(lag_order)
# Separate input data for forecasting
# the goal is to forecast based on the last 4 inputs (since the lag is 4)
forecast_input = df_differenced.values[-lag_order:]
# Forecast
# We insert the last four values and inform the model to predict the next 10 values
fc = model_fitted.forecast(y=forecast_input, steps=nobs)# organize the output into a clear DataFrame layout, add ‘_f’ suffix # at each column indicating they are the forecasted values
df_forecast = pd.DataFrame(fc, index=tsdf.index[-nobs:], columns=tsdf.columns + ‘_f’)

We need to do a few more steps before we can visualize the forecast and display the result. We need to transform the data back to their actual values, since we have applied differencing prior to fitting. We need to take a cumulative sum of the predicted values (summation row by row) and add it back to first row’s original value.

# get a copy of the forecast
df_fc = df_forecast.copy()
# get column name from the original dataframe
columns = df_train.columns
# Roll back from the 1st order differencing
# we take the cumulative sum (from the top row to the bottom) for
# each of the forecasting data, and the add to the previous step’s
# original value (since we deduct each row from the previous one)
# we rename the new forecasted column with the prefix ‘_forecast’
for col in columns:
df_fc[str(col)+’_forecast’] = df_train[col].iloc[-1] +
df_fc[str(col)+’_f’].cumsum()
# if you perform second order diff, make sure to get the difference from the last row and second last row of df_trainfor col in columns:
df_fc[str(col)+’_first_differenced’] = (df_train[col].iloc[-1]-
df_train[col].iloc[-2]) + df_fc[str(col)+’_f’].cumsum()
df_fc[str(col)+’_forecast’] = df_train[col].iloc[-1] +
df_fc[str(col)+’_first_differenced’].cumsum()
df_results = df_fc

The table above shows the forecast of all four variables, based on the lag of 4 from the previous table. For most forecasting models, we can evaluate by calculating the Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE), as below. However, since MAE and RMSE is scale-dependent, that is, they depend on the actual values of the series, which in this case is denoted in actual millions, the difference in between yhat-y is immense. A better way to evaluate the forecast is to view the Mean Absolute Percentage Error (MAPE), which is scale-independent. Below, MAPE value lies around 10% error, which is not too bad. We can call model_fitted.plot_forecast(10) to forecast ten values ahead, but what would be interesting is to plot the predicted vs. the actual value of the test set.

Predicted vs. Actual on the Test Set.

The result is decent, if not precise, but certainly could be better if the predicted values fit closely with the actual test values, but doing so may implore overfitting. As long as the prediction shows a similar trend trajectory with the test set, the model is on a good track. We could tune the model to improve the result, or we can try Prophet module with the same variables we have optimally chosen in this model and forecast for the next nine months until December 2020.

FACEBOOK OPEN-SOURCED PROPHET MODULE

Prophet is robust yet simple and accessible. This tool provides an easy, uncomplicated, and straightforward high-level forecasting method, built on the principle that forecasting should be easy and accessible to analyst, without having to go through the current archetypal procedures like we just discussed above. Personally, I much prefer to proceed with this module but only after I have conducted typical forecasting model tests and validations. Those tedious steps may appear daunting, but they served a purpose — to exhibit the context of the series and allow us to incise each steps of the components and tune them as necessary. On the other hand, Prophet does not explicitly indicate missing components or elements that require adjustment. Hence you can see why VAR model commenced this article.

In essence, Prophet builds on Generalized Additive Model (GAM), which adds these three components altogether:

“Here g(t) is the trend function which models non-periodic changes in the value of the time series, s(t) represents periodic changes (e.g., weekly and yearly seasonality), and h(t) represents the effects of holidays which occur on potentially irregular schedules over one or more days. The error term εt represents any idiosyncratic changes which are not accommodated by the model; later we will make the parametric assumption that εt is normally distributed.”[7]

Note that GAM takes into account trend and seasonality, which typically occur in retail, economic, and financial data. It also emphasizes holidays, which usually affect the variance of the series, but there are options to exclude them from the series if we believe they do not institute any significant impact. This module simplifies several critical steps we need to perform in VAR model, largely testing the data for causality, stationarity, and auto-correlation on the series and the residuals.

The Model

The data, with date renamed as ds and Net Sales as y.
m = Prophet(seasonality_mode=’multiplicative’,
growth=’linear’,
weekly_seasonality=False
yearly_seasonality=True,
mcmc_samples=1000,
changepoint_prior_scale = 0.001,

Prophet’s high-level functionality necessitates standardized column names. We need to rename the date column as ds, and the predicted value as y. The remaining regressor column names can be intact. We can instantiate the model quickly using the code above and set these parameters as arguments:

seasonality_mode = ‘multiplicative’: We set the seasonality as multiplicative as opposed to additive because the seasonality grows with the trend; an additive estimation would be less accurate. Note that Prophet allows you to customize seasonality as well as add holidays into the model.

growth=’linear’: The default growth model is linear, but you can also choose logistic growth model, commonly used in population growth, economic innovation cycle, and pandemic model, such as the current covid-19.

weekly_seasonality=False and year_seasonality=True: By default, Prophet will fit weekly seasonality using dummy variables and yearly seasonality using Fourier Transformation. Since our data frequency is monthly, it makes less sense to detect weekly seasonality.

mcmc_samples=1000: By default, Prophet will only calculate uncertainty in the trend — either the series grow upwards or downwards. To detect uncertainty in seasonality, Prophet will calculate Bayesian sampling. Here we estimate the posterior distribution using 1000 samples (note that the Hamiltonian Markov Chain Monte Carlo calculation here uses PyStan).

changepoint_priot_scale = ‘0.001’: Prophet defines changepoints as abrupt changes in the trajectory. We can customize our own changepoints if we know certain period with sudden change (e.g., won a large tender), but by default Prophet will detect them automatically. By default this parameter it is set to 0.05, but we decrease it to make it less prone to overfitting (i.e., avoiding too much changepoint flexibility).

m.add_regressor(‘Gross domestic product, current prices’, prior_scale=0.5, mode=’multiplicative’)m.add_regressor(‘Inflation, average consumer prices’, prior_scale=0.5, mode=’multiplicative’)m.add_regressor(‘Net Trade’, prior_scale=0.5, mode=’multiplicative’)

After we have fit the model, we can add other variables we have explored in the VAR model as additional regressors. Note that the regressors must have values known in the past and in the future. That means if we want to forecast 12 months ahead, we must know the GDP for the upcoming 12 months. In this case, I will use the forecasted GDP from OECD.

m.fit(alldf)
We create future ‘slot’ for the predicted value.
We fill this slot with the regressor (macroeconomic variable) data.

Conveniently we call fit method to fit the data. Since the current Net Sales data is available up to March, create nine future DataFrame rows to be filled with forecast until December. After that, we append the extra regressors with these respective nine months into this future DataFrame. Prophet will include the values of these regressors as its basis for forecasting from April to December.

Prediction

We can conveniently use prediction method to predict the future value of y, stored as yhat. The forecasted value starts right at April, where y is absent.

m.plot(forecast, ax=ax)

We can call plot method to plot the forecasted line to observe the future trajectory of the Net Sales. The black dot represents the actual data, while the blue line represents the fitted prediction. The blue shade represents region of uncertainty intervals, which at default set at 80%.

We can also inspect the components of the forecast to yield the trend, which confirms our observation that the trend increases over time, and the model captures it well too. More importantly, we can see yearly seasonality, and bench-marked with our domain knowledge to see if the model captures it well. The model shows that the peak sales purchase occurs around February and drop around December. Perhaps this is true — consumer consumes less as they go on winter holiday and bulk their purchase again as the year starts, but we need domain expert to confirm this hypothesis. Another important aspect revealed by the component plot is that the regressors (macroeconomic variable) also moves linearly with the Net Sales, which indicates a good fit into the model.

Diagnostics: Cross-Validation and Performance Metrics

Cross-validation of the model is crucial to measure the forecast error. By splitting the data into training and testing set periodically, we can estimate the forecast errors more accurately. We pass the cross-validated prediction into performance_metrics function, which calculates mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) for each of the time horizon. While the value of MSE, RMSE, and MAE is scale-dependent (since the Net Sales are in actual millions), the MAPE is scale-independent, hence we judge the performance using this metric instead. The smallest MAPE is at 13%, which is not bad. We can further yield a plot of MAPE across time horizons below, which shows that the error is considerably consistent over time.

MAPE metric on the cross-validated forecast.

Scenario Analysis

Finally, we have arrived at the target, which is to estimate the effect of the decline in GDP and other economic indicators on the sales growth. To do so, we create a new future DataFrame called future2, where we supply the regressor values but with a drop of 5% from their actual forecast. If we believe the economy may be worse, then we can drop these values even lower. Then, using the same fitted model, we predict the future values on future2.

masterdf2 = masterdf.reset_index().rename(columns={‘Date’:’ds’})
masterdf2[‘ds’] = pd.to_datetime(masterdf2[‘ds’])
future2 = future.merge(masterdf2.loc[:, modified_array], how=’left’, on=’ds’)
# drop estimated macroeconomic forecast by 5 percent.
# we take the last nine rows, April to Dec.
future2.iloc[-8:, 1:] = future2.iloc[-8:, 1:].apply(lambda x : x * 0.95)
scenario_forecast = m.predict(future2)scenario_result = pd.concat([scenario_forecast[[‘ds’, ‘yhat’, ‘yhat_lower’, ‘yhat_upper’]], alldf[[‘y’]]], 1)

We can sum the total of Net Sales of the entire year in 2019, and the predicted Net Sales in 2020. We can see that there is a sales growth of about 12%.

Compare that to the baseline forecast earlier, the actual growth is at 17%. Thus, we can postulate that with the drop in economic indicators, our sales growth should be revised from 17% to 13%, a drop by 4%. This result culminates our growth analysis of the Net Sales of this hypothetical food and beverage company as a result of the Covid-19 uncertainty.

FINAL WORDS

If we can predict the economy, what would we do differently? Photo by Markus Spiske on Unsplash

Time series relies a lot on a set of robust procedures that bridge the gap from intuition to numerical result. We believe business performance rely a lot on economic and financial condition, but the current uncertainty necessitates an analysis that measures range of possible growth or loss, business plans ensue. From intuition, we need to test and validate the intuition through a series of hypothesis testing rather than heedlessly apply a model — that is the reason I demonstrated a step-by-step analysis using VAR model before I introduce Prophet, in a hope that analysts and readers would gain deeper understanding of forecasting.

FOOTNOTES

[1] “World Economic and Financial Surveys,” International Monetary Fund, accessed May 1, 2020, https://www.imf.org/external/pubs/ft/weo/2020/01/weodata/index.aspx.

[2] “Trade in Goods and Service Forecast,” OECD, accessed May 1, 2020, https://data.oecd.org/trade/trade-in-goods-and-services-forecast.htm.

[3] “The Perils of Prediction, June 2nd,” The Economist, accessed May 15, 2020, https://www.economist.com/letters-to-the-editor-the-inbox/2007/07/15/the-perils-of-prediction-june-2nd.

[4] Vegard Flovik. “How (not) to use Machine Learning for Time Series Forecasting: Avoiding The Pitfalls,” Medium, accessed May 1 2020, https://towardsdatascience.com/how-not-to-use-machine-learning-for-time-series-forecasting-avoiding-the-pitfalls-19f9d7adf424.

[5] Hashem Dezhbakhsh and Daniel Levy. “Periodic Properties of Interpolated Time Series,” Economics Letters 44, (1994): 221–228.

[6] Lepot, Mathieu et al., “Interpolation in Time Series: An Introductive Overview of Exiting Methods, Their Performance Criteria and Uncertainty Assessment,” Water 9, 796 (2017).

[7] Taylor, Sean et al., “Forecasting At Scale,” Peer J Preprints 5, 33190v2 (2017).

REFERENCES

Bahadori, Mohamad and Liu, Yan. “An examination of practical granger causality inference.” SDM, (2013).

Bahadori, Mohamad and Liu, Yan. “Granger causality analysis in irregular time series.” SDM, (2012).

Chambers, John., Mullick, Satinder, and Smith, Donald. “How to Choose the Right Forecasting Technique,” Harvard Business Review, accessed May 12, 2020, https://hbr.org/1971/07/how-to-choose-the-right-forecasting-technique.

Hyndman, Rob and Athanasopoulos, George. (2018) Forecasting: principles and practice, 2nd ed. (Australia: OTexts, 2018). Accessed on May 13, 2020, OTexts.com/fpp2.

Merwe, Ruan van der. “Implementing Facebook Prophet Efficiently.” Medium, accessed May 2, 2020, https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3.

Navrátil, Miroslav and Kolková, Andrea. “Decomposition and Forecasting Time Series in Business Economy Using Prophet Forecasting Model.” Central European Business Review 8, no. 4 (2019): 26–39.

Palachy, Shay. “Inferring Causality in Time Series Data.” Medium, accessed May 2, 2020, https://towardsdatascience.com/inferring-causality-in-time-series-data-b8b75fe52c46.

Prabhakaran, Selva. “Vector Autoregression (VAR) — Comprehensive Guide with Examples in Python.” Machine Leaning Plus, accessed May 1 2020, https://www.machinelearningplus.com/time-series/vector-autoregression-examples-python/.

--

--

Taufik Azri

Data Scientist with interests in applicable solutions in retail and consumer industry.