| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Times Series with Sarimax & Prophet
This notebook contains my notes from the tutorial source linked below. I have personalized the code in many places to match my own way of doing things, and I have added my own explanations of things for my own study.
| Source Author | Source Repository |
Evan Marie online: | EvanMarie.com| EvanMarie@Proton.me | Linked In | GitHub | Hugging Face | Mastadon | Jovian.ai | TikTok | CodeWars | Discord ⇨ ✨ EvanMarie ✨#6114 |


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Time Series Components
- trend shows whether the series is consistently decreasing (downward trend), constant (no trend) or increasing (upward trend) over time
- seasonality describes the periodic signal in your time series
- noise or residual is the unexplained variance and volatility of the time series
- Python’s statsmodels library: seasonal_decompose
- Seasonality: a pattern that occurs in a fixed and known period
- Cylicality: a pattern that does not have a fixed or known period
- Stationarity: when data's statistical properties do. not change over time.
- With algorithms like SARIMAX, it is important to identify this property, because they depend on it
- with linear regression, it is assumed that observations are independent of each other
- in a time series, observations are time dependent
- by making the time series stationary it is possible to apply regression techniques to time dependent variables
- non-stationary time series can be made stationary
Stationary Time Series Criteria
- the variance in the seasonality component is constant
- the amplitude of the signal does not change much over time
- autocorrelation is constant
- the relationship of each value in the time series and its neighbors stays the same

- analyzing components is a common way to check for stationarity
- Augmented Dicky-Fuller test (ADF)
- Kwiatkowski-Phillips-Schmidt-Shin test (KPSS)
- these are part of the Python statsmodel library


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Additive Models and Multiplicative Models
Time series trend, seasonal, and residual components can occur in either an additive or mutliplicative way.
Additive Models:
- the components are added linearly
- changes over time are consistent in the amount they change
$$Y(t) = trend + seasonality + residual$$- linear trend is a straight line
- linear seasonality has the same frequency and amplitude, width and height of cycles
Multiplicative Models:
- components are multiplied with one another
$$Y(t) = trend * seasonality * residual$$ - nonlinear, i.e. it is quadratic or exponential
- changes increase or decrease over time
- trend is a curved, non-linear line
- non-linear seasonality which varies in frequency and amplitude
Decomposition Models
- a main objective of decomposition is to estimate seasonal effects
- these can be used to create seasonally adjusted values
- additive models are useful when seasonal variation is fairly constant over time
- multiplicative models are useful when seasonal variation increases over time.


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Google Trends Data
- dataset reflecting the Google searches for the word 'diet
- date range is 2016-03-27 to 2021-03-21'
Seasonal Pattern
- searches for 'diet' decrease rapidly at the end of each year
- at the beginning of each year, they spike
- seasonality occrs in a fixed and known period
- no consistent increase or decrease in the trend, suggesting a non-linear trend
Breaking down components:
- trend being non-linear, the model parameter will be multiplicative (by default, it is additive)
- period can be specified depending on the time series
- because the data is given in weeks, the period is set to the number of weeks in a year
- frequency and amplitude of seasonality remain constant, suggesting linearity
statsmodels.tsa.seasonal_decompose()
model = 'additive'
- trend is still non-linear
- time series follows no consistent up or down slope
- thus no positive or negative trend (up or down)
- additive model fits the data better
- Additive & Multiplicative Decomposition


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Stationarity Tests
statsmodels documentation on ADF and KPSS tests

Augmented Dicky-Fuller test (ADF)

stationarity test can allow data to pass that may not actually be stationary
- it is best to also apply the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) to check for true stationarity
- it is also important to observe a time series' plot
ADF Test:
- tests for the presence of unit root in the series
- helps determine if series is stationary


In statistics, an augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is present in a time series sample. The alternative hypothesis is different depending on which version of the test is used, but is usually stationarity or trend-stationarity. It is an augmented version of the Dickey–Fuller test for a larger and more complicated set of time series models.

The augmented Dickey–Fuller (ADF) statistic, used in the test, is a negative number. The more negative it is, the stronger the rejection of the hypothesis that there is a unit root at some level of confidence.(Source)



Unit Root
In probability theory and statistics, a unit root is a feature of some stochastic processes (such as random walks) that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if 1 is a root of the process's characteristic equation. Such a process is non-stationary but does not always have a trend.(Source)
- Null Hypothesis: The series has a unit root, meaning it is non-stationary. It has some time dependent structure.
- Alternate Hypothesis: The series has no unit root, meaning it is stationary. It does not have time-dependent structure.
- if null hypothesis is not rejected, possible evidence a series is non-stationary
- p-value below a threshold (1%, 5%, etc) suggests null hypothesis, i.e. stationary
- p-value above the threshold suggest failed null hypothesis, i.e. non-stationary
KPSS Text
- null and alternate hypothesis are opposite those of ADF
- Null Hypothesis - the time series is trend stationary
- Alternate Hypothesis - the seies has a unit root and is not stationary
- a p-value below a threshold suggests rejection of the null hypothesis and non-stationarity
- a p-value above the threshold suggests failure to reject null hypothesis, i.e. stationary
# statsmodels for the two tests
statsmodels.tsa.stattools.adfuller() - Augmented Dickey Fuller Test
statsmodels.tsa.stattools.kpss() - Kwiatkowski-Phillips-Schmidt-Shin test
The following outcomes are possible:
- both tests deem the series is not stationary, therefor the series is not stationary
- both tests deem the series IS stationary, therefor the series is stationary
- KPSS indicates stationarity, and ADF indicates non-stationarity, meaning that the series is trend stationary. In this case, the trend must be removed to make the series strict stationary, and the series should be checked for stationarity with the trend removed
- KPSS indicates non-stationarity, and ADF indicates stationarity, meaning the series is difference stationary. In this case, differencing must be used to make the series stationary, and the differenced series should be checked for stationarity
ADF Results
KPSS Results
Results:
ADF -> 0.05 threshold - the p-value is not below the threshold, null hypothesis is regected. Series is stationary
KPSS -> evidence suggests rejecting the null hypothesis for the alternate hypothesis, thus suggesting non-stationarity

- these results fall into the category of the last in our list, thus it is necessary to apply difference to achieve stationarity
- the series must then be tested again
Summary:
- the trend is non-linear and multiplicative, neither increasing or decreasing consistently
- seasonality is influenced at the end and beginning of each year
- seasonality is linear and does not vary in frequency or amplitude
- additive residuals are lower than multiplicative
- ADF says stationary, while KPSS says non-stationary, so difference must be applied
Conclusion:
- since seasonality is linear and the smaller additive residuals, it is reasonable to choose the additive model as the more appropriate


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Making a Time Series Stationary
- statistical models often require a time series be stationary in order to make effective and precise predictions, for example the ARIMA model
DIFFERENCING:
- subtract the previous value from each value in the time series

Other transformations are possible as well, i.e. taking the log or the square root from a time series

pd.DataFrame.diff() - used to make the differenced and stationary time series
# time series is now stationary


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Global Temperature Dataset
- Dataset source
- includes global monthly average temperature (°C) anomalies from 1880 to 2016
- GISS surface temperature analysis, GISTEMP
- global component of Climate at a Glance, GCAG
- plots above suggest a positive trend
- time series is therefor non-stationary
# decomposing the time series, using period = 12, 1 year
# increasing frequency/period to make seasonality possible to observe, increasing from 1 year to 20 years

Conclusions:
- trend is positive, suggesting non-stationarity
- amplitude and frequency do not change, suggesting use of an additive model
ADF and KPSS Tests
# both tests who the gistemp time series as non-stationary, which confirms what is seen in the plots above.
Zooming in to view the plots for just the time period 2014 - 2016, the last two years of the data

Observations:
- positive trend
- seasonality present with lower values in July and higher in March
# How many times must the data be differenced to achieve stationarity?
# gistemp and gcag time series become stationary after differencing once (d=1).
Original Time Series Plotted
Differenced Time Series Plotted


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Forecasting with ARIMA
- Using SARIMAX but not setting any of the seasonality orders
# function to reduce memory usage, (Source, Numer.AI)
(S)ARIMA(X) models
ARIMA
- Auto Regressive Moving Average
- combination of two models, auto regressive, which uses lagged values to forecast
- and moving average, which uses lagged values of residual errors to forecast
- uses dependencies between data values and error values from past data to make predictions
- takes three parameters
* p - number of autoregressive lags
* d - number of times differencing is applied to make the data stationary
* q - number of moving average lags
- it is also possible to apply transformations before using the ARIMA model
- However, if differencing and other transformations are applied before the model, they must be reverse transformed to access the forecast of the original values
- it is important to difference the data ONLY until it is stationary and no further
- to know the value for d, the number of times to run differencing, use the ADF and KPSS tests, the adf_kpss() function above
ARIMAX
- extended version of the ARIMA model which encorporates exogenous inputs
- modeled using other independednt variables in addition to the time series
- example: when modeling the waiting time in an emergency room. The number of nurses available at a certain shift could be considered an external variable since it may impact on the waiting time. If this is indeed the case, by changing the number of nurses we can affect the waiting times.
SARIMA
- this model should be used when there is seasonality
- ARIMA ignores seasonality
- SARIMA includes additional parameters to work with seasonality: P, D, Q, and S
* P - seasonal autoregressive order
* D - seasonal differencing order
* Q - seasonal moving average order
* S - length of the seasonal cycle


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

The Walmart Dataset
- comes from a Store Item Demand Forecasting Challenge on Kaggle
- 5 years of store-item sales data split into train and test csv files
- competition objective: forecast 3 months of sales for 50 different items at 10 different stores using the data
Training Data Overview
Testing Data Overview
Unique Items by Store
# both datasets have 10 stores each with 50 unique items. Target column is 'sales'
# date columns contain string data and must be converted to datetime
# goal - predict sales of items in all stores from 01-01-2018 to 03-31-2018, i.e., 3 months.
# sales amounts by store
# items sold by the two stores with highest sales
# datetime feature engineering
# boxplot shows the level of outliers for each weekday
# highest sales are in June and July
# observing seasonality across the years
# seasonality repeats year after year
# volume differs year to year
# higher sales on weekends
# July is the month with more sales.


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

The Box-Jenkins Method
- Helps in choosing parameters that will lead to a good model.

(Source)
The original model uses an iterative three-stage modeling approach:

1. Model identification and model selection: making sure that the variables are stationary, identifying seasonality in the dependent series (seasonally differencing it if necessary), and using plots of the autocorrelation (ACF) and partial autocorrelation (PACF) functions of the dependent time series to decide which (if any) autoregressive or moving average component should be used in the model.

2. Parameter estimation using computation algorithms to arrive at coefficients that best fit the selected ARIMA model. The most common methods use maximum likelihood estimation or non-linear least-squares estimation.

3. Statistical model checking by testing whether the estimated model conforms to the specifications of a stationary univariate process. In particular, the residuals should be independent of each other and constant in mean and variance over time. (Plotting the mean and variance of residuals over time and performing a Ljung–Box test or plotting autocorrelation and partial autocorrelation of the residuals are helpful to identify misspecification.) If the estimation is inadequate, we have to return to step one and attempt to build a better model.


Image Source and Author

- the Walmart dataset contains 500 times series which are paired stores and items sold (10 stores, 50 items)
- each of the time series will need a forecast model applied to it in order to forecast sales for all stores
Box-Jenkins Step One: Identify the Model
- Identifying the characteristics of a time series in order to choose the appropriate model

Questions to ask:
1. Is the time series stationary?
2. If it is not stationary, which transformation is best to make it stationary?
3. Is the time series seasonal?
4. If seasonal, what is the periodicity of its seasonality?
5. Which orders should be used? (p for Arima, q for Arimax)
- 500 time series, each being a store-item pair
- a forecast model must be applied to all 500 time series pairs
- the following is the method applied to one time series, the pair: store no. 2 and item 1
# period = 365 - yearly cycle
Observations:
- upward sales volume trend means non-stationary
- seasonal component does not increase greatly overtime (i.e. not multiplicative) means model is additive
- seasonality component with lower sales at the beginning of the year and higher sales in the middle of summer
- differencing to make stationary
- use adf_kpss to find out how many times to apply differencing
# applying stationarity tests
# time series is stationary with d = 1
Plotting ACF and PACF:
- autocorrelation and partial autocorrelation plots give clues for the best ARIMA parameter values
- also shows if differencing is needed or if too much has already been applied

Autocorrelation Function (ACF)
- the plot of the autocorrelation of a time series by lag
- also known as correlogram
- includes direct and indirect dependence information
- describes how well a present value of a time series is related to past values
- bars represent ACF values at increasing lags
- shaded area represents the confidence interval, default 95%
- if if bars like within the shaded region, they are statistically insignificant
Partial AutoCorrelation Function (PACF)- describes only the relationship between and observation and its lag
- finds correlation of the residuals rather than finding correlations of present values with lagged values
- Autocorrelation and Partial Autocorrelation
- Identifying ARIMA parameters
- the time series should be made stationary before creating these plots
- if ACF values start high and trail off very slowly, it is a sign of non-stationarity and the need for differencing.
- if autocorrelation at lag-1 is very negative, it is a sign of too much differencing.
Plotting correlation and autocorrelation
- ACF shows period correlation patterns
- find the periodicity by finding a lag greater than 1
- the peak above is at 7, so the seasonal component repeats every 7 steps, weekly
# no clear trailing off in either plot
- observed seasonal behavior by weeks
- suggest the use of a model with seasonal parameters, i.e. SARIMA
Box-Jenkins Step Two: Estimate Coefficients (p, q)
- althought SARIMA is the better choice, applying ARIMA first for comparison (i.e. SARIMAX with no seasonality settings? Not sure, because she uses SARIMAX first as well.)
- this will show some of the advantages of choosing the appropriate model
- to choose proper parameters, there is some trial and error
- will use the ARIMA model with different values
- choosing best values based on metrics like AIC and BIC
AIC - Akaike Information Criterion
- tells how good a model is
- lower value means a better model
- penalizes models wwith many parameters
- i.e. if order is too high compared to the data, there will be a high score
- this indicates where work should be done to avoid overfitting to the training data
BIC - Bayesian Information Criterion
- similar to AIC in that lower value is better
- penalizes additional model orders more than AIC does
- consequently BIC will sometimes suggest a simpler model
- these statistics can be obtained after fitting a model
- there is usually some agreement between the two metrics
- if there is no agreement, it is best to choose a smaller AIC for a predictive model
SARIMAX() - model and attributes information
# AIC and BIC agree that p = 4 and q = 5 are the best parameter values
Box-Jenkins Step Three: Model Evaluation
- evaluating the accuracy of the model before choosing it as the best
- focusing on residuals for evaluation
- residuals are the difference between the model's one-step-ahead predictions and the real values of the time series
Mean Absolute Error (MAE)
- calculating the MAE of the residuals
- this will show on average how far off the predictions are from the true values
- the MAE is around 5 sales per day
- the average sale for item 1 in store 2 is 28 sales per day
- for an ideal model, the residuals should be uncorrelated, white Gaussian noise centered on zero
- in the following section, this will be evaluated as well


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Diagnostic Summary Statistics
- Analyzing the residual test statistics from the results summary
- Evaluating Prob(Q) by applying the Ljung-Box Test with the Null Hypothesis: there are no correlations in the residuals
- Evaluating Prob(JB) by applying the Jarque-Bera Test with the Null Hypothesis: residuals are normally distributed
- Prob(Q) = 0.65, which is greater than 0.05
- this means do not reject the null hypothesis that the residuals are uncorrelated
- therefor the residuals are not correlated
- Prob(JB) = 0.02, which is less than 0.05
- this means reject the null hypothesis that the residuals are normally distributed
- i.e. the residuals are not normally distributed
Plot Diagnostics
- the following are four diagnostic plots using plot_diagnostics() that help in deciding whether a model is a good fit
Conclusions:
- an ideal model will have residuals resembling uncorrelating white Gaussian noise centered on zero
- this concept will be evaluated using the plots above
Standardized Residual:
- there are no obvious patterns in the residuals
- this suggests a good model

Histogram and KDE Estimate:
- shows the measured distribution of the residuals
- green line shows the KDE curve, a smoothed version of the histogram
- the line shows a normal distribution
- for a good model, the N line will be similar to the KDE line

Correlogram:
- 95% of correlations for lag greater than one should not be significant (inside the blue area)
- indicates a good model

Normal Q-Q:
- most of the data points occur on the line
- this indicates a normal distribution of the residuals
The conclusion is that these metrics all point to this being a good model.
- if residuals are not normally distributed, increasing d can fix
- if the residuals are correlated, increasing p and / or q can help fix


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Accounting for Seasonality
- for this, P, Q, D, and S should also be set
- earlier the ACF plot determined the seasonal period was 7 days, or 1 week
- remember to make a time series stationary before creating the ACF plot
- seasonal data might require seasonal differencing
- seasonal differencing subtracts the time series value from one previous cycle
- if the time series shows a trend, the normal difference is taken
- if there is a strong seasonal cycle, the seasonal difference will also be taken
- before, our d = 1, and it is a general rule that d plus D should not equal more than 2
- first find the two orders of differencing, d and D and make the series stationary
- non-season orders, such as p and q can still be found by plotting ACF and PACF of the differenced time series
- to find seasonal orders, P and Q, it is necessary to plot ACF and PACF of the differenced time series and multiple seasonal steps
# Take the first and seasonal differences (7 days, 1 week) and drop NaNs
- he non-seasonal ACF and PACF plots above show moving average model pattern with q = 1
- for the seasonal parameters, the lags parameter takes a list of lags instead of a maximum
- follow with plotting ACF and PACF for these specific lags only.
- the seasonal ACF and PACF plots look like a moving average = 1 model, i.e., Q=1
- combining both of these results: SARIMA(order = (0,1,6), seasonal_order = (0,1,1,7))
- further progress in choosing model parameters can be made from investigating the AIC and BIC as done above
- however, there are many more parameters now that the seasonal ones are added
- instead Automated Model Selection is a good alternative
- Prob(Q) = 0.97, which is greater than 0.05
- do not reject the null hypothesis that residuals are uncorrelated
- residuals are not correlated
- Prob(JB) = 0.13, which is greater than 0.05
- do not reject the null hypothesis that the residuals are normally distributed
- the residuals are normally distributed
Standardized Residual:
- no obvious patterns in the residuals
- suggests a good model

Histogram and KDE Estimate:
- the two curves line up
- suggests a good model

Correlogram:
- 95% of correlations for lag greater than one should not be significant (inside the shaded area)
- these results also suggest a good model

Normal Q-Q:
- most of the data points should lie on the line, aside from outliers
- this indicates a normal distribution of the residuals


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Automated Model Selection
- pmdarima allows for the automatation of model order search
- using the information from the Box-Jenkins identification step to predefine some of the orders before fitting
- this can speed up the process of choosing model orders, but needs to be done carefully
- automation can make mistakes due to imperfect data, which can affect the test scores in non-predictable ways
- the only required parameter in auto_arima is data
- however, using knowledge to specify other parameters can help find the best model
- seasonal = True # is the time series seasonal
- m = 7 # the seasonal period - one week
- d = 1 # non-seasonal difference order
- D = 1 # seasonal difference order
- max_p = 6 # max value of p to test
- max_q = 6 # max value of q to test
- max_P = 6 # max value of P to test
- max_Q = 6 # max value of Q to test
- information_criterion = 'aic' # used to select best mode
- trace = True # prints the information_criterion for each model it fits
- error_action = 'ignore' # ignore orders that don't work
- stepwise = True # apply an intelligent order search
# Best model: ARIMA(6,1,1)(6,1,0)[7]
- Prob(Q) = 1.00, which is greater than 0.05
- do not reject the null hypothesis that residuals are uncorrelated
- residuals are not correlated
- Prob(JB) = 0.98, which is greater than 0.05
- do not reject the null hypothesis that the residuals are normally distributed
- the residuals are normally distributed
Standardized Residual:
- no obvious patterns in the residuals
- suggests a good model

Histogram and KDE Estimate:
- shows the measured distribution of the residuals
- KDE (green line) shows a smoothed version of the histogram
- the two curves line up, but are slightly off

Correlogram:
- 95% of correlations for lag greater than one should not be significant (inside the shaded area)
- these results also suggest a good model

Normal Q-Q:
- most of the data points should lie on the line
- this indicates a normal distribution of the residuals
- compared with the first SARIMAX model, the first is closer to a normal distribution for the KDE curve than the second
- the first model also achieved a lower MAE score than the second
- 4.55% vs 4.78% MAE scores
- conclusion: the first model is the more accurate


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Forecasting: SARIMA vs ARIMA
- working with both the ARIMA and the better of the two SARIMA models
- confirming that SARIMA is the better model
- working with the last 90 days of data as the validation set
Comparison Metrics:
- Mean Absolute Error and Mean Absolute Percentage Error will be used to compare models
- MAE & MAPE: evaluating forecasting models
- MAE is clear and easy to comprehead
- MAPE is unit-free and therefore useful for comparing forecast performance between datasets
Comparing Model Result Summaries
Model Predictions and Metrics
# sklearn.metrics.mean_absolute_error()
# sklearn.metrics.mean_absolute_percentage_error()
# results.get_prediction()
# predictions.predicted_mean()
# creating the predictions and predicted mean for each model
# getting the MAE and MAPE for each model's predictions
# combining model metric results into a dataframe
# model metric results: MAE and MAPE
- based on these metrics, the second SARIMAX model performed the best and has the lowest loss metrics
# plotting the predictions of all three models
# actual values vs ARIMA predictions
# actual values versus first SARIMAX predictions
# actual values versus second (optimized) SARIMAX predictions
- of the three models, the optimized, automated selection SARIMAX model clearly follows the actual values more closely
Forecasting into the Future
# results.get_forecast()
# getting forecast predictions into the future with ARIMA and optimized SARIMAX
- the optimized SARIMAX model follows the trajectory of the actual values much more than the ARIMA model, which ignores the seasonal information
Saving the optimized SARIMAX model
Loading the saved model


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Section Conclusions
- the optimized model could be further improved by using a true SARIMAX model and adding exogenous data such as holidays, etc.
- Python Holidays Library
-


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Forecasting with Facebook Prophet Model
- Using Prophet on the Walmart Forecasting Dataset
- Facebook Prophet on GitHub
- Facebook / Meta Data Science Research
- Prophet works with univariate (one variable) time series forecasting data based on the additive model
- it supports trends, seasonality, and holidays
- works best with data that has strong seasonal effects and with several seasons worth of historical data
- Prophet is robust in dealing with missing data, shifts in trends, and handles outliers well
- Prophet is easy to use
- automatically finds a good set of hyperparameters for the model
-
# df.resample()
Conclusions:
- there is a clear upward trend in the time series
- the time series is not stationary
- the seasonal component is similar across time, not multiplicative
- this points to the model being additive
- seasonality in sales is higher in July and lower in January


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Prophet Data Prep & Training
- Prophet requires as input a dataframe with two columns
- ds will be the datetime column
- y represents the metric to be forecast
# Prophet
Model Training
- Prophet does not require hyper parameter specification
- however, if the data is multiplicated, seasonality_mode must be set to multiplicative since Prophet is based on an additive model
- for this data, no parameter must be set
- Prophet Documentation
- interval_width, which refers to the confidence level, is 0.8 by default
- setting this parameter to 0.95
Forecasting:
- make_future_dataframe() creates a dataframe for future predictions to be stored
- the prediction length is based on the periods parameter
- the prediction periods here will be 90 for 90 days
- by default, it also stores past dates as well
# Prophet.make_future_dataframe()
# make predictions by calling predict on the future dataframe
- the forecast dataframe contains Prophet's predictions on sales
- because it also got the historical dates, Prophet provides an in-sample fit that can be used to evaluate the model
- forecast() includes columns for yhat with the forecast
- it also includes columns for components and uncertainty intervals
Plotting the Prophet Forecast
- the dark blue is the forecast for sales, forecast['y_hat']
- the black dots are the actual sales, forecast['y']
- the light blue shading is the 95% confidence interval around the forecast
- the uncertainty interval is bounded by forecast['yhat_lower'] and forecast['yhat_upper']
Trend Changepoints
- trend changepoints refer to the abrupt changes in trajectory that real-life time series data often contains
- these are caused by things like new product launches, unforseen problems, etc.
- Prophet automatically detects changepoints and allows the trend to adapt appropriately
- the growth rate varies and makes the model more flexible
- this can cause overfitting or underfitting, however
- changepoint_prior_scale is the parameter that can be used to adjust trend flexibility and deal with over or underfitting
- a higher values fits a more flexible curve to the time series
- by default, changepoints are only inferred for the first 80% of the data
- changepoint_range can be used to affect this behavior
- changepoints can also be added manually using thechangepoints argument
- Changepoint Documentation
- Changepoints are represented by the dotted lines in the plot below
# component plots
Observations
- trend component - trends upwards
- weekly seasonality component - shows more purchases on weekends and a drop in sales from Sunday to Monday
- yearly seasonality component - confirms previously seen trends as well, in the increase in sales around July and decrease around January
- the model can be further improved by accounting for holidays, etc.
- Prophet: Seasonality, Holdays, and Special Events


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Prophet Model Evaluation
- the forecast dataframe also contains predictions made on the training data
- it is possible to use this in-sample fit to evaluate the model
# using MAE and MAPE to evaluate the model
Prophet's Diagnostic Tools
- cross-validation and hyperparameter tuning are also available through Prophet
- Prophet Diagnostic Tools Documentation
Cross Validation
- this compares predicted values with the actual values
- forecast horizon (horizon) must be specified
- the initial training period (initial) must also be specified
- period refers to the spacing between cutoff dates
Performance Metrics
- the blue line shoes the MAE (above) and MAPE (below) where the mean is taken over a rolling window represented by the dots
- errors around 18% are typical for predictions 9 days into the future
- error decreases to around 17% for predictions 90 days out


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Fine-Tuning Prophet
- using grid-search to fine-tune the hyperparameters
- hyperparameters to tune are:
    - changepoint_propr_scale
    - seasonality_prior_scale
-changepoint_prior_scale determines the flexibility of the trend, specifically how much the trend changes at the trend changepoints
- seasonality_prior_scale controls the flexibility of seasonality
- Prophet Hyperparameter Tuning
# code from the Prophet documentation that I turned into a function
Hypertuned Model
# results are the same as the model before hyperparameter tuning
# comparing prophet_01 and prophet_02
- the fine-tuned, second Prophet model slightly outperformed the first


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Incorporating Holidays
- adding US Holidays by using add_country_holidays(country_name='US')
By including, holidays in US provided by Prophet we succeed in improving a bit our model. Even if it was not a significative improvement it shows that changing parameters based on our business case can play in our favour.
# adding holidays improved the model a bit more
Saving and Loading Best Model
- in Python, Prophet models should not be saved with pickle
- the Stan backend makes this impossible
- instead use the built-in serialization functions to serialize the model to json
# saving model
# loading model


| Top | Components | Additive / Multiplicative | Google Trends | Stationarity Tests | Making Stationary | Global Temperature | ARIMA | Walmart Data | Box-Jenkins Method | Summary Statistics | Adding Seasonality | Automated Model Selection | Model Forecasting Compared | Section Conclusions | Facebook Prophet | Prophet Pre & Training | Prophet Evaluation | Prophet Fine-Tuning | Adding Holidays | Prophet Conclusions |

Conclusion
- the Prophet models clearly outperform the ARIMA and SARIMA models
- the best Prophet model has a MAE 32.38% lower than the MAE of the best SARIMA model
- MAPE of the best Prophet is 21.13% lower than the MAPE of the best SARIMA model
Comparing all results - SARIMA models vs Prophet models