| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Forecasting with SKTime (with Dave Ebbelaar)
| YouTube Tutorial | Tutorials GitHub | Dave Ebbelaar |
Evan Marie online: | EvanMarie.com| EvanMarie@Proton.me | Linked In | GitHub | Hugging Face | Mastadon | Jovian.ai | TikTok | CodeWars | Discord ⇨ ✨ EvanMarie ✨#6114 |


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

SKTime Overview
- SKTime example
- Using the airline dataset from SKTime
- Running a forecasting model and predicting
# investigating imported data
temporal_train_test_split()- SKTime splitter makes sure training data always comes before testing
ForecastingHorizon() - this will match the indices of the test data, that which we want to predict
ThetaForecaster()
- Season period is set to 12, which is 12 months, 1 year
- monthly, seasonly, periodically
- 1 indicates yearly, 12 indicates monthly, and 4 indicates quarterly
# getting predictions
# predictions versus actual numbers


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Traffic Dataset
- importing the traffic dataset
- Dataset on Kaggle
# dataset overview
# pivot() - so that the columns are the 4 junctions, and the values are the numbers of vehicles at each in any given hour
# visualization - the hourly traffic at the 4 different junctions


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Pre-processing the Data
- The data requires resampling, because Pandas does not recognize the frequency of the timestamps
resample() for frequency
- Pandas does not recognize the frequency or timestep between each of the index entries
- rule = 'H' - resamples to hourly
- sum() - we want to know the total number of vehicles per hour
daily - if the resampling frequency is set to daily, the data is aggregated for each column by day
- This results in far fewer rows


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Forecasting with FB Prophet
"Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well." (Source)
horizon = 30 - predicting 30 days into the future
Working with just one of the four junctions at first
predictions: number of vehicles as well as confidence intervals
confidence interval predictions (90% coverage)
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated confidence level; the 95% confidence level is most common, but other levels, such as 90% or 99%, are sometimes used. The confidence level represents the long-run proportion of CIs (at the given confidence level) that theoretically contain the true value of the parameter. For example, out of all intervals computed at the 95% level, 95% of them should contain the parameter's true value. Factors affecting the width of the CI include the sample size, the variability in the sample, and the confidence level. All else being the same, a larger sample produces a narrower confidence interval, greater variability in the sample produces a wider confidence interval, and a higher confidence level produces a wider confidence interval.
(Source)
Making Future Predictions
- use all data to train, since this will go into the future
- using the same 30 day forecast horizon
-
Forecasting Horizon
- this will go from the end of the data input and one month into the future
- this creates the date range


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Forecasting Function
- loop over a dataframe of time series data
- train an SKTime forecasting model
- make predictions
- visualize the results
Forecaster
- yearly_seasonality and weekly_seasonality are True, because there is an assumption with traffic data that there will be such patterns


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |

Forecasting with AutoARIMA
"An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time-series data to better understand the data set or predict future trends. A statistical model is autoregressive if it predicts future values based on past values. For example, an ARIMA model might seek to predict a stock’s future prices based on its past performance or forecast a company’s earnings based on past periods." (Source)


| Top | SKTime Overview | Traffic Dataset | Pre-processing | FB Prophet Model | Forecasting Function | AutoARIMA Model |