How to Forecast Time Series With Multiple Seasonalities

data science python time series Aug 24, 2022

 When dealing with time series, we often encounter seasonality. Seasonality is defined as a periodical variation in our series. It is a cycle that occurs over a fixed period in our series. For example, let’s take a look at the popular airline dataset shown below.

Monthly total number of air passengers for an airline, from January 1949 to December 1960. We notice a clear seasonal pattern in the series, with more people travelling during the months of June, July, and August (image by the author)

Here, we can clearly a seasonal cycle, as every year, the number of air passengers peaks around the month of July and falls down again.

To forecast this series, we can simply use a SARIMA model, since there is only one seasonal period with a length of one year.

Now, things get complicated when we are working with high frequency data. For example, an hourly time series can exhibit a daily, weekly, monthly and yearly seasonality, meaning that we now have multiple seasonal periods. Take a look at the hourly traffic volume on the Interstate 94 shown below.

Hourly traffic volume, westbound, on the interstate 94 in Minneapolis, Minnesota. Here we can see both a daily seasonality (more cars are on the road during the day than during the night), but also a weekly seasonality (more car are on the road Monday to Friday, than during the weekends). Image by the author

Looking at the data above, we can see that we have two seasonal periods! First, we have a daily seasonality, as we see that more cars travel on the road during the day than during the night. Second, we have a weekly seasonality, as traffic volume is higher during weekdays than during the weekend.

In this case, a SARIMA model cannot be used, because we can only specify one seasonal periods, whereas we definitely have two seasonal periods in our data: a daily seasonality and a weekly seasonality.

We thus turn our attention to BATS and TBATS models. Using these models, we can fit and forecast time series that have more than one seasonal period.

In this article, we first explore the theory behind BATS and TBATS, and then apply them to forecast the hourly traffic volume for the next seven days. Let’s get started!

Learn the latest time series analysis techniques with my free time series cheat sheet in Python! Get the implementation of statistical and deep learning techniques, all in Python and TensorFlow!

The intuition behind BATS and TBATS

Before we dive into the project, let’s first understand how BATS and TBATS work behind the scenes.

I promise the theory isn’t so bad (image by the author)

BATS

The acronym BATS refers to the method: exponential smoothing state-space model with Box-Cox transformation, ARMA errors, Trend, and Seasonal components. There is a lot to dissect here, so let’s go step by step.

  • Exponential smoothing is a family of forecasting methods. The general idea behind these forecasting methods is that future values are a weighted average of past values, with the weights decaying exponentially as we go back in time. Forecasting methods include simple exponential smoothing, double exponential smoothing or Holt’s method (for time series with a trend), and triple exponential smoothing or Holt-Winter’s method (for time series with a trend and sesaonality).
  • State-space modelling is a framework in which a time series is seen as a set of observed data that is influenced by a set of unobserved factors. The state-space model then expresses the relationship between the two sets. Again, this must be seen as a framework, as an ARMA model can be expressed as a state-space model.
  • Box-Cox transformation is a power transformation that helps make the series stationary by stabilizing the variance and the mean over time.
  • ARMA errors is a process in which we apply an ARMA model on the residuals  of the time series in order to find any unexplained relationship. Usually, the residuals of a model should be totally random, unless some information was not captured by the model. Here, we use an ARMA model to capture any remaining information in the residuals.
  • Trend is a component of a time series that explains the long-term change in the mean value of the series. When we have a positive trend, then our series is increasing over time. With a negative trend, the series decreases over time.
  • The seasonal component is what explains the periodical variation in the series.

To summarize, BATS is an extension of exponential smoothing methods that combines a Box-Cox transformation to handle non-linear data and uses an ARMA model to capture autocorrelation in the residuals.

The advantage of using BATS is that it can treat non-linear data, solve the autocorrelation problem in residuals, since it uses an ARMA model, and it can take into account multiple seasonal periods.

However, the seasonal periods must be integer numbers, otherwise BATS cannot be applied. For example, suppose that you have weekly data with a yearly seasonality, then your period is 365.25/7 which is approximately 52.2. In that case, BATS is ruled out.

Furthermore, BATS can take a long time to fit if the seasonal period is very large, meaning that it is not suitable if you have hourly data with a monthly (the period would be 730).

Thus, the TBATS model was developed to address that situation.

TBATS

The acronym TBATS stands for Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components.

It uses the same components as the BATS model, however it represents each seasonal period as a trigonometric representation based on Fourier series. This allows the model to fit large seasonal periods and non-integer seasonal periods.

It is thus a better choice when dealing with high-frequency data and it usually fits faster than BATS.

Here, I purposely avoided the math to avoid any confusion. For a detailed mathematical explanation of both BATS and TBATS, I suggest you read this paper.

Now that we have an intuition on how both models work, let’s apply them to forecast the next seven days of hourly traffic volume.

Applying BATS and TBATS in Python

Let’s see both models in action when forecasting the hourly traffic volume. You can refer to the entire source code on GitHub.

Exploration

First, we import the required libraries for this project.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Then, we read the CSV file containing our data. Note that you can download it from GitHub as well.

data = pd.read_csv(‘daily_traffic.csv’)
data = data.dropna()

Great! With this done, we can now visualize our data.

fig, ax = plt.subplots(figsize=(14, 8))ax.plot(data['traffic_volume'])
ax.set_xlabel('Time')
ax.set_ylabel('Traffic volume')
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
Hourly traffic volume on the interstate 94, in Minneapolis, Minnesota (image by the author)

From the figure above, we can clearly see that we have two seasonal periods. Let’s zoom in and label the days of the week to identify both periods.

fig, ax = plt.subplots(figsize=(14, 8))ax.plot(data['traffic_volume'])
ax.set_xlabel('Time')
ax.set_ylabel('Traffic volume')
plt.xticks(np.arange(7, 400, 24), ['Friday', 'Saturday', 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'])
plt.xlim(0, 400)
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
Hourly traffic volume, westbound, on the interstate 94 in Minneapolis, Minnesota. Here we can see both a daily seasonality (more cars are on the road during the day than during the night), but also a weekly seasonality (more car are on the road Monday to Friday, than during the weekends). Image by the author

Of course, we recognize the plot from the beginning of this article and notice that the traffic volume is indeed lower during the weekend than during the weekdays. Also, we see a daily seasonality, with traffic being heavier during the day than at night.

Therefore, we have two periods: the daily period has a length of 24 hours, and the weekly period has a length of 168 hours. Let’s keep that in mind as we move on to modeling.

Modeling

We are now ready to start modeling our data. Here, we use the sktime package. I just discovered this framework which brings many statistical and machine learning methods for time series. It also uses a similar syntax convention to scikit-learn, making it easy to use.

The first step is to define our target and define the forecast horizon. Here, the target is the traffic volume itself. For the forecast horizon, we wish to predict one week of data. Since we have hourly data, we must then predict 168 timesteps (7 * 24) into the future.

y = data['traffic_volume']fh = np.arange(1, 168)

Then, we split our data into a training set and a test set. We will keep the last week of data as a test set in order to evaluate our predictions.

Here, we use the temporal_train_test_split function from sktime.

from sktime.forecasting.model_selection import temporal_train_test_splity_train, y_test = temporal_train_test_split(y, test_size=168)

Optionally, we can visualize our test set.

fig, ax = plt.subplots(figsize=(14, 8))ax.plot(y_train, ls='-', label='Train')
ax.plot(y_test, ls='--', label='Test')
ax.set_xlabel('time')
ax.set_ylabel('Daily traffic volu,e')
ax.legend(loc='best')
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
Visualizing the training set and test set. The test is simply the last week of data, as shown by the dashed orange line. The rest is used for fitting the models (image by the author).

Baseline model

Before we implement our more complex BATS and TBATS models, it’s always a good idea to have a baseline model. That way, we can determine if our more complex forecasting methods are actually performant.

Here, the simplest baseline I can think of is simply repeating the last week of data from the training set into the future.

y_pred_baseline = y_train[-168:].values

Applying BATS

Now that we have a baseline, let’s move on to implementing the BATS model.

We first import the BATS model from sktime. Then, we specify the parameters of the model for training. Here, we want to use the Box-Cox transformation as we are dealing with non-linear data. Then, since our dataset does not have an apparent trend, we remove those components from the model. Finally, we specify the seasonal periods, which are 24 (for the daily seasonality) and 168 (for the weekly seasonality).

Once the model is specified, we simply fit it on the training set and generate the predictions over the forecast horizon.

All of the steps outlined above translate into the code below.

from sktime.forecasting.bats import BATSforecaster = BATS(use_box_cox=True,
use_trend=False,
use_damped_trend=False,
sp=[24, 168])
forecaster.fit(y_train)
y_pred_BATS = forecaster.predict(fh)

Applying TBATS

Forecasting using TBATS turns out to be exactly like using BATS, only now, well… we use TBATS!

from sktime.forecasting.tbats import TBATSforecaster = TBATS(use_box_cox=True,
use_trend=False,
use_damped_trend=False,
sp=[24, 168])
forecaster.fit(y_train)
y_pred_TBATS = forecaster.predict(fh)

Evaluating the performance

At this point, we have predictions from our baseline model, BATS, and TBATS. We are then ready to visualize the predictions and see which model performs best.

Visualizing the predictions gives the following plot.

fig, ax = plt.subplots(figsize=(14, 8))ax.plot(y_train, ls='-', label='Train')
ax.plot(y_test, ls='-', label='Test')
ax.plot(y_test.index, y_pred_baseline, ls=':', label='Baseline')
ax.plot(y_pred_BATS, ls='--', label='BATS')
ax.plot(y_pred_TBATS, ls='-.', label='TBATS')
ax.set_xlabel('time')
ax.set_ylabel('Daily traffic volume')
ax.legend(loc='best')
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
Forecasting the next 168 hours of traffic volume. We can see that all models seem to generate similar predictions, as the lines overlap one another. It is hard to know which model performs best by looking at the plot. (image by the author)

Looking at the figure above, it seems that all of our models generate very similar predictions, as the lines are overlapping. It is very hard to determine which model performs best just by looking at the plot.

We can optionally zoom in on the test set to better visualize the predictions.

Zooming in on the test set. Here, it seems that BATS does a great job at modeling both seasonalities, whereas TBATS sometimes overshoots or undershoots. Note that the baseline also follows the actual values very well. (image by the author)

Looking at the figure above, we first notice that both models indeed model a double seasonality, which is great in itself! Also, it seems that BATS does a better job at predicting the future, since TBATS seems to sometimes overshoot or undershoot. Note also that the baseline model closely follows the curve of actual values.

We now compute an error metric to determine the best model and compare their performance. In this case, we use the mean absolute percentage error (MAPE), for its ease of interpretation. Recall that the closer the MAPE is to 0, the better the performance.

MAPE is not yet implemented in scikit-learn, so we define the function ourselves.

def mape(y_true, y_pred):
return round(np.mean(np.abs((y_true - y_pred) / y_true)) * 100,2)

Then, we simply compute the performance of each model and visualize it in bar chart.

mape_baseline = mape(y_test, y_pred_baseline)
mape_BATS = mape(y_test, y_pred_BATS)
mape_TBATS = mape(y_test, y_pred_TBATS)
print(f'MAPE from baseline: {mape_baseline}')
print(f'MAPE from BATS: {mape_BATS}')
print(f'MAPE from TBATS: {mape_TBATS}')
fig, ax = plt.subplots()x = ['Baseline', 'BATS', 'TBATS']
y = [mape_baseline, mape_BATS, mape_TBATS]
ax.bar(x, y, width=0.4)
ax.set_xlabel('Models')
ax.set_ylabel('MAPE (%)')
ax.set_ylim(0, 35)
for index, value in enumerate(y):
plt.text(x=index, y=value + 1, s=str(round(value,2)), ha='center')
plt.tight_layout()
MAPE of all models. Here, the baseline model achieves the best performance as it has the lowest MAPE. (image by the author)

From the figure above, we can see that BATS performed better than TBATS, which is to be expected as we observed from the plot. However, we see that the baseline model is the best performing model, achieving a MAPE of 11.97%.

That’s a bit anticlimactic, but let’s understand why this happened.

It is possible that our dataset is too small. It might be that the sample that we used for testing turns out to favor the baseline model. One way to verify would be to forecast multiple 168 hour-horizon, to see if the baseline model still outperforms the rest.

Also, it can be that we were too strict with the models’ parameters. Here, we forced both models to use Box-Cox transformations and remove the trend component. However, we could have not specified those parameters, and the model would have tried both possibilities for each parameter and select the one with the lowest AIC (Akaike’s Information Criterion). While this makes the training process longer, it might also result in better performance from BATS and TBATS.

Nevertheless, a key takeaway is that a building a baseline model is very important for any forecasting project.

Conclusion

In this article, we learned about BATS and TBATS models, and how they can be used to forecast time series that have more than one seasonal period, in which case a SARIMA model cannot be used.

We applied both models to forecast the hourly traffic volume, but it turned out that our baseline remained the best performing model.

Nevertheless, we saw that BATS and TBATS can indeed model time series with complex seasonalities.

Potential improvements

  • Forecast multiple 168 hour-horizon and see if the baseline is indeed the most performant model. You can use the original dataset which contains much more data than what we worked with.
  • Do not specify the parameters use_box_cox, use_trend, and use_damped_trend, and allow the model to make the best selection based on the AIC.

Key takeaways

  • Always build a baseline model when forecasting
  • BATS and TBATS can be used for modeling time series with complex seasonality
  • BATS works well when the periods are short and integer numbers
  • TBATS trains faster than BATS and works with seasonal periods that are not integers

Thank for reading, and I hope that learned something useful! If you want to learn more about time series forecasting in Python using statistical and deep learning models, check out my free cheat sheet!

Cheers! 🍺

Support me

Enjoying my work? Show your support with Buy me a coffee, a simple way for you to encourage me, and I get to enjoy a cup of coffee! If you feel like it, just click the button below 👇

Stay connected with news and updates!

Join the mailing list to receive the latest articles, course announcements, and VIP invitations!
Don't worry, your information will not be shared.

I don't have the time to spam you and I'll never sell your information to anyone.