The difference between univariate vs. multivariate, single-step vs. multistep, and sliding vs. expanding window time series problems
Getting started with time series forecasting can be overwhelming as there are many variations. Time series problems can vary depending on the number of input and output sequences, the number of steps to forecast, and whether the input sequence length is static or variable.
Getting started with Time Series Forecasting can be overwhelming as there are many variations.
For example, you could have sales data for a chocolate bar for which you are trying to forecast its sales for the next six months based on the previous 12 months.
Or you could try to forecast the amount of snow for the next day based on all available past data on temperature and rain.
In this article, we will go over the different types of time series forecasting problems. Since time series problems can be a combination of different variations, we will use the example of a fast food combo meal to showcase the variations.
So, welcome to the “Time Series Bistro”!
Please have a look at the menu and pick one from each of the following categories:
First, we will look at the number of input and output time series. That means you can either have a single time series or multiple time series as an input. The same goes for the output to forecast.
We differentiate between univariate and multivariate time series problems. For the multivariate problem formulation, we also differentiate between equal and different input and output sequences.
In a univariate time series problem, you only have one time series which is used as the input sequence and as the output sequence as well.
input_cols = ["chocolate_bar_sales"]
output_cols = input_cols
You would use this if the time series you are trying to forecast does not have any dependencies on other time series.
Use case example: Forecasting the future sales of a chocolate bar based on its past sales.
Multivariate with Equal Inputs and Outputs
In a multivariate time series problem, you can have multiple time series which are used as input sequences and as output sequences as well.
input_cols = ["stock_1", ..., "stock_n"]
output_cols = input_cols
You would usually use this problem formulation when there is a correlation between multiple time series.
Use case example: Forecasting the future prices of all stocks in a stock market index based on their past prices.
Multivariate with Different Inputs and Outputs
In contrast to the above multivariate time series problem with equal inputs and outputs, you can also have different input and output sequences.
input_cols = ["precipitation", "temperature"]
output_cols = ["snowfall"]
You would usually use this problem formulation if you need additional information from another time series to forecast the target sequence. E.g. you can provide some time awareness by feeding a modulo or sine wave signal to the model).
Use case example: Forecasting the amount of snowfall based on previous precipitation and temperature.
Next, we will look at the length of the output sequences (sides). That means you can either forecast a single time step or multiple time steps into the future.
We differentiate between single-step and multistep time series problems. For the multistep problem formulation, we also differentiate between single-shot and recursive predictions.
Single Step Output Sequence
The single-step time series problem is probably the easiest because you only have to forecast one timestep into the future.
Use case example: Forecasting the number of goods to bake for the next day.
Single-Shot Multistep Output Sequence
In contrast to the single-step time series problem, the multistep time series problem is a little bit harder because you have to forecast more than one timestep into the future. The further in the future we try to forecast at once, the less reliable our predictions become.
Use case example: Forecasting the number of school lunches for the next week to buy the right amount of groceries.
Recursive Multistep Output Sequence
Instead of forecasting multiple timesteps into the future at once, you can forecast single timesteps multiple times. While we mentioned that forecasting one timestep is more reliable than forecasting multiple timesteps at once, you have to keep in mind that in this method you will carry over the error from the previous prediction.
Finally, we will look at the type of input sequences (drinks). That means you can either use an input sequence with a fixed length or you can have an input sequence with a variable length.
We differentiate between the sliding window and expanding window time series problems.
Another factor to consider is the step size. While you could shift or expand your sequence window one timestep at a time, you could shift it by a few timesteps at a time as well. The smaller your stepsize, the higher the number of available training samples.
Sliding window means that your input sequence always has a specified fixed length, e.g. one hour, one day, one week, six months, one year, five years, etc.
Use case example: Forecasting the demand for school lunches for the next year based on the previous year.
As the name already states, the length of the input sequence increases for the expanding window time series problem.
Use case example: Forecasting the number of new users on a platform for the next month based on all historical data.
Are you hungry for some time series problems now?
Based on our combo menu, you can see that there are at least 18 (3 * 3 * 2) different types of time series problems. It’s no wonder that it feels overwhelming.
With the combo meal analogy, it became clear that there are three main building blocks for time series problems:
- Variation of the number of input and output time series
- Variation of how and how far to forecast into the future
- Variation of the elasticity of the length of the training samples