Time Series Analysis
 Summary

Discussion
 What are the main objectives of time series analysis?
 What are some applications of time series analysis?
 What are the main components of time series data?
 What is a stationary series and how important is it?
 Given a nonstationary series, how can I make it stationary?
 What are the different models used in Time Series Analysis?
 What are autocorrelations in the context of time series analysis?
 How do I build a time series model?
 How do we handle random variations in data?
 Milestones
 References
 Further Reading
 Article Stats
 Cite As
Time series data is an ordered sequence of observations of welldefined data items at regular time intervals. Examples include daily exchange rates, bank interest rates, monthly sales, heights of ocean tides, or humidity.^{} Time Series Analysis (TSA) finds hidden patterns and obtains useful insights from time series data.^{} TSA is useful in predicting future values or detecting anomalies across a variety of application areas.^{}
Historically, TSA was divided into time domain versus frequency domain approaches. The time domain approach used autocorrelation function whereas the frequency domain approach used Fourier transform of the autocorrelation function. Likewise, there are also Bayesian and nonBayesian approaches. Today these differences are of less importance. Analysts use whatever suits the problem.^{}
While most methods of TSA are from classical statistics, since the 1990s artificial neural networks have been used.^{} However, these can excel only when sufficient data is available.^{}
Discussion
What are the main objectives of time series analysis? TSA has the following objectives:^{}
 Describe: Describe the important features of the time series data. The first step is to plot the data to look for the possible presence of trends, seasonal variations, outliers and turning points.
 Model: Investigate and find out the generating process of the time series.
 Predict: Forecast future values of an observed time series. Applications are in predicting stock prices or product sales.
What are some applications of time series analysis? TSA used in numerous practical fields such as business, economics, finance, science, or engineering. Some typical use cases are Economic Forecasting, Sales Forecasting, Budgetary Analysis, Stock Market Analysis, Yield Projections, Process and Quality Control, Inventory Studies, Workload Projections, Utility Studies, and Census Analysis.^{}
In TSA, we collect and study past observations of a time series data. We then develop an appropriate model that describes the inherent structure of the series. This model is then used to generate future values for the series, that is, to make forecasts. Time series analysis can be termed as the act of predicting the future by understanding the past.^{}
Forecasting is a common need in business and economics. Besides forecasting, TSA is also useful to see how a single event affects the time series. TSA can also help towards quality control by pointing out data points that are deviating too much from the norm. Control and monitoring applications of TSA are more common in science and industry.^{}
What are the main components of time series data? There are many factors that result in variations in time series data. The effects of these factors are studied by following four major components:^{}
 Trends: A trend exists when there is a longterm increase or decrease in the data. It doesn't have to be linear. Sometimes we will refer to a trend as "changing direction" when it goes from an increasing trend to a decreasing trend.
 Seasonal: A seasonal pattern exists when a series is influenced by seasonal factors (quarterly, monthly, halfyearly). Seasonality is always of a fixed and known period.
 Cyclic Variation: A cyclic pattern exists when data exhibits rises and falls that are not of fixed period. The duration of these cycles is more than a year. For example, stock prices cycle between periods of high and low values but there's no set amount of time between those fluctuations.
 Irregular: The variation of observations in a time series which is unusual or unexpected. It's also termed as a Random Variation and is usually unpredictable. Floods, fires, revolutions, epidemics, and strikes are some examples.
What is a stationary series and how important is it? Given a series of data points, if the mean and variance of all the data points remain constant with time, then we call it a stationary series. If these vary with time, we call it a nonstationary series.^{}
Most prices (such as stock prices or price of Bitcoins) are not stationary. They are either drifting upward or downward. Nonstationary data are unpredictable and cannot be modeled or forecasted. The results obtained by using nonstationary time series may be spurious in that they may indicate a relationship between two variables where one doesn't exist. In order to receive consistent, reliable results, nonstationary data needs to be transformed into stationary data.
Given a nonstationary series, how can I make it stationary? The two most common ways to make a nonstationary time series curve stationary are:^{}
 Differencing: In order to make a series stationary, we take a difference between the data points. Suppose the original time series is \(X_1, X_2, X_3, \ldots X_n\). Series with difference of degree 1 becomes \(X_2X_1, X_3X_2, X_4X_3, \ldots, X_nX_{n1}\). If this transformation is done only once to a series, we say that the data has been first differenced. This process essentially eliminates the trend if the series is growing at a fairly constant rate. If it's growing at an increasing/decreasing rate, we can apply the same procedure and difference the data again. The data would then be second differenced.
 Transformation: If the series can't be made stationary, we can try transforming the variables. Log transform is probably the most commonly used transformation for a diverging time series. However, it's normally suggested to use transformation only when differencing is not working.
What are the different models used in Time Series Analysis? Some commonly used models for TSA are:
 AutoRegressive (AR): A regression model, such as linear regression, models an output value based on a linear combination of input values. \(y = \beta_0 + \beta_1x + \epsilon\). In TSA, input variables are observations from previous time steps, called lag variables. For p=2, where p is the order of the AR model, AR(p) is \( x_t = \beta_0 + \beta_1 x_{t1} + \beta_2 x_{t2}\)
 Moving Average (MA): This uses past forecast errors in a regressionlike model. For q=2, MA(q) is \(x_t = \theta_0 + \theta_1 \epsilon_{t1} + \theta_2 \epsilon_{t2}\)
 AutoRegressive Moving Average (ARMA): This combines both AR and MA models. ARMA(p,q) is \(\begin{align}x_t = &\beta_0 + \beta_1 x_{t1} + \beta_2 x_{t2} + \ldots + \beta_p x_{tp} + \\ &\theta_0 + \theta_1 \epsilon_{t1} + \theta_2 \epsilon_{t2} + \ldots + \theta_q \epsilon_{tq} \end{align}\)
 AutoRegressive Integrated Moving Average (ARIMA): The above models can't handle nonstationary data. ARIMA(p,d,q) handles the conversion of nonstationary data to stationary: I refers to the use of differencing, p is lag order, d is degree of differencing, q is averaging window size.^{}
What are autocorrelations in the context of time series analysis? Autocorrelations are numerical values that indicate how a data series is related to itself over time. It measures how strongly data values separated by a specified number of periods (called the lag) are correlated to each other. AutoCorrelation Function (ACF) defines autocorrelation for a specific lag.^{}
Autocorrelations may range from +1 to 1. A value close to +1 indicates a high positive correlation while a value close to 1 implies a high negative correlation. These measures are most often evaluated through graphical plots called correlogram. A correlogram plots the autocorrelation values against lag.^{} Such a plot helps us choose the order parameters for ARIMA model.
In addition to suggesting the order of differencing, ACF plots can help in determining the order of MA(q) models. Partial AutoCorrelation Function (PACF) correlates a variable with its lags, conditioned on the values in between. PACF plots are useful when determining the order of AR(p) models.^{}
How do I build a time series model? ARMA or ARIMA are standard statistical models for time series forecast and analysis. Along with its development, the authors Box and Jenkins also suggested a process for identifying, estimating, and checking models. This process is now referred to as the BoxJenkins (BJ) Method. It's an iterative approach that consists of the following three steps:^{}
 Identification: Involves determining the order (p, d, q) of the model in order to capture the salient dynamic features of the data. This mainly leads to use graphical procedures such as time series plot, ACF, PACF, etc.
 Estimation: The estimation procedure involves using the model with p, d and q orders to fit the actual time series and minimize the loss or error term.
 Diagnostic checking: Evaluate the fitted model in the context of the available data and check for areas where the model may be improved.
How do we handle random variations in data? Whenever we collect data over some period of time there's some form of random variations. Smoothing is the technique to reduce the effect of such variations and thereby bring out trends and cyclic components.^{} There are two distinct groups of smoothing methods:
 Averaging Methods:
(a) Moving Average: we forecast the next value by averaging 'p' previous values.
(b) Weighted Average: we assign weights to each of the previous observations and then take the average. The sum of all the weights should be equal to 1.  Exponential Smoothing Methods: It assigns exponentially decreasing weights as the observation get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.^{} There are several varieties of this method:^{}
(a) Simple exponential smoothing for series with no trend and seasonality: the basic formula for simple exponential smoothing is \(S_{t+1} = \alpha y_t + (1\alpha)S_t, \qquad 0 < \alpha <=1, t > 0\)
(b) Double exponential smoothing for series with a trend and no seasonality.
(c) Triple exponential smoothing for series with both trend and seasonality.
 Averaging Methods:
Milestones
John Graunt publishes a book titled Natural and Political Observations … Made upon the Bills of Mortality. The book contains the number of births and deaths recorded weekly for many years starting from early 17th century. It also includes the probability that a person dies by a certain age. Such tables of life expectancy later become known as actuarial tables. This is one of the earliest examples of time series style of thinking applied to medicine.^{}
Robert FitzRoy coins the term "weather forecast". Such forecasts start appearing in The Times from August 1861. Atmospheric data collected from many parts of England are relayed by telegraph to London, where FitzRoy analyzes the data (along with past data) to make forecasts. His forecasts forewarn sailors of impending storms and directly contribute to reducing shipwrecks.^{}
Augustus D. Waller, a doctor by profession, records what is possibly the first electrocardiogram (ECG). As practical ECG machines arrive in the early 20th century, TSA is applied to estimate the risk of cardiac arrests. In the 1920s, electroencephalogram (EEG) is introduced to measure brain activity. This gives doctors more opportunities to apply TSA.^{}
Yule applies harmonic analysis and regression to determine the periodicity of sunspots. He separates periodicity from superposed fluctuations and disturbances.^{} Yule's work starts the use of statistics in TSA.^{} In general, application of autoregressive models is due to Yule and Walker in the 1920s and 1930s.^{}
Muth establishes a statistical foundation for Simple Exponential Smoothing (SES) by showing that it's optimal for a random walk plus noise. Further advances to exponential smoothing happen in 1985: Gardner gives a comprehensive review of the topic; Snyder links SES to innovation state space model,^{} where innovation refers to the forecast error.^{}
Bates and Granger show that by combining forecasts from two independent models, we can achieve a lower mean squared error. They also propose how to derive the weights in which the two original forecasts are to be combined.^{} The same year, David Reid publishes his PhD thesis that's probably the first nontrivial study of time series forecast accuracy.^{}
Box and Jenkins publish a book titled Time Series Analysis: Forecasting and Control. This work popularizes the ARIMA model with an iterative modelling procedure. Once a suitable model is built, forecasts are conditional expectations of the model using mean squared error (MSE) criterion.^{} In time, this model is called the BoxJenkins Model.^{}
Through the 1970s, many statisticians continue to believe that there's a single model waiting to be discovered that can best fit any given time series data. However, empirical evidence show that an ensemble of models give better results. These debates cause George Box to famously remark,^{}
All models are wrong but some are useful
Makridakis and Hibon use 111 time series data and compare the performance of many forecasting methods. Their results claim that a combination of simpler methods can outperform a sophisticated method. This causes a stir within the research community.^{} To prove the point, Makridakis and Hibon organize a competition, called MCompetition starting from 1982: 1001 series (1982), 29 series (1993), 3003 series (2000), 100,000 series (2018),^{} and 42,840 series (2020).^{}
Although Kalman filtering was invented in the 1960, it's only in the 1980s that statisticians use statespace parameterization and Kalman filtering for TSA. The recursive form of the filter enables efficient forecasting.^{} An ARIMA model can be put into a statespace model. Similarly, a statespace model suggests an ARIMA model.^{}
Robert Engle develops the Autoregressive Conditional Heteroskedasticity (ARCH) model to account for timevarying volatility observed in economics time series data. In 1986, his student Time Bollerslev develops the Generalized ARCH (GARCH) model.^{} In general, variance of the error term depends on past error terms and their variance.^{} ARCH and GARCH are nonlinear generalizations of the BoxJenkins model.^{}
Engle and Grange propose cointegration as a technique for multivariate TSA. Cointegration is a linear combination of marginally unitroot nonstationary series to yield a stationary series. This becomes a popular method in econometrics due to longterm relationship between variables. An earlier method of multivariable TSA is Vector Autoregressive (VAR) model.^{}
Zhang et al. publish a survey of neural networks applied to forecasting. They note an early work by Lapedes and Farber (1987) who proposed multilayer feedforward networks. However, the use of ANNs for forecasting happens mostly in the 1990s. In general, feedforward or recurrent networks are preferred. At most two hidden layers are used. Number of input nodes correspond to the number of lagged observations needed to discover patterns in data. Number of output nodes correspond to the forecasting horizon.^{}
SánchezSánchez et al. highlight many issues in using neural networks for TSA. There's no clarity on how to select the number of input or hidden neurons. There's no guidance on how best to partition the data into training and validation sets. It's not clear if data needs to be preprocessed or if seasonal/trend components have to be removed before data goes into the model.^{} In 2018, Hyndman commented that neural networks perform poorly due to insufficient data.^{} This is likely to change as data becomes more easily available.
References
 Bates, J. M. and C. W. J. Granger. 1969. "The Combination of Forecasts." Operational Research Quarterly, Operational Research Society, vol. 20, no. 4, pp. 451468, December. Accessed 20200819.
 Brownlee, Jason. 2017. "A Gentle Introduction to the BoxJenkins Method for Time Series Forecasting." Machine Learning Mastery, January 13. Accessed 20180728.
 Dahodwala, Murtuza. 2018. "Beginners Guide To Time Series Analysis with Implementation in R." Blog, Digital Vidya, February 20. Accessed 20180728.
 Emmanuel, Joshua. 2015. "Forecasting: Exponential Smoothing, MSE." YouTube, July 9. Accessed 20200819.
 Gavrilov, Viktor. 2015. "Time series analysis: smoothing." Accessed 20180728.
 Gooijer, Jan G. De and Rob J. Hyndman. 2006. "25 Years of Time Series Forecasting." International Journal of Forecasting, vol. 22, no. 3, pp. 443473. doi: 10.1016/j.ijforecast.2006.01.001. Accessed 20200819.
 Holmes, E. E., M. D. Scheuerell, and E. J. Ward. 2020. "Correlation within and among time series." Section 4.4 in: Applied time series analysis for fisheries and environmental data, NOAA Fisheries, Northwest Fisheries Science Center, February 3. Accessed 20200819.
 Hyndman, Rob J. 2018. "A brief history of time series forecasting competitions." Blog, Hyndsight, April 11. Accessed 20200819.
 Hyndman, Rob J. and George Athanasopoulos. 2018. "Stationarity and differencing." Section 8.1 in: Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. Accessed 20200819.
 Krishnan, Adithya. 2019. "Anomaly Detection with Time Series Forecasting." Towards Data Science, on Medium, March 3. Accessed 20200819.
 Mitrani, Alex. 2020. "Achieving Stationarity With Time Series Data." Towards Data Science, on Medium, January 10. Accessed 20200819.
 MOFC. 2020. "The M5 Competition." MOFC. Accessed 20200819.
 Moore, Peter. 2015. "The science of weather forecasting: The pioneer who founded the Met Office." Independent, April 27. Accessed 20200819.
 Morris, C., W. Petty, J. Graunt, and T. Birch. 1759. "A Collection of the yearly bills of mortality, from 1657 to 1758 inclusive: Together with several other bills of an earlier date." London: A. Millar. Accessed 20200819.
 Morrison, Jeff. 2018. "Autoregressive Integrated Moving Average Models (ARIMA)." Accessed 20180728.
 Nielsen, Aileen. 2019. "Time Series: An Overview and a Quick History." Chapter 1 in: Practical Time Series Analysis, O'Reilly Media, Inc. Accessed 20200819.
 NIST. 2003a. "Definitions, Applications and Techniques." Section 6.4.1 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1. Accessed 20180728.
 NIST. 2003b. "What are Moving Average or Smoothing Techniques?" Section 6.4.2 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1.Accessed 20180728.
 NIST. 2003c. "What is Exponential Smoothing?" Section 6.4.3 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1.Accessed 20180728.
 PennState. 2020. "Partial Autocorrelation Function (PACF)." Section 2.2 in: Applied Time Series Analysis, STAT 510, Penn State Univ. Accessed 20200819.
 SanJuan, Juan Félix, Montserrat SanMartín, and Ivan Perez. 2012. "An Economic Hybrid J2 Analytical Orbit Propagator Program Based on SARIMA Models." Mathematical Problems in Engineering, Hindawi Publishing Corporation, vol. 2012, article ID 207381, August. doi: 10.1155/2012/207381. Accessed 20200819.
 Senter, Anne. 2008. "Time Series Analysis." BIOL 710, San Francisco State University, June 3. Accessed 20200819.
 Shmueli, Galit. 2016. "Smoothing 3: Differencing." National Tsing Hua Univ, on YouTube, November 30. Accessed 20200819.
 SánchezSánchez, Paola Andrea, José Rafael GarcíaGonzález, and Leidy Haidy Perez Coronell. 2019. "Encountered Problems of Time Series with Neural Networks: Models and Architectures." IntechOpen Limited, November 27. Accessed 20200819.
 Toloi, Clélia M.C., and Sergio R. Martins. 2006. "How to teach some basic concepts in time series analysis." Accessed 20180728.
 Tsay, Ruey S. 2000. "Time Series and Forecasting: Brief History and Future Research." Journal of the American Statistical Association, vol. 95, no. 450, pp. 638643, June. Accessed 20200819.
 Waller, Augustus D. 1887. "A Demonstration on Man of Electromotive Changes accompanying the Heart's Beat." J Physiol., vol. 8, no. 5, pp. 229–234, October. Accessed 20200819.
 Wikipedia. 2020. "Time series." Wikipedia, August 16. Accessed 20200819.
 Wikipedia. 2020b. "Makridakis Competitions." Wikipedia, April 14. Accessed 20200819.
 Yule, G. Udny. 1927. "On a Method of Investigating Periodicities in Disturbed Series, with Special Reference toWolfer's Sunspot Numbers." Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 226, pp. 267298. Accessed 20200819.
 Zhang, Guoqiang, B. Eddy Patuwo, and Michael Y. Hu. 1998. "Forecasting with artificial neural networks: The state of the art." International Journal of Forecasting, Elsevier, vol. 14, pp. 35–62. Accessed 20200819.
 Zhao, Yanchang. 2011. "Time Series Analysis and Mining with R." Blog, R Data Mining, August 23. Accessed 20200819.
 Zhu, Wei. 2019. "ARCH/GARCH Models." AMS 586, Time Series Analysis, State University of New York at Stony Brook, November. Accessed 20200819.
 Zoubir, Leila. 2017. "A brief history of time series analysis." Department of Statistics, Stockholms universitet, October 31. Accessed 20200819.
Further Reading
 ARIMA model
 BoxJenkins methodology
 Hyndman, Rob J. and George Athanasopoulos. 2018. "Forecasting: principles and practice." 2nd edition, OTexts: Melbourne, Australia. Accessed 20200819.
 Chatfield, Chris, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder. 2001. "A New Look at Models for Exponential Smoothing." Journal of the Royal Statistical Society. Series D (The Statistician), vol. 50, no. 2, pp. 147159. Accessed 20200819.
 Time Series Forecast: A basic introduction using Python
 Using python to work with time series data
Article Stats
Cite As
See Also
 Predictive Analytics
 ARIMA Model
 Regression Modelling
 Exploratory Data Analysis
 Time Series Smoothing
 Time Series Database