Chapter 1 Introduction

“Econometrics is the science and art of using economic theory and statistical techniques to analyze economic data.’’ (Stock, J.H., Watson, M.W., 2018)

1.1 Types of data

Cross section: data on different entities (workers, consumers, firms, etc.) collected at a single time period (e.g. number of inhabitants per postcode in 2010).

Cross section: time variable year 2010, ID variable postcode

Figure 1.1: Cross section: time variable year 2010, ID variable postcode

Times series: data for a single entity (person, firm, country, etc.) collected at multiple time periods (e.g. GDP in the Netherlands).
Time series: time variable date, ID variable Netherlands \ Source: FRED Real GDP Netherlands

Figure 1.2: Time series: time variable date, ID variable Netherlands  Source: FRED Real GDP Netherlands

Panel data: data for multiple entities in which each entry is observed at two or more time periods (e.g. number of inhabitants in postcodes over time).

Panel data: time variable  year, ID variable  postcode

Figure 1.3: Panel data: time variable year, ID variable postcode

Example of a time series - CO_2 emissions

Figure 1.4: Example of a time series - CO_2 emissions

1.2 Time Series

A time series or a time-varying process is a set of observations \(\{y_t\}\) on a variable over a time period.

\[\begin{align} y_1,\ldots,& y_n \implies \{y_t\} \;\;\;\;\;\ t=1,\ldots,n \end{align}\] The is the value of \(y\) at time \(t\), .
The (or lagged value) is the value of \(y_t\) in the previous period, .
The of \(y\) is its value \(p\) periods before \(t\), .
The difference between the value of \(y\) at time \(t\) and \(t-1\) is called \[\begin{align} \Delta y_t&= y_t-y_{t-1} \end{align}\] where \(\Delta\) is the difference operator.

1.2.1 Time Series properties

Stationary data: time-series data that has a constant mean value over time.

This condition is very helpful for predicting and forecasting the time series, hence it is a relevant condition.

In details, the stationarity condition implies that the mean, variance and covariance of the time series \(y_t\) should be constant over time and finite: \[\begin{align} \mathbb{E}(y_t)&=\mu \;\;\;\;\;\;\;\; \forall t \text{$\mathbb{V}$ar}(y_t)&=\gamma(0) \;\;\;\; \forall t \text{$\mathbb{C}$ov}(y_{t+h},y_t)&=\gamma(h)\;\;\;\; \forall t \end{align}\]

Expected value: the mean of the series \(y_t\) is the long-run average.
Variance: the dispersion or spread of a probability distribution. It is the expected value of the square of the deviation of the variable from its expected value.

Autocovariance: the covariance between two values of the time series at different points in time.