In time series analysis, the lag operator (L) or backshift operator (B) operates on an element of a time series to produce the previous element. For example, given some time series

$X=\{X_{1},X_{2},\dots \)$ then

$LX_{t}=X_{t-1)$ for all $t>1$ or similarly in terms of the backshift operator B: $BX_{t}=X_{t-1)$ for all $t>1$ . Equivalently, this definition can be represented as

$X_{t}=LX_{t+1)$ for all $t\geq 1$ The lag operator (as well as backshift operator) can be raised to arbitrary integer powers so that

$L^{-1}X_{t}=X_{t+1)$ and

$L^{k}X_{t}=X_{t-k}.$ ## Lag polynomials

Polynomials of the lag operator can be used, and this is a common notation for ARMA (autoregressive moving average) models. For example,

$\varepsilon _{t}=X_{t}-\sum _{i=1}^{p}\varphi _{i}X_{t-i}=\left(1-\sum _{i=1}^{p}\varphi _{i}L^{i}\right)X_{t)$ specifies an AR(p) model.

A polynomial of lag operators is called a lag polynomial so that, for example, the ARMA model can be concisely specified as

$\varphi (L)X_{t}=\theta (L)\varepsilon _{t)$ where $\varphi (L)$ and $\theta (L)$ respectively represent the lag polynomials

$\varphi (L)=1-\sum _{i=1}^{p}\varphi _{i}L^{i)$ and

$\theta (L)=1+\sum _{i=1}^{q}\theta _{i}L^{i}.\,$ Polynomials of lag operators follow similar rules of multiplication and division as do numbers and polynomials of variables. For example,

$X_{t}={\frac {\theta (L)}{\varphi (L)))\varepsilon _{t},$ means the same thing as

$\varphi (L)X_{t}=\theta (L)\varepsilon _{t}.$ As with polynomials of variables, a polynomial in the lag operator can be divided by another one using polynomial long division. In general dividing one such polynomial by another, when each has a finite order (highest exponent), results in an infinite-order polynomial.

An annihilator operator, denoted $[\ ]_{+)$ , removes the entries of the polynomial with negative power (future values).

Note that $\varphi \left(1\right)$ denotes the sum of coefficients:

$\varphi \left(1\right)=1-\sum _{i=1}^{p}\varphi _{i)$ ## Difference operator

 Main article: Finite difference

In time series analysis, the first difference operator  :$\Delta$ {\begin{aligned}\Delta X_{t}&=X_{t}-X_{t-1}\\\Delta X_{t}&=(1-L)X_{t}~.\end{aligned)) Similarly, the second difference operator works as follows:

{\begin{aligned}\Delta (\Delta X_{t})&=\Delta X_{t}-\Delta X_{t-1}\\\Delta ^{2}X_{t}&=(1-L)\Delta X_{t}\\\Delta ^{2}X_{t}&=(1-L)(1-L)X_{t}\\\Delta ^{2}X_{t}&=(1-L)^{2}X_{t}~.\end{aligned)) The above approach generalises to the i-th difference operator $\Delta ^{i}X_{t}=(1-L)^{i}X_{t}\ .$ ## Conditional expectation

It is common in stochastic processes to care about the expected value of a variable given a previous information set. Let $\Omega _{t)$ be all information that is common knowledge at time t (this is often subscripted below the expectation operator); then the expected value of the realisation of X, j time-steps in the future, can be written equivalently as:

$E[X_{t+j}|\Omega _{t}]=E_{t}[X_{t+j}].$ With these time-dependent conditional expectations, there is the need to distinguish between the backshift operator (B) that only adjusts the date of the forecasted variable and the Lag operator (L) that adjusts equally the date of the forecasted variable and the information set:

$L^{n}E_{t}[X_{t+j}]=E_{t-n}[X_{t+j-n}],$ $B^{n}E_{t}[X_{t+j}]=E_{t}[X_{t+j-n}].$ 