Regression analysis
Part of a series on
Models
Linear regression Simple regression Polynomial regression General linear model
Generalized linear model Vector generalized linear model Discrete choice Binomial regression Binary regression Logistic regression Multinomial logistic regression Mixed logit Probit Multinomial probit Ordered logit Ordered probit Poisson
Multilevel model Fixed effects Random effects Linear mixed-effects model Nonlinear mixed-effects model
Nonlinear regression Nonparametric Semiparametric Robust Quantile Isotonic Principal components Least angle Local Segmented
Errors-in-variables
Estimation
Least squares Linear Non-linear
Ordinary Weighted Generalized Generalized estimating equation
Partial Total Non-negative Ridge regression Regularized
Least absolute deviations Iteratively reweighted Bayesian Bayesian multivariate Least-squares spectral analysis
Background
Regression validation Mean and predicted response Errors and residuals Goodness of fit Studentized residual Gauss–Markov theorem
Mathematics portal
v t e

Least-squares spectral analysis (LSSA) is a method of estimating a frequency spectrum based on a least-squares fit of sinusoids to data samples, similar to Fourier analysis.^[1]^[2] Fourier analysis, the most used spectral method in science, generally boosts long-periodic noise in the long and gapped records; LSSA mitigates such problems.^[3] Unlike in Fourier analysis, data need not be equally spaced to use LSSA.

Developed in 1969^[4] and 1971,^[5] LSSA is also known as the Vaníček method and the Gauss-Vaniček method after Petr Vaníček,^[6]^[7] and as the Lomb method^[3] or the Lomb–Scargle periodogram,^[2]^[8] based on the simplifications first by Nicholas R. Lomb^[9] and then by Jeffrey D. Scargle.^[10]

Historical background

The close connections between Fourier analysis, the periodogram, and the least-squares fitting of sinusoids have been known for a long time.^[11] However, most developments are restricted to complete data sets of equally spaced samples. In 1963, Freek J. M. Barning of Mathematisch Centrum, Amsterdam, handled unequally spaced data by similar techniques,^[12] including both a periodogram analysis equivalent to what nowadays is called the Lomb method and least-squares fitting of selected frequencies of sinusoids determined from such periodograms — and connected by a procedure known today as the matching pursuit with post-back fitting^[13] or the orthogonal matching pursuit.^[14]

Petr Vaníček, a Canadian geophysicist and geodesist of the University of New Brunswick, proposed in 1969 also the matching-pursuit approach for equally and unequally spaced data, which he called "successive spectral analysis" and the result a "least-squares periodogram".^[4] He generalized this method to account for any systematic components beyond a simple mean, such as a "predicted linear (quadratic, exponential, ...) secular trend of unknown magnitude", and applied it to a variety of samples, in 1971.^[5]

Vaníček's strictly least-squares method was then simplified in 1976 by Nicholas R. Lomb of the University of Sydney, who pointed out its close connection to periodogram analysis.^[9] Subsequently, the definition of a periodogram of unequally spaced data was modified and analyzed by Jeffrey D. Scargle of NASA Ames Research Center,^[10] who showed that, with minor changes, it becomes identical to Lomb's least-squares formula for fitting individual sinusoid frequencies.

Scargle states that his paper "does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced," and further points out regarding least-squares fitting of sinusoids compared to periodogram analysis, that his paper "establishes, apparently for the first time, that (with the proposed modifications) these two methods are exactly equivalent."^[10]

Press^[3] summarizes the development this way:

A completely different method of spectral analysis for unevenly sampled data, one that mitigates these difficulties and has some other very desirable properties, was developed by Lomb, based in part on earlier work by Barning and Vanicek, and additionally elaborated by Scargle.

In 1989, Michael J. Korenberg of Queen's University in Kingston, Ontario, developed the "fast orthogonal search" method of more quickly finding a near-optimal decomposition of spectra or other problems,^[15] similar to the technique that later became known as the orthogonal matching pursuit.

Development of LSSA and variants

The Vaníček method

In the Vaníček method, a discrete data set is approximated by a weighted sum of sinusoids of progressively determined frequencies using a standard linear regression or least-squares fit.^[16] The frequencies are chosen using a method similar to Barning's, but going further in optimizing the choice of each successive new frequency by picking the frequency that minimizes the residual after least-squares fitting (equivalent to the fitting technique now known as matching pursuit with pre-backfitting^[13]). The number of sinusoids must be less than or equal to the number of data samples (counting sines and cosines of the same frequency as separate sinusoids).

A data vector Φ is represented as a weighted sum of sinusoidal basis functions, tabulated in a matrix A by evaluating each function at the sample times, with weight vector x:

\phi \approx {\textbf {A))x

,

where the weights vector x is chosen to minimize the sum of squared errors in approximating Φ. The solution for x is closed-form, using standard linear regression:^[17]

x=({\textbf {A))^{\mathrm {T} }{\textbf {A)))^{-1}{\textbf {A))^{\mathrm {T} }\phi .

Here the matrix A can be based on any set of functions mutually independent (not necessarily orthogonal) when evaluated at the sample times; functions used for spectral analysis are typically sines and cosines evenly distributed over the frequency range of interest. If we choose too many frequencies in a too-narrow frequency range, the functions will be insufficiently independent, the matrix ill-conditioned, and the resulting spectrum meaningless.^[17]

When the basis functions in A are orthogonal (that is, not correlated, meaning the columns have zero pair-wise dot products), the matrix A^TA is diagonal; when the columns all have the same power (sum of squares of elements), then that matrix is an identity matrix times a constant, so the inversion is trivial. The latter is the case when the sample times are equally spaced and sinusoids chosen as sines and cosines equally spaced in pairs on the frequency interval 0 to a half cycle per sample (spaced by 1/N cycles per sample, omitting the sine phases at 0 and maximum frequency where they are identically zero). This case is known as the discrete Fourier transform, slightly rewritten in terms of measurements and coefficients.^[17]

x={\textbf {A))^{\mathrm {T} }\phi

— DFT case for N equally spaced samples and frequencies, within a scalar factor.

The Lomb method

A power spectrum (magnitude-squared) of two sinusoidal basis functions, calculated by the periodogram method

Trying to lower the computational burden of the Vaníček method in 1976 ^[9] (no longer an issue), Lomb proposed using the above simplification in general, except for pair-wise correlations between sine and cosine bases of the same frequency, since the correlations between pairs of sinusoids are often small, at least when they are not tightly spaced. This formulation is essentially that of the traditional periodogram but adapted for use with unevenly spaced samples. The vector x is a reasonably good estimate of an underlying spectrum, but since we ignore any correlations, Ax is no longer a good approximation to the signal, and the method is no longer a least-squares method — yet in the literature continues to be referred to as such.

Rather than just taking dot products of the data with sine and cosine waveforms directly, Scargle modified the standard periodogram formula so to find a time delay $\tau$ first, such that this pair of sinusoids would be mutually orthogonal at sample times ${\displaystyle t_{j))$ and also adjusted for the potentially unequal powers of these two basis functions, to obtain a better estimate of the power at a frequency.^[3]^[10] This procedure made his modified periodogram method exactly equivalent to Lomb's method. Time delay $\tau$ by definition equals to

\tan {2\omega \tau }={\frac {\sum _{j}\sin 2\omega t_{j)){\sum _{j}\cos 2\omega t_{j))}.

Then the periodogram at frequency $\omega$ is estimated as:

P_{x}(\omega )={\frac {1}{2))\left({\frac {\left[\sum _{j}X_{j}\cos \omega (t_{j}-\tau )\right]^{2)){\sum _{j}\cos ^{2}\omega (t_{j}-\tau )))+{\frac {\left[\sum _{j}X_{j}\sin \omega (t_{j}-\tau )\right]^{2)){\sum _{j}\sin ^{2}\omega (t_{j}-\tau )))\right)

,

which, as Scargle reports, has the same statistical distribution as the periodogram in the evenly sampled case.^[10]

At any individual frequency $\omega$ , this method gives the same power as does a least-squares fit to sinusoids of that frequency and of the form:

\phi (t)=A\sin \omega t+B\cos \omega t.

^[18]

In practice, it is always difficult to judge if a given Lomb peak is significant or not, especially when the nature of the noise is unknown, so for example a false-alarm spectral peak in the Lomb periodogram analysis of noisy periodic signal may result from noise in turbulence data.^[19] Fourier methods can also report false spectral peaks when analyzing patched-up or data edited otherwise.^[7]

The generalized Lomb–Scargle periodogram

The standard Lomb–Scargle periodogram is only valid for a model with a zero mean. Commonly, this is approximated — by subtracting the mean of the data before calculating the periodogram. However, this is an inaccurate assumption when the mean of the model (the fitted sinusoids) is non-zero. The generalized Lomb–Scargle periodogram removes this assumption and explicitly solves for the mean. In this case, the function fitted is

\phi (t)=A\sin \omega t+B\cos \omega t+C.

^[20]

The generalized Lomb–Scargle periodogram has also been referred to in the literature as a floating mean periodogram.^[21]

Korenberg's "fast orthogonal search" method

Michael Korenberg of Queen's University in Kingston, Ontario, developed a method for choosing a sparse set of components from an over-complete set — such as sinusoidal components for spectral analysis — called the fast orthogonal search (FOS). Mathematically, FOS uses a slightly modified Cholesky decomposition in a mean-square error reduction (MSER) process, implemented as a sparse matrix inversion.^[15]^[22] As with the other LSSA methods, FOS avoids the major shortcoming of discrete Fourier analysis, so it can accurately identify embedded periodicities and excel with unequally spaced data. The fast orthogonal search method was also applied to other problems, such as nonlinear system identification.

Palmer's Chi-squared method

Palmer has developed a method for finding the best-fit function to any chosen number of harmonics, allowing more freedom to find non-sinusoidal harmonic functions.^[23] His is a fast (FFT-based) technique for weighted least-squares analysis on arbitrarily spaced data with non-uniform standard errors. Source code that implements this technique is available.^[24] Because data are often not sampled at uniformly spaced discrete times, this method "grids" the data by sparsely filling a time series array at the sample times. All intervening grid points receive zero statistical weight, equivalent to having infinite error bars at times between samples.

Applications

The most useful feature of LSSA is enabling incomplete records to be spectrally analyzed — without the need to manipulate data or to invent otherwise non-existent data.

Magnitudes in the LSSA spectrum depict the contribution of a frequency or period to the variance of the time series.^[4] Generally, spectral magnitudes thus defined enable the output's straightforward significance level regime.^[25] Alternatively, spectral magnitudes in the Vaníček spectrum can also be expressed in dB.^[26] Note that spectral magnitudes in the Vaníček spectrum follow β-distribution.^[27]

Inverse transformation of Vaníček's LSSA is possible, as is most easily seen by writing the forward transform as a matrix; the matrix inverse (when the matrix is not singular) or pseudo-inverse will then be an inverse transformation; the inverse will exactly match the original data if the chosen sinusoids are mutually independent at the sample points and their number is equal to the number of data points.^[17] No such inverse procedure is known for the periodogram method.

Implementation

The LSSA can be implemented in less than a page of MATLAB code.^[28] In essence:^[16]

"to compute the least-squares spectrum we must compute m spectral values ... which involves performing the least-squares approximation m times, each time to get [the spectral power] for a different frequency"

I.e., for each frequency in a desired set of frequencies, sine and cosine functions are evaluated at the times corresponding to the data samples, and dot products of the data vector with the sinusoid vectors are taken and appropriately normalized; following the method known as Lomb/Scargle periodogram, a time shift is calculated for each frequency to orthogonalize the sine and cosine components before the dot product;^[17] finally, a power is computed from those two amplitude components. This same process implements a discrete Fourier transform when the data are uniformly spaced in time and the frequencies chosen correspond to integer numbers of cycles over the finite data record.

This method treats each sinusoidal component independently, or out of context, even though they may not be orthogonal to data points; it is Vaníček's original method. In addition, it is possible to perform a full simultaneous or in-context least-squares fit by solving a matrix equation and partitioning the total data variance between the specified sinusoid frequencies.^[17] Such a matrix least-squares solution is natively available in MATLAB as the backslash operator.^[29]

Furthermore, the simultaneous or in-context method, as opposed to the independent or out-of-context version (as well as the periodogram version due to Lomb), cannot fit more components (sines and cosines) than there are data samples, so that:^[17]

"...serious repercussions can also arise if the selected frequencies result in some of the Fourier components (trig functions) becoming nearly linearly dependent with each other, thereby producing an ill-conditioned or near singular N. To avoid such ill conditioning it becomes necessary to either select a different set of frequencies to be estimated (e.g., equally spaced frequencies) or simply neglect the correlations in N (i.e., the off-diagonal blocks) and estimate the inverse least squares transform separately for the individual frequencies..."

Lomb's periodogram method, on the other hand, can use an arbitrarily high number of, or density of, frequency components, as in a standard periodogram; that is, the frequency domain can be over-sampled by an arbitrary factor.^[3] However, as mentioned above, one should keep in mind that Lomb's simplification and diverging from the least squares criterion opened up his technique to grave sources of errors, resulting even in false spectral peaks.^[19]

In Fourier analysis, such as the Fourier transform and discrete Fourier transform, the sinusoids fitted to data are all mutually orthogonal, so there is no distinction between the simple out-of-context dot-product-based projection onto basis functions versus an in-context simultaneous least-squares fit; that is, no matrix inversion is required to least-squares partition the variance between orthogonal sinusoids of different frequencies.^[30] In the past, Fourier's was for many a method of choice thanks to its processing-efficient fast Fourier transform implementation when complete data records with equally spaced samples are available, and they used the Fourier family of techniques to analyze gapped records as well, which, however, required manipulating and even inventing non-existent data just so to be able to run a Fourier-based algorithm.

References

External links

Major topics in mathematical analysis

Major topics in mathematical analysis
Calculus: Integration Differentiation Differential equations ordinary partial stochastic Fundamental theorem of calculus Calculus of variations Vector calculus Tensor calculus Matrix calculus Lists of integrals Table of derivatives
Real analysis Complex analysis Hypercomplex analysis (quaternionic analysis) Functional analysis Fourier analysis Least-squares spectral analysis Harmonic analysis P-adic analysis (P-adic numbers) Measure theory Representation theory Functions Continuous function Special functions Limit Series Infinity
Mathematics portal

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Historical background

Development of LSSA and variants

The Vaníček method

The Lomb method

The generalized Lomb–Scargle periodogram

Korenberg's "fast orthogonal search" method

Palmer's Chi-squared method

Applications

Implementation

See also

References

External links