Video quality is a characteristic of a video passed through a video transmission or processing system that describes perceived video degradation (typically, compared to the original video). Video processing systems may introduce some amount of distortion or artifacts in the video signal that negatively impacts the user's perception of a system. For many stakeholders in video production and distribution, assurance of video quality is an important task.
Video quality evaluation is performed to describe the quality of a set of video sequences under study. Video quality can be evaluated objectively (by mathematical models) or subjectively (by asking users for their rating). Also, the quality of a system can be determined offline (i.e., in a laboratory setting for developing new codecs or services), or in-service (to monitor and ensure a certain level of quality).
Since the world's first video sequence was recorded and transmitted, many video processing systems have been designed. Such systems encode video streams and transmit them over various kinds of networks or channels. In the ages of analog video systems, it was possible to evaluate the quality aspects of a video processing system by calculating the system's frequency response using test signals (for example, a collection of color bars and circles).
Digital video systems have almost fully replaced analog ones, and quality evaluation methods have changed. The performance of a digital video processing and transmission system can vary significantly and depends on many factors including the characteristics of the input video signal (e.g. amount of motion or spatial details), the settings used for encoding and transmission, and the channel fidelity or network performance.
Objective video quality models are mathematical models that approximate results from subjective quality assessment, in which human observers are asked to rate the quality of a video. In this context, the term model may refer to a simple statistical model in which several independent variables (e.g. the packet loss rate on a network and the video coding parameters) are fit against results obtained in a subjective quality evaluation test using regression techniques. A model may also be a more complicated algorithm implemented in software or hardware.
The terms model and metric are often used interchangeably in the field to mean a descriptive statistic which provides an indicator of quality. The term “objective” relates to the fact that, in general, quality models are based on criteria that can be measured objectively – that is, free from human interpretation. They can be automatically evaluated by a computer program. Unlike a panel of human observers, an objective model should always deterministically output the same quality score for a given set of input parameters.
Objective quality models are sometimes also referred to as instrumental (quality) models, in order to emphasize their application as measurement instruments. Some authors suggest that the term “objective” is misleading, as it “implies that instrumental measurements bear objectivity, which they only do in case that they can be generalized.”
Objective models can be classified by the amount of information available about the original signal, the received signal, or whether there is a signal present at all:
Some models that are used for video quality assessment (such as PSNR or SSIM) are simply image quality models, whose output is calculated for every frame of a video sequence. This quality measure of every frame can then be recorded and pooled over time to assess the quality of an entire video sequence. While this method is easy to implement, it does not factor in certain kinds of degradations that develop over time, such as the moving artifacts caused by packet loss and its concealment. A video quality model that considers the temporal aspects of quality degradations, like VQM or the MOVIE Index, may be able to produce more accurate predictions of human-perceived quality.
|Full-Reference||PSNR (Peak Signal-to-Noise Ratio)||Image||It is calculated between every frame of the original and the degraded video signal. PSNR is the most widely used objective image quality metric. However, PSNR values do not correlate well with perceived picture quality due to the complex, highly non-linear behaviour of the human visual system.|
|SSIM (Structural SIMilarity)||Image||SSIM is a perception-based model that considers image degradation as perceived change in structural information, while also incorporating important perceptual phenomena, including both luminance masking and contrast masking terms.|
|MOVIE Index MOtion-based Video Integrity Evaluation||Video||The MOVIE index is a neuroscience-based model for predicting the perceptual quality of a (possibly compressed or otherwise distorted) motion picture or video against a pristine reference video.|
|VMAF Video Multimethod Assessment Fusion||Video||VMAF uses four features to predict video quality VIF, DLM, MCPD, AN-SNR. The above features are fused using a SVM-based regression to provide a single output score. These scores are then temporally pooled over the entire video sequence using the arithmetic mean to provide an overall differential mean opinion score (DMOS).|
|Reduced-Reference||SRR (SSIM Reduced-Reference)||Video||SRR value is calculated as the ratio of received (target) video signal SSIM with reference video pattern SSIM values.|
|ST-RRED||Video||Compute wavelet coefficients of frame differences between the adjacent frames in a video sequence(modelled by a GSM). It is used to evaluate RR entropic differences leading to temporal RRED.It in conjunction with spatial RRED indices evaluated by applying the RRED index on every frame of the video, yield the spatio-temporal RRED|
|No-Reference||NIQE Naturalness Image Quality Evaluator||Image||This IQA model is founded on perceptually relevant spatial domain n natural scene statistic (NSS) features extracted from local image patches that effectively capture the essential low-order statistics of natural images.|
|BRISQUE Blind/Referenceless Image Spatial Quality Evaluator||Image||Method extracts the point wise statistics of local normalized luminance signals and measures image naturalness (or lack thereof) based on measured deviations from a natural image model. It also models the distribution of pairwise statistics of adjacent normalized luminance signals which provides distortion orientation information.|
|Video-BLIINDS||Video||Computes statistical models on DCT coefficients of frame differences and calculates motion characterization. Pedicts score based on those features using SVM.|
An overview of recent no-reference image quality models has been given in a journal paper by Shahid et al. As mentioned above, these can be used for video applications as well. The Video Quality Experts Group has a dedicated working group on developing no-reference metrics (called NORM).
Full or reduced-reference metrics still require access to the original video bitstream before transmission or at least part of it. In practice, an original stream may not always be available for comparison, for example when measuring the quality from the user side. In other situations, a network operator may want to measure the quality of video streams passing through their network, without fully decoding them. For a more efficient estimation of video quality in such cases, parametric/bitstream-based metrics have also been standardized:
Since objective video quality models are expected to predict results given by human observers, they are developed with the aid of subjective test results. During the development of an objective model, its parameters should be trained so as to achieve the best correlation between the objectively predicted values and the subjective scores, often available as mean opinion scores (MOS).
The most widely used subjective test materials are in the public domain and include still pictures, motion pictures, streaming video, high definition, 3-D (stereoscopic), and special-purposes picture quality-related datasets. These so-called databases are created by various research laboratories around the world. Some of them have become de facto standards, including several public-domain subjective picture quality databases created and maintained by the Laboratory for Image and Video Engineering (LIVE) as well the Tampere Image Database 2008. A collection of databases can be found in the QUALINET Databases repository. The Consumer Digital Video Library (CDVL) hosts freely available video test sequences for model development.
In theory, a model can be trained on a set of data in such a way that it produces perfectly matching scores on that dataset. However, such a model will be over-trained and will therefore not perform well on new datasets. It is therefore advised to validate models against new data and use the resulting performance as a real indicator of the model's prediction accuracy.
To measure the performance of a model, some frequently used metrics are the linear correlation coefficient, Spearman's rank correlation coefficient, and the root mean square error (RMSE). Other metrics are the kappa coefficient and the outliers ratio. ITU-T Rec. P.1401 gives an overview of statistical procedures to evaluate and compare objective models.
Objective video quality models can be used in various application areas. In video codec development, the performance of a codec is often evaluated in terms of PSNR or SSIM. For service providers, objective models can be used for monitoring a system. For example, an IPTV provider may choose to monitor their service quality by means of objective models, rather than asking users for their opinion, or waiting for customer complaints about bad video quality. Few of these standards have found commercial applications, including PEVQ and VQuad-HD. SSIM is also part of a commercially available video quality toolset (SSIMWAVE). VMAF is used by Netflix to tune their encoding and streaming algorithms, and to quality-control all streamed content. It is also being used by other technology companies like Bitmovin and has been integrated into software such as FFmpeg.
An objective model should only be used in the context that it was developed for. For example, a model that was developed using a particular video codec is not guaranteed to be accurate for another video codec. Similarly, a model trained on tests performed on a large TV screen should not be used for evaluating the quality of a video watched on a mobile phone.
When estimating quality of a video codec, all the mentioned objective methods may require repeating post-encoding tests in order to determine the encoding parameters that satisfy a required level of visual quality, making them time consuming, complex and impractical for implementation in real commercial applications. There is ongoing research into developing novel objective evaluation methods which enable prediction of the perceived quality level of the encoded video before the actual encoding is performed.
All the visual artifacts are still valuable for video quality. Unique not mentioned attributes include
The majority of them can be grouped into compression artifacts
Main article: Subjective video quality
The main goal of many-objective video quality metrics is to automatically estimate the average user's (viewer's) opinion on the quality of a video processed by a system. Procedures for subjective video quality measurements are described in ITU-R recommendation BT.500 and ITU-T recommendation P.910. In such tests, video sequences are shown to a group of viewers. The viewers' opinion is recorded and averaged into the mean opinion score to evaluate the quality of each video sequence. However, the testing procedure may vary depending on what kind of system is tested.
|FFmpeg||Free||PSNR, SSIM, VMAF|
|MSU VQMT||Free for basic metrics
Paid for HDR metrics
|PSNR, SSIM, MS-SSIM, 3SSIM, VMAF, NIQE, VQM, Delta, MSAD, MSE
MSU developed metrics: Blurring Metric, Blocking Metric, Brightness Flicking Metric, Drop Frame Metric, Noise Estimation Metric
|EPFL VQMT||Free||PSNR, PSNR-HVS, PSNR-HVS-M, SSIM, MS-SSIM, VIFp|
|OpenVQ||Free||PSNR, SSIM, OPVQ - The Open Perceptual Video Quality metric|
|Elecard||Demo version available||PSNR, APSNR, MSAD, MSE, SSIM, Delta, VQM, NQI, VMAF и VMAF phone, VIF|
|VQ Probe||Free||PSNR, SSIM, VMAF|
QoE prediction in videos is a great challenge because of the multiple situations that may arise and the subjective character of QoE. For this reason, to predict the QoE in the most precise way, we have to make use of a good classifier that can detect the most types of errors or unexpected situations that affect video quality. Some studies have demonstrated that a Gaussian Process Classifier give good results for this type of classification.