Microsoft first announced Windows Media Photo at WinHEC 2006,[10] and then renamed it to HD Photo in November of that year. In July 2007, the Joint Photographic Experts Group and Microsoft announced HD Photo to be under consideration to become a JPEG standard known as JPEG XR.[11][12] On 16 March 2009, JPEG XR was given final approval as ITU-T Recommendation T.832 and starting in April 2009, it became available from the ITU-T in "pre-published" form.[1] On 19 June 2009, it passed an ISO/IEC Final Draft International Standard (FDIS) ballot, resulting in final approval as International Standard ISO/IEC 29199-2.[13][14] The ITU-T updated its publication with a corrigendum approved in December 2009,[1] and ISO/IEC issued a new edition with similar corrections on 30 September 2010.[15]
In 2010, after completion of the image coding specification, the ITU-T and ISO/IEC also published a motion format specification (ITU-T T.833 | ISO/IEC 29199-3), a conformance test set (ITU-T T.834 | ISO/IEC 29199-4), and reference software (ITU-T T.835 | ISO/IEC 29199-5) for JPEG XR. In 2011, they published a technical report describing the workflow architecture for the use of JPEG XR images in applications (ITU-T T.Sup2 | ISO/IEC TR 29199-1).
Description
Capabilities
JPEG XR is an image file format that offers several key improvements over JPEG, including:[16]
Better compression
JPEG XR file format supports higher compression ratios in comparison to JPEG for encoding an image with equivalent quality.
Lossless compression
JPEG XR also supports lossless compression. The signal processing steps in JPEG XR are the same for both lossless and lossy coding. This makes the lossless mode simple to support and enables the "trimming" of some bits from a lossless compressed image to produce a lossy compressed image.
Tile structure support
A JPEG XR coded image can be segmented into tile regions. The data for each region can be decoded separately. This enables rapid access to parts of an image without needing to decode the entire image. When a type of tiling referred to as "soft tiling" is used, the tile region structuring can be changed without fully decoding the image and without introducing additional distortion.
Support for more color accuracy
JPEG XR supports a wide variety of image color representations in addition to the conventional 8-bit-per-sample YUV (formally YCbCr) 4:2:0 encoding that is typically used for the original JPEG standard.
For support of images using an RGB color space, JPEG XR includes an internal conversion to the YCgCo color space, and supports a variety of bit depth and color representation packing schemes. These can be used with and without an accompanying alpha channel for shape masking and semi-transparency support, and some of them have much higher precision than what has typically been used for image coding. They include:
Low bit-depth packings of RGB into 16 bits per pixel using 5 bits for each channel or 5 bits for red and blue and 6 bits for green
8 bits per component (sometimes called true color) packed into 24 or 32 bits per pixel
10 bits per component in a 32 bit packed representation (one of several higher-precision varieties of color representation known as deep color)
16 bits per component as integers, fixed-point numbers, or half-precision floating-point numbers packed into 48 or 64 bits
32 bits per component as fixed-point numbers or full-precision floating point numbers packed into 96 or 128 bits (for which lossless coding is not supported due to the excessively high precision)
JPEG XR also supports 16-bit per component (64-bit per pixel) integer CMYK color model.[17]
16-bit and 32-bit fixed point color component codings are also supported in JPEG XR. In such encodings, the most-significant 4 bits of each color channel are treated as providing additional "headroom" and "toe room" beyond the range of values that represents the nominal black-to-white signal range.
Moreover, 16-bit and 32-bit floating point color component codings are supported in JPEG XR. In these cases the image is interpreted as floating point data, although the JPEG XR encoding and decoding steps are all performed using only integer operations (to simplify the compression processing).
In addition to RGB and CMYK formats, JPEG XR also supports grayscale and multi-channel color encodings with an arbitrary number of channels.
The color representations, in most cases, are transformed to an internal color representation. The transformation is entirely reversible, so that this color transformation step does not introduce distortion and thus lossless coding modes can be supported.
Transparency map support
An alpha channel may be present to represent transparency, so that alpha blending overlay capability is enabled.
Compressed-domain image modification
In JPEG XR, full decoding of the image is unnecessary for converting an image from a lossless to lossy encoding, reducing the fidelity of a lossy encoding, or reducing the encoded image resolution.
Full decoding is also unnecessary for certain editing operations such as cropping, horizontal or vertical flips, or cardinal rotations.
The tile structure for access to image regions can also be changed without full decoding and without introducing distortion.
Metadata support
A JPEG XR image file may optionally contain an embedded ICC color profile, to achieve consistent color representation across multiple devices.
One file container format that can be used to store JPEG XR image data is specified in Annex A of the JPEG XR standard. It is a TIFF-like format using a table of Image File Directory (IFD) tags. A JPEG XR file contains image data, optional alpha channel data, metadata, optional XMP metadata stored as RDF/XML, and optional Exif metadata, in IFD tags. The image data is a contiguous self-contained chunk of data. The optional alpha channel, if present, can be compressed as a separate image record, enabling decoding of the image data independently of transparency data in applications which do not support transparency. (Alternatively, JPEG XR also supports an "interleaved" alpha channel format in which the alpha channel data is encoded together with the other image data in a single compressed codestream.)
Being TIFF-based, this format inherits all of the limitations of the TIFF format including the 4 GB file-size limit, which according to the HD Photo specification "will be addressed in a future update".[18]
New work has been started in the JPEG committee to enable the use of JPEG XR image coding within the JPX file storage format — enabling use of the JPIP protocol, which allows interactive browsing of networked images.[13] Additionally, a Motion JPEG XR specification was approved as an ISO standard for motion (video) compression in March 2010.[19]
Compression algorithm
JPEG XR's design[1][20] is conceptually very similar to JPEG: the source image is optionally converted to a luma-chroma colorspace, the chroma planes are optionally subsampled, each plane is divided into fixed-size blocks, the blocks are transformed into the frequency domain, and the frequency coefficients are quantized and entropy coded. Major differences include the following:
JPEG supports bit depths of 8 and 12 bits; JPEG XR supports bit depths of up to 32 bits. JPEG XR also supports lossless and lossy compression of floating-point image data (by representing the floating-point values in an IEEE 754-like format, and encoding them as though they were integers) and RGBE imagery.
JFIF and other typical image encoding practices specify a linear transformation from RGB to YCbCr, which is slightly lossy in practice because of roundoff error. JPEG XR specifies a lossless colorspace transformation, given (for RGB) by
While JPEG uses 8 × 8 blocks for its frequency transformation, JPEG XR primarily uses 4 × 4 block transforms. (2 × 4 and 2 × 2 transformations are also defined for special cases involving chroma subsampling; encoder options include YUV_444, YUV_422, YUV_420, and a monochrome Y_only.)[21]
While JPEG uses a single transformation stage, JPEG XR applies its 4 × 4 core transform in a two-level hierarchical fashion within 16 × 16 macroblock regions. This gives the transform a wavelet-like multi-resolution hierarchy and improves its compression capability.
The DCT, the frequency transformation used by JPEG, is slightly lossy because of roundoff error. JPEG XR uses a type of integer transform employing a lifting scheme.[22] The required transform, called the Photo Core Transform (PCT), resembles a 4 × 4 DCT but is lossless (exactly invertible). In fact, it is a particular realization of a larger family of binary-friendly multiplier-less transforms called the binDCT.[23]
JPEG XR allows an optional overlap prefiltering step, called the Photo Overlap Transform (POT), before each of its 4 × 4 core transform PCT stages.[22] The filter operates on 4 × 4 blocks which are offset by 2 samples in each direction from the 4 × 4 core transform blocks. Its purpose is to improve compression capability and reduce block-boundary artifacts at low bitrates. At high bitrates, where such artifacts are typically not a problem, the prefiltering can be omitted to reduce encoding and decoding time. The overlap filtering is constructed using integer operations following a lifting scheme, so that it is also lossless. When appropriately combined, the POT and the PCT in JPEG-XR form a lapped transform.[24]
In JPEG, the image DC coefficients of the DCT blocks are predicted by applying DC prediction from the left neighbor transform block, and no other coeffients are predicted. In JPEG XR, 4 × 4 blocks are grouped into macroblocks of 16 × 16 samples, and the 16 DC coefficients from the 4 × 4 blocks of each macroblock are passed through another level of frequency transformation, leaving three types of coefficients to be entropy coded: the macroblock DC coefficients (called DC), macroblock-level AC coefficients (called "lowpass"), and lower-level AC coefficients (called AC). Prediction of coefficient values across transform blocks is applied to the DC coefficients and to an additional row or column of AC coefficients as well.
JPEG XR supports the encoding of an image by decomposing it into smaller individual rectangular tile area regions. Each tile area can be decoded independently from the other areas of the picture. This allows fast access to spatial areas of pictures without decoding the entire picture.
JPEG XR's entropy coding phase is more adaptive and complex than JPEG's, involving a DC and AC coefficient prediction scheme, adaptive coefficient reordering (in contrast to JPEG's fixed zigzag ordering), and a form of adaptive Huffman coding for the coefficients themselves.
JPEG uses a single quantization step size per DC/AC component per color plane per image. JPEG XR allows a selection of DC quantization step sizes on a tile region basis, and allows lowpass and AC quantization step sizes to vary from macroblock to macroblock.
Because all encoding phases except quantization are lossless, JPEG XR is lossless when all quantization coefficients are equal to 1. This is not true of JPEG. JPEG defines a separate lossless mode which does not use the DCT, but it is not implemented by libjpeg and therefore not widely supported.
The HD Photo bitstream specification claims that "HD Photo offers image quality comparable to JPEG-2000 with computational and memory performance more closely comparable to JPEG", that it "delivers a lossy compressed image of better perceptive quality than JPEG at less than half the file size", and that "lossless compressed images … are typically 2.5 times smaller than the original uncompressed data".
Software support
A reference software implementation of JPEG XR has been published as ITU-T Recommendation T.835 and ISO/IEC International Standard 29199-5.
The following notable software products natively support JPEG XR:
The 2011 video game Rage employs JPEG XR compression to compress its textures.[46]
Licensing
Microsoft has patents on the technology in JPEG XR. A Microsoft representative stated in a January 2007 interview that in order to encourage the adoption and use of HD Photo, the specification is made available under the Microsoft Open Specification Promise, which asserts that Microsoft allows implementation of the specification for free, and will not file suits on the patented technology for its implementation,[47] as reportedly stated by Josh Weisberg, director of Microsoft's Rich Media Group. As of 15 August 2010, Microsoft made the resulting JPEG XR standard available under its Community Promise.[48]
In July 2010, reference software to implement the JPEG XR standard was published as ITU-T Recommendation T.835 and International Standard ISO/IEC 29199-5. Microsoft included these publications in the list of specifications covered by its Community Promise.[48]
In April 2013, Microsoft released an open source JPEG XR library under the BSD licence.[49][50] This resolved any licensing issues with the library being implemented in software packages distributed under popular open source licences such as the GNU General Public License, with which the previously released "HD Photo Device Porting Kit"[51] was incompatible.
See also
JPEG, an image format used for lossy compression, which JPEG XR lossy is comparable with
JPEG 2000, an improvement intended to replace JPEG by the JPEG committee as of 2000
PNG, a format for lossless compression, which JPEG XR lossless is comparable with
WebP, a format with lossy or lossless compression, proposed by Google in 2010
^matthewu (31 January 2014). "jxrlib". CodePlex. Retrieved 15 March 2014. The JPEG XR format replaces the HD Photo/Windows Media™ Photo format in both Windows 8 and the Windows Image Component (WIC). WIC accompanies the Internet Explorer 10 redistributable packages for down-level versions of Windows.