The Methodology of the Learning-Based Image Compression Benchmark

Comparison methodology

Metrics and implementations

JPEG AI Quality Assessment Framework

MSU Video Quality Measurement Tool 14.1

Others

  • LPIPS[9] - We use this implementation

BSQ-rate

As a measure of Bitrate/Quality trade-off we use BSQ-rate[10] (Bitrate-for-the-Same-Quality rate). BSQ-rate can be calculated in 4 steps:

  1. Calculate rate/distortion values (points on the rate-distortion plot) for the reference and test codecs.
  2. If there are outlying points which lead to non-monotonic rate-distortion curve, remove them.
  3. Invert bitrate and quality axes, apply linear interpolation to the obtained points.
  4. Set the interval for integrating as the limits of obtained curves overlapping segments and calculate the areas under the curves in the chosen integration segment and determine their ratio: BSQ-rate = S1/S2.

Execution Time Measure

We measure execution time of each triplet (codec, image, bitrate) 3 times and choose minimal of them.

Dataset Preparation

Optimal Sequence Number

To define optimal sequence number, we conducted this research:

  1. Select a sample of 700 images
  2. Compress images with all codecs, calculate metrics and BSQ-rate
  3. Perform dataset subsampling with replacement
  4. Calculate the average deviation of BSQ-rate across all methods for each metric
  5. If the subsample size is 250 images and the average deviation of BSQ-rate is < 1%, in this case, the results on the subsample can be considered equivalent to the results on the full sample
Subsampling

Image Resolutions

Our study examines codecs across three distinct resolution types, providing a comprehensive analysis of their effectiveness.

  • HD
  • Full HD
  • 4K

Image Sources

We use images from 2 sources:

  1. Flickr - for image of Full HD and 4K resolutions
  2. OpenImages Dataset - for images of HD resolutions

Image Selection

We processed over 1M images and selected 250 images per resolution. Selection was conducted based on two features:

    \[SI(X) = std(Sobel(\|(Sobel_v, Sobel_h)\|_2)),\] where \(Sobel_h\) and \(Sobel_v\) are horizontal and vertical Sobel transformation. \[LogBlurLap(X) = log(var(Laplace(X))),\] where Laplace is the Laplacian operator \(Laplace(f) = \frac{\partial^2 f(x)}{\partial x^2} + \frac{\partial^2 f(y)}{\partial y^2} \).

  1. Images in each resolution were divided into 250 clusters
  2. Image closest to the cluster center selected as candidate
  3. Each candidate manually verified

Clustering

Examples


Hardware

Calculations were made using the following hardware:

  • GeForce RTX 3090 GPU, an Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz
  • NVIDIA RTX A6000 GPU, AMD EPYC 7532 32-Core Processor @ 2.40GHz

References

  1. Wang, Z. & Simoncelli, Eero & Bovik, Alan. (2003). Multiscale structural similarity for image quality assessment. Conference Record of the Asilomar Conference on Signals, Systems and Computers. 2. 1398 - 1402 Vol.2. 10.1109/ACSSC.2003.1292216.
  2. L. Zhang, L. Zhang, X. Mou and D. Zhang. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378-2386, Aug. 2011
  3. Zhou Wang and Qiang Li, Information Content Weighting for Perceptual Image Quality Assessment. IEEE Transactions on Image Processing, vol. 20, no. 5, pp. 1185-1198, May 2011.
  4. N. Ponomarenko, F. Silvestri, K.Egiazarian, M. Carli, V. Lukin, On Between-Coefficient Contrast Masking of DCT Basis Functions, CD-ROM proceedings of Third International Workshop on Video Processing and Quality Metrics for Consumer Electronics VPQM-07, January, 2007, 4p
  5. Valero Laparra, Alexander Berardino, Johannes Ballé, and Eero P. Simoncelli, "Perceptually optimized image rendering," J. Opt. Soc. Am. A 34, 1511-1525 (2017)
  6. Sheikh H. R. and Bovik A. C., Image Information and Visual Quality, IEEE Transactions on Image Processing, vol. 15, February, 2006, pp. 430-444
  7. VMAF repository
  8. Wang Z., Bovik A., Sheikh H., Simoncelli E., Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing, vol.13, 2004, pp.600-612
  9. Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, Oliver Wang. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 586-595
  10. Antsiferova, Anastasia & Kulikov, Dmitriy & Zvezdakov, Sergey & Vatolin, Dmitriy. (2020). BSQ-rate: a new approach for video-codec performance comparison and drawbacks of current solutions. Proceedings of the Institute for System Programming of the RAS. 32. 89-108. 10.15514/ISPRAS-2020-32(1)-5.
25 Jun 2024
See Also
PSNR and SSIM: application areas and criticism
Learn about limits and applicability of the most popular metrics
Super-Resolution for Video Compression Benchmark
Learn about the best SR methods for compressed videos and choose the best model to use with your codec
Video Colorization Benchmark
Explore the best video colorization algorithms
Defenses for Image Quality Metrics Benchmark
Explore defenses from adv attacks
Learning-Based Image Compression Benchmark
The First extensive comparison of Learned Image Compression algorithms
Super-Resolution Quality Metrics Benchmark
Discover 66 Super-Resolution Quality Metrics and choose the most appropriate for your videos
Site structure