Metrics correlations of the Video Colorization Benchmark

Metrics charts

Metrics

PSNR

PSNR is a commonly used metric for reconstruction quality for images and video. In our benchmark, we calculate PSNR on the A, B components in LAB colorspace.

SSIM

SSIM is a metric based on structural similarity. In our benchmark, we calculate SSIM on the A, B components in LAB colorspace.

LPIPS

LPIPS (Learned Perceptual Image Patch Similarity) evaluates the distance between image patches. Higher means further/more different. Lower means more similar.

Color^[1]

Color is a no-reference metric that is proposed to evaluate the colorfulness of an image. It uses statistics calculated on A, B components in LAB colorspace.

Warp Error^[2]

Warp Error is a metric that evaluates the temporal stability of a video. By warping one frame to another using corresponding optical flow, we can compare their differences. In our benchmark, we calculate WarpError on the A, B components in LAB colorspace.

CDC^[2]

CDC (the Color Distribution Consistency index) measures the Jensen–Shannon (JS) divergence of the color distribution between consecutive frames. Unlike the commonly-used warping error, CDC is specifically designed for the video colorization task, and can better reflect the consistency of color.

ID

Fréchet inception distance (FID) is a metric for quantifying the realism and diversity of images predicted by generative adversarial networks (GANs). FID has a frequent use in articles on colorization, but we want to point out that this metric makes sense for unpaired datasets. For paired datasets it is enough to calculate just Inception Distance.

Metrics runtime

We measured runtime of metrics, for cpu-compatible metrics (PSNR, SSIM, Color, CDC, WE) we run on AMD EPYC 7532 32-Core Processor @ 1.50 GHz, for gpu-based metrics (LPIPS, ID) we run on NVIDIA RTX A6000. The average time per frame on video was calculated and three runs were performed, from which the minimum was taken.

References

Hasler, D., & Suesstrunk, S. E. (2003, June). Measuring colorfulness in natural images. In Human vision and electronic imaging VIII (Vol. 5007, pp. 87-95)
Liu, Y., Zhao, H., Chan, K. C., Wang, X., Loy, C. C., Qiao, Y., & Dong, C. (2024). Temporally consistent video colorization with deep feature propagation and self-regularization learning. Computational Visual Media, 10(2), 375-395.

21 Oct 2024

Video processing, compression and quality research group Based in MSU Graphics & Media Laboratory