Temporal shift estimation for stereoscopic videos
- Author: Aleksandr Ploshkin
- Supervisor: dr. Dmitriy Vatolin
Introduction
Video synchronization is a fundamental computer-vision task necessary for a wide range of applications. A 3D video involves two streams, which show the scene from different angles simultaneously. It was demonstrated that desynchronization between streams causes severe discomfort for people watching the stereo video.
We propose a temporal shift (time difference) estimation method. In this method we assume that the temporal shift and geometric distortion between the two streams are constant throughout each scene. The result of the algorithm is a shift value measured in fractions of frame steps (inverted FPS).
Example of a detected shot with temporal shift
Drive Angry
We approached the task as a regression problem by constructing an equation that describes the spatio-temporal dependency using the motion disparity and stereo parallax vectors.
The proposed algorithm consists of the following two main stages:
- Calculate the stereo parallax and motion vectors using a block-based matching for each stereo frame;
- Estimate model parameters from motion vectors with high confidence using the RANSAC algorithm.
Stereoscopic video can employ horizontal disparity by design in order to achieve the stereo effect, but vertical disparity is always the result of spatio-temporal misalignment. The algorithm uses this assumption to restore a temporal shift value from vectors’ vertical components. The detailed algorithm description is published in [1].
A histogram of founded values. The tangent of the slope is the shift value in frames’ fractions
Experiments
The algorithm has been tested on our synthetically created dataset. The video set contained 396 stereoscopic scenes with frame rate of 30 FPS from only converted stereoscopic movies, as they did not contain temporal shifts. The frames were subsampled to simulate the temporal shift (e.g. taking only even frames for the left view and uneven frames for the right view results in a shift of 0.5 frames). The final dataset consisted of subsampled views, resulting in a relative temporal shift by ±{0.25, 0.5, 1.0, 2.0} frames.
The comparison of the current algorithm with the previous work shows a significant gain. The error was calculated as the absolute difference between the target and estimated shift values in the frame steps. The evaluation problem was addressed as a classification of whether the error was below a threshold value. In the experiments, the least noticeable value of the time shift was estimated to be 0.10 frames, and it was used as a threshold error value for comparing algorithms.
Additionally, we’ve processed 60 full-length stereoscopic movies and revealed 198 scenes with temporal shift value at least 0.10 frames. Further examples can be found in our VQMT3D reports 8 and 9.
Histogram of revealed scenes with temporal shift
Results
- Developed a temporal shift estimation algorithm
- Speed: 0.9 FPS at Intel Core i7-4700HQ CPU
- Accuracy: 0.9798 for 0.1 frames error threshold
- Revealed 198 shifted scenes in 60 feature-length stereoscopic movies
Publications
1. Ploshkin, A., and Vatolin, D.,
“Accurate method of temporal-shift estimation for 3D video,” [pdf]
2018-3DTV-Conference: 3D at any scale and any perspective (3DTV-CON),
2018. doi:10.1109/3DTV.2018.8478431
-
MSU Benchmark Collection
- Super-Resolution for Video Compression Benchmark
- Video Colorization Benchmark
- Defenses for Image Quality Metrics Benchmark
- Learning-Based Image Compression Benchmark
- Super-Resolution Quality Metrics Benchmark
- Video Saliency Prediction Benchmark
- Metrics Robustness Benchmark
- Video Upscalers Benchmark
- Video Deblurring Benchmark
- Video Frame Interpolation Benchmark
- HDR Video Reconstruction Benchmark
- No-Reference Video Quality Metrics Benchmark
- Full-Reference Video Quality Metrics Benchmark
- Video Alignment and Retrieval Benchmark
- Mobile Video Codecs Benchmark
- Video Super-Resolution Benchmark
- Shot Boundary Detection Benchmark
- The VideoMatting Project
- Video Completion
- Codecs Comparisons & Optimization
- VQMT
- MSU Datasets Collection
- Metrics Research
- Video Quality Measurement Tool 3D
- Video Filters
- Other Projects