Automatic detection and analysis of techniques for 2D to 3D video conversion

Introduction

One of the most common methods of 3D movies creation is conversion from 2D. It is a process, where two separate views for each eye are created from one source image.

The most widely used method for 2D to 3D conversion is warping of a source video according to a depth map — Depth Image-Based Rendering (DIBR). Original image pixels are shifted horizontally depending on the corresponding depth value. But at the same time, unfilled areas appear in occlusions — parts of the image invisible in the original frame. Filling such areas is a difficult task that has not been completely resolved yet.

Incorrect filling in occlusions
Valerian and the City of a Thousand Planets

Opening areas


In addition to filling the occlusions, the following conversion methods exist:

Enlarged object example
Spider-Man: Homecoming

Enlarged object example


Warped background example
Ant-Man

Warped background example


Deleted object example
The Legend of Tarzan

Deleted object example

This method allows to detect the conversion method and to what extent the final frame differs from the source one.

Proposed method

Algorithm scheme

Algorithm scheme

Experiments

To verify the correctness of the classifiers, a test dataset of 35 full-length converted stereoscopic movies containing 4 classes was compiled (1000 examples per class):

Analysis of the deformed areas boundaries
Alice Through the Looking Glass

2D Left view Final map Marked borders

Blue indicates the border with a positive depth change, green and red — the border without any depth changes, purple — the border with a negative depth change.

The evaluation of proposed algorithms on the test dataset are presented on the following graph. The classification accuracy was at least 90%.

Graph

Results

The average runtime of the proposed method for processing video sequences with a resolution of 960 × 540 is approximately 1 second on a computer with the following characteristics: 3.20 GHz Intel Core i5, 8 GB RAM.

05 May 2020
See Also
Learning-Based Image Compression Benchmark
MSU 3D-video Quality Analysis. Report 12
MSU 3D-video Quality Analysis. Report 11
MSU 3D-video Quality Analysis. Report 10
Detection of stereo window violation
How to find objects that are present only in one view?
Depth continuity estimation in S3D video
How smooth is the depth transition between scenes?
Site structure