M.S. Thesis:
Digital Watermarking For Free Viewpoint Television
(Depth Based)

Introduction

Starting with the invention of the motion picture camera in the late 19th century, technological achievements have always been one of the most important factors laying the path the entertainment industry was to take. The first breakthrough technology brought was the introduction of sound into the film. Then came colour. And now with the ongoing massive research, it seems inevitable that 3D televisions will find their way into this market in a not-so-distant future.

For 3D televisions to work, new types of content must be created and for new types of content, new content protection schemes must be developed. Digital watermarking is one of the popular methods for content and copyright protection. In this thesis we have tried to find a watermarking method for one of the 3D television applications, namely Free Viewpoint Television (Depth Based).

Free Viewpoint Television is a system consisting of a set-top box, which takes as input, source videos taken from multiple cameras along with the depth maps of these videos. Then the user can view the scene from any arbitrary viewpoint --which does not need to be one of the locations of the source cameras. The output is a 2D video created by the set-top box.

The fact that the output can be entirely different from all of the source videos renders the traditional watermarking techniques unusable. Our watermarking technique addresses this problem. What we have tried to achieve is that once the content creator watermarks his/her source videos, he/she is going to be able to detect the watermark from a rendered 2D output video, recorded from an unknown arbitrary viewpoint.

As there is no standardized Free Viewpoint Television implementation in existance yet, we had to develop our own implementation which we believe is going to be quite similar (but simpler) to a future standardized one. We have also tried to make our watermarking technique as independent from differences in Free Viewpoint Television implementations as possible.

Free Viewpoint Television

Sample video of our Free Viewpoint
Television Implementation
YouTube version can be found here

A sample video of our Free Viewpoint Television implementation can be viewed in Quicktime format by clicking the image to the right. The video is recorded in real-time with a 2.4 GHz Macbook Pro laptop. Due to the recording process, the performance of our software is a little bit degraded in the video.

We take a rather simple approach in our implementation: Using the depth map, the colour map and the projection matrix of the nearest source camera to the viewpoint the user wants to generate, we find the 3D locations of every pixel in the source camera. Then we reproject these 3D points to the viewpoint the user wants to generate. The same steps are reproduced also for the second nearest source camera. Then we blend the two images we obtained, filling the occlusion pixels from one another. We then interpolate any occlusion pixels still left, using the neighbouring pixels.

The reprojecton of the 3D points are implemented using OpenGL for efficiency reasons. Actually, the other steps mentioned above (i.e. the calculation of the 3D locations of the source camera pixels and the blending of the final images) could also have been implemented using fragment and vertex shaders on the GPU for great performance increases; however, this is not our primary objective.

Another logical approach preferred by some for the implementation of Free Viewpoint Television would be to first create a depth map for the viewpoint the user wants to generate. The depth map can be generated in a manner similar to the creation of the colour images in the approach mentioned previously. Then using this depth map, the 3D locations of the pixels of the image --which is going to be generated-- can be found. Projecting these 3D locations, onto the source cameras, gives us the colour of the corresponding pixels in the output image. The advantage of this method is that by applying some post-filtering on the generated depth map, better occlusion handling and the prevention of some visual corruptions along the edges can be achieved.

Watermarking Results

We have performed different tests, measuring the robustness of our watermarking algorithm. Two of those test are presented here.

In order to detect the watermark from the user generated 2D images we need to estimate the projection matrix of those images. To do that we find SIFT matches between the reference camera and the user generated images. As we already have the depth map of the reference camera, what we actually have is the 3D locations and corresponding projected 2D locations of those matches. Using the RANSAC algorithm we can easily estimate the projection matrix.

Watermark Detection Quality vs.
Viewpoint Position

However, as the user generated viewpoint gets farther away from the location of the reference camera, the quality of the estimation decreases. Therefore after the initial estimation, we repeat the whole process; but this time instead of the reference camera we use the nearest source camera to the estimated projection matrix.

In order to see the relationship between the watermark detection quality and the position of the user generated view, we have moved the position of the viewpoint on a path connecting all of the source cameras and plotted the watermark detection quality. The results can be seen by clicking the image to the right. In this test the PSNR of the watermarked images are approximately 44.6.

In the graph, higher the ratio means more likely that a watermark is present. If the ratio is above a certain threshold (we use 2 as this threshold) watermark is successfully detected. The dots in the graph are actual data points. The data points taken from the locations where the generated viewport coincides with one of the source cameras are shown with a red colour. The black solid curve is the result of a 6th degree curve fit.

Watermark Detection Quality vs.
Watermark Strength (measured in PSNR)

Examining the graph reveals an expected result: When the location of the viewpoint is near the location of one of the cameras, the watermark is detected more easily. This is a direct consequence of the increase in the projection matrix estimation quality as mentioned above.

The second test presented here shows the relationship between the watermark strength and the watermark detection quality.

In order to see the results of the worst case scenario, the location of the viewpoint is selected as between the second and the third camera, where the watermark detection quality is already low. Looking at the graph, it can be seen that even for a problematic location the watermark can be successfully detected up to PSNR values around 50.

This site is created by Eren HALICI. If you are using IE to view this page, don't.