Video from a Single Exposure Coded Photograph

Cameras face a fundamental tradeoff between the spatial and temporal resolution – digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs.

In this project, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit.

To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.

This project was done in collaboration with Yasunobu Hitomi and Tomoo Mitsunaga of Sony Corporation.

Publications

Video from a Single Coded Exposure Photograph using a Learned Over-Complete Dictionary

Y Hitomi, J Gu, Mohit Gupta, Tomoo Mitsunaga, Shree Nayar

Proc. ICCV 2011

Efficient Space-Time Sampling with Pixel-wise Coded Exposure for High Speed Imaging

Dengyu Liu, Jinwei Gu, Y Hitomi, Mohit Gupta, Tomoo Mitsunaga, Shree Nayar

IEEE Trans. PAMI

Video from a Single Exposure Coded Photograph

Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs.

A fundamental tradeoff between temporal and spatial resolution

Due to image sensor hardware factors, as the frame rate increases, spatial resolution decreases. It causes degradation of image quality.

The goal of our work

The goal of our work is to design an imaging system that can capture videos with both high spatial and temporal resolutions. In this project, we focus on two problems: 1) sampling, and 2) representation of space-time volumes for designing practical compressive video acquisition systems.

How to sample space-time volumes while accounting for the restrictions imposed by imaging hardware?

For the maximum flexibility in designing sampling schemes, it is important to have pixel-wise exposure control. Meanwhile, we design sampling functions adhering to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices.

How to effciently represent space-time volumes for sparse reconstruction?

We propose learning an over-complete dictionary from a large collection of videos, and represent any given video as a sparse, linear combination of the elements from the dictionary. The redundant nature of these dictionaries leads to highly sparse representations.

Hardware prototype and experiments

While we have not yet fabricated a CMOS image sensor chip with per-pixel exposure control, we constructed an emulation imaging system with an LCoS device to achieve pixel-wise exposure control. We show video reconstruction results for a variety of motions, ranging from simple linear translation to complex fluid motion and muscle deformations.

Comments

Share This Article