Motivations and related problems of rate control at the encoder

Get Complete Project Material File(s) Now! »

Performance and related works

The experiments shown in [Puri et al., 2007] state that the Prism architecure allows to approach the H.263+ inter frame coder performance for some test sequences. These performances were theoretically analyzed in [Majumdar et al., 2005], and it was confirm that Prism architecture could perform a good compression for sequences containing slow and easily estimated motion, but less acceptable efficiency for more complex sequences, as football. An open-source implementation of this architecture was proposed by Fowler in 2005 [Fowler, 2005].
The main drawback of this coding scheme is that the proposed approach is not stricly distributed since the encoder needs a reference block, and then performs an inter frame comparison.

The drawbacks of the backward channel

One of the major drawback of the Stanford DVC scheme is the necessity of a backward channel. Indeed, the encoder needs to wait for the decoder request to send the correct amount of parity information. This forces DVC schemes to have a real-time decoding. This real-time constraint is hardly possible, in the sense that it would imply a complexity reducing at the decoder, very high for the moment because of the iterative algorithms used in turbodecoding.
Some works in the literature have tried to get rid of this return loop. We present these methods in detail in Section 3.3.1.2. Removing the backward channel implies a significant loss in performance (around 1 dB), and it also implies to betray the distributed coding spirit by performing a non-complex comparison between the previous and next key frames in order to have a coarse estimation of the correlation at the encoder. In Section 3.3, we propose our own encoder rate estimation method, based on the proposed rate-distortion model introduced in Chapter 2.

Decorrelation between the quantization and the motion/disparity estimation errors

As in Section 2.3.1, several experiments have been run in order to check the validity of Hypothesis 2. For several sequences and for several rates (obtained by modifying the quantization step of the key frames), the real distortion, 2 eI , is measured, and compared to the approximation ^2 eI . We calculate the per cent error between them. The obtained results are reported in Table 2.2 and in Figures 2.5 and 2.6. While the distance is quite small (under 10%) for the major part of the statistics, there are some larger values (the maximum being 17:42% for mobile at low bitrates) which demonstrates that, in some cases (mainly for high QP), the approximation ^2 eI does not fully reflect the reality. Plots in Figures 2.5 and 2.6 confirm the tendance. They show the evolution of 2 eI (in plain black line) and ^2 eI (in dashed dotted red line). The Figures also display the aspect of the quantities eI1 ;eI2 (dotted green line), eI ;eI1 and eI ;eI2 (dotted blue lines), which are supposed to be negligible compared to Md1;d2 (plain green line), k2 1DI1 and k2 2DI2 (plain blue lines). In Figure 2.5 which displays results obtained at high bitrate, the approximation ^2 eI is very similar to the original distortion 2 eI ;eI1 . At low bitrate (Figure 2.6), the approximation error is wider and confirms the bad results in Table 2.2. In a rate allocation/estimation
framework, the crux of the matter is to approximate the evolution of the distortion along time. To this end, we do not need access to the exact distortion value. In this light, and since the gap between the true and the estimated distortion remains unchanged, th obtained results are adequate to the rate allocation/estimation problem and thus can be deemed as satisfying. Then, we have calculated the numerical temporal differential of the distortions and we have measured the difference (in %) between them. The obtained results, in Table 2.3, seem disappointing, but it is known that the differential is more sensible to errors. For example, the plots in Figure 2.5 have very close evolutions, but the differential error is about 16%. In this light, the results in Table 2.3 are quite good, and show that even if at low bitrate there is a gap between 2 eI and ^2 eI , it remains constant along the sequence. The ^2 eI thus at least predicts reliably the evolution of the original distortion 2 eI and at high bitrate, predicts its almost exact value. To conclude, in the light of these acceptable results, the proposed distortion model seems to be suited to the aimed applications.

Table of contents :

Introduction
1 Distributed coding principles
1.1 Distributed source coding
1.1.1 Theoretical statement
1.1.1.1 Definition and problem statement
1.1.1.1.a Probability mass function and entropy
1.1.1.1.b Rate and admissibility of the rate
1.1.1.1.c Extension to the case of two correlated sources
1.1.1.1.d Distortion
1.1.1.2 Problem statement
1.1.1.3 Lossless transmission
1.1.1.4 Lossy transmission
1.1.2 Applications
1.2 Distributed video coding
1.2.1 Prism Architecture
1.2.1.1 Prism encoder
1.2.1.2 Prism decoder
1.2.1.3 Performance and related works
1.2.2 Stanford approach
1.2.2.1 Key frame coding
1.2.2.2 WZ frame coding
1.2.2.2.a Image classification
1.2.2.2.b Transform
1.2.2.2.c Quantization
1.2.2.2.d Channel encoder
1.2.2.2.e Side information generation
1.2.2.2.f Channel decoder
1.2.2.2.g Reconstruction
1.2.2.2.h The drawbacks of the backward channel
1.2.2.2.i Hash-based schemes
1.2.3 Multiview distributed video coding
1.2.3.1 Schemes
1.2.3.2 Side information
1.3 Conclusion
I Rate distortion model and applications
2 Rate distortion model for the prediction error
2.1 Context
2.2 Hypotheses and calculation
2.3 Model validation
2.3.1 Approximation for quantization distortion
2.3.2 Decorrelation between the quantization and the motion/disparity estimation errors
2.3.3 Md1;d2 does not depend on the quantization level
2.3.4 Discussion about hypothesis validation
2.4 Rate distortion model
2.4.1 Results from information theory
2.4.2 Proposed model
2.5 Conclusion
3 Applications of the rate-distortion model
3.1 Multiview schemes
3.1.1 State-of-the-art
3.1.2 Symmetric schemes
3.1.3 Experimental validation
3.2 Frame loss analysis
3.2.1 Context
3.2.2 Theoretical analysis
3.2.3 Experimental validation
3.3 Backward channel suppression
3.3.1 Introduction
3.3.1.1 Motivations and related problems of rate control at the encoder
3.3.1.2 Existing rate estimation algorithms
3.3.1.3 Hypotheses and main idea of the proposed approach
3.3.2 Frame rate estimation
3.3.2.1 Rate expression
3.3.2.2 Homogeneous distortion inside the GOP
3.3.2.3 Practical approach
3.3.2.4 Experiments
3.3.3 Bitplane rate estimation
3.3.3.1 Wyner-Ziv frame encoding
3.3.3.2 Proposed algorithm
3.3.3.3 Experiments
3.4 Conclusion
II Side information construction
4 State-of-the-art of the side information generation
4.1 Estimation methods
4.1.1 Interpolation
4.1.2 Extrapolation
4.1.3 Disparity
4.1.4 Spatial estimation
4.1.5 Refinement methods
4.2 Fusion
4.2.1 Problem statement
4.2.2 Symmetric schemes
4.2.3 Other schemes
4.3 Hash-based schemes
4.3.1 Definition of a hash-based scheme
4.3.2 Hash information transmission
4.3.2.1 Hash selection
4.3.2.2 Hash compression
4.3.3 Hash based side information generation methods
4.3.3.1 Hash motion estimation / interpolation
4.3.3.2 Genetic algorithm fusion
4.4 Conclusion
5 Essor project scheme
5.1 A wavelet based distributed video coding scheme
5.1.1 Key Frame Encoding and Decoding
5.1.2 Wyner Ziv Frame Encoding
5.1.2.1 Discrete Wavelet Transform and quantization
5.1.2.2 Accumulate LDPC coding
5.1.3 Wyner-Ziv Frame Decoding
5.1.3.1 Accumulate LDPC Decoding
5.2 Proposed interpolation method
5.2.0.2 Forward and Backward motion estimation
5.2.0.3 Bidirectional Interpolation
5.3 Experimental results
5.3.1 Lossless Key frames
5.3.2 Lossy Key frame encoding with H.264 Intra
5.3.3 Lossy Key frame encoding with JPEG-2000
5.3.4 Interpolation error analysis
5.3.5 Rate-distortion performances
5.4 Conclusion
6 Side information refinement
6.1 Generation of dense vector fields
6.1.1 Motivations and general structure
6.1.2 Cafforio-Rocca algorithm (CRA)
6.1.2.1 Monodirectional refinement
6.1.2.1.a Principle
6.1.2.1.b First experiments
6.1.2.2 Bidirectional refinement
6.1.2.2.a Principle
6.1.2.2.b First experiments
6.1.3 Total variation based algorithm
6.1.3.1 Monodirectional refinement
6.1.3.1.a Principle
6.1.3.1.b First experiments
6.1.3.2 Bidirectional refinement
6.1.3.2.a Principle
6.1.3.2.b First experiments
6.1.4 Experiments
6.2 Proposed fusion methods
6.2.1 Recall of the context
6.2.2 Proposed techniques
6.2.3 Experimental results
6.3 Conclusion
7 Hash-based side information generation
7.1 Proposed algorithm
7.1.1 General structure
7.1.2 Hash information generation
7.1.3 Genetic algorithm
7.2 Zoom on the three setting-dependent steps
7.2.1 Initial side information generation
7.2.2 Side information block distortion estimation
7.2.3 Candidates of the Genetic Algorithm
7.3 Experimental results
7.3.1 First results
7.3.2 Rate-distortion results
7.4 Conclusion
III Zoom on Wyner Ziv decoding
8 Correlation noise estimation at the Slepian-Wolf decoder
8.1 State-of-the-art: existing models
8.1.1 Pixel domain
8.1.1.1 Sequence level
8.1.1.2 Frame Level
8.1.1.3 Block level
8.1.1.4 Pixel Level
8.1.2 Transform domain
8.1.2.1 Sequence level
8.1.2.2 Frame Level
8.1.2.3 Coefficient level
8.1.3 Performance evaluation
8.2 Proposed model: Generalized Gaussian model
8.2.1 Definition and parameter estimation
8.2.1.1 Moment estimation
8.2.1.2 Maximum likelihood estimation
8.2.1.3 Comparison
8.2.2 Approach validation
8.2.3 Experimental results
8.2.3.1 Experimental setting
8.2.3.2 Comparison in the offline setting
8.2.3.3 Comparison in the online scenario
8.2.3.4 Comparison between the offline and online settings
8.2.3.5 Discussion
8.3 A more complete study
8.3.1 Motivations
8.3.2 Experiments and results
8.3.2.1 Experiments setting and results
8.3.2.2 Discussion
8.3.3 Conclusion
9 Side information quality estimation
9.1 Motivations
9.2 State-of-the-art
9.2.1 PSNR metric
9.2.2 SIQ
9.3 Proposed metric
9.3.1 Generalization of the SIQ
9.3.2 A Hamming distance based metric
9.4 Methodology of metric comparison
9.5 Experimental results
9.5.1 Common side information features
9.5.2 The reasons why the PSNR is commonly used
9.5.2.1 Experiment settings
9.5.2.2 Discussion
9.5.3 The limits of the PSNR
9.5.3.1 Experiment settings
9.5.3.2 Discussion
9.6 Conclusion
Conclusion
List of publications
Appendix – Compressed sensing of multiview images based on disparity estimation methods
Bibliography