Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://graphics.cs.msu.ru/en/node/1244
Дата изменения: Sun Apr 10 00:46:47 2016 Дата индексирования: Sun Apr 10 00:46:47 2016 Кодировка: UTF-8 |
This research aims to sufficiently increase the quality of visualattention modeling to enable practical applications. We found that automatic models are significantly worse at predicting attention than even single-observer eye tracking. We propose a semiautomatic approach that requires eye tracking of only one observer and is based on time consistency of the observer?s attention.
Our comparisons showed the high objective quality of our proposed approach relative to automatic methods and to the results of single-observer eye tracking with no postprocessing. We demonstrated the practical applicability of our proposed concept to the task of saliency-based video compression.
During our work we have created the database of human eye-movements captured while viewing various videos (static and dynamic scenes, shots from cinema-like films and scientific databases)
Key features:
Our short-term memory retains a representation of our environment for some time. In fact, an observer’s next eye movement may be determined by short-term memory of the scene as much as by the current perception of it. This behavior can be viewed as temporal consistency of attention, i.e. objects that are salient in a certain frame are assumed to be salient in neighboring frames. This leads us to the idea of bidirectional temporal saliency propagation
We performed comparison of the proposed approach with 11 automatic visual-attention models. To ensure fairness of comparison the metric invariant to most of brightness transforms and mixing with center prior model was developed (see the paper and open source implementation for details)
We have modified x264 H.264 video encoder to enable saliency-aware compression. More precisely saliency maps were used to modify macroblocks’ quantization-parameter values to spend more bits on salient regions and vice versa.
We used saliency maps obtained with different visual attention models for saliency-aware x264-based video compression. By expending fewer bits on the non-salient area, we achieved a quality increase in the salient region up to 0.022 EWSSIM for the same bit rate.