We found that to be able to annotate dynamic texture sequences unambiguously we need some additional terminology. First, a sequence may contain more than one dynamic texture. We refer to each of these as a dynamic texture process, e.g. we may have waving grass in front of waving water, giving two processes. When processes visually overlap we say that the processes interfere.
Furthermore, texture processes are often composed themselves of a mix of behaviors. For instance, a sea wave may consist of an oscillating motion and may also have a turbulent foam part at the same time. To be able to annotate quantities such as <SpeedFrequency> and <TemporalRegularity> unambiguously, i.e. without having to make arbitrary choices about what is the most important or dominant behavior, we introduce the notion of a dynamic texture mode. The dynamic texture processes are then described as compositions of visually relevant modes, where each mode can be considered as a dynamical texture by itself but consists of only a single type of dynamics.
A texture mode may also be either continuous or discrete. Discrete textures have directly discernible parts, e.g. a group of ants crawling around, or leaves fluttering in the wind. Continuous texture modes have continuous media, e.g. a waving flag, or are practically indiscernible from continuous media, e.g. a waving field of grass seen from considerable distance.
Figure shows different dynamic textures of the DynTex
database with different numbers of processes and modes.
|
The first group of descriptors apply to the video sequence as a whole.
They include administrative descriptors such as a unique identifier,
location
and date. The <Protocol> variable indicates if the
quality of
the sequence is sufficient to be included into the DynTex golden set;
if not,
the reason is described in the <Deviation> field.
Next, there are descriptors describing the shooting conditions, e.g.
indicating
if the shot was outdoor, if camera motion or a disturbance (e.g. a duck
swimming into the shot) was
present. Finally a number of important global properties are described:
the
shot type (close-up or in context), the number of dynamic texture
processes
visible in the sequence, and the total number of dynamic texture modes
(see
Section ).
The complete list of management descriptors is shown in Table
.
Every dynamic texture mode has a unique <DynTexId>.
If a mode is part of texture process where it is superimposed with
another mode this is indicated by the <Superposition>
descriptor.
If a mode visually interferes with a mode or modes from another process
this is indicated by the <Interference> descriptor.
The global texture mode descriptors are listed in Table .
Next we turn to the visual description of the modes.
|
In Section we saw that (strong) dynamic textures become dynamic
due to
intrinsic changes in the spatial pattern. The intrinsic changes can be
divided
into two main types. By far the most common cause of change is motion of the
constituent objects or of the medium. Motion leads to direct change,
and it
can also indirectly lead to a changed appearance as a result of a new
orientation
towards light sources and camera. An example of such motion-induced
appearance
change is the light reflection on rippling water surfaces.
The second, much less common cause of intrinsic pattern change, is
direct
change of appearance without a causing
motion. This can either
be the result of the medium or object changing its luminous properties
(e.g. a grid of flickering LEDs in a server room), or of constituent
parts changing
their size or shape (e.g. a flock of birds flapping their wings).
For the annotation of the temporal dynamics, we found that for both types of pattern change, a small number of qualitative behavior types suffice to characterize most texture modes. First we discern directed (ie. unidirectional) change and oscillating change. These change types apply mainly to dynamics that are constrained to some extent, e.g. a fluttering leaf connected its branch, or a car moving along the highway. Unconstrained, free, change is usually described well as ``irregular''. For motion we work this out more precisely by means of a <TrajectoryType> variable with the following possible values: still, oscillation, directed straight, directed curved and irregular. Note that we discern straight and curved directed change. Naturally also the other types can be further qualified, but we did not find this useful. Similarly, for the <AppearanceChange> variable we take values no/little, oscillation, directed (e.g. going from one color to another, without returning to the original color), and irregular.
For convenient searching, we also found it useful to introduce a <MainClass> variable, which provides an overall characterization of the dynamic texture mode by essentially summarizing the change type and the discreteness (or continuity) of the process. We chose the following classes:
Note that, just as for motion, the direct appearance changes could be further divided using the values of the change type (i.e. <AppearanceChange>) and <Discrete> variables, but since these modes are so rarely visually relevant, we have kept these in a single class.
|
For each dynamic texture mode the visual structure is annotated by its
main
class and the descriptors listed in Table .
The variables <SpeedFrequency> and <Amplitude>
describe the
speed and extent of the modal dynamics;
<TemporalRegularity> describes to what extent the
dynamics in terms of
the former two variables are regular in time. Similarly,
<SpatialRegularity> measures how the dynamics are
varying between
parts of the texture: do all parts have similar dynamics or is there a
lot of
spatial variation? The variables <SpatialScale> and
<SpatialContrast> describe the scale and extent of
spatial variation
between the moving parts of the texture; <Density>
applies only to
discrete modes. For discrete textures where the parts are clearly
separated
(i.e. with low density), <SpatialContrast> refers to
contrast between the discrete objects and their background.
Each mode is also described by two semantic descriptors. The <ObjectMedium> variable is used foremost to identify the main constituent of the physical dynamic texture. To this end we must make a distinction between continuous and discrete modes. For continuous textures we provide a semantic description of the continuous medium. For example, for sea waves the semantic category is water. For discrete textures we provide category labels for the main texture objects, e.g. for cars on a highway this label is car.
For both types of textures we provide an additional semantic specifier which allows us to identify the embedding process more specifically. In the examples above, the associated specifiers are sea and traffic, respectively. Note that modes from the same process will always share the same value for this variable. To describe the processes we only use nouns; the types of change (e.g. ``waving'' or ``rippling'') are already characterized sufficiently by the structural descriptors.
The values currently used to annotate the DynTex textures
for the <ObjectMedium> and <Process>
descriptors
are listed below in Table . In some cases
more than one value may be applicable.
|
A
Microsoft Access database with a specially designed interface is made
available for
quick browsing of the data set and for convenient selection of dynamic
textures of interest. A snapshot of the browsing interface can be seen
in Figure .
Sequences can be played in
an embedded player for a quick overview of their content.
The sequences are annotated according to the scheme described in
Section . The annotations can be accessed through the
browser mentioned above and are also made available as XML files.
An example of XML description for the wave sequence of Figure
is shown below:
<Dyntex_sequence>
<SequenceId>681</SequenceId>
<Name>54ab110</Name>
<CloneId>0</CloneId>
<Date>2005-04-10</Date>
<Location>Castricum, The Netherlands</Location>
<Outdoor>1</Outdoor>
<ArtificialLight>0</ArtificialLight>
<Protocol>Golden</Protocol>
<Sound>0</Sound>
<CameraMotion>0</CameraMotion>
<NrOfFrames>250</NrOfFrames>
<NrOfTextures>1</NrOfTextures>
<NumberOfModes>2</NumberOfModes>
<ShotType>Closeup</ShotType>
<Combination>0</Combination>
<Disturbance>0</Disturbance>
<Notes>
mode 1: the `turbulence'; mode 2 : the incoming wave
</Notes>
<ImagePath>C:\Databases\dyntex_web\img\54ab110.png</ImagePath>
<VideoPath>C:\Databases\dyntex_web\mpeg4\54ab110.avi</VideoPath>
</Dyntex_sequence>
Using the browser, database selections can be compiled based on all variables of the annotation scheme. For example, one can select all sequences with more than one texture process, all sequences containing disturbances, all close-ups, all sequences with irregular trajectories or any other specific combination of structural properties.