We found that to be able to annotate dynamic texture sequences unambiguously we need some additional terminology. First, a sequence may contain more than one dynamic texture. We refer to each of these as a dynamic texture process, e.g. we may have waving grass in front of waving water, giving two processes. When processes visually overlap we say that the processes interfere.
Furthermore, texture processes are often composed themselves of a mix of behaviors. For instance, a sea wave may consist of an oscillating motion and may also have a turbulent foam part at the same time. To be able to annotate quantities such as <SpeedFrequency> and <TemporalRegularity> unambiguously, i.e. without having to make arbitrary choices about what is the most important or dominant behavior, we introduce the notion of a dynamic texture mode. The dynamic texture processes are then described as compositions of visually relevant modes, where each mode can be considered as a dynamical texture by itself but consists of only a single type of dynamics.
A texture mode may also be either continuous or discrete. Discrete textures have directly discernible parts, e.g. a group of ants crawling around, or leaves fluttering in the wind. Continuous texture modes have continuous media, e.g. a waving flag, or are practically indiscernible from continuous media, e.g. a waving field of grass seen from considerable distance.
The first group of descriptors apply to the video sequence as a whole. They include administrative descriptors such as a unique identifier, location and date. The <Protocol> variable indicates if the quality of the sequence is sufficient to be included into the DynTex golden set; if not, the reason is described in the <Deviation> field. Next, there are descriptors describing the shooting conditions, e.g. indicating if the shot was outdoor, if camera motion or a disturbance (e.g. a duck swimming into the shot) was present. Finally a number of important global properties are described: the shot type (close-up or in context), the number of dynamic texture processes visible in the sequence, and the total number of dynamic texture modes (see Section ).
Every dynamic texture mode has a unique <DynTexId>. If a mode is part of texture process where it is superimposed with another mode this is indicated by the <Superposition> descriptor. If a mode visually interferes with a mode or modes from another process this is indicated by the <Interference> descriptor. The global texture mode descriptors are listed in Table . Next we turn to the visual description of the modes.
In Section we saw that (strong) dynamic textures become dynamic due to intrinsic changes in the spatial pattern. The intrinsic changes can be divided into two main types. By far the most common cause of change is motion of the constituent objects or of the medium. Motion leads to direct change, and it can also indirectly lead to a changed appearance as a result of a new orientation towards light sources and camera. An example of such motion-induced appearance change is the light reflection on rippling water surfaces. The second, much less common cause of intrinsic pattern change, is direct change of appearance without a causing motion. This can either be the result of the medium or object changing its luminous properties (e.g. a grid of flickering LEDs in a server room), or of constituent parts changing their size or shape (e.g. a flock of birds flapping their wings).
For the annotation of the temporal dynamics, we found that for both types of pattern change, a small number of qualitative behavior types suffice to characterize most texture modes. First we discern directed (ie. unidirectional) change and oscillating change. These change types apply mainly to dynamics that are constrained to some extent, e.g. a fluttering leaf connected its branch, or a car moving along the highway. Unconstrained, free, change is usually described well as ``irregular''. For motion we work this out more precisely by means of a <TrajectoryType> variable with the following possible values: still, oscillation, directed straight, directed curved and irregular. Note that we discern straight and curved directed change. Naturally also the other types can be further qualified, but we did not find this useful. Similarly, for the <AppearanceChange> variable we take values no/little, oscillation, directed (e.g. going from one color to another, without returning to the original color), and irregular.
For convenient searching, we also found it useful to introduce a <MainClass> variable, which provides an overall characterization of the dynamic texture mode by essentially summarizing the change type and the discreteness (or continuity) of the process. We chose the following classes:
Note that, just as for motion, the direct appearance changes could be further divided using the values of the change type (i.e. <AppearanceChange>) and <Discrete> variables, but since these modes are so rarely visually relevant, we have kept these in a single class.
For each dynamic texture mode the visual structure is annotated by its main class and the descriptors listed in Table . The variables <SpeedFrequency> and <Amplitude> describe the speed and extent of the modal dynamics; <TemporalRegularity> describes to what extent the dynamics in terms of the former two variables are regular in time. Similarly, <SpatialRegularity> measures how the dynamics are varying between parts of the texture: do all parts have similar dynamics or is there a lot of spatial variation? The variables <SpatialScale> and <SpatialContrast> describe the scale and extent of spatial variation between the moving parts of the texture; <Density> applies only to discrete modes. For discrete textures where the parts are clearly separated (i.e. with low density), <SpatialContrast> refers to contrast between the discrete objects and their background.
Each mode is also described by two semantic descriptors. The <ObjectMedium> variable is used foremost to identify the main constituent of the physical dynamic texture. To this end we must make a distinction between continuous and discrete modes. For continuous textures we provide a semantic description of the continuous medium. For example, for sea waves the semantic category is water. For discrete textures we provide category labels for the main texture objects, e.g. for cars on a highway this label is car.
For both types of textures we provide an additional semantic specifier which allows us to identify the embedding process more specifically. In the examples above, the associated specifiers are sea and traffic, respectively. Note that modes from the same process will always share the same value for this variable. To describe the processes we only use nouns; the types of change (e.g. ``waving'' or ``rippling'') are already characterized sufficiently by the structural descriptors.
Microsoft Access database with a specially designed interface is made
quick browsing of the data set and for convenient selection of dynamic
textures of interest. A snapshot of the browsing interface can be seen
in Figure .
Sequences can be played in an embedded player for a quick overview of their content.
The sequences are annotated according to the scheme described in
Section . The annotations can be accessed through the
browser mentioned above and are also made available as XML files.
An example of XML description for the wave sequence of Figure is shown below:
<Location>Castricum, The Netherlands</Location>
mode 1: the `turbulence'; mode 2 : the incoming wave
Using the browser, database selections can be compiled based on all variables of the annotation scheme. For example, one can select all sequences with more than one texture process, all sequences containing disturbances, all close-ups, all sequences with irregular trajectories or any other specific combination of structural properties.