The DynTex database Homepage

The DynTex database

Classification  benchmarkAnnotations

Annotation of the DynTex Database

We have annotated the DynTex database by means of a description scheme based on the physical texture processes occurring in the sequences.
The descriptors are divided into three categories:
content management descriptors, structural descriptors and semantic descriptors.
The annotations serve at least three purposes:
(i) they can assist users in retrieving particular dynamic textures; for example, when looking for trees in heavy wind, we may filter by selecting oscillating motions with large amplitude; or when looking for turbulent water, we select continuous texture with an irregular trajectory type and set the medium to water;
(ii) similarly, it allows the user to quickly tailor test sets for particular research purposes. A number of tools to assist in this process are described in Section [*] and,
(iii) the annotations themselves can serve as a ground truth for various problems, e.g. automatic texture characterization and texture recognition.

Dynamic Texture Modes

We found that to be able to annotate dynamic texture sequences unambiguously we need some additional terminology. First, a sequence may contain more than one dynamic texture. We refer to each of these as a dynamic texture process, e.g. we may have waving grass in front of waving water, giving two processes. When processes visually overlap we say that the processes interfere.

Furthermore, texture processes are often composed themselves of a mix of behaviors. For instance, a sea wave may consist of an oscillating motion and may also have a turbulent foam part at the same time. To be able to annotate quantities such as <SpeedFrequency> and <TemporalRegularity> unambiguously, i.e. without having to make arbitrary choices about what is the most important or dominant behavior, we introduce the notion of a dynamic texture mode. The dynamic texture processes are then described as compositions of visually relevant modes, where each mode can be considered as a dynamical texture by itself but consists of only a single type of dynamics.

A texture mode may also be either continuous or discrete. Discrete textures have directly discernible parts, e.g. a group of ants crawling around, or leaves fluttering in the wind. Continuous texture modes have continuous media, e.g. a waving flag, or are practically indiscernible from continuous media, e.g. a waving field of grass seen from considerable distance.

Figure [*] shows different dynamic textures of the DynTex database with different numbers of processes and modes.

Figure: (a)  A straw field sequence (1 process, 1 mode). (b) A wave sequence (1 process, 2 modes). (c) The grass and river sequence (2 processes, 1 mode each).
Image 54ac110 (a)
Image 54ab110 (b)
Image 54pf210 (c)

Content Management Descriptors

The first group of descriptors apply to the video sequence as a whole. They include administrative descriptors such as a unique identifier, location and date. The <Protocol> variable indicates if the quality of the sequence is sufficient to be included into the DynTex golden set; if not, the reason is described in the <Deviation> field. Next, there are descriptors describing the shooting conditions, e.g. indicating if the shot was outdoor, if camera motion or a disturbance (e.g. a duck swimming into the shot) was present. Finally a number of important global properties are described: the shot type (close-up or in context), the number of dynamic texture processes visible in the sequence, and the total number of dynamic texture modes (see Section [*]).

The complete list of management descriptors is shown in Table [*].

Table: Content Management descriptors. Brackets indicate default values.
<SequenceId> number unique sequence identifier
<Name> text sequence name, corresponds to filename
<Location> text location (city, country) of shot
<Date> text date shot was taken
<Protocol> [Golden]/Extra flags membership golden set
<Deviation> text deviation if not member of golden set
Acquisition conditions
<Outdoor> [true]/false flags if shot was outdoor
<ArtificialLight> true/[false] flags if main light source artificial
<Camera Motion> true/[false] flags if camera motion present
<Disturbance> true/[false] flags if disturbance present
<Sound> true/[false] only if relevant to texture
Global properties
<ShotType> Closeup/Context type of shot
<nProcesses> number number of dynamic texture processes
<nModes> number total number of dynamic texture modes

Structural Descriptors

Every dynamic texture mode has a unique <DynTexId>. If a mode is part of texture process where it is superimposed with another mode this is indicated by the <Superposition> descriptor. If a mode visually interferes with a mode or modes from another process this is indicated by the <Interference> descriptor. The global texture mode descriptors are listed in Table [*]. Next we turn to the visual description of the modes.

Table: Global texture mode descriptors
<DynTexId> number unique mode id
<SequenceId> number id of embedding sequence
<Weak> true/[false] flags weak dynamic texture
<Discrete> true/[false] flags discrete process
Interaction Descriptors
<Superposition> true/[false] in superposition with other mode(s)
<Interference> true/[false] interference with other process
Semantic Descriptors
<MediumObject> text see Section [*]
<Process> text see Section [*]

In Section [*] we saw that (strong) dynamic textures become dynamic due to intrinsic changes in the spatial pattern. The intrinsic changes can be divided into two main types. By far the most common cause of change is motion of the constituent objects or of the medium. Motion leads to direct change, and it can also indirectly lead to a changed appearance as a result of a new orientation towards light sources and camera. An example of such motion-induced appearance change is the light reflection on rippling water surfaces. The second, much less common cause of intrinsic pattern change, is direct change of appearance without a causing motion. This can either be the result of the medium or object changing its luminous properties (e.g. a grid of flickering LEDs in a server room), or of constituent parts changing their size or shape (e.g. a flock of birds flapping their wings).

For the annotation of the temporal dynamics, we found that for both types of pattern change, a small number of qualitative behavior types suffice to characterize most texture modes. First we discern directed (ie. unidirectional) change and oscillating change. These change types apply mainly to dynamics that are constrained to some extent, e.g. a fluttering leaf connected its branch, or a car moving along the highway. Unconstrained, free, change is usually described well as ``irregular''. For motion we work this out more precisely by means of a <TrajectoryType> variable with the following possible values: still, oscillation, directed straight, directed curved and irregular. Note that we discern straight and curved directed change. Naturally also the other types can be further qualified, but we did not find this useful. Similarly, for the <AppearanceChange> variable we take values no/little, oscillation, directed (e.g. going from one color to another, without returning to the original color), and irregular.

For convenient searching, we also found it useful to introduce a <MainClass> variable, which provides an overall characterization of the dynamic texture mode by essentially summarizing the change type and the discreteness (or continuity) of the process. We chose the following classes:

  1. Waving/Oscillating Motion (continuous)
  2. Directed Motion (continuous)
  3. Turbulent/irregular Motion (continuous)
  4. Oscillating Motions (discrete)
  5. Directed Motions (discrete)
  6. Irregular Motions (discrete)
  7. Direct Appearance Change

Note that, just as for motion, the direct appearance changes could be further divided using the values of the change type (i.e. <AppearanceChange>) and <Discrete> variables, but since these modes are so rarely visually relevant, we have kept these in a single class.

Table: Texture mode structural descriptors
Temporal Dynamics
<MainClass> see text
<TrajectoryType> still/directedStraight/directedCurved
<AppearanceChange> none/directed/oscillation/irregular
<SpeedFrequency> low/[medium]/high
<Amplitude> small/[medium]/large
<TemporalRegularity> low/[medium]/high
Spatial Variation
<SpatialRegularity> low/[medium]/high
<SpatialScale> fine/[medium]/coarse
<SpatialContrast> low/[medium]/high
<Density> sparse/[medium]/dense

For each dynamic texture mode the visual structure is annotated by its main class and the descriptors listed in Table [*]. The variables <SpeedFrequency> and <Amplitude> describe the speed and extent of the modal dynamics; <TemporalRegularity> describes to what extent the dynamics in terms of the former two variables are regular in time. Similarly, <SpatialRegularity> measures how the dynamics are varying between parts of the texture: do all parts have similar dynamics or is there a lot of spatial variation? The variables <SpatialScale> and <SpatialContrast> describe the scale and extent of spatial variation between the moving parts of the texture; <Density> applies only to discrete modes. For discrete textures where the parts are clearly separated (i.e. with low density), <SpatialContrast> refers to contrast between the discrete objects and their background.

Semantic Descriptors

Each mode is also described by two semantic descriptors. The <ObjectMedium> variable is used foremost to identify the main constituent of the physical dynamic texture. To this end we must make a distinction between continuous and discrete modes. For continuous textures we provide a semantic description of the continuous medium. For example, for sea waves the semantic category is water. For discrete textures we provide category labels for the main texture objects, e.g. for cars on a highway this label is car.

For both types of textures we provide an additional semantic specifier which allows us to identify the embedding process more specifically. In the examples above, the associated specifiers are sea and traffic, respectively. Note that modes from the same process will always share the same value for this variable. To describe the processes we only use nouns; the types of change (e.g. ``waving'' or ``rippling'') are already characterized sufficiently by the structural descriptors.

The values currently used to annotate the DynTex textures for the <ObjectMedium> and <Process> descriptors are listed below in Table [*]. In some cases more than one value may be applicable.

Table: Semantic Categories
Medium Categories
textile, water, vegetation, smoke, steam, fire, cloud,
foam, spray, hair, fur, paper
Object Categories
droplet, branch/stem, leaf, needle, flower, car, bird, fish,
flame, cloud, person, light, tentacle, insects, bubbles, CD,
streaks,  wings, droplet, wings
Process Categories
sea, field, river, shower, flag, tree, shrub/plant, road,
stream, waterfall, fountain, boiling, shadow, boat, aquarium,
curtain, carpet, cloth, candle, sunblind, toilet, pond, source,
fog, rain, escalator, windmill, net, wheel, anemone, laundry,
server, beer, traffic, flock

Querying and Browsing DynTex

A Microsoft Access database with a specially designed interface is made available for quick browsing of the data set and for convenient selection of dynamic textures of interest. A snapshot of the browsing interface can be seen in Figure [*].
Sequences can be played in an embedded player for a quick overview of their content.

Figure: Screenshot of the DynTex browser.
Image interface

The sequences are annotated according to the scheme described in Section [*]. The annotations can be accessed through the browser mentioned above and are also made available as XML files.
An example of XML description for the wave sequence of Figure [*] is shown below:

<Location>Castricum, The Netherlands</Location>
mode 1: the `turbulence'; mode 2 : the incoming wave

Using the browser, database selections can be compiled based on all variables of the annotation scheme. For example, one can select all sequences with more than one texture process, all sequences containing disturbances, all close-ups, all sequences with irregular trajectories or any other specific combination of structural properties.

Remarks? Questions? Broken links?  mail!mail_dyntex