Annotator utility for Photobook ------------------------------- Annotate is a utility program which creates FRAMER media databases suitable for use with Photobook. The usage is annotate where specifies the file(s) which will be used to build the database. The environment variable FRAMER_SEARCH_PATH must be a comma separated list of FRAMER data directories, e.g. framer,.ph-central/lib/common/data The first occurrence of "radix.fra" in the directories in this list will contain the database to be created or modified. The files which annotate reads are: 1. .spec The format of this file is <specifications> where a specification is either: i) An image annotation field. These have the form <type> <name> [=<value>] The type is one of text A string field. symbol A searchable string field (a "primitive"). symbols A list of primitives (possibly empty). data A data vector specification (for a search metric or display mode). ignore (Causes the annotator to ignore this annotation) Every frame listed in the index file will have this annotation, and the value of the field must be initialized or appear on a line by itself in the <prefix>.data file (see below). The optional value initializer can be used to avoid redundant entries in the <prefix>.data file. The value initializer is a string with possible file or variable references. A file reference has the form `filename`. The contents of the file are read in as a string, which may contain file or variable references which are further expanded. A variable reference has the form $(field), where "field" is the name of an annotation field occurring before this one in the <prefix>.spec file. The reference is replaced with the field string, which may also contain file or variable references. (There is no detection of circular references, so be careful!) The two special variables $# and $* are replaced by the frame number and frame name, respectively. Frame numbers start at one. To get a dollar sign, use $$. Ranges can be applied to file or variable references to select arbitrary portions of their contents. The form is <ref>:[<start>]-[<end>][-<step>], where <ref> is the reference, "start" is an optional starting index, "end" is an optional ending index, and "step" is an optional increment factor. One of "start" or "end" must be specified; "start" defaults to zero, "end" defaults to the maximum index, and "step" defaults to one. When applied to file references, a range selects lines. When applied to variables, a range selects characters. Numbering starts at zero. Examples: text place text place =Boston text place =Boston.$# (Boston.1, Boston.2, ...) symbol city =$(place):-5 (just gets "Boston") symbols objects (data file can contain lines like "car street building", "building door", or "chair table") Data vector specifications are special because they do not introduce explicit annotations. Rather, they specify the existence of a data vector for that frame in the PHOTOBOOK_DATA_DIR, and specify its type. The only type available in version 5.0 is "ptr N double", which is a vector of N double-precision floating-point numbers. Data vectors are usually specified using initializers, since every frame in the database uses the same type. Alternatively, the type can be placed in the <prefix>.data file, so that each frame can have a different sized vector. ii) A display field specification. These are multi-line; they have the form display <disp-mode-name> class <disp-mode-class> width <w> height <h> channels <c> <field> <value> . . end <disp-mode-name> is the name of the display mode as it will appear in Photobook. <disp-mode-class> is the class of the display mode, i.e. what Photobook actually does to produce it. See the Photobook documentation for the list of display mode classes. If not specified, <disp-mode-class> defaults to <disp-mode-name>. Names are case-sensitive; all display mode classes are in lower case. A database must have at least one display mode; the first one is suggested to be of class "image", which instructs Photobook to simply read the image in from a file. The width, height, etc. specs can appear in any order and are optional. "width" and "height" default to 128, and "channels" defaults to "1". Other fields are defined by the particular display mode class. iii) A search metric specification. These are multi-line; they have the form search <metric-name> class <metric-class> <field> <value> . . end As with display modes, <metric-name> is the name of the metric as it will appear in Photobook, and <metric-class> is the class of the metric. See the Photobook documentation for the list of metric classes. The fields and values recognized/required are metric dependent. For example, the euclidean metric requires the fields "vector-size" and "field", and the fields "from" and "to" are optional. Example: search picture-ev class euclidean field picture-ev vector-size 50 from 0 to 39 end iv) A labeling specification. These are multi-line; they have the form deflabel <label> . . end or deftree <tree> . . end and define the labels which appear in the labeling dialog menu or the similarity trees used in labeling, respectively. A similarity tree names a file in the database directory which contains a tree. An example tree file is: 3 1 : 2 0 2 : 2 1 which specifies (in postfix notation) the tree 1 / \ 0 2 / \ 3 1 The lines without colons denote leaves; colons are followed by the number of nodes to join and the value of the parent node. For annotation, the leaf values are image indices, starting at 1; the interior node values are ignored. 2. <prefix>.index If the <prefix>.spec file specifies any annotation fields (type i) then the <prefix>.index file must exist. Alternatively, the <prefix>.index file can be used to add new (empty) frames to the database. This file contains one frame name per line, and specifies the order in which the annotations in the <prefix>.data file are read. A special range specification can be given as part of an index line. These have the form $(<start>-<stop>[-<step>]), and a line may contain multiple ranges (which are applied combinatorially). Ranges are *not* equivalent to a list of frame names, because all frames indexed by the range are given the same annotation values; i.e. they are treated as one frame. For example, foo1 foo2 creates two frames which require separate entries in the <prefix>.data file, while foo$(1-2) creates two frames foo1 and foo2 which are assigned to the same entry in the <prefix>.data file. The $# and $* variables still refer to the actual frame number and name, so the frames generated by a range can be distinguished by including these variables in the annotations. 3. <prefix>.data If the <prefix>.spec file contains any annotation specifiers (type i) which do not have value initializers, then the <prefix>.data file is required. This file contains a sequence of lines, one line for each annotation for each frame. The annotations for frame 1 come first, followed by the annotations for frame 2, etc. The frame ordering is defined by the <prefix>.index file, and the annotation ordering is defined by the <prefix>.spec file. File and variable references are allowed (see the section on value initializers). "symbol" type annotations must not be blank ("symbols" annotations, however, may). Unless otherwise specified, all names can consist of any printing characters besides #, ^, or / (which are FRAMER meta-characters). tpminka@media.mit.edu 12/12/94