Configuring the labeled data collection

The labeled document editor is a text editor in which you can manually label documents. Before you can label a document, you must configure the collection of which the document is a part.

About this task

You can configure a collection by specifying the following options:
  • The annotation types (view names and field names) of the annotated text
  • Whether you want the editor to automatically recognize word boundaries

    The Detect Word boundaries automatically option is enabled by default.

    Tip: If this option is enabled and you place the cursor within a word, the editor infers word boundaries when it applies a label or deletes an annotation. Even when you do not select a span of text, the editor correctly infers word boundaries.
  • The language
  • The default annotation type, which simplifies the process of manually annotating a document

Procedure

  1. Right-click the appropriate labeled collection directory, and click Labeled Data Collection > Configure.
  2. In the Preferences dialog, define the annotation types.
    1. Expand the labeled collection node, and click Annotation types.
    2. Click New.
    3. Type a name in the View Name and Field Name fields.
      Remember: Ensure that the view and field (attribute) names that you specify are identical to the ones that are defined in your AQL code.

      For example, if you create a new annotation type and specify Person as the view name and name as the field name. Then, ensure that Person is defined as the view name and name is defined as the field name in the AQL code.

    4. Specify a shortcut key.
      Note: You can define keyboard shortcuts for up to 10 annotation types. A keyboard shortcut is a key combination that you can use in the editor to quickly label selected text. For example, if the keyboard shortcut is 0 for annotation type Person.name , then the combination Ctrl+0 labels the selected portion of text with the annotation type Person.name .

      To label a span of text with an annotation type that is not assigned with a keyboard shortcut, use one of the following approaches:

      • Use the menu Annotate As and choose the annotation type of your choice.
      • In the Labeled data collection configuration page, set your choice of annotation type as the default annotation type. In your labeled document editor, select a span of text and press CTRL+ENTER to apply the default annotation type to the selected span of text.
    5. Repeat the previous steps until you define all the annotation types.
    6. Click Save.
  3. Specify the general properties.
    1. Click General in the left navigation pane.
    2. Specify whether you want the editor to recognize word boundaries. If you select this option, you must specify the language of the data collection.