Project properties

Text Analytics Project Properties differ depending on whether you are working with a modular or non-modular project.

Project properties for modular projects

General Tab
If the Show enable provenance option in the Text Analytics project property page is selected in the Text Analytics Preferences panel, the Enable Provenance check box is also displayed in the Project Properties dialog.

The provenance of an output tuple is the lineage of how an output tuple was generated by the extractor. It explains the origin and flow of the output tuple. For more information, see Viewing the lineage of analysis results.

Enable Provenance
By default, the Provenance feature is enabled in the Text Analytics system. If you want to disable the Provenance feature, clear the Enable Provenance check box in the Project Properties dialog.
Path to find the precompiled TAM files
You can specify the directory or a ZIP or JAR file that contains the TAM files. Do not store the precompiled TAM files in the build output directory because it is cleared with every build.
Pagination on Annotation Explorer and Result Table views Enable Pagination
Enable Pagination
Select this check box to separate the results in Annotation Explorer and Result Tables into pages the next time they are opened.
Number of input files per page
Specify the number of input documents whose results will be displayed. This number controls pagination.
Source Tab
Source directory
Modules are created under the Src directory. The default path is <project>/textAnalytics/src . You can browse and change the default path.
Build output directory
The Build output directory stores TAM files that are generated when the project is compiled. The default path is <project>/textAnalytics/bin . You can browse and change the default path. Note that this build output directory is cleared every time the extractor is built. Ensure that no precompiled TAM files are stored in it.
Project Tab
In the Specify the projects that are required for this project list, add the project dependencies by selecting other projects in the workbench. The projects that are added here is automatically added to the Java™ Build path. The projects that are removed from here are automatically removed from the Java Build path. By adding the project, you can import the views that were exported in the dependent projects.
Advanced Tab
You can optionally change the tokenization configuration. By default, the InfoSphere® BigInsights™ Eclipse environment uses a default Multilingual tokenizer that is included in InfoSphere BigInsights. If you want to use a different tokenizer, complete the following steps:
  1. Click the Advanced tab in the content pane. If the Advanced tab is not visible:
    1. Close the Text Analytics properties panel.
    2. Click Window > Preferences > BigInsights > Text Analytics.
    3. Select Show advanced tab in the Text Analytics project property page, and click OK.
    4. Reopen the Text Analytics properties panel.
  2. Select either of the following parameters:
    Standard tokenizer
    Splits tokens that are based on white space and punctuation.
    Custom Multilingual tokenizer
    Splits tokens based on a custom configuration.

For more information about tokenizers, see Tokenization.

Project properties for non-modular projects

General Tab
Migrate Properties
You can migrate the project to a Modular Project by clicking Migrate Properties.
If you select the Show enable provenance option in the Text Analytics project property page in the Text Analytics Preferences panel, the Enable Provenance check box is also displayed in the Project Properties dialog.
Enable Provenance
The Provenance feature is enabled by default in the Text Analytics system. If you want to disable the Provenance feature, clear the Enable Provenance check box in the Project Properties dialog.
Location of Main AQL file
A Text Analytics project can contain one or more AQL files. These AQL files can be stand alone files or included in a bigger AQL file by using the include directive,
for example:
								    include 'person-simple.aql'; 
    include 'phone-simple.aql'; 
    output view PersonSimple; 
    output view PhoneNumber; 
							
Select one .aql file as a main AQL file. The project builder compiles only the main AQL file and its dependencies and places the compiled TAM file in the <project_directory>/.aog directory.
You can specify a single main AQL file in the project properties at a time. If you want to work on a different extractor in the same project, change the location of the main AQL file property to point to the new main AQL file, and adjust the data path property accordingly.
Data path to dependent AQL files, dictionaries, and UDF JAR files
The data path defines where dependent AQL files, dictionaries, and UDF Java archive files are located. If no data path is specified the default path is the project root directory. You can select either a workspace folder or an external folder on the file system to specify the search path.
If you change the main AQL file property or the search path property, and if the Build Automatically option is enabled, the AQL builder is triggered to regenerate the .tam file. If the Build Automatically option is disabled, you must manually start the project builder to regenerate the .tam file.
The properties for Text Analytics are stored in the <project_directory>/.textanalytics file. Use only the Text Analytics Project Properties dialog to edit project properties. Do not directly edit this file with a text editor.
Pagination on Annotation Explorer and Result Table views
Enable Pagination
Select this check box to separate the results in Annotation Explorer and Result Tables into pages the next time they are opened.
Number of input files per page
Specify the number of input documents whose results will be displayed. This number controls pagination.
Advanced Tab
The advanced tab of the non-modular project property dialog is the same as that of modular project property dialog.