Text Analytics Project Properties differ
depending on whether you are working with a modular or non-modular
project.
Project properties for modular
projects
- General Tab
-
If the Show enable provenance
option in the Text Analytics project property page is selected in
the Text Analytics
Preferences panel, the Enable
Provenance check box is also displayed in the Project Properties
dialog.
The provenance of an output tuple is the lineage of how an output
tuple was generated by the extractor. It explains the origin and
flow of the output tuple. For more information, see Viewing
the lineage of analysis results.
- Enable Provenance
- By default, the Provenance feature is enabled
in the Text Analytics system. If you want to disable the
Provenance feature, clear the Enable Provenance check box in the
Project Properties dialog.
-
Path to find the precompiled TAM
files
-
You can specify the directory or a ZIP
or JAR file that contains the TAM files. Do not store the
precompiled TAM files in the
build output directory because it is cleared with every build.
- Pagination on Annotation Explorer and
Result Table views Enable Pagination
-
- Enable Pagination
- Select this check box to separate the
results in Annotation Explorer and Result Tables into pages the
next time they are opened.
- Number of input files per page
- Specify the number of input documents whose
results will be displayed. This number controls pagination.
- Source Tab
-
- Source directory
-
Modules are created under the Src
directory. The default path is <project>/textAnalytics/src
. You can browse and change the default path.
- Build output directory
-
The Build output directory
stores TAM files that are
generated when the project is compiled. The default path is <project>/textAnalytics/bin
. You can browse and change the default path. Note that this
build output directory is cleared every time the extractor is
built. Ensure that no precompiled TAM
files are stored in it.
- Project Tab
-
In the Specify the projects that
are required for this project list, add the project dependencies
by selecting other projects in the workbench. The projects that are
added here is automatically added to the Java™ Build path. The
projects that are removed from here are automatically removed from
the Java Build path. By adding the project, you can import the
views that were exported in the dependent projects.
- Advanced Tab
-
You can optionally change the tokenization configuration. By
default, the InfoSphere® BigInsights™
Eclipse environment uses a default Multilingual tokenizer that is
included in InfoSphere BigInsights. If
you want to use a different tokenizer, complete the following
steps:
- Click the Advanced
tab in the content pane. If the Advanced
tab is not visible:
- Close the Text Analytics properties panel.
- Click .
- Select Show
advanced tab in the Text Analytics project property page, and
click OK.
- Reopen the Text Analytics properties panel.
- Select either of the following parameters:
- Standard tokenizer
- Splits tokens that are based on white space
and punctuation.
- Custom Multilingual tokenizer
- Splits tokens based on a custom
configuration.
For more information about tokenizers, see Tokenization.
Project properties for
non-modular projects
- General Tab
-
- Migrate Properties
-
You can migrate
the project to a Modular Project by clicking Migrate Properties.
-
If you select the Show enable
provenance option in the Text Analytics project property page in
the Text Analytics
Preferences panel, the Enable
Provenance check box is also displayed in the Project Properties
dialog.
- Enable Provenance
-
The Provenance feature is enabled by default in the Text
Analytics system. If you want to disable the Provenance feature,
clear the Enable Provenance
check box in the Project Properties dialog.
- Location of Main AQL file
- A Text Analytics project can contain one or
more AQL files. These AQL files can be stand alone files or
included in a bigger AQL file by using the include directive,
-
for example:
include 'person-simple.aql';
include 'phone-simple.aql';
output view PersonSimple;
output view PhoneNumber;
Select one .aql file as a main
AQL file. The project builder compiles only the main AQL file and
its dependencies and places the compiled TAM file in the <project_directory>/.aog
directory.
- You can specify a single main AQL file in the
project properties at a time. If you want to work on a different
extractor in the same project, change the location of the main
AQL file property to point to the new main AQL file, and adjust
the data path property accordingly.
- Data path to dependent AQL files,
dictionaries, and UDF JAR files
- The data path defines where dependent AQL
files, dictionaries, and UDF Java archive files are located. If
no data path is specified the default path is the project root
directory. You can select either a workspace folder or an
external folder on the file system to specify the search path.
-
If you change the main AQL file property or the search path
property, and if the Build
Automatically option is enabled, the AQL builder is triggered to
regenerate the .tam file. If the
Build Automatically option is
disabled, you must manually start the project builder to
regenerate the .tam file.
-
The properties for Text Analytics are stored in the <project_directory>/.textanalytics
file. Use only the Text Analytics Project Properties dialog to
edit project properties. Do not directly edit this file with a
text editor.
- Pagination on Annotation Explorer and
Result Table views
-
- Enable Pagination
- Select this check box to separate the
results in Annotation Explorer and Result Tables into pages the
next time they are opened.
- Number of input files per page
- Specify the number of input documents whose
results will be displayed. This number controls pagination.
- Advanced Tab
- The advanced tab of the non-modular project
property dialog is the same as that of modular project property
dialog.