| Problem number | Problem | Likely cause | Correction |
|---|---|---|---|
| 1 | The Text Analytics engine generates incorrect token boundaries. | An incorrect language value was set in the launch configuration. | Confirm the language of the data collection, and check the value in the launch configuration. |
| 2 | The highlighting for span values in the Result Tree View, Labeled Document Collection Viewer, or File Side-by-Side Differences Viewer is off by a few characters. | The text file encoding of the InfoSphere® BigInsights™ project is not set to UTF-8. | Set the text file encoding of the InfoSphere BigInsights project to UTF-8:
|
| 3 | In the InfoSphere BigInsights Tools for Eclipse, an out-of-memory error (OutOfMemoryError) occurs when you run your extractor on a data collection or navigate through the extracted results. | Because the Text Analytics system has a document-at-a-time execution model, the memory utilization depends on the number of results that are generated for the largest document when you run an extractor in the Text Analytics runtime component or call Text Analytics APIs. However, when you run a Text Analytics extractor in the InfoSphere BigInsights Tools for Eclipse, all the results for all documents in the data collection are loaded in main memory so they can be displayed in the Annotation Explorer and the other result viewers. | Choose a smaller data collection to test
your extractor, or increase the maximum heap size for your
Eclipse application. To increase the heap size:
|
| 4 | The Annotation Explorer displays zero results for some extractors. | The extractor results are non-span values and therefore are not shown in the Annotation Explorer. Use the Result Table Viewers to examine the non-span values. | No action is required. |
| 5 | The Annotation Explorer displays empty
spans of zero length such as [0-0]. When the span is opened:
|
The extractor contains AQL code in which
empty spans are created, for example:
|
In the example, empty text is created in
PeopleNames2 for the fName field to indicate NULL or
empty values for this field. The attributes in PeopleNames1, however, are not empty.
Later, a union all AQL construct unifies the tuples from PeopleNames1 and PeopleNames2 to create the final output
view, PeopleNames. The 2 views
are not union-compatible in a strict sense. For example, the
first view returns tuples whose first column is of type Span over Document.text, while the
second view returns tuples whose first column is of type Text. However, the Text Analytics
run time is not strict in this case and allows such unions, since
using empty values to indicate null values is a convenient
feature.
When the extractor that contains this kind of AQL code is run, there are empty span values of zero length [0-0] for the PeopleNames.fName . These empty spans correspond to the empty text created in PeopleNames2 . When one of these spans is opened in the Result Editor, an empty document is opened and contains Anonymous in the title. This title indicates that the value is a span over text created in the AQL code and it is not a Span over Document.text or a Span over other text that is derived from the input document text, for example by using the AQL detag statement. For more information, see the AQL Reference in the Information Center. |
| 6 | The Text Analytics Indexer is corrupted, the new index files must be created, and refactoring and AQL Doc Hover does not work. | Text Analytics Indexer can get corrupted
in the following scenarios:
|
Generate new index files for the InfoSphere BigInsights Tools for Eclipse
workspace:
|