| Package | Description |
|---|---|
| edu.nyu.jet |
The root Jet package provides the methods for top-level system
control and for the Console.
|
| edu.nyu.jet.aceJet |
The AceJet package provides the classes and methods for the
Automatic
Content Extraction (ACE) evaluation.
|
| edu.nyu.jet.chunk | |
| edu.nyu.jet.format | |
| edu.nyu.jet.hmm |
The HMM package includes the classes for Hidden Markov Models,
and part-of-speech and name taggers implemented using HMMs. A separate
description is available of the overall structure and external representation
of these models, provided for those who wish to modify the models.
|
| edu.nyu.jet.lex |
The Lex Package incorporate the code for reading dictionaries,
looking words up in dictionaries, and tokenizing text.
|
| edu.nyu.jet.ne |
The NE package contains code for annotating extended named entities using a
large dictionary and a set of transformation rules.
|
| edu.nyu.jet.parser |
The Parser package includes several types of parsers (top-down, bottom-up,
and chart).
|
| edu.nyu.jet.pat |
The Pat package encapsulates the basic pattern application
mechanism of Jet, sets of pattern/action rules which can be applied
to a document to add or modify annotations on the document. The external
form of the pattern language is described below; the classes used
to encode these patterns are summarized separately.
|
| edu.nyu.jet.refres |
The Refres package provides the methods for identifying coreference
relations within a document.
|
| edu.nyu.jet.scorer |
The Scorer package provides the classes for scoring an annotated
document against an answer key..
|
| edu.nyu.jet.time |
The Time package contains code for annotating time expressions in text
following the TIMEX2 standard.
|
| edu.nyu.jet.tipster |
The Tipster package provides the basic methods for recording
information about documents. It is loosely based on the 'Tipster
Architecture' developed by R.Grishman as part of the Government-sponsored
Tipster program. The basic objects are Documents and Annotations;
a Document is a container for the text of the document, and a set of Annotations
on the Document.
|
| edu.nyu.jet.zoner |
The Zoner package contains methods for identifying text segments
(sentences, etc.) within the document.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
Control.applyScript(Document doc,
Span span,
String script)
apply script
script to span span of
document doc. |
static void |
Control.processSentence(Document doc,
Span sentenceSpan)
apply the processSentence script to span
sentenceSpan
of document doc. |
| Modifier and Type | Field and Description |
|---|---|
Span |
AceEventMention.anchorExtent
the span of the anchor of the event, with start and end positions based
on the ACE offsets (excluding XML tags).
|
Span |
AceEventMention.anchorJetExtent
the span of the anchor of the event, with start and end positions based
on Jet offsets (and so including following whitespace).
|
Span |
AceEntityName.extent
the extent of the mention, with start and end positions based on
ACE offsets (excluding XML tags).
|
Span |
AceMention.extent
the extent of the mention, with start and end positions based on
ACE offsets (excluding XML tags).
|
Span |
AceEventMention.extent
the span of the extent of the event, with start and end positions based
on the ACE offsets (excluding XML tags).
|
Span |
AceRelationMention.extent
the span of the extent of the event, with start and end positions based
on the ACE offsets (excluding XML tags).
|
Span |
AceEntityMention.head
the span of the head of the mention, with start and end positions based
on the ACE offsets (excluding XML tags).
|
Span |
AceMention.jetExtent
the extent of the mention, with start and end positions based on
Jet offsets (including all following whitespace).
|
Span |
AceEventMention.jetExtent
the span of the extent of the event, with start and end positions based
on Jet offsets (and so including following whitespace).
|
Span |
AceEntityMention.jetHead
the span of the head of the mention, with start and end positions based
on Jet offsets (and so including following whitespace).
|
| Modifier and Type | Method and Description |
|---|---|
static Span |
AceEntityMention.convertSpan(Span jetSpan,
String fileText)
converts a jet Span to an APF span.
|
Span |
AceEventAnchor.getJetHead() |
Span |
AceMention.getJetHead() |
Span |
AceEntityMention.getJetHead() |
| Modifier and Type | Method and Description |
|---|---|
static boolean |
Ace.allLowerCase(Document doc,
Span span)
return true if either all the letters in span are
lower case, or the fraction of letters which are upper case
exceeds MAX_UPPER.
|
static void |
FindAceValues.buildAceValue(String id,
String typeSubtype,
Span extent,
AceDocument aceDoc,
String fileText)
constructs an AceValue and adds it to the AceDocument.
|
static Span |
AceEntityMention.convertSpan(Span jetSpan,
String fileText)
converts a jet Span to an APF span.
|
AceEvent |
EventPattern.match(Span anchorExtent,
String anchor,
Document doc,
SyntacticRelationSet relations,
AceDocument aceDoc)
match an anchor and its context against the event patterns; if the
match is successful, build and return an AceEvent.
|
void |
PerfectNameTagger.tag(Document doc,
Span span)
tag Span 'span' of Document 'doc' with ENAMEX annotations.
|
static boolean |
Ace.titleCase(Document doc,
Span span)
returns true if Span
span of Document doc
appears to be capitalized as a title: if there are no words
beginning with a lower-case letter except for a small list of
function words (articles, possessive pronouns, prepositions, ...). |
| Modifier and Type | Method and Description |
|---|---|
static String |
EventSyntacticPattern.buildSyntacticPathOnSpans(int fromPosn,
int toPosn,
SyntacticRelationSet relations,
List<Span> localSpans)
returns the syntactic path from the anchor to an argument.
|
| Constructor and Description |
|---|
AceEntityMention(String id,
String type,
Span extent,
Span head,
String fileText) |
AceEntityName(Span extent,
String fileText) |
AceEventAnchor(Span head,
Span jetHead,
String text,
Document doc) |
AceEventMention(String id,
Span jetExtent,
Span anchorJetExtent,
String fileText) |
AceTimexMention(String id,
Span extent,
String fileText)
create a new Timex mention with the specified id and extent.
|
AceValueMention(String id,
Span extent,
String fileText)
create a new Value mention with the specified id and extent.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
Chunker.chunk(Document doc,
Span span)
adds chunks (annotations of type ng) to Span 'span' of
Document 'doc'.
|
void |
MENameTagger.tag(Document doc,
Span span)
tag span 'span' of Document 'doc' with Named Entity annotations.
|
static void |
Onoma.tagDrugs(Document doc,
Span span)
This is a stub which remains from code that was added at SRI's
request for Dovetail in order to tag drug names..
|
static void |
Onoma.tagNames(Document doc,
Span span)
tag names which appear in the onomasticon, adding an ENAMEX annotation
with features TYPE and SUBTYPE.
|
| Modifier and Type | Method and Description |
|---|---|
void |
PTBReader.addAnnotations(List<ParseTreeNode> trees,
Document doc,
String targetAnnotation,
Span span,
boolean jetCategories)
Adds constit annotations to an existing Document
doc to
represent the parse tree structure of a set of trees trees. |
void |
PTBReader.addAnnotations(List<ParseTreeNode> trees,
List<Integer> offsets,
Document doc,
String targetAnnotation,
Span span,
boolean jetCategories)
Adds constit annotations to an existing Document
doc to
represent the parse tree structure of a set of trees trees. |
void |
PTBReader.addAnnotations(ParseTreeNode tree,
Document doc,
Span span,
boolean jetCategories)
Adds constit annotations to an existing Document
doc to
represent the parse tree structure tree. |
| Modifier and Type | Method and Description |
|---|---|
void |
HMMTagger.annotate(Document doc,
Span span,
String type)
tag 'span' of 'doc' according to the Penn Tree Bank tag set.
|
void |
HMMannotator.annotateSpan(Document doc,
Span textSpan)
use the HMM to add annotations to Span 'textSpan' of Document 'doc'.
|
ArrayList |
HMMannotator.annotateSpanNbest(Document doc,
Span textSpan,
int n,
String hypId)
use the HMM to add annotations to Span 'textSpan' of Document 'doc'.
|
static boolean |
HMMNameTagger.inZone(Document doc,
Span span,
String zoneType)
returns 'true' if Span 'span' is enclosed in an annotation of type
'zoneType'.
|
void |
HMMTagger.prune(Document doc,
Span span)
prune existing 'constit' annotations on 'span' of 'doc' using information
from a part-of-speech tagger.
|
static void |
Retagger.pruneConstit(Document d,
Span zone)
prunes constit annotations obtained from lexical look-up
using Penn tags (recorded as tagger annotations).
|
void |
HMMNameTagger.tag(Document doc,
Span span)
tag span 'span' of Document 'doc' with Named Entity annotations.
|
ArrayList |
XNameTagger.tag(Document doc,
Span span,
String sentno)
tag span 'span' of Document 'doc' with N-best Named Entity annotations.
|
void |
HMMTagger.tagJet(Document doc,
Span span)
tag 'span' of 'doc' according to the Jet part of speech set.
|
void |
HMMTagger.tagPenn(Document doc,
Span span)
tag 'span' of 'doc' according to the Penn Tree Bank tag set.
|
static void |
HMMNameTagger.tagPersonZone(Document doc,
Span span,
HMMannotator annotator) |
void |
HMMannotator.trainOnSpan(Document doc,
Span textSpan)
use the annotations on Span 'span' of Document 'doc' to train the HMM.
|
| Modifier and Type | Method and Description |
|---|---|
static Annotation[] |
Tokenizer.gatherTokens(Document doc,
Span span)
returns an array containing all token annotations in
span of doc. |
static String[] |
Tokenizer.gatherTokenStrings(Document doc,
Span span)
returns an array of Strings corresponding to all the tokens
in
span of doc. |
void |
Stemmer.tagStem(Document doc,
Span span)
Added stem feature to each token annotation if token text and stem are
difference.
|
static void |
Tokenizer.tokenize(Document doc,
Span span)
tokenizes the portion of Document doc covered by span.
|
static void |
Tokenizer.tokenizeOnWS(Document doc,
Span span)
tokenizes portion 'span' of 'doc', splitting only on white space.
|
| Modifier and Type | Method and Description |
|---|---|
void |
ClassAnnotator.annotate(Document doc,
Span span) |
void |
NameAnnotator.annotate(Document doc,
Span span)
annotate the text in
span with named entity (ENAMEX)
annotations using the dictionary and rules of the ENE tagger. |
void |
CRFNameTagger.annotate(Document doc,
Span span) |
void |
DictionaryTagger.annotate(Document doc,
Span span)
look up the tokens in
span in the ENE dictionary and
record the results on the NE_INTERNAL annotations for these tokens. |
void |
TransformRules.apply(Document doc,
Span span)
applies the transformation rules to 'span'.
|
static void |
NamedEntityUtil.packNamedEntity(Document doc,
Span span,
String system)
create ENAMEX annotations from the NE_INTERNAL annotations used internally
by the Extended Named Entity annotator.
|
static void |
NamedEntityUtil.splitToNamedEntity(Document doc,
Span span)
create NE_INTERNAL annotations for use by the Extended Named Entity
annotator.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
AddSyntacticRelations.annotate(Document doc,
Span span)
annotate the constituents of document 'doc' within Span 'span'.
|
static Annotation |
StatParser.buildWordDefn(Document doc,
String word,
Span span,
Annotation wordDefn,
String pennPOS) |
static void |
StatParser.deleteUnusedConstits(Document doc,
Span span,
Annotation rootAnnotation)
deletes all annotations of type 'constit' within span 'span' of
Document 'doc' which are not descendants of 'rootAnnotation'.
|
static ParseTreeNode |
StatParser.parse(Document doc,
Span span)
parse the sentence in 'span' of Document 'doc'.
|
static void |
DepParser.parseSentence(Document doc,
Span span,
SyntacticRelationSet relations)
generate the dependency parse for a sentence, adding its arcs to
'relations'.
|
| Modifier and Type | Method and Description |
|---|---|
void |
PatternSet.apply(Document doc,
Span span) |
void |
PatternCollection.apply(String patternSetName,
Document doc,
Span span)
applies the rules in the named PatternSet to the specified span.
|
static void |
NewAnnotationAction.hideAnnotations(Document doc,
String type,
Span span)
hides (adds the 'hidden' feature) to all annotations of type type
beginning at the starting position of span span.
|
| Modifier and Type | Method and Description |
|---|---|
static Vector<Annotation> |
Resolve.gatherClauses(Document doc,
Span span)
returns the set of all clauses (constituents of category s, rn-wh,
or rn-vingo) within Span
span of Document doc. |
static Vector<Annotation> |
Resolve.gatherMentions(Document doc,
Span span)
returns the set of all mentions -- constituents which are
subject to reference resolution.
|
static void |
MaxEntResolve.references(Document doc,
Span span)
Resolve.references resolves the mentions (noun groups) in
span of Document doc. |
static void |
Resolve.references(Document doc,
Span span)
Resolve.references resolves the mentions (noun groups) in
span of Document doc. |
static void |
Resolve.references(Document doc,
Span span,
Vector<Annotation> mentions,
Vector<Annotation> clauses) |
static void |
MaxEntResolve.references(Document doc,
Span span,
Vector mentions,
Vector clauses) |
static void |
Resolve.updateEvents(Document doc,
Span span,
Map mentionToEntity)
updates events based on reference resolution.
|
| Modifier and Type | Method and Description |
|---|---|
void |
SGMLScorer.match(String annType1,
String annType2,
Span span) |
void |
NameTagger.tag(Document doc,
Span span) |
| Modifier and Type | Field and Description |
|---|---|
Span |
PatternMatchResult.span |
| Modifier and Type | Method and Description |
|---|---|
Span |
TimeRule.matches(Document doc,
List<Annotation> tokens,
int offset,
org.joda.time.DateTime ref,
List<Object> values)
matches the pattern portion of the current TimeRule against the sequence
of
tokens in doc starting with |
| Modifier and Type | Method and Description |
|---|---|
void |
NumberAnnotator.annotate(Document doc,
Span span)
Annotates number expression and normalize value.
|
void |
TimeAnnotator.annotate(Document doc,
Span span,
org.joda.time.DateTime ref)
annotate the time expressions in 'span' with TIMEX2 annotations.
|
void |
ScriptRule.apply(Document doc,
List<Object> values,
Span span,
org.joda.time.DateTime ref) |
abstract void |
TimeRule.apply(Document doc,
List<Object> values,
Span span,
org.joda.time.DateTime ref) |
void |
SimpleTimeRule.apply(Document doc,
List<Object> values,
Span span,
org.joda.time.DateTime ref) |
protected int |
TimeRule.nextOffset(List<Annotation> tokens,
int offset,
Span span) |
| Constructor and Description |
|---|
PatternMatchResult(Object value,
Span span) |
| Modifier and Type | Method and Description |
|---|---|
Span |
Document.fullSpan()
Returns a Span covering the entire document.
|
Span |
Annotation.span()
returns the span (of text) associated with the annotation.
|
| Modifier and Type | Method and Description |
|---|---|
Annotation |
Document.annotate(String tp,
Span sp,
FeatureSet att)
Creates an annotation and adds it to the document.
|
boolean |
AnnotationTool.annotateDocument(Document doc,
Span annotationZone)
display annotation tool with Document 'doc', allowing user to
add annotations within Span 'annotationZone' of the document.
|
Vector<Annotation> |
Document.annotationsOfType(String type,
Span span)
Returns a vector of all annotations of type type whose span is
contained within span.
|
String |
Document.normalizedText(Span s)
Returns the text subsumed by span s, with leading and trailing
whitespace removed, and other whitespace sequences replaced by a single
blank.
|
String |
Document.text(Span s)
Returns the text subsumed by span s.
|
boolean |
Span.within(Span s)
Returns true if Span 's' contains the span.
|
StringBuffer |
Document.writeSGML(String type,
Span span)
performs writeSGML over portion 'span' of the Document.
|
| Constructor and Description |
|---|
Annotation(String tp,
Span sp,
FeatureSet att) |
| Modifier and Type | Method and Description |
|---|---|
static void |
SpeechSplitter.split(Document doc,
Span textSpan) |
static void |
SentenceSplitter.split(Document doc,
Span textSpan)
splits the text in textSpan into sentences, adding sentence
annotations to the document.
|
Copyright © 2016 New York University. All rights reserved.