| Package | Description |
|---|---|
| edu.nyu.jet.aceJet |
The AceJet package provides the classes and methods for the
Automatic
Content Extraction (ACE) evaluation.
|
| edu.nyu.jet.chunk | |
| edu.nyu.jet.hmm |
The HMM package includes the classes for Hidden Markov Models,
and part-of-speech and name taggers implemented using HMMs. A separate
description is available of the overall structure and external representation
of these models, provided for those who wish to modify the models.
|
| edu.nyu.jet.lex |
The Lex Package incorporate the code for reading dictionaries,
looking words up in dictionaries, and tokenizing text.
|
| edu.nyu.jet.ne |
The NE package contains code for annotating extended named entities using a
large dictionary and a set of transformation rules.
|
| edu.nyu.jet.parser |
The Parser package includes several types of parsers (top-down, bottom-up,
and chart).
|
| edu.nyu.jet.pat |
The Pat package encapsulates the basic pattern application
mechanism of Jet, sets of pattern/action rules which can be applied
to a document to add or modify annotations on the document. The external
form of the pattern language is described below; the classes used
to encode these patterns are summarized separately.
|
| edu.nyu.jet.refres |
The Refres package provides the methods for identifying coreference
relations within a document.
|
| edu.nyu.jet.time |
The Time package contains code for annotating time expressions in text
following the TIMEX2 standard.
|
| edu.nyu.jet.tipster |
The Tipster package provides the basic methods for recording
information about documents. It is loosely based on the 'Tipster
Architecture' developed by R.Grishman as part of the Government-sponsored
Tipster program. The basic objects are Documents and Annotations;
a Document is a container for the text of the document, and a set of Annotations
on the Document.
|
| edu.nyu.jet.zoner |
The Zoner package contains methods for identifying text segments
(sentences, etc.) within the document.
|
| Modifier and Type | Method and Description |
|---|---|
static String |
PerfectAce.getEntityID(Annotation head) |
static String |
PerfectAce.getTypeSubtype(Annotation head) |
static String |
FindAceValues.getTypeSubtype(Document doc,
Annotation mention)
returns the AceValue type and subtype of a mention: Numeric, Crime,
Sentence, Contact-Info, ...
|
static String |
EDTtype.getTypeSubtype(Document doc,
Annotation entity,
Annotation mention)
returns the EDT type of a mention: PERSON, GPE, ORGANIZATION,
LOCATION, FACILITY, or OTHER (where OTHER indicates that it is not
and EDT mention).
|
static boolean |
EDTtype.hasGenericHead(Document doc,
Annotation mention) |
static boolean |
PerfectAce.validMention(Document doc,
Annotation head,
String cat) |
| Modifier and Type | Method and Description |
|---|---|
double |
TokenClassifier.getLocalMargin(Document doc,
Annotation[] tokens,
String excludedTag,
int excludedTagStart,
int excludedTagEnd) |
String[] |
MaxEntNE.simpleDecoder(Document doc,
Annotation[] tokens)
assign the best tag for each token using a simple deterministic
left-to-right tagger (which may not find the most probable path).
|
void |
MaxEntNE.train(Document doc,
Annotation[] tokens,
String[] tags)
train the model on a sequence of words from Document doc.
|
abstract void |
TokenClassifier.train(Document doc,
Annotation[] tokens,
String[] tags) |
String[] |
MaxEntNE.viterbi(Document doc,
Annotation[] tokens)
assign the best tag for each token using a Viterbi decoder.
|
abstract String[] |
TokenClassifier.viterbi(Document doc,
Annotation[] tokens) |
| Modifier and Type | Method and Description |
|---|---|
boolean |
HMMstate.allowedToken(Annotation token)
returns true if token
token can be emitted by this state. |
static boolean |
Retagger.compatible(String word,
String pennPOS,
Annotation jetDefn)
returns true if Penn part-of-speech tag 'pennPOS', as a tag for 'word', is
compatible with Jet word definition 'jetDefn'.
|
double |
HMMstate.getEmissionProb(String tokenText,
String priorToken,
Annotation token)
returns the probability of emitting 'token' with attributes 'fs'
when in this state.
|
double |
HMM.getLocalMargin(Document doc,
Annotation[] tokens,
String excludedTag,
int excludedTagStart,
int excludedTagEnd)
returns the margin for assigning a particular tag to a sequence of
tokens.
|
void |
HMM.train(Document doc,
Annotation[] tokens,
String[] tags)
a slower algorithm for training the HMM.
|
void |
HMM.train0(Document doc,
Annotation[] tokens,
String[] tags)
a fast, simple algorithm for training the HMM.
|
String[] |
HMM.viterbi(Document doc,
Annotation[] tokens)
a Viterbi decoder for HMMs.
|
int[] |
HMM.viterbiPath(Document doc,
Annotation[] tokens)
a Viterbi decoder for HMMs.
|
| Modifier and Type | Method and Description |
|---|---|
static Annotation[] |
Tokenizer.gatherTokens(Document doc,
Span span)
returns an array containing all token annotations in
span of doc. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
PartOfSpeechRule.accept(Document doc,
Annotation[] tokens,
int n) |
boolean |
RegexpRule.accept(Document doc,
Annotation[] tokens,
int pos) |
boolean |
NamedEntityRule.accept(Document doc,
Annotation[] tokens,
int n) |
boolean |
StringRule.accept(Document doc,
Annotation[] tokens,
int n) |
boolean |
MatchRule.accept(Document doc,
Annotation[] tokens,
int n) |
boolean |
ClassRule.accept(Document doc,
Annotation[] tokens,
int n,
ClassHierarchyResolver resolver) |
boolean |
TransformRule.accept(Document doc,
Annotation[] tokens,
int pos,
ClassHierarchyResolver resolver)
determines whether the left-hand side of the rule matches the tokens
beginning with token[pos].
|
boolean |
MatchRuleItem.accept(Document doc,
Annotation[] tokens,
int pos,
ClassHierarchyResolver resolver) |
void |
TransformRule.transform(Document doc,
Annotation[] tokens,
int pos)
applies the transformation (right-hand part) of the rule to the tokens
starting with token[pos].
|
| Modifier and Type | Field and Description |
|---|---|
Annotation |
ParseTreeNode.ann
for leaf nodes, the (token or constit) annotation matched by this node.
|
| Modifier and Type | Method and Description |
|---|---|
static Annotation |
StatParser.buildWordDefn(Document doc,
String word,
Span span,
Annotation wordDefn,
String pennPOS) |
static Annotation[] |
ParseTreeNode.children(Annotation node)
given a parse tree Annotation 'node', as created by makeParseAnnotations,
returns an array containing the children of 'node', or null if the node
has no children.
|
static Annotation |
ParseTreeNode.makeParseAnnotations(Document doc,
ParseTreeNode n)
given a parse tree in the form of nested ParseTreeNodes, adds an
Annotation of type 'constit' to Document 'doc' for each non-terminal node
in the tree.
|
| Modifier and Type | Method and Description |
|---|---|
static HashMap<Annotation,Annotation> |
SynFun.collectParents(Annotation root)
returns a map from each child node to its parent in the parse tree
rooted at
root. |
static HashMap<Annotation,Annotation> |
SynFun.collectParents(Annotation root)
returns a map from each child node to its parent in the parse tree
rooted at
root. |
static Set<Annotation> |
StatParser.descendants(Annotation node)
returns a Set containing the parse tree node and all of its
descendants (its children, the children of its children, etc.).
|
| Modifier and Type | Method and Description |
|---|---|
static Annotation |
StatParser.buildWordDefn(Document doc,
String word,
Span span,
Annotation wordDefn,
String pennPOS) |
static Annotation[] |
ParseTreeNode.children(Annotation node)
given a parse tree Annotation 'node', as created by makeParseAnnotations,
returns an array containing the children of 'node', or null if the node
has no children.
|
static HashMap<Annotation,Annotation> |
SynFun.collectParents(Annotation root)
returns a map from each child node to its parent in the parse tree
rooted at
root. |
static void |
StatParser.deleteUnusedConstits(Document doc,
Span span,
Annotation rootAnnotation)
deletes all annotations of type 'constit' within span 'span' of
Document 'doc' which are not descendants of 'rootAnnotation'.
|
static Set<Annotation> |
StatParser.descendants(Annotation node)
returns a Set containing the parse tree node and all of its
descendants (its children, the children of its children, etc.).
|
static int |
AddSyntacticRelations.findPrepositionIndex(Annotation pp)
starting from a PP constituent node, returns the index of the
child node of category P or DP, or -1 if no such category is found.
|
static String |
SynFun.getDet(Annotation constit)
returns the determiner of 'constit', or null if the
consitutent has no determiner.
|
static String |
SynFun.getHead(Document doc,
Annotation ann)
returns the head string of constituent 'ann' in a parse tree.
|
static boolean |
SynFun.getHuman(Annotation constit)
returns true if noun phrase 'constit' has a human head,
as recorded either an a 'human' feature on PA (by the
chunk patterns) or an 'nhuman' feature in the dictionary.
|
static String |
SynFun.getImmediateHead(Annotation node)
returns the head string for the current node.
|
static String |
SynFun.getName(Document doc,
Annotation constit)
returns the name associated with a noun phrase, as a single
string, or null if the np does not have a name.
|
static String |
SynFun.getNameOrHead(Document doc,
Annotation ann)
if the head (the end of the 'headC' chain) of constituent 'ann'
is a name, return the name itself (with tokens connected by '-');
otherwise return the head as determined by 'getHead'.
|
static String |
SynFun.getNumber(Annotation constit)
returns the number feature of noun phrase 'constit'
(singular or plural), or 'null' if the number feature is
not specified.
|
static Object |
SynFun.getPA(Annotation constit)
returns the 'pa' feature directly or indirectly associated with
parse tree node 'constit'.
|
static String |
SynFun.headOfPa(Object pa,
Annotation ann)
returns the head string from the value of the 'pa' feature.
|
| Constructor and Description |
|---|
ParseTreeNode(Object category,
ParseTreeNode[] children,
int start,
int end,
Annotation ann,
String word)
create a ParseTreeNode corresponding to a leaf of the parse tree.
|
ParseTreeNode(Object category,
ParseTreeNode[] children,
int start,
int end,
Annotation ann,
String word,
String function)
create a ParseTreeNode corresponding to a leaf of the parse tree.
|
ParseView(String title,
Annotation root)
creates a new Frame entitled 'title' displaying the parse tree rooted
at 'root'.
|
| Modifier and Type | Method and Description |
|---|---|
static HashMap |
Pat.matchAnnotations(Annotation ann1,
Annotation ann2,
HashMap bindings)
determines whether annotations ann1 and ann2 can be
matched (unified), consistent with variable bindings bindings.
|
| Modifier and Type | Method and Description |
|---|---|
static Annotation |
Resolve.getHeadC(Annotation ann)
returns the head constituent associated with constituent 'ann'.
|
static Annotation |
Resolve.getNgHead(Annotation ng) |
| Modifier and Type | Method and Description |
|---|---|
static Vector<Annotation> |
Resolve.gatherClauses(Document doc,
Span span)
returns the set of all clauses (constituents of category s, rn-wh,
or rn-vingo) within Span
span of Document doc. |
static Vector<Annotation> |
Resolve.gatherMentions(Document doc,
Span span)
returns the set of all mentions -- constituents which are
subject to reference resolution.
|
static HashMap<Annotation,Annotation> |
Resolve.gatherSyntacticCoref(Document doc,
Vector<Annotation> mentions,
Vector<Annotation> clauses)
gatherSyntacticCoref looks for particular syntactic patterns in the
text which indicate coreference, and returns a Map with one entry
for each such syntactic coreference, linking the anaphor to the
antecedent.
|
static HashMap<Annotation,Annotation> |
Resolve.gatherSyntacticCoref(Document doc,
Vector<Annotation> mentions,
Vector<Annotation> clauses)
gatherSyntacticCoref looks for particular syntactic patterns in the
text which indicate coreference, and returns a Map with one entry
for each such syntactic coreference, linking the anaphor to the
antecedent.
|
| Modifier and Type | Method and Description |
|---|---|
static int |
Hobbs.distance(Document doc,
Annotation m1,
Annotation m2,
ArrayList<Annotation> antecedents,
Vector sentences)
computes the distance (number of mention nodes traversed) in a Hobbs search
starting from parse tree node 'm2' and searching backwards for parse
tree node 'm1'.
|
static Annotation |
Resolve.getHeadC(Annotation ann)
returns the head constituent associated with constituent 'ann'.
|
static String[] |
Resolve.getHeadTokens(Document doc,
Annotation constit) |
static String[] |
Resolve.getNameTokens(Document doc,
Annotation constit)
returns the name associated with a noun phrase, as an array of token
strings, or null if the np does not have a name.
|
static Annotation |
Resolve.getNgHead(Annotation ng) |
static boolean |
Resolve.isName(Annotation constit)
returns true if 'consit' is a name.
|
static boolean |
Resolve.matchPronoun(Document doc,
Annotation anaphor,
String mentionHead,
Annotation ent)
return true if pronoun 'mentionHead' is a possible anaphor for
entity 'ent' (this also includes possessive pronouns of category
'det', and headless noun phrases of category 'np').
|
static float |
MaxEntResolve.matchPronoun(Document doc,
Annotation anaphor,
String pronoun,
Annotation entity,
boolean parse,
ArrayList<Annotation> antecedents)
return the probability that pronoun 'pronoun' is a possible anaphor for
entity 'ent' (this also includes possessive pronouns of category
'det', and headless noun phrases of category 'np').
|
static boolean |
Resolve.nameNomCoref(Document doc,
String det,
String mentionHead,
Annotation mention,
Annotation entity)
return true if a common noun phrase headed by 'mentionHead' is a possible
anaphoric reference to the (named) entity 'entity'.
|
static boolean |
Resolve.nomInName(Document doc,
Annotation mention,
Annotation entity) |
static boolean |
Hobbs.sameSimplex(Annotation x,
Annotation y,
HashMap<Annotation,Annotation> parents)
returns true if parse tree nodes
x and y
are part of the same simplex sentence (used for reflexive pronoun tests). |
static void |
MaxEntResolve.trainOnMention(Document doc,
Annotation mention)
add information on mention
mention and its possible
antecedents to the training data which will be used to train the
coreference model. |
| Modifier and Type | Method and Description |
|---|---|
static int |
Hobbs.distance(Document doc,
Annotation m1,
Annotation m2,
ArrayList<Annotation> antecedents,
Vector sentences)
computes the distance (number of mention nodes traversed) in a Hobbs search
starting from parse tree node 'm2' and searching backwards for parse
tree node 'm1'.
|
static HashMap<Annotation,Annotation> |
Resolve.gatherSyntacticCoref(Document doc,
Vector<Annotation> mentions,
Vector<Annotation> clauses)
gatherSyntacticCoref looks for particular syntactic patterns in the
text which indicate coreference, and returns a Map with one entry
for each such syntactic coreference, linking the anaphor to the
antecedent.
|
static HashMap<Annotation,Annotation> |
Resolve.gatherSyntacticCoref(Document doc,
Vector<Annotation> mentions,
Vector<Annotation> clauses)
gatherSyntacticCoref looks for particular syntactic patterns in the
text which indicate coreference, and returns a Map with one entry
for each such syntactic coreference, linking the anaphor to the
antecedent.
|
static float |
MaxEntResolve.matchPronoun(Document doc,
Annotation anaphor,
String pronoun,
Annotation entity,
boolean parse,
ArrayList<Annotation> antecedents)
return the probability that pronoun 'pronoun' is a possible anaphor for
entity 'ent' (this also includes possessive pronouns of category
'det', and headless noun phrases of category 'np').
|
static void |
Resolve.references(Document doc,
Span span,
Vector<Annotation> mentions,
Vector<Annotation> clauses) |
static void |
Resolve.references(Document doc,
Span span,
Vector<Annotation> mentions,
Vector<Annotation> clauses) |
static boolean |
Hobbs.sameSimplex(Annotation x,
Annotation y,
HashMap<Annotation,Annotation> parents)
returns true if parse tree nodes
x and y
are part of the same simplex sentence (used for reflexive pronoun tests). |
static boolean |
Hobbs.sameSimplex(Annotation x,
Annotation y,
HashMap<Annotation,Annotation> parents)
returns true if parse tree nodes
x and y
are part of the same simplex sentence (used for reflexive pronoun tests). |
| Modifier and Type | Method and Description |
|---|---|
PatternMatchResult |
DayOfWeekPattern.match(Document doc,
List<Annotation> tokens,
int offset) |
PatternMatchResult |
NumberPattern.match(Document doc,
List<Annotation> tokens,
int offset) |
PatternMatchResult |
RegexPattern.match(Document doc,
List<Annotation> tokens,
int offset) |
PatternMatchResult |
MonthPattern.match(Document doc,
List<Annotation> tokens,
int offset) |
abstract PatternMatchResult |
PatternItem.match(Document doc,
List<Annotation> tokens,
int offset)
if tokens[offset] matches the Pattern Item, return a PatternMatchResult
containing the normalized value of the matched token along with
the span of the matched token, else
null. |
PatternMatchResult |
TimePattern.match(Document doc,
List<Annotation> tokens,
int offset)
if the tokens beginning at token[offset] constitute a time expression,
return a PatternMatchResult incorporating that time expression and
its span; otherwise return
null. |
PatternMatchResult |
StringPattern.match(Document doc,
List<Annotation> tokens,
int offset) |
Span |
TimeRule.matches(Document doc,
List<Annotation> tokens,
int offset,
org.joda.time.DateTime ref,
List<Object> values)
matches the pattern portion of the current TimeRule against the sequence
of
tokens in doc starting with |
protected int |
TimeRule.nextOffset(List<Annotation> tokens,
int offset,
Span span) |
| Modifier and Type | Method and Description |
|---|---|
Annotation |
Document.addAnnotation(Annotation ann)
Adds an annotation to the document.
|
Annotation |
Document.annotate(String tp,
Span sp,
FeatureSet att)
Creates an annotation and adds it to the document.
|
Annotation |
Document.tokenAt(int start)
Returns the token annotation starting at position start, or
null if no token starts at this position.
|
Annotation |
Document.tokenEndingAt(int end)
Returns the token annotation ending at position end, or null
if no token starts at this position.
|
| Modifier and Type | Method and Description |
|---|---|
Vector<Annotation> |
Document.annotationsAt(int start)
Returns the annotations beginning at character position start.
|
Vector<Annotation> |
Document.annotationsAt(int start,
String type)
Returns the annotations of type type beginning at character
position start.
|
Vector<Annotation> |
Document.annotationsAt(int start,
String[] types)
Returns the annotations which begin at character position start
and whose type is in array types.
|
Vector<Annotation> |
Document.annotationsEndingAt(int end)
Returns the annotations ending at character position ending.
|
Vector<Annotation> |
Document.annotationsEndingAt(int end,
String type)
Returns the annotations of type type ending at character position
end.
|
Vector<Annotation> |
Document.annotationsOfType(String type)
Returns a vector of all annotations of type type.
|
Vector<Annotation> |
Document.annotationsOfType(String type,
Span span)
Returns a vector of all annotations of type type whose span is
contained within span.
|
| Modifier and Type | Method and Description |
|---|---|
Annotation |
Document.addAnnotation(Annotation ann)
Adds an annotation to the document.
|
void |
AnnotationTool.addType(char key,
Annotation annotationPrototype)
specifies that pressing key 'key' will cause an Annotation with
type and features specified by 'annotationPrototype' to be added
to the document over the selected text.
|
static Color |
AnnotationColor.getColor(Annotation ann)
returns the Color associated with Annotation ann, or null if there
is no Color association for this Annotation.
|
String |
Document.normalizedText(Annotation ann)
Returns the text subsumed by annotation ann, with leading and
trailing whitespace removed, and other whitespace sequences replaced by a
single blank.
|
void |
Document.removeAnnotation(Annotation ann)
Removes annotation ann from the document.
|
void |
Document.shrink(Annotation ann)
shrink the endpoint of Annotation
code to remove any
trailing whitespace. |
void |
Document.stretch(Annotation ann)
extend the endpoint of Annotation ann to include the following whitespace
past the current endpoint.
|
String |
Document.text(Annotation ann)
Returns the text subsumed by annotation ann.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
Annotation.sortByStartPosition(List<Annotation> annotations)
sorts a list of annotation order by its start position.
|
| Modifier and Type | Method and Description |
|---|---|
Vector<Annotation> |
SentenceSet.sentences()
returns a Vector of sentence Annotations.
|
Copyright © 2016 New York University. All rights reserved.