public class Document extends Object implements Serializable
Hypotheses: Each annotation on a document can be associated with a
hypothesis. If currentHypothesis (set by
setCurrentHypothesis(java.lang.Object)) is non-null, each annotation added to the
document is given a feature hypo with that value. If
activeHypotheses (set by setActiveHypotheses(java.util.Set)) is
non-null, methods which return annotations only return those annotations
whose hypo feature is among the activeHypotheses.
| Modifier and Type | Field and Description |
|---|---|
SyntacticRelationSet |
relations |
| Constructor and Description |
|---|
Document()
Creates a new document with no text or annotations.
|
Document(Document doc) |
Document(String stg)
Creates a new document with text stg and no annotations.
|
| Modifier and Type | Method and Description |
|---|---|
Annotation |
addAnnotation(Annotation ann)
Adds an annotation to the document.
|
Annotation |
annotate(String tp,
Span sp,
FeatureSet att)
Creates an annotation and adds it to the document.
|
void |
annotateWithTag(String tag)
annotateWithTag annotates document with
Span of text
between <tag> and </tag>. |
void |
annotateWithTag(String tag,
int start,
int end)
annotateWithTag annotates document with
Span of text
between <tag> and </tag>. |
Vector<Annotation> |
annotationsAt(int start)
Returns the annotations beginning at character position start.
|
Vector<Annotation> |
annotationsAt(int start,
String type)
Returns the annotations of type type beginning at character
position start.
|
Vector<Annotation> |
annotationsAt(int start,
String[] types)
Returns the annotations which begin at character position start
and whose type is in array types.
|
Vector<Annotation> |
annotationsEndingAt(int end)
Returns the annotations ending at character position ending.
|
Vector<Annotation> |
annotationsEndingAt(int end,
String type)
Returns the annotations of type type ending at character position
end.
|
Vector<Annotation> |
annotationsOfType(String type)
Returns a vector of all annotations of type type.
|
Vector<Annotation> |
annotationsOfType(String type,
Span span)
Returns a vector of all annotations of type type whose span is
contained within span.
|
StringBuffer |
append(char c)
Adds the char c to the end of the document.
|
StringBuffer |
append(String stg)
Adds the text stg to the end of the document.
|
char |
charAt(int posn)
Returns the character at position posn in the document.
|
void |
clear()
Deletes the text and all annotations on a document, creating an empty
document.
|
void |
clearAnnotations()
Removes all annotations on the document.
|
Span |
fullSpan()
Returns a Span covering the entire document.
|
String[] |
getAnnotationTypes()
Returns a vector of all annotation types.
|
int |
getNextAnnotationId()
returns a unique integer for this Document, to be used in assigning an
'id' feature to an Annotation on this Document.
|
int |
length()
Returns the length of the document (in characters).
|
String |
normalizedText(Annotation ann)
Returns the text subsumed by annotation ann, with leading and
trailing whitespace removed, and other whitespace sequences replaced by a
single blank.
|
String |
normalizedText(Span s)
Returns the text subsumed by span s, with leading and trailing
whitespace removed, and other whitespace sequences replaced by a single
blank.
|
void |
removeAnnotation(Annotation ann)
Removes annotation ann from the document.
|
void |
removeAnnotationsOfType(String type)
removes all annotations of type 'type' from the document.
|
void |
setActiveHypotheses(Set hypoIdSet)
sets the value of
activeHypotheses. |
void |
setCharAt(int posn,
char c)
Sets the character at position posn to c.
|
void |
setCurrentHypothesis(Object hypoId)
sets the value of
currentHypothesis. |
void |
setSGMLindent(int n)
set amount to indent sgml tags, per level of tag nesting, for writeSGML.
|
void |
setSGMLwrapMargin(int n)
set right margin for wrapping (inserting newlines) into sgml tags.
|
void |
setText(String stg)
Sets the text of a document.
|
void |
shrink(Annotation ann)
shrink the endpoint of Annotation
code to remove any
trailing whitespace. |
void |
shrink(String type)
shrink the endpoint, using
shrink, of all
annotations of type type, to eliminate trailing
whitespace. |
void |
shrinkAll()
shrink the endpoint, using
shrink, of all
annotations in the document. |
void |
stretch(Annotation ann)
extend the endpoint of Annotation ann to include the following whitespace
past the current endpoint.
|
void |
stretch(String type)
extend the endpoint, using
stretch, of all
annotations of type type. |
void |
stretchAll()
extend the endpoint, using
stretch, of all
annotations in the document. |
String |
text()
Returns the entire text of the document.
|
String |
text(Annotation ann)
Returns the text subsumed by annotation ann.
|
String |
text(Span s)
Returns the text subsumed by span s.
|
Annotation |
tokenAt(int start)
Returns the token annotation starting at position start, or
null if no token starts at this position.
|
Annotation |
tokenEndingAt(int end)
Returns the token annotation ending at position end, or null
if no token starts at this position.
|
StringBuffer |
writeSGML(String type)
Returns the text of the document with each instance of an annotation of
type type enclosed in SGML tags.
|
StringBuffer |
writeSGML(String[] types,
int start,
int end)
performs writeSGML for characters 'start' through 'end' of the Document,
generating tags for annotations whose type appears in array 'types'.
|
StringBuffer |
writeSGML(String type,
Span span)
performs writeSGML over portion 'span' of the Document.
|
public SyntacticRelationSet relations
public Document()
public Document(String stg)
public Document(Document doc)
public void clear()
public void setText(String stg)
public String text()
public String text(Annotation ann)
public String normalizedText(Span s)
public String normalizedText(Annotation ann)
public StringBuffer append(String stg)
public StringBuffer append(char c)
public int length()
public Span fullSpan()
public char charAt(int posn)
public void setCharAt(int posn,
char c)
public void clearAnnotations()
public Annotation addAnnotation(Annotation ann)
public Annotation annotate(String tp, Span sp, FeatureSet att)
public void removeAnnotation(Annotation ann)
public void removeAnnotationsOfType(String type)
public Vector<Annotation> annotationsAt(int start)
public Vector<Annotation> annotationsAt(int start, String type)
null.public Vector<Annotation> annotationsAt(int start, String[] types)
null.public Vector<Annotation> annotationsEndingAt(int end)
public Vector<Annotation> annotationsEndingAt(int end, String type)
null.public Annotation tokenAt(int start)
public Annotation tokenEndingAt(int end)
public Vector<Annotation> annotationsOfType(String type)
public Vector<Annotation> annotationsOfType(String type, Span span)
null,
all annotations of that type are returned. Returns null if
there are no annotations starting at this position.public void setCurrentHypothesis(Object hypoId)
currentHypothesis. If
currentHypothesis is non-null, a hypo feature with
this value is added to every new annotation on this document.public void setActiveHypotheses(Set hypoIdSet)
activeHypotheses. If
activeHypotheses is non-null, methods which retrieve
annotations on a document only return those annotations whose hypo
value is in activeHypotheses.public String[] getAnnotationTypes()
public void annotateWithTag(String tag, int start, int end)
Span of text
between <tag> and </tag>. Sets type of
annotation to tag name.tag - name of a tag to find a Span between tagsstart - where to start searching for a tagend - where to end searching for a tagpublic void annotateWithTag(String tag)
Span of text
between <tag> and </tag>. Sets type of
annotation to tag name.tag - name of a tag to find a Span between tagspublic int getNextAnnotationId()
public void setSGMLwrapMargin(int n)
public void setSGMLindent(int n)
public StringBuffer writeSGML(String type)
A Jet Document may contain annotations that are not nested, but these cannot be represented in SGML or XML. If the endpoint of an annotation is greated than the endpoint of a preceding annotation that is still open, the annotation is not written out.
public StringBuffer writeSGML(String type, Span span)
public StringBuffer writeSGML(String[] types, int start, int end)
public void stretch(Annotation ann)
public void stretch(String type)
stretch, of all
annotations of type type.public void stretchAll()
stretch, of all
annotations in the document.public void shrink(Annotation ann)
code to remove any
trailing whitespace.public void shrink(String type)
shrink, of all
annotations of type type, to eliminate trailing
whitespace.public void shrinkAll()
shrink, of all
annotations in the document.Copyright © 2016 New York University. All rights reserved.