public class Retagger extends Object
| Constructor and Description |
|---|
Retagger() |
| Modifier and Type | Method and Description |
|---|---|
static boolean |
compatible(String word,
String pennPOS,
Annotation jetDefn)
returns true if Penn part-of-speech tag 'pennPOS', as a tag for 'word', is
compatible with Jet word definition 'jetDefn'.
|
static String |
jetToPtbPos(FeatureSet fs)
given a FeatureSet fs for a Jet lexical constituent (with a 'cat'
feature and possibly other features), return a Penn POS consistent
with 'fs'.
|
static void |
pruneConstit(Document d,
Span zone)
prunes constit annotations obtained from lexical look-up
using Penn tags (recorded as tagger annotations).
|
static FeatureSet[] |
ptbToJetFS(String word,
String pennPOS)
given an annotation based on Penn tag set, returns an array
(possibly empty) of corresponding Jet FeatureSets, with one entry for
each possible Jet category and attributes.
|
public static FeatureSet[] ptbToJetFS(String word, String pennPOS)
public static String jetToPtbPos(FeatureSet fs)
public static void pruneConstit(Document d, Span zone)
The following rules are applied at each token:
1. If there are no lexical entries (constit tags), generate
constit annotations based on tagger tags.
2. If there is a lexical entry spanning more than one token,
keep it.
3. If there are lexical entries, and some of them are
consistent with the tagger tags, keep only the consistent
entries.
4. If the tagger tag is POS (possessive), delete all lexical
entries.
5. Otherwise (there are lexical entries, but none are consistent
with the tagger tags) keep all lexical entries.
public static boolean compatible(String word, String pennPOS, Annotation jetDefn)
Copyright © 2016 New York University. All rights reserved.