edu.washington.cs.knowitall.sequence
Class RegexTagger

java.lang.Object
  extended by edu.washington.cs.knowitall.sequence.RegexTagger

public class RegexTagger
extends Object

A class for tagging a sequence using a LayeredTokenPattern pattern. The tagger is defined by a pattern and a tag. Given a LayeredSequence object, the tag(LayeredSequence) method will return a list of strings, where each string is either the tag, or the OUT_TAG symbol.

For example, given the sequence "she sells sea shells by the shore", the tag symbol "X" and a regular expression that matches the words starting with s, the tagger will return the list [X, X, X, X, O, O, X].

Author:
afader

Field Summary
static String OUT_TAG
          The symbol used to represent a token that did not match the pattern.
 
Constructor Summary
RegexTagger(LayeredTokenPattern pattern, String tag)
           
 
Method Summary
 List<String> tag(LayeredSequence seq)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

OUT_TAG

public static final String OUT_TAG
The symbol used to represent a token that did not match the pattern.

See Also:
Constant Field Values
Constructor Detail

RegexTagger

public RegexTagger(LayeredTokenPattern pattern,
                   String tag)
Parameters:
pattern - the regular expression to match
tag - the tag to use for matching tokens
Method Detail

tag

public List<String> tag(LayeredSequence seq)
                 throws SequenceException
Parameters:
seq -
Returns:
the tagged result
Throws:
SequenceException - if unable to match against seq


Copyright © 2010-2012 University of Washington CSE. All Rights Reserved.