edu.washington.cs.knowitall.nlp
Class ChunkedSentencePattern

java.lang.Object
  extended by edu.washington.cs.knowitall.nlp.ChunkedSentencePattern

public class ChunkedSentencePattern
extends Object


Constructor Summary
ChunkedSentencePattern()
           
 
Method Summary
static edu.washington.cs.knowitall.regex.RegularExpression<ChunkedSentenceToken> compile(String regex)
          This class compiles regular expressions over the ChunkedSentenceTokens in a sentence into an NFA.
static void main(String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChunkedSentencePattern

public ChunkedSentencePattern()
Method Detail

compile

public static edu.washington.cs.knowitall.regex.RegularExpression<ChunkedSentenceToken> compile(String regex)
This class compiles regular expressions over the ChunkedSentenceTokens in a sentence into an NFA. There is a lot of redundancy in their expressiveness. This is largely because it supports pattern matching on the fields This is not necessary but is an optimization and a shorthand (i.e. <pos="NNPS?"> is equivalent to "<pos="NNP" | pos="NNPS"> and (?:<pos="NNP"> | <pos="NNPS">).

Here are some equivalent examples:

  1. <pos="JJ">* <pos="NNP.">+
  2. <pos="JJ">* <pos="NNPS?">+
  3. <pos="JJ">* <pos="NNP" | pos="NNPS">+
  4. <pos="JJ">* (?:<pos="NNP"> | <pos="NNPS">)+
Note that (3) and (4) are not preferred for efficiency reasons. Regex OR (in example (4)) should only be used on multi-ChunkedSentenceToken sequences.

The Regular Expressions support named groups (: ... ), unnamed groups (?: ... ), and capturing groups ( ... ). The operators allowed are +, ?, *, and |. The Logic Expressions (that describe each ChunkedSentenceToken) allow grouping "( ... )", not '!', or '|', and and '&'.

Parameters:
regex -
Returns:

main

public static void main(String[] args)
                 throws ChunkerException,
                        IOException
Throws:
ChunkerException
IOException


Copyright © 2010-2012 University of Washington CSE. All Rights Reserved.