edu.washington.cs.knowitall.extractor.conf.featureset
Class ExtractionFeature
java.lang.Object
edu.washington.cs.knowitall.extractor.conf.featureset.ExtractionFeature
- All Implemented Interfaces:
- com.google.common.base.Predicate<ChunkedBinaryExtraction>
- Direct Known Subclasses:
- ChunkFeature, PosFeature, TokenFeature, VerbTokenFeature
public abstract class ExtractionFeature
- extends Object
- implements com.google.common.base.Predicate<ChunkedBinaryExtraction>
A parent class for any feature that picks a particular range and applies a
test to all indices within that range.
For example, the feature `return true if arg2 contains token "fish"` would be
implemented by having rangeToExamine() return arg2.getRange() and
testAtIndex() returning sentence.getToken(index).equalsIgnoreCase("fish");
- Author:
- Rob
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface com.google.common.base.Predicate |
equals |
stemmer
protected BasicFieldNormalizer stemmer
ExtractionFeature
protected ExtractionFeature()
rangeToExamine
protected abstract edu.washington.cs.knowitall.commonlib.Range rangeToExamine(ChunkedBinaryExtraction cbe)
apply
public boolean apply(ChunkedBinaryExtraction cbe)
- Specified by:
apply in interface com.google.common.base.Predicate<ChunkedBinaryExtraction>
testAtIndex
protected abstract boolean testAtIndex(Integer index,
ChunkedSentence sentence)
indexOfHeadVerb
public static Integer indexOfHeadVerb(ChunkedExtraction relation,
boolean exception)
- Implements the following naive algorithm for locating the head verb
within a verb phrase:
1. Start at end of phrase.
2. Work backward until you encounter a posTag.startsWith("V");
3. Return the corresponding token.
If exception, throw an IllegalArgumentException if no verb in the
relation. If !exception, return null to the client if no verb in the
relation.
- Parameters:
relation -
Copyright © 2010-2012 University of Washington CSE. All Rights Reserved.