package sequences
- Alphabetic
- Public
- All
Type Members
-
sealed
trait
BIOETag
[+L] extends AnyRef
A BIOETag is a tag that us to represent epic.sequences.Segmentations as epic.sequences.TaggedSequences.
A BIOETag is a tag that us to represent epic.sequences.Segmentations as epic.sequences.TaggedSequences. It includes Begins, Inside, Outside, and End tags. Sometimes we just use IO, or BIO.
-
trait
CRF
[L, W] extends Serializable
A Linear Chain Conditional Random Field.
A Linear Chain Conditional Random Field. Useful for POS tagging, etc.
As usual in Epic, all the heavy lifting is done in the companion object and Marginals.
CRFs can produce epic.sequences.TaggedSequence from an input sequence of words. They can also produce marginals, etc.
- Annotations
- @SerialVersionUID()
-
class
CRFInference
[L, W] extends AugmentableInference[TaggedSequence[L, W], Anchoring[L, W]] with CRF[L, W] with AnnotatingInference[TaggedSequence[L, W]] with Serializable
- Annotations
- @SerialVersionUID()
-
class
CRFModel
[L, W] extends Model[TaggedSequence[L, W]] with Model[TaggedSequence[L, W]] with Serializable
- Annotations
- @SerialVersionUID()
-
trait
Gazetteer
[+L, W] extends SurfaceFeaturizer[W] with WordFeaturizer[W]
A Gazeteer is a map from IndexedSeq[W]->L.
A Gazeteer is a map from IndexedSeq[W]->L. That is, it maps strings of words to a label that we've seen before. For example, you might use a list of countries. These are very useful for named entity recognition.
- case class GazetteerSpanFeature (label: Any) extends Feature with Product with Serializable
- case class GazetteerWordFeature (label: Any) extends Feature with Product with Serializable
- trait GoldSegmentPolicy [L] extends AnyRef
-
class
HammingLossAugmentation
[L, W] extends LossAugmentation[Segmentation[L, W], Anchoring[L, W]]
TODO
- case class Segmentation [+L, +W](segments: IndexedSeq[(L, Span)], words: IndexedSeq[W], id: String = "") extends Example[IndexedSeq[(L, Span)], IndexedSeq[W]] with Product with Serializable
-
class
SegmentationModelFactory
[L] extends SerializableLogging
Factory class for making a epic.sequences.SemiCRFModel based on some data and an optional gazetteer.
-
trait
Segmenter
[Tag] extends StringAnalysisFunction[Sentence with Token, Tag] with (IndexedSeq[String]) ⇒ IndexedSeq[(Tag, Span)]
A epic.sequences.Segmenter splits up a sentence into labeled segments.
A epic.sequences.Segmenter splits up a sentence into labeled segments. For instance, it might find all the people, places and things (Named Entity Recognition) in a document.
- Tag
the type of tag that is annotated
-
trait
SemiCRF
[L, W] extends Serializable
A Semi-Markov Linear Chain Conditional Random Field, that is, the length of time spent in a state may be longer than 1 tick.
A Semi-Markov Linear Chain Conditional Random Field, that is, the length of time spent in a state may be longer than 1 tick. Useful for field segmentation or NER.
As usual in Epic, all the heavy lifting is done in the companion object and Marginals.
- Annotations
- @SerialVersionUID()
-
class
SemiCRFInference
[L, W] extends AugmentableInference[Segmentation[L, W], Anchoring[L, W]] with SemiCRF[L, W] with Serializable
- Annotations
- @SerialVersionUID()
-
class
SemiCRFModel
[L, W] extends Model[Segmentation[L, W]] with Serializable
- Annotations
- @SerialVersionUID()
-
case class
TaggedSequence
[+L, +W](tags: IndexedSeq[L], words: IndexedSeq[W], id: String = "") extends Example[IndexedSeq[L], IndexedSeq[W]] with Product with Serializable
A tagged sequence has a sequence of tags and a sequence of words that are in one-to-one correspondence.
A tagged sequence has a sequence of tags and a sequence of words that are in one-to-one correspondence. think POS tags etc.
- class TaggedSequenceModelFactory [L] extends SerializableLogging
-
trait
Tagger
[Tag] extends StringAnalysisFunction[Sentence with Token, Tag] with (IndexedSeq[String]) ⇒ IndexedSeq[Tag]
A Tagger assigns a sequence of Tags to a
A Tagger assigns a sequence of Tags to a
- Tag
the type of tag that is annotated
Value Members
- object BIOETag
- object CRF extends Serializable
- object Gazetteer extends Serializable
- object GoldSegmentPolicy
-
object
HMM
HiddenMarkovModel, which is the generative special case of a epic.sequences.CRF.
-
object
SegmentText
extends ProcessTextMain[SemiCRF[Any, String], Segmentation[Any, String]]
Simple class that reads in a bunch of files and parses them.
Simple class that reads in a bunch of files and parses them. Output is dumped to standard out.
- object Segmentation extends Serializable
-
object
SegmentationEval
extends SerializableLogging
Object for evaluating epic.sequences.Segmentations.
Object for evaluating epic.sequences.Segmentations. Returned metrics are precision, recall, and f1
-
object
SegmentationModelFactory
extends Serializable
- Annotations
- @SerialVersionUID()
- object Segmenter
- object SemiCRF extends Serializable
- object SemiCRFModel extends Serializable
- object SemiConllNerPipeline extends SerializableLogging
- object SemiNerPipeline extends SerializableLogging
-
object
SemiPOSTagger
extends SerializableLogging
Mostly for debugging SemiCRFs.
Mostly for debugging SemiCRFs. Just uses a SemiCRF as a CRF.
-
object
TagText
extends ProcessTextMain[CRF[AnnotatedLabel, String], TaggedSequence[AnnotatedLabel, String]]
Simple class that reads in a bunch of files and tags them.
Simple class that reads in a bunch of files and tags them. Output is dumped to standard out.
-
object
TaggedSequenceEval
Object for evaluating epic.sequences.TaggedSequences.
Object for evaluating epic.sequences.TaggedSequences. Returned metrics are accuracy and exact match.
- object TaggedSequenceModelFactory extends Serializable
- object Tagger
- object TrainPosTagger extends SerializableLogging