public class AnnotateProperties extends Object
| Modifier and Type | Field and Description |
|---|---|
static String |
DEFAULT_ALL_MORPHOLOGY
Output all POS and Lemma analysis before disambiguation.
|
static String |
DEFAULT_DICTAG
Correct statistical POS tagger output with monosemic dictionary.
|
static String |
DEFAULT_HARD_PARAGRAPH
Choose between 'yes' and 'no'.
|
static String |
DEFAULT_MULTIWORDS
Detect multiwords.
|
static String |
DEFAULT_NORMALIZE
Choose corpus conventions for normalization of punctuation.
|
static String |
DEFAULT_UNTOKENIZABLE_STRING
Choose between 'yes' and 'no'.
|
| Modifier and Type | Method and Description |
|---|---|
Properties |
setChunkingProperties(String model,
String language)
Generate Properties object for chunking.
|
static Properties |
setPOSLemmaProperties(String model,
String lemmatizerModel,
String language,
String multiwords,
String dictag,
String allMorphology)
Generate Properties object for POS tagging and Lemmatizing.
|
static Properties |
setTokenizeProperties(String lang,
String normalize,
String untokenizable,
String hardParagraph)
Creates the Properties object required to construct a Sentence
Segmenter and a Tokenizer.
|
public static final String DEFAULT_NORMALIZE
public static final String DEFAULT_UNTOKENIZABLE_STRING
public static final String DEFAULT_HARD_PARAGRAPH
public static final String DEFAULT_MULTIWORDS
public static final String DEFAULT_DICTAG
public static final String DEFAULT_ALL_MORPHOLOGY
public static Properties setTokenizeProperties(String lang, String normalize, String untokenizable, String hardParagraph)
lang - it is required to provide a language codenormalize - the normalization optionuntokenizable - print untokenizable tokenshardParagraph - do not segment paragraph markspublic static Properties setPOSLemmaProperties(String model, String lemmatizerModel, String language, String multiwords, String dictag, String allMorphology)
model - the pos tagger modellemmatizerModel - the lemmatizer modellanguage - the languagemultiwords - whether multiwords are to be detecteddictag - whether tagging from a dictionary is activatedallMorphology - whether to disclose all pos tags and lemmas before disambiguationpublic Properties setChunkingProperties(String model, String language)
model - the model to perform the annotationlanguage - the languageCopyright © 2016 IXA pipes. All rights reserved.