Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
A
C
E
F
G
M
O
P
S
T
A
addUnitOnValues(List<String>, String)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
Adding unit on each values of the list
annotate(ComposedUnit, Map<String, List<String>>)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
Annotates
cu
with the predicates and literals contained in
toAnnot
.
C
characters(char[], int, int)
- Method in class org.ow2.weblab.services.normaliser.tika.
MediaUnitContentHandler
checkArgs(ProcessArgs)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
cleanMap(Map<String, List<String>>)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
Modify the
Map
in parameter.
convertToISO8601Date(String)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
CustomOfficeParser
- Class in
org.apache.tika.parser.microsoft
Defines a Microsoft document content extractor.
CustomOfficeParser()
- Constructor for class org.apache.tika.parser.microsoft.
CustomOfficeParser
E
EmlParser
- Class in
org.apache.tika.parser.microsoft
Defines a EML document content extractor.
EmlParser()
- Constructor for class org.apache.tika.parser.microsoft.
EmlParser
endElement(String, String, String)
- Method in class org.ow2.weblab.services.normaliser.tika.
MediaUnitContentHandler
extractTextAndMetadata(ComposedUnit, File, Map<String, List<String>>, boolean)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
F
fillMapWithMetadata(Map<String, List<String>>, Metadata)
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
The method converts the metadata extracted by Tika into a Map of predicates with their values that can be annotated.
G
getTikaConfig()
- Static method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
M
MediaUnitContentHandler
- Class in
org.ow2.weblab.services.normaliser.tika
MediaUnitContentHandler(ContentHandler, ComposedUnit)
- Constructor for class org.ow2.weblab.services.normaliser.tika.
MediaUnitContentHandler
O
org.apache.tika.parser.microsoft
- package org.apache.tika.parser.microsoft
org.ow2.weblab.services.normaliser.tika
- package org.ow2.weblab.services.normaliser.tika
P
parse(InputStream, ContentHandler, Metadata, ParseContext)
- Method in class org.apache.tika.parser.microsoft.
CustomOfficeParser
Extracts properties and text from an MS Document input stream
parse(InputStream, ContentHandler, Metadata)
- Method in class org.apache.tika.parser.microsoft.
CustomOfficeParser
Deprecated.
This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext)
- Method in class org.apache.tika.parser.microsoft.
EmlParser
Extracts properties and text from an EML Document input stream
parse(InputStream, ContentHandler, Metadata)
- Method in class org.apache.tika.parser.microsoft.
EmlParser
Deprecated.
This method will be removed in Apache Tika 1.0.
process(ProcessArgs)
- Method in class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
S
startElement(String, String, String, Attributes)
- Method in class org.ow2.weblab.services.normaliser.tika.
MediaUnitContentHandler
T
TikaExtractorService
- Class in
org.ow2.weblab.services.normaliser.tika
Tika extractor is quite simple since it does not handle with structure of documents (sheets in Excel, paragraphs in Word, etc.)
TikaExtractorService()
- Constructor for class org.ow2.weblab.services.normaliser.tika.
TikaExtractorService
The default and only constructor.
A
C
E
F
G
M
O
P
S
T
Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
Copyright © 2004-2010. All Rights Reserved.