public interface PepperExporter extends PepperModule
A mapping task in the Pepper workflow is not a monolithic block. It consists of several smaller steps.
public MyModule() {
super("Name of the module");
setSupplierContact(URI.createURI("Contact address of the module's supplier"));
setSupplierHomepage(URI.createURI("homepage of the module"));
setDesc("A short description of what is the intention of this module, for instance which formats are importable. ");
this.addSupportedFormat("The name of a format which is importable e.g. txt",
"The version corresponding to the format name", null);
}
public boolean isReadyToStart() {
return (true);
}
exportCorpusStructure(). It is invoked on top of the
method ' start() ' of the PepperExporter . For totally changing the default
behavior just override this method. The aim of the method
exportCorpusStructure() is to fill the map of corresponding
corpus-structure and file structure. The file structure is automatically
created, there are just URI s pointing to the virtual file or folder. The
creation of the file or folder has to be done by the Pepper module itself in
method PepperMapper.mapSCorpus() or
PepperMapper.mapSDocument(). To adapt the creation of this 'virtual'
file structure, you first have to choose the mode of export. You can do this
for instance in method 'readyToStart()', as shown in the following snippet.
But even in the constructor as well.
public boolean isReadyToStart(){
... //option 1
setExportMode(EXPORT_MODE.NO_EXPORT);
//option 2
setExportMode(EXPORT_MODE.CORPORA_ONLY);
//option 3
setExportMode(EXPORT_MODE.DOCUMENTS_IN_FILES);
//sets the ending, which should be added to the documents name
setDocumentEnding(ENDING_TAB);
..
}
In this snippet, option 1 means that nothing will be mapped. Option 2 means
that only SCorpus objects are mapped to a folder and
SDocument objects will be ignored. And option 3 means that
SCorpus objects are mapped to a folder and SDocument objects
are mapped to a file. The ending of that file can be determined by passing
the ending with method setDocumentEnding(String). In the given
snippet a URI having the ending 'tab' is created for each
SDocument.
PepperModule.createPepperMapper(Identifier) a PepperMapper object needs
to be initialized and returned. The PepperMapper is the major part
major part doing the mapping. It provides the methods
PepperMapper.mapSCorpus() to handle the mapping of a single
SCorpus object and PepperMapper.mapSDocument() to handle a
single SDocument object. Both methods are invoked by the Pepper
framework. To set the PepperMapper.getResourceURI(), which offers the
mapper the file or folder of the current SCorpus or SDocument
object, this filed needs to be set in the
PepperModule.createPepperMapper(Identifier) method. The following snippet shows a
dummy of that method:
public PepperMapper createPepperMapper(Identifier sElementId) {
PepperMapper mapper = new PepperMapperImpl() {
@Override
public DOCUMENT_STATUS mapSCorpus() {
// handling the mapping of a single corpus
// accessing the current file or folder
getResourceURI();
// returning, that the corpus was mapped successfully
return (DOCUMENT_STATUS.COMPLETED);
}
@Override
public DOCUMENT_STATUS mapSDocument() {
// handling the mapping of a single document
// accessing the current file or folder
getResourceURI();
// returning, that the document was mapped successfully
return (DOCUMENT_STATUS.COMPLETED);
}
};
// pass current file or folder to mapper. When using
// PepperImporter.importCorpusStructure or
// PepperExporter.exportCorpusStructure, the mapping between file or
// folder
// and SCorpus or SDocument was stored here
mapper.setResourceURI(getIdentifier2ResourceTable().get(sElementId));
return (mapper);
}
public void end() {
super.end();
// do some clean up like closing of streams etc.
}
| Modifier and Type | Interface and Description |
|---|---|
static class |
PepperExporter.EXPORT_MODE
Determines how the corpus-structure should be exported.
|
ENDING_ALL_FILES, ENDING_FOLDER, ENDING_LEAF_FOLDER, ENDING_TAB, ENDING_TXT, ENDING_XML| Modifier and Type | Method and Description |
|---|---|
FormatDesc |
addSupportedFormat(String formatName,
String formatVersion,
org.eclipse.emf.common.util.URI formatReference) |
org.eclipse.emf.common.util.URI |
createFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)
Creates a folder structure basing on the passed corpus path in (
CorpusDesc.getCorpusPath()). |
void |
exportCorpusStructure()
This method is called by
PepperModule.start() to export the corpus-structure
into a folder-structure. |
CorpusDesc |
getCorpusDesc()
TODO docu
|
String |
getDocumentEnding()
Returns the format ending for files to be exported and related to
SDocument objects. |
PepperExporter.EXPORT_MODE |
getExportMode()
Returns how corpus-structure is exported.
|
Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI> |
getIdentifier2ResourceTable()
Returns table correspondence between
Identifier and a resource. |
List<FormatDesc> |
getSupportedFormats()
TODO docu
|
void |
setCorpusDesc(CorpusDesc corpusDesc)
TODO docu
|
void |
setDocumentEnding(String sDocumentEnding)
Sets the format ending for files to be exported and related to
SDocument objects. |
void |
setExportMode(PepperExporter.EXPORT_MODE exportMode)
Determines how the corpus-structure should be exported.
|
createPepperMapper, done, done, end, getComponentContext, getCorpusGraph, getDesc, getFingerprint, getModuleController, getModuleType, getName, getProgress, getProgress, getProperties, getResources, getSaltProject, getSelfTestDesc, getStartProblems, getSupplierContact, getSupplierHomepage, getSymbolicName, getTemproraries, getVersion, isMultithreaded, isReadyToStart, proposeImportOrder, setCorpusGraph, setDesc, setIsMultithreaded, setPepperModuleController_basic, setPepperModuleController, setProperties, setResources, setSaltProject, setSupplierContact, setSupplierHomepage, setSymbolicName, setTemproraries, setVersion, start, startList<FormatDesc> getSupportedFormats()
CorpusDesc getCorpusDesc()
String getDocumentEnding()
SDocument objects.SDocument objects to be exported.void setDocumentEnding(String sDocumentEnding)
SDocument objects.file - ending for SDocument objects to be exported.void setCorpusDesc(CorpusDesc corpusDesc)
Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI> getIdentifier2ResourceTable()
Identifier and a resource.
Stores Identifier objects corresponding to either a
SDocument or a SCorpus object, which has been created
during the run of #importCorpusStructure(SCorpusGraph).
Corresponding to the Identifier object this table stores the
resource from where the element shall be imported.| corpus_1 | /home/me/corpora/myCorpus |
| corpus_2 | /home/me/corpora/myCorpus/subcorpus |
| doc_1 | /home/me/corpora/myCorpus/subcorpus/document1.xml |
| doc_2 | /home/me/corpora/myCorpus/subcorpus/document2.xml |
Identifier and a resource.org.eclipse.emf.common.util.URI createFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)
CorpusDesc.getCorpusPath()). For each segment in
Identifier a folder is created.Identifier as file path, which was
created on diskPepperExporter.EXPORT_MODE getExportMode()
void setExportMode(PepperExporter.EXPORT_MODE exportMode)
SCorpus objects are exported into a
folder structure, but SDocument objects are not exportedSCorpus objects are exported
into a folder structure and SDocument objects are stored in files
having the ending determined by PepperExporter#getDocumentEnding()exportMode - void exportCorpusStructure()
PepperModule.start() to export the corpus-structure
into a folder-structure. That means, each Identifier belonging to
a SDocument or SCorpus object is stored
getIdentifier2ResourceTable() together with thze corresponding
file-structure object (file or folder) located by a URI. The
URI object corresponding to files will get the file ending
determined by #getDocumentEnding(String), which could be set by
setDocumentEnding(String). URIs set the export mode via
setExportMode(EXPORT_MODE).FormatDesc addSupportedFormat(String formatName, String formatVersion, org.eclipse.emf.common.util.URI formatReference)
Copyright © 2009–2018 Humboldt-Universität zu Berlin, INRIA. All rights reserved.