Package org.corpus_tools.pepper.modules
Interface PepperExporter
-
- All Superinterfaces:
PepperModule
- All Known Implementing Classes:
DoNothingExporter,DOTExporter,PepperExporterImpl,SaltXMLExporter,TextExporter
public interface PepperExporter extends PepperModule
A mapping task in the Pepper workflow is not a monolithic block. It consists of several smaller steps.
- Declare the fingerprint of the module. This is part of the constructor.
- Check readyness of the module.
- Export the corpus structure.
- Export the document structure and create a mapper for each corpus and document.
- clean-up
Declare the fingerprint
Initialize the module and set the modules name, its description and the format description of data which are importable. This is part of the constructor:public MyModule() { super("Name of the module"); setSupplierContact(URI.createURI("Contact address of the module's supplier")); setSupplierHomepage(URI.createURI("homepage of the module")); setDesc("A short description of what is the intention of this module, for instance which formats are importable. "); this.addSupportedFormat("The name of a format which is importable e.g. txt", "The version corresponding to the format name", null); }Check readyness of the module
This method is invoked by the Pepper framework before the mapping process is started. This method must return true, otherwise, this Pepper module could not be used in a Pepper workflow. At this point problems which prevent the module from being used you can report all problems to the user, for instance a database connection could not be established.public boolean isReadyToStart() { return (true); }Export corpus structure
The corpus-structure export is handled in the methodexportCorpusStructure(). It is invoked on top of the method ' start() ' of the PepperExporter . For totally changing the default behavior just override this method. The aim of the methodexportCorpusStructure()is to fill the map of corresponding corpus-structure and file structure. The file structure is automatically created, there are just URI s pointing to the virtual file or folder. The creation of the file or folder has to be done by the Pepper module itself in methodPepperMapper.mapSCorpus()orPepperMapper.mapSDocument(). To adapt the creation of this 'virtual' file structure, you first have to choose the mode of export. You can do this for instance in method 'readyToStart()', as shown in the following snippet. But even in the constructor as well.public boolean isReadyToStart(){ ... //option 1 setExportMode(EXPORT_MODE.NO_EXPORT); //option 2 setExportMode(EXPORT_MODE.CORPORA_ONLY); //option 3 setExportMode(EXPORT_MODE.DOCUMENTS_IN_FILES); //sets the ending, which should be added to the documents name setDocumentEnding(ENDING_TAB); .. }In this snippet, option 1 means that nothing will be mapped. Option 2 means that onlySCorpusobjects are mapped to a folder andSDocumentobjects will be ignored. And option 3 means thatSCorpusobjects are mapped to a folder andSDocumentobjects are mapped to a file. The ending of that file can be determined by passing the ending with methodsetDocumentEnding(String). In the given snippet aURIhaving the ending 'tab' is created for eachSDocument.Export the document structure
In the methodPepperModule.createPepperMapper(Identifier)aPepperMapperobject needs to be initialized and returned. ThePepperMapperis the major part major part doing the mapping. It provides the methodsPepperMapper.mapSCorpus()to handle the mapping of a singleSCorpusobject andPepperMapper.mapSDocument()to handle a singleSDocumentobject. Both methods are invoked by the Pepper framework. To set thePepperMapper.getResourceURI(), which offers the mapper the file or folder of the currentSCorpusorSDocumentobject, this filed needs to be set in thePepperModule.createPepperMapper(Identifier)method. The following snippet shows a dummy of that method:public PepperMapper createPepperMapper(Identifier sElementId) { PepperMapper mapper = new PepperMapperImpl() { @Override public DOCUMENT_STATUS mapSCorpus() { // handling the mapping of a single corpus // accessing the current file or folder getResourceURI(); // returning, that the corpus was mapped successfully return (DOCUMENT_STATUS.COMPLETED); } @Override public DOCUMENT_STATUS mapSDocument() { // handling the mapping of a single document // accessing the current file or folder getResourceURI(); // returning, that the document was mapped successfully return (DOCUMENT_STATUS.COMPLETED); } }; // pass current file or folder to mapper. When using // PepperImporter.importCorpusStructure or // PepperExporter.exportCorpusStructure, the mapping between file or // folder // and SCorpus or SDocument was stored here mapper.setResourceURI(getIdentifier2ResourceTable().get(sElementId)); return (mapper); }clean-up
Sometimes it might be necessary to clean up after the module did the job. For instance when writing an im- or an exporter it might be necessary to close file streams, a db connection etc. Therefore, after the processing is done, the Pepper framework calls the method described in the following snippet:public void end() { super.end(); // do some clean up like closing of streams etc. }- Author:
- Florian Zipser
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classPepperExporter.EXPORT_MODEDetermines how the corpus-structure should be exported.
-
Field Summary
-
Fields inherited from interface org.corpus_tools.pepper.modules.PepperModule
ENDING_ALL_FILES, ENDING_FOLDER, ENDING_LEAF_FOLDER, ENDING_TAB, ENDING_TXT, ENDING_XML
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description FormatDescaddSupportedFormat(String formatName, String formatVersion, org.eclipse.emf.common.util.URI formatReference){@inheritDoc PepperModuleDesc#addSupportedFormat(String, String, URI)}org.eclipse.emf.common.util.URIcreateFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)Creates a folder structure basing on the passed corpus path in (CorpusDesc.getCorpusPath()).voidexportCorpusStructure()This method is called byPepperModule.start()to export the corpus-structure into a folder-structure.CorpusDescgetCorpusDesc()TODO docuStringgetDocumentEnding()Returns the format ending for files to be exported and related toSDocumentobjects.PepperExporter.EXPORT_MODEgetExportMode()Returns how corpus-structure is exported.Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI>getIdentifier2ResourceTable()Returns table correspondence betweenIdentifierand a resource.List<FormatDesc>getSupportedFormats()TODO docuvoidsetCorpusDesc(CorpusDesc corpusDesc)TODO docuvoidsetDocumentEnding(String sDocumentEnding)Sets the format ending for files to be exported and related toSDocumentobjects.voidsetExportMode(PepperExporter.EXPORT_MODE exportMode)Determines how the corpus-structure should be exported.-
Methods inherited from interface org.corpus_tools.pepper.modules.PepperModule
createPepperMapper, done, done, end, getComponentContext, getCorpusGraph, getDesc, getFingerprint, getModuleController, getModuleType, getName, getProgress, getProgress, getProperties, getResources, getSaltProject, getSelfTestDesc, getStartProblems, getSupplierContact, getSupplierHomepage, getSymbolicName, getTemproraries, getVersion, isMultithreaded, isReadyToStart, proposeImportOrder, setCorpusGraph, setDesc, setIsMultithreaded, setPepperModuleController, setPepperModuleController_basic, setProperties, setResources, setSaltProject, setSupplierContact, setSupplierHomepage, setSymbolicName, setTemproraries, setVersion, start, start
-
-
-
-
Method Detail
-
getSupportedFormats
List<FormatDesc> getSupportedFormats()
TODO docu- Returns:
-
getCorpusDesc
CorpusDesc getCorpusDesc()
TODO docu- Returns:
-
getDocumentEnding
String getDocumentEnding()
Returns the format ending for files to be exported and related toSDocumentobjects.- Returns:
- file ending for
SDocumentobjects to be exported.
-
setDocumentEnding
void setDocumentEnding(String sDocumentEnding)
Sets the format ending for files to be exported and related toSDocumentobjects.- Parameters:
file- ending forSDocumentobjects to be exported.
-
setCorpusDesc
void setCorpusDesc(CorpusDesc corpusDesc)
TODO docu
-
getIdentifier2ResourceTable
Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI> getIdentifier2ResourceTable()
Returns table correspondence betweenIdentifierand a resource. StoresIdentifierobjects corresponding to either aSDocumentor aSCorpusobject, which has been created during the run of#importCorpusStructure(SCorpusGraph). Corresponding to theIdentifierobject this table stores the resource from where the element shall be imported.
For instance:corpus_1 /home/me/corpora/myCorpus corpus_2 /home/me/corpora/myCorpus/subcorpus doc_1 /home/me/corpora/myCorpus/subcorpus/document1.xml doc_2 /home/me/corpora/myCorpus/subcorpus/document2.xml - Returns:
- table correspondence between
Identifierand a resource.
-
createFolderStructure
org.eclipse.emf.common.util.URI createFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)
Creates a folder structure basing on the passed corpus path in (CorpusDesc.getCorpusPath()). For each segment inIdentifiera folder is created.- Returns:
- the entire path of
Identifieras file path, which was created on disk
-
getExportMode
PepperExporter.EXPORT_MODE getExportMode()
Returns how corpus-structure is exported.- Returns:
-
setExportMode
void setExportMode(PepperExporter.EXPORT_MODE exportMode)
Determines how the corpus-structure should be exported.- EXPORT_MODE#NO_EXPORT - corpus-structure should not be exported
- EXPORT_MODE#CORPORA_ONLY
SCorpusobjects are exported into a folder structure, butSDocumentobjects are not exported - EXPORT_MODE#DOCUMENTS_IN_FILES -
SCorpusobjects are exported into a folder structure andSDocumentobjects are stored in files having the ending determined by PepperExporter#getDocumentEnding()
- Parameters:
exportMode-
-
exportCorpusStructure
void exportCorpusStructure()
This method is called byPepperModule.start()to export the corpus-structure into a folder-structure. That means, eachIdentifierbelonging to aSDocumentorSCorpusobject is storedgetIdentifier2ResourceTable()together with thze corresponding file-structure object (file or folder) located by aURI. TheURIobject corresponding to files will get the file ending determined by#getDocumentEnding(String), which could be set bysetDocumentEnding(String).
To adapt the creation ofURIs set the export mode viasetExportMode(EXPORT_MODE).
-
addSupportedFormat
FormatDesc addSupportedFormat(String formatName, String formatVersion, org.eclipse.emf.common.util.URI formatReference)
{@inheritDoc PepperModuleDesc#addSupportedFormat(String, String, URI)}
-
-