org.ow2.weblab.service.transcript.sphinx
Class WebLabTextTranscriptCreator

java.lang.Object
  extended by org.ow2.weblab.service.transcript.sphinx.WebLabTextTranscriptCreator

public class WebLabTextTranscriptCreator
extends java.lang.Object


Field Summary
protected  org.ow2.weblab.content.api.ContentManager contentManager
           
protected  SphinxTranscriptor sphinxTranscriptor
           
protected  java.lang.String targetLang
           
protected  boolean writeFiller
           
protected  boolean writePronunciation
           
protected  boolean writeScore
           
protected  boolean writeTokenInfo
           
 
Constructor Summary
WebLabTextTranscriptCreator(SphinxTranscriptor sphinxTranscriptor, org.ow2.weblab.content.api.ContentManager contentManager, boolean writeTokenInfo, boolean writeFiller, boolean writeScore, boolean writePronunciation, java.lang.String targetLang)
          Generates all text transcript from all audio media unit contained in a WebLab document using a specific sphinxTranscriptor
 
Method Summary
protected  void addWord(java.lang.StringBuffer sb, org.ow2.weblab.core.model.Annotation textAnnotation, org.ow2.weblab.core.model.Annotation audioAnnotation, org.ow2.weblab.core.model.Text text, org.ow2.weblab.core.model.Audio audio, edu.cmu.sphinx.decoder.search.Token token, edu.cmu.sphinx.frontend.FloatData startFeature, edu.cmu.sphinx.frontend.FloatData endFeature)
          Generates the two segments with all configured annotations
protected  void generatedTranscriptedText(edu.cmu.sphinx.decoder.search.Token curToken, org.ow2.weblab.core.model.Text text, org.ow2.weblab.core.model.Audio audio, org.ow2.weblab.core.model.Annotation textAnnotation, org.ow2.weblab.core.model.Annotation audioAnnotation)
          Creates aligned segments from Token, Text and Audio.
 java.lang.String toString()
           
 void transcriptDocument(org.ow2.weblab.core.model.Document parent)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

sphinxTranscriptor

protected SphinxTranscriptor sphinxTranscriptor

writeTokenInfo

protected boolean writeTokenInfo

writeFiller

protected boolean writeFiller

writeScore

protected boolean writeScore

writePronunciation

protected boolean writePronunciation

targetLang

protected java.lang.String targetLang

contentManager

protected org.ow2.weblab.content.api.ContentManager contentManager
Constructor Detail

WebLabTextTranscriptCreator

public WebLabTextTranscriptCreator(SphinxTranscriptor sphinxTranscriptor,
                                   org.ow2.weblab.content.api.ContentManager contentManager,
                                   boolean writeTokenInfo,
                                   boolean writeFiller,
                                   boolean writeScore,
                                   boolean writePronunciation,
                                   java.lang.String targetLang)
Generates all text transcript from all audio media unit contained in a WebLab document using a specific sphinxTranscriptor

Parameters:
sphinxTranscriptor - model used to transcript
contentManager - content manager to be used
writeTokenInfo - true if your want to add extra information on each token
writeFiller - true if you want filler words
writeScore - true if you want to write score
writePronunciation - true if you want to write pronunciation
targetLang - target language
Method Detail

transcriptDocument

public void transcriptDocument(org.ow2.weblab.core.model.Document parent)
                        throws org.ow2.weblab.core.services.InvalidParameterException,
                               org.ow2.weblab.core.services.ContentNotAvailableException
Throws:
org.ow2.weblab.core.services.InvalidParameterException
org.ow2.weblab.core.services.ContentNotAvailableException

generatedTranscriptedText

protected void generatedTranscriptedText(edu.cmu.sphinx.decoder.search.Token curToken,
                                         org.ow2.weblab.core.model.Text text,
                                         org.ow2.weblab.core.model.Audio audio,
                                         org.ow2.weblab.core.model.Annotation textAnnotation,
                                         org.ow2.weblab.core.model.Annotation audioAnnotation)
Creates aligned segments from Token, Text and Audio. inspired from Sphinx code in Result.getTimedWordPath

Parameters:
token - Sphinx Token
text - Text media unit transcripted
audio - Audio source media unit
audioAnnotation - Annotation on Audio unit
textAnnotation - Annotation on Text unit

addWord

protected void addWord(java.lang.StringBuffer sb,
                       org.ow2.weblab.core.model.Annotation textAnnotation,
                       org.ow2.weblab.core.model.Annotation audioAnnotation,
                       org.ow2.weblab.core.model.Text text,
                       org.ow2.weblab.core.model.Audio audio,
                       edu.cmu.sphinx.decoder.search.Token token,
                       edu.cmu.sphinx.frontend.FloatData startFeature,
                       edu.cmu.sphinx.frontend.FloatData endFeature)
Generates the two segments with all configured annotations

Parameters:
textAnnotation - Annotation where to add text segment metadata
audioAnnotation - Annotation where to add audio segment metadata
text - Text unit to add linear segment to
audio - Audio unit to add temporal segment to
token - the Sphinx recognnized token
startFeature - Sphinx started feature
endFeature - Sphinx ending feature

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object


Copyright © 2004-2011. All Rights Reserved.