Class CmdLineCrawl
java.lang.Object
org.lockss.laaws.crawler.impl.pluggable.PluggableCrawl
org.lockss.laaws.crawler.impl.pluggable.CmdLineCrawl
A class to wrap a single CommandLineCrawl
-
Nested Class Summary
Nested classes/interfaces inherited from class org.lockss.laaws.crawler.impl.pluggable.PluggableCrawl
PluggableCrawl.PluggableCrawlerStatus -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static Patternprotected CmdLineCrawlerprotected Stringprotected static Patternprotected Stringprotected static Patternprotected Stringprotected FileThe temp directory used to store any files.protected static PatternFields inherited from class org.lockss.laaws.crawler.impl.pluggable.PluggableCrawl
au, crawlDesc, crawlerConfig, crawlerStatus, crawlJob -
Constructor Summary
ConstructorsConstructorDescriptionCmdLineCrawl(CmdLineCrawler crawler, org.lockss.plugin.ArchivalUnit au, org.lockss.util.rest.crawler.CrawlJob crawlJob) Instantiates a new Cmd line crawl. -
Method Summary
Modifier and TypeMethodDescriptionstatic longextractBytes(String str) extractUrls(String text) Returns a list with all links contained in the inputGets command.org.lockss.daemon.LockssRunnablegetStems()Gets tmp dir.getWarcFiles(List<String> exts) voidorg.lockss.crawler.CrawlerStatusEnqueue a crawl request.org.lockss.crawler.CrawlerStatusStop crawl crawler status.toString()Methods inherited from class org.lockss.laaws.crawler.impl.pluggable.PluggableCrawl
generateKey, getAu, getAuId, getCrawlDesc, getCrawlerConfig, getCrawlerId, getCrawlerStatus, getCrawlKey, getCrawlKind, getCrawlStatus, getJobStatus, setCrawlerStatus
-
Field Details
-
crawler
-
threadName
-
command
-
tmpDir
The temp directory used to store any files. -
outputLogLevel
-
errorLogLevel
-
successPattern
-
errorPattern
-
urlPattern
-
bytesPattern
-
-
Constructor Details
-
CmdLineCrawl
public CmdLineCrawl(CmdLineCrawler crawler, org.lockss.plugin.ArchivalUnit au, org.lockss.util.rest.crawler.CrawlJob crawlJob) Instantiates a new Cmd line crawl.- Parameters:
crawler- the crawler for this crawlcrawlJob- the job for this crawl
-
-
Method Details
-
startCrawl
public org.lockss.crawler.CrawlerStatus startCrawl()Description copied from class:PluggableCrawlEnqueue a crawl request.- Specified by:
startCrawlin classPluggableCrawl- Returns:
- the crawler status
-
stopCrawl
public org.lockss.crawler.CrawlerStatus stopCrawl()Description copied from class:PluggableCrawlStop crawl crawler status.- Specified by:
stopCrawlin classPluggableCrawl- Returns:
- the crawler status
-
getTmpDir
Gets tmp dir.- Returns:
- the tmp dir
-
getWarcFiles
-
getCommand
Gets command.- Returns:
- the command
-
getReqUrls
-
getStems
-
getRunnable
public org.lockss.daemon.LockssRunnable getRunnable() -
parseLine
-
extractUrls
Returns a list with all links contained in the input -
extractBytes
-
toString
-