public class WgetCmdLineCrawler extends CmdLineCrawler
CmdLineCrawler.CommandLineBuilder, CmdLineCrawler.RunnableCrawlJob| Modifier and Type | Field and Description |
|---|---|
static String |
ATTR_OUTPUT_LEVEL
The constant ATTR_OUTPUT_LEVEL.
|
static String |
ATTR_SUCCESS_CODE
The constant ATTR_SUCCESS_CODE.
|
ATTR_COMPRESS_WARC, ATTR_COMPRESSED_WARC_FILE_EXTENSION, ATTR_CRAWL_EXECUTOR_SPEC, ATTR_ERROR_LOG_LEVEL, ATTR_EXCLUDE_STATUS_PATTERN, ATTR_JOIN_OUTPUT_STREAMS, ATTR_OUTPUT_LOG_LEVEL, ATTR_PROC_EXIT_WAIT, ATTR_UNCOMPRESSED_WARC_FILE_EXTENSION, ATTR_UNSUPPORTED_PARAMS, cmdLineBuilder, compressWarc, config, crawlMap, DEFAULT_CMDLINE_CRAWL_EXECUTOR_SPEC, DEFAULT_COMPRESS_WARC, DEFAULT_COMPRESSED_WARC_FILE_EXTENSION, DEFAULT_ERROR_LOG_LEVEL, DEFAULT_EXCLUDE_STATUS_PATTERN, DEFAULT_JOIN_OUTPUT_STREAMS, DEFAULT_OUTPUT_LOG_LEVEL, DEFAULT_PROC_EXIT_WAIT, DEFAULT_UNCOMPRESSED_WARC_FILE_EXTENSION, errorLogLevel, excludeStatusPattern, outputLogLevel, pcManager, PREFIX, procExitWait, START_URL_KEY, unsupportedParams, URL_STEMS_KEY, warcFileFilter| Constructor and Description |
|---|
WgetCmdLineCrawler() |
| Modifier and Type | Method and Description |
|---|---|
protected boolean |
didCrawlSucceed(int exitCode) |
List<String> |
getConfigOptions() |
double |
getConnectTimeout() |
double |
getFetchDelay() |
long |
getMaxRetries() |
double |
getReadTimeout() |
double |
getRetryDelay() |
void |
updateCrawlerConfig(CrawlerConfig crawlerConfig)
set the configuration parameters for this crawler
|
deleteAllCrawls, disable, getCmdLineBuilder, getCompressedWarcExtension, getConfig, getCrawl, getCrawlerConfig, getCrawlerId, getErrorLogLevel, getOutputLogLevel, getPluggableCrawlManager, getProcExitWait, getUncompressedWarcExtension, getUnsupportedParams, getWarcFileFilter, initCrawlScheduler, isCrawlerEnabled, isElgibleForCrawl, isJoinOutputStreams, requestCrawl, setCmdLineBuilder, setConfig, setCrawlManager, setNamespace, setPluggableCrawlManager, setV2Repo, shutdown, shutdownWithWait, stopCrawl, storeInRepository, updateAuConfig, useCompressWarcpublic static final String ATTR_SUCCESS_CODE
public static final String ATTR_OUTPUT_LEVEL
public void updateCrawlerConfig(CrawlerConfig crawlerConfig)
PluggableCrawlerupdateCrawlerConfig in interface PluggableCrawlerupdateCrawlerConfig in class CmdLineCrawlercrawlerConfig - the configuration parameters to usepublic long getMaxRetries()
public double getRetryDelay()
public double getConnectTimeout()
public double getReadTimeout()
public double getFetchDelay()
protected boolean didCrawlSucceed(int exitCode)
didCrawlSucceed in class CmdLineCrawlerCopyright © 2000–2023 LOCKSS Program. All rights reserved.