public abstract class BaseSeimiCrawler extends java.lang.Object implements SeimiCrawler
| 限定符和类型 | 字段和说明 |
|---|---|
protected java.lang.String |
crawlerName |
protected java.lang.String[] |
defUAs |
protected org.slf4j.Logger |
logger |
| 构造器和说明 |
|---|
BaseSeimiCrawler() |
| 限定符和类型 | 方法和说明 |
|---|---|
java.lang.String[] |
allowRules()
用于设置允许的请求URL匹配规则
|
java.lang.String[] |
denyRules()
用于设置要放弃访问的请求URL匹配规则
|
java.lang.String |
getCrawlerName() |
java.lang.String |
getUserAgent() |
void |
handleErrorRequest(Request request)
当一个请求处理异常次数超过开发者所设置或是默认设置的最大重新处理次数时会调用该方法记录异常请求
|
java.lang.String |
proxy() |
protected void |
push(Request request) |
void |
setCrawlerName(java.lang.String crawlerName) |
java.util.List<Request> |
startRequests() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitstart, startUrlsprotected org.slf4j.Logger logger
protected java.lang.String crawlerName
protected java.lang.String[] defUAs
protected void push(Request request)
public java.lang.String getUserAgent()
getUserAgent 在接口中 SeimiCrawlerpublic java.lang.String[] allowRules()
SeimiCrawlerallowRules 在接口中 SeimiCrawlerpublic java.lang.String[] denyRules()
SeimiCrawlerdenyRules 在接口中 SeimiCrawlerpublic java.lang.String proxy()
proxy 在接口中 SeimiCrawlerpublic void handleErrorRequest(Request request)
SeimiCrawlerhandleErrorRequest 在接口中 SeimiCrawlerrequest - --public java.util.List<Request> startRequests()
startRequests 在接口中 SeimiCrawlerString[] startUrls();无法满足需求的情况下推荐使用public void setCrawlerName(java.lang.String crawlerName)
public java.lang.String getCrawlerName()
Copyright © 2019. All Rights Reserved.