public class DefaultPageCrawler extends Object implements PageCrawler
PageHandler and use its declarations of
UrlMapping as applicable url patterns.| Modifier and Type | Field and Description |
|---|---|
protected LinkUtils |
linkUtils |
protected org.apache.commons.logging.Log |
logger |
protected PageHandlerFactory |
pageHandlerFactory |
protected List<PageHandler> |
pageHandlers |
protected int |
pageLoadAjaxSeconds |
protected int |
pageLoadWaitSeconds |
protected Set<String> |
urlPatterns |
protected VaniContext |
vaniContext |
protected WaitUtil |
waitUtil |
protected org.openqa.selenium.WebDriver |
webDriver |
| Constructor and Description |
|---|
DefaultPageCrawler() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
crawl()
This method does the crawling.
|
protected List<String> |
getApplicableUrls()
This method look for all urls on current page, which matches the declared
url patterns.
|
protected void |
handle(String url)
This method opens given url, executes waits and calls all applicable
PageHandler. |
void |
initializeHandlers()
This method looks for classes annotated with
PageHandler. |
protected boolean |
isVisited(String url,
Set<String> visitedUrls)
This method will check whether provided url is in given
visitedUrls set. |
protected String |
removeJSessionId(String url)
This method removes the
JSESSIONID from provided url. |
void |
start()
This method will crawl all applicable urls with
WebDriver
resolved by spring context. |
void |
start(org.openqa.selenium.WebDriver webDriver)
This method will crawl all applicable urls with specified
WebDriver. |
protected final org.apache.commons.logging.Log logger
protected org.openqa.selenium.WebDriver webDriver
@Autowired protected VaniContext vaniContext
@Autowired protected PageHandlerFactory pageHandlerFactory
@Autowired protected WaitUtil waitUtil
@Autowired protected LinkUtils linkUtils
@Value(value="${vani.pageCrawler.pageLoadWaitSeconds:1}")
protected int pageLoadWaitSeconds
@Value(value="${vani.pageCrawler.pageLoadAjaxSeconds:1}")
protected int pageLoadAjaxSeconds
protected List<PageHandler> pageHandlers
@PostConstruct public void initializeHandlers()
PageHandler. The found classes will be
instantiated by PageHandlerFactory and the declared url patterns
are registered.public void start(org.openqa.selenium.WebDriver webDriver)
PageCrawlerWebDriver.start in interface PageCrawlerpublic void start()
PageCrawlerWebDriver
resolved by spring context.start in interface PageCrawlerprotected void crawl()
protected void handle(String url)
PageHandler.url - protected boolean isVisited(String url, Set<String> visitedUrls)
visitedUrls set. If specified url contains a jSessionId, it will
be remove before checking.url - visitedUrls - set of all visited urlstrue if given visitedUrls contains provided url
(without jSessionId), else falseprotected String removeJSessionId(String url)
JSESSIONID from provided url. It checks
uppercase and lowercase.url - JSESSIONID or provided url if it
doesn't contains the marker.Copyright © 2016. All rights reserved.