Package ai.preferred.venom
Class Crawler.Builder
- java.lang.Object
-
- ai.preferred.venom.Crawler.Builder
-
- Enclosing class:
- Crawler
public static final class Crawler.Builder extends java.lang.ObjectA builder for crawler class.
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Crawlerbuild()Builds the crawler with the options specified.Crawler.BuildersetFetcher(@NotNull Fetcher fetcher)Sets the Fetcher to be used, if not set, default will be chosen.Crawler.BuildersetHandlerRouter(@NotNull HandlerRouter router)Sets HandlerRouter to be used.Crawler.BuildersetMaxConnections(int maxConnections)The number of concurrent connections allowed out of the client.Crawler.BuildersetMaxTries(int maxTries)Sets number of times to retry for a request.Crawler.BuildersetName(@NotNull java.lang.String name)Sets the name for crawler thread.Crawler.BuildersetParallelism(int parallelism)Sets the parallelism level.Crawler.BuildersetPropRetainProxy(double propRetainProxy)Sets the proportion of max tries where a specified proxy, if specified will be used.Crawler.BuildersetScheduler(@NotNull AbstractQueueScheduler scheduler)Sets the Scheduler to be used, if not set, default will be chosen.Crawler.BuildersetSession(@NotNull Session session)Sets the Session to be used, if not set, defaults to none.Crawler.BuildersetSleepScheduler(SleepScheduler sleepScheduler)Sets the SleepScheduler to be used, if not set, default will be chosen.Crawler.BuildersetWorkerManager(@NotNull WorkerManager workerManager)Sets the WorkerManager to be used, if not set, default will be chosen.
-
-
-
Method Detail
-
setName
public Crawler.Builder setName(@NotNull @NotNull java.lang.String name)
Sets the name for crawler thread.- Parameters:
name- name for crawler thread- Returns:
- this
-
setFetcher
public Crawler.Builder setFetcher(@NotNull @NotNull Fetcher fetcher)
Sets the Fetcher to be used, if not set, default will be chosen.- Parameters:
fetcher- fetcher to be used.- Returns:
- this
-
setParallelism
public Crawler.Builder setParallelism(int parallelism)
Sets the parallelism level. Defaults to system thread count.- Parameters:
parallelism- the parallelism level.- Returns:
- this
-
setWorkerManager
public Crawler.Builder setWorkerManager(@NotNull @NotNull WorkerManager workerManager)
Sets the WorkerManager to be used, if not set, default will be chosen.- Parameters:
workerManager- result workerManager to be used.- Returns:
- this
-
setScheduler
public Crawler.Builder setScheduler(@NotNull @NotNull AbstractQueueScheduler scheduler)
Sets the Scheduler to be used, if not set, default will be chosen.- Parameters:
scheduler- scheduler to be used.- Returns:
- this
-
setHandlerRouter
public Crawler.Builder setHandlerRouter(@NotNull @NotNull HandlerRouter router)
Sets HandlerRouter to be used. Defaults to none.- Parameters:
router- handler router to be used.- Returns:
- this
-
setMaxConnections
public Crawler.Builder setMaxConnections(int maxConnections)
The number of concurrent connections allowed out of the client.- Parameters:
maxConnections- maximum number of concurrent connections.- Returns:
- this
-
setMaxTries
public Crawler.Builder setMaxTries(int maxTries)
Sets number of times to retry for a request. This number excludes the first try. Defaults to 50.- Parameters:
maxTries- max retry times.- Returns:
- this
-
setPropRetainProxy
public Crawler.Builder setPropRetainProxy(double propRetainProxy)
Sets the proportion of max tries where a specified proxy, if specified will be used. Number should be between 0 and 1 inclusive, Defaults to 0.05.This only comes into effect when a specific proxy is set for the request. This proxy set will be overridden beyond this threshold.
- Parameters:
propRetainProxy- threshold percentage.- Returns:
- this
-
setSleepScheduler
public Crawler.Builder setSleepScheduler(SleepScheduler sleepScheduler)
Sets the SleepScheduler to be used, if not set, default will be chosen.- Parameters:
sleepScheduler- sleepAndGetTime scheduler to be used.- Returns:
- this
-
setSession
public Crawler.Builder setSession(@NotNull @NotNull Session session)
Sets the Session to be used, if not set, defaults to none.- Parameters:
session- Sessions where variables are defined- Returns:
- this
-
build
public Crawler build()
Builds the crawler with the options specified.- Returns:
- an instance of Crawler
-
-