Class CertificateCrawler

  • All Implemented Interfaces:
    Runnable

    public class CertificateCrawler
    extends edu.uci.ics.crawler4j.crawler.WebCrawler
    This class crawls for certificates on the web and inserts it into the database.
    • Field Summary

      • Fields inherited from class edu.uci.ics.crawler4j.crawler.WebCrawler

        logger, myController, myId
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean shouldVisit​(edu.uci.ics.crawler4j.crawler.Page referringPage, edu.uci.ics.crawler4j.url.WebURL url)
      decide whether there may be interesting stuff or not
      void visit​(edu.uci.ics.crawler4j.crawler.Page page)
      retrieve certificates
      • Methods inherited from class edu.uci.ics.crawler4j.crawler.WebCrawler

        getMyController, getMyId, getMyLocalData, getThread, handlePageStatusCode, handleUrlBeforeProcess, init, isNotWaitingForNewURLs, onBeforeExit, onContentFetchError, onContentFetchError, onPageBiggerThanMaxSize, onParseError, onRedirectedStatusCode, onStart, onUnexpectedStatusCode, onUnhandledException, run, setThread, shouldFollowLinksIn
    • Method Detail

      • shouldVisit

        public boolean shouldVisit​(edu.uci.ics.crawler4j.crawler.Page referringPage,
                                   edu.uci.ics.crawler4j.url.WebURL url)
        decide whether there may be interesting stuff or not
        Overrides:
        shouldVisit in class edu.uci.ics.crawler4j.crawler.WebCrawler
      • visit

        public void visit​(edu.uci.ics.crawler4j.crawler.Page page)
        retrieve certificates
        Overrides:
        visit in class edu.uci.ics.crawler4j.crawler.WebCrawler