Class URLExtractor

    • Field Detail

      • dir

        protected final java.io.File dir
      • baseURL

        protected final java.lang.String baseURL
      • currentURL

        protected final java.lang.String currentURL
      • saved

        protected final java.util.Map<java.lang.String,​java.lang.String> saved
      • deleteFile

        protected boolean deleteFile
    • Constructor Detail

      • URLExtractor

        public URLExtractor​(java.io.File dumpDir,
                            java.lang.String currentURL,
                            java.lang.String baseURL)
        Parameters:
        dumpDir - the local directory where any files are dumped
        currentURL - the current local input url for relative urls - in general this will be a file url (cwd)
        baseURL - the base output url of the extracted data, for instance in an http server environment
    • Method Detail

      • isDeleteFile

        public boolean isDeleteFile()
        Returns:
        the deleteFile
      • setDeleteFile

        public void setDeleteFile​(boolean deleteFile)
        Parameters:
        deleteFile - the deleteFile to set; if true files are move rather than copied note that files are NOT removed from zip or mime packages
      • getSaved

        public java.util.Set<java.lang.String> getSaved()
        Getter for list of saved files
        Returns:
        the saved
      • setWantLog

        public void setWantLog​(boolean bWant)
        Parameters:
        bWant - if true, we will log each move
      • addProtocol

        public void addProtocol​(UrlUtil.URLProtocol protocol)
        add a protocol to the list of protocols that are supported
        Parameters:
        protocol - the protocol to add