Class AbstractGenerator

  • Direct Known Subclasses:
    HTMLSitemapGenerator, SitemapsOrgGenerator

    public abstract class AbstractGenerator
    extends Object
    Base class for creating sitemaps of various kinds. A sitemap consists of one or more files which list significant URLs on a site for search engines to efficiently crawl. Dates of modification may also be included. A sitemap index file that links to each of the sitemap files is also generated. It is this index file that search engines should be directed towards.

    Provides most of the required functionality, subclasses need just implement a few methods that specify the "boilerplate" and text for including URLs.

    Typical usage:

       AbstractGenerator g = new FooGenerator(...);
       while (...) {
         g.addURL(url, date);
       }
       g.finish();
     
    Author:
    Robert Tansley
    • Field Detail

      • fileCount

        protected int fileCount
        Number of files written so far
      • bytesWritten

        protected int bytesWritten
        Number of bytes written to current file
      • urlsWritten

        protected int urlsWritten
        Number of URLs written to current file
      • outputDir

        protected File outputDir
        Directory files are written to
      • currentOutput

        protected PrintStream currentOutput
        Current output
    • Constructor Detail

      • AbstractGenerator

        public AbstractGenerator​(File outputDirIn)
        Initialize this generator to write to the given directory. This must be called by any subclass constructor.
        Parameters:
        outputDirIn - directory to write sitemap files to
    • Method Detail

      • startNewFile

        protected void startNewFile()
                             throws IOException
        Start writing a new sitemap file.
        Throws:
        IOException - if IO error if an error occurs creating the file
      • addURL

        public void addURL​(String url,
                           Date lastMod)
                    throws IOException
        Add the given URL to the sitemap.
        Parameters:
        url - Full URL to add
        lastMod - Date URL was last modified, or null
        Throws:
        IOException - if IO error if an error occurs writing
      • closeCurrentFile

        protected void closeCurrentFile()
                                 throws IOException
        Finish with the current sitemap file.
        Throws:
        IOException - if IO error if an error occurs writing
      • finish

        public int finish()
                   throws IOException
        Complete writing sitemap files and write the index files. This is invoked when all calls to addURL(String, Date) have been completed, and invalidates the generator.
        Returns:
        number of sitemap files written.
        Throws:
        IOException - if IO error if an error occurs writing
      • getURLText

        public abstract String getURLText​(String url,
                                          Date lastMod)
        Return marked-up text to be included in a sitemap about a given URL.
        Parameters:
        url - URL to add information about
        lastMod - date URL was last modified, or null if unknown or not applicable
        Returns:
        the mark-up to include
      • getLeadingBoilerPlate

        public abstract String getLeadingBoilerPlate()
        Return the boilerplate at the top of a sitemap file.
        Returns:
        The boilerplate markup.
      • getTrailingBoilerPlate

        public abstract String getTrailingBoilerPlate()
        Return the boilerplate at the end of a sitemap file.
        Returns:
        The boilerplate markup.
      • getMaxSize

        public abstract int getMaxSize()
        Return the maximum size in bytes that an individual sitemap file should be.
        Returns:
        the size in bytes.
      • getMaxURLs

        public abstract int getMaxURLs()
        Return the maximum number of URLs that an individual sitemap file should contain.
        Returns:
        the maximum number of URLs.
      • useCompression

        public abstract boolean useCompression()
        Return whether the written sitemap files and index should be GZIP-compressed.
        Returns:
        true if GZIP compression should be used, false otherwise.
      • getFilename

        public abstract String getFilename​(int number)
        Return the filename a sitemap at the given index should be stored at.
        Parameters:
        number - index of the sitemap file (zero is first).
        Returns:
        the filename to write the sitemap to.
      • getIndexFilename

        public abstract String getIndexFilename()
        Get the filename the index should be written to.
        Returns:
        the filename of the index.
      • writeIndex

        public abstract void writeIndex​(PrintStream output,
                                        int sitemapCount)
                                 throws IOException
        Write the index file.
        Parameters:
        output - stream to write the index to
        sitemapCount - number of sitemaps that were generated
        Throws:
        IOException - if IO error if an IO error occurs