Class HTMLSitemapGenerator


  • public class HTMLSitemapGenerator
    extends AbstractGenerator
    Class for generating HTML "sitemaps" which contain links to various pages in a DSpace site. This should improve search engine coverage of the DSpace site and limit the server load caused by crawlers.
    Author:
    Robert Tansley, Stuart Lewis
    • Field Detail

      • indexURLStem

        protected String indexURLStem
        Stem of URLs sitemaps will eventually appear at
      • indexURLTail

        protected String indexURLTail
        Tail of URLs sitemaps will eventually appear at
    • Constructor Detail

      • HTMLSitemapGenerator

        public HTMLSitemapGenerator​(File outputDirIn,
                                    String urlStem,
                                    String urlTail)
        Construct an HTML sitemap generator, writing files to the given directory, and with the sitemaps eventually exposed at starting with the given URL stem and tail.
        Parameters:
        outputDirIn - Directory to write sitemap files to
        urlStem - start of URL that sitemap files will appear at, e.g. http://dspace.myu.edu/sitemap?sitemap=
        urlTail - end of URL that sitemap files will appear at, e.g. .html or null
    • Method Detail

      • getFilename

        public String getFilename​(int number)
        Description copied from class: AbstractGenerator
        Return the filename a sitemap at the given index should be stored at.
        Specified by:
        getFilename in class AbstractGenerator
        Parameters:
        number - index of the sitemap file (zero is first).
        Returns:
        the filename to write the sitemap to.
      • getMaxSize

        public int getMaxSize()
        Description copied from class: AbstractGenerator
        Return the maximum size in bytes that an individual sitemap file should be.
        Specified by:
        getMaxSize in class AbstractGenerator
        Returns:
        the size in bytes.
      • getMaxURLs

        public int getMaxURLs()
        Description copied from class: AbstractGenerator
        Return the maximum number of URLs that an individual sitemap file should contain.
        Specified by:
        getMaxURLs in class AbstractGenerator
        Returns:
        the maximum number of URLs.
      • getURLText

        public String getURLText​(String url,
                                 Date lastMod)
        Description copied from class: AbstractGenerator
        Return marked-up text to be included in a sitemap about a given URL.
        Specified by:
        getURLText in class AbstractGenerator
        Parameters:
        url - URL to add information about
        lastMod - date URL was last modified, or null if unknown or not applicable
        Returns:
        the mark-up to include
      • useCompression

        public boolean useCompression()
        Description copied from class: AbstractGenerator
        Return whether the written sitemap files and index should be GZIP-compressed.
        Specified by:
        useCompression in class AbstractGenerator
        Returns:
        true if GZIP compression should be used, false otherwise.
      • writeIndex

        public void writeIndex​(PrintStream output,
                               int sitemapCount)
                        throws IOException
        Description copied from class: AbstractGenerator
        Write the index file.
        Specified by:
        writeIndex in class AbstractGenerator
        Parameters:
        output - stream to write the index to
        sitemapCount - number of sitemaps that were generated
        Throws:
        IOException - if IO error if an IO error occurs