Package org.dspace.app.sitemap
Class AbstractGenerator
java.lang.Object
org.dspace.app.sitemap.AbstractGenerator
- Direct Known Subclasses:
HTMLSitemapGenerator,SitemapsOrgGenerator
Base class for creating sitemaps of various kinds. A sitemap consists of one
or more files which list significant URLs on a site for search engines to
efficiently crawl. Dates of modification may also be included. A sitemap
index file that links to each of the sitemap files is also generated. It is
this index file that search engines should be directed towards.
Provides most of the required functionality, subclasses need just implement a few methods that specify the "boilerplate" and text for including URLs.
Typical usage:
AbstractGenerator g = new FooGenerator(...);
while (...) {
g.addURL(url, date);
}
g.finish();
- Author:
- Robert Tansley
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected intNumber of bytes written to current fileprotected PrintStreamCurrent outputprotected intNumber of files written so farprotected FileDirectory files are written toprotected intNumber of URLs written to current file -
Constructor Summary
ConstructorsConstructorDescriptionAbstractGenerator(File outputDirIn) Initialize this generator to write to the given directory. -
Method Summary
Modifier and TypeMethodDescriptionvoidAdd the given URL to the sitemap.protected voidFinish with the current sitemap file.intfinish()Complete writing sitemap files and write the index files.abstract StringgetFilename(int number) Return the filename a sitemap at the given index should be stored at.abstract StringGet the filename the index should be written to.abstract StringReturn the boilerplate at the top of a sitemap file.abstract intReturn the maximum size in bytes that an individual sitemap file should be.abstract intReturn the maximum number of URLs that an individual sitemap file should contain.abstract StringReturn the boilerplate at the end of a sitemap file.abstract StringgetURLText(String url, Instant lastMod) Return marked-up text to be included in a sitemap about a given URL.protected voidStart writing a new sitemap file.abstract booleanReturn whether the written sitemap files and index should be GZIP-compressed.abstract voidwriteIndex(PrintStream output, int sitemapCount) Write the index file.
-
Field Details
-
fileCount
protected int fileCountNumber of files written so far -
bytesWritten
protected int bytesWrittenNumber of bytes written to current file -
urlsWritten
protected int urlsWrittenNumber of URLs written to current file -
outputDir
Directory files are written to -
currentOutput
Current output
-
-
Constructor Details
-
AbstractGenerator
Initialize this generator to write to the given directory. This must be called by any subclass constructor.- Parameters:
outputDirIn- directory to write sitemap files to
-
-
Method Details
-
startNewFile
Start writing a new sitemap file.- Throws:
IOException- if IO error if an error occurs creating the file
-
addURL
Add the given URL to the sitemap.- Parameters:
url- Full URL to addlastMod- Date URL was last modified, ornull- Throws:
IOException- if IO error if an error occurs writing
-
closeCurrentFile
Finish with the current sitemap file.- Throws:
IOException- if IO error if an error occurs writing
-
finish
Complete writing sitemap files and write the index files. This is invoked when all calls toaddURL(String, Instant)have been completed, and invalidates the generator.- Returns:
- number of sitemap files written.
- Throws:
IOException- if IO error if an error occurs writing
-
getURLText
Return marked-up text to be included in a sitemap about a given URL.- Parameters:
url- URL to add information aboutlastMod- date URL was last modified, ornullif unknown or not applicable- Returns:
- the mark-up to include
-
getLeadingBoilerPlate
Return the boilerplate at the top of a sitemap file.- Returns:
- The boilerplate markup.
-
getTrailingBoilerPlate
Return the boilerplate at the end of a sitemap file.- Returns:
- The boilerplate markup.
-
getMaxSize
public abstract int getMaxSize()Return the maximum size in bytes that an individual sitemap file should be.- Returns:
- the size in bytes.
-
getMaxURLs
public abstract int getMaxURLs()Return the maximum number of URLs that an individual sitemap file should contain.- Returns:
- the maximum number of URLs.
-
useCompression
public abstract boolean useCompression()Return whether the written sitemap files and index should be GZIP-compressed.- Returns:
trueif GZIP compression should be used,falseotherwise.
-
getFilename
Return the filename a sitemap at the given index should be stored at.- Parameters:
number- index of the sitemap file (zero is first).- Returns:
- the filename to write the sitemap to.
-
getIndexFilename
Get the filename the index should be written to.- Returns:
- the filename of the index.
-
writeIndex
Write the index file.- Parameters:
output- stream to write the index tositemapCount- number of sitemaps that were generated- Throws:
IOException- if IO error if an IO error occurs
-