Package org.dspace.storage.bitstore
Provides an API for storing, retrieving and deleting streams of bits in a transactionally safe fashion. The main class is BitstreamStorageManager.
Using the Bitstore API
An example use of the Bitstore API is shown below:
// Create or obtain a context object
Context context;
// Stream to store
InputStream stream;
try
{
// Store the stream
int id = BitstreamStorageManager.store (context, stream);
// Retrieve it
InputStream retrieved = BitstreamStorageManager.retrieve(context, id);
// Delete it
BitstreamStorageManager.delete(context, id);
// Complete the context object so changes are written
}
// Error with I/O operations
catch (IOException ioe)
{
}
// Database error
catch (SQLException sqle)
{
}
Storage mechanism
The BitstreamStorageManager stores files in one or more asset store
directories. These can be configured in dspace.cfg. For
example:
assetstore.dir = /dspace/assetstore
The above example specifies a single asset store.
assetstore.dir = /dspace/assetstore_0 assetstore.dir.1 = /mnt/other_filesystem/assetstore_1
The above example specifies two asset stores. assetstore.dir
specifies the asset store number 0 (zero); after that use
assetstore.dir.1, assetstore.dir.2 and so on. The
particular asset store a bitstream is stored in is held in the database, so
don't move bitstreams between asset stores, and don't renumber them.
By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the
assetstore.dir property.) To change this, for example when asset
store 0 is getting full, add a line to dspace.cfg like:
assetstore.incoming = 1
Then restart DSpace (Tomcat). New bitstreams will be written to the asset
store specified by assetstore.dir.1, which is
/mnt/other_filesystem/assetstore_1 in the above example.
Moving an Asset Store
You can move an asset store as a whole to a new location in the file system; stop DSpace
(Tomcat), move all of the contents to the new location, change the appropriate
line in dspace.cfg, and restart DSpace (Tomcat).
We will be providing administration tools for more sophisticated management of these asset stores in the future.
When given a stream of bits to store, the BitstreamStorageManager generates a unique key for the stream. The key takes the form of a long sequence of digits, which is transformed into a file path. The BitstreamStorageManager stores the contents of the stream in this path, creating parent directories as necessary.
The Bitstore and Transactions
The bitstore is carefully engineered to prevent data loss, using transactional flags in the database. Before a bitstream is actually stored, a metadata entry with the unique bitstream id is committed to the database. If the storage operation fails or is aborted, the deleted flag remains. The bitstore API then ensures that the bitstream cannot be retrieved, and after an hour, the bitstream is eligible for cleanup. The bitstream is accessible only after all database operations have been successfully committed.
Similarly, bitstreams are deleted by simply setting the deleted flag. If an deletion operation is rolled back, the bitstream is still present in the asset store.
Cleaning up the Asset Store
As noted above, sometimes files will be physically present in the
Asset Store even though they are marked deleted in the database.
You can use the command-line utility class
org.dspace.storage.bitstore.Cleanup (which is invoked via
/dspace/bin/cleanup)
to remove the bitstreams which are marked deleted from the Asset Store.
To prevent accidental deletion of bitstreams which are in the process
of being stored, cleanup only removes bitstreams which are more than
an hour old.
-
Interface Summary Interface Description BitStoreService A low-level asset store interface -
Class Summary Class Description BitStoreMigrate Command Line Utility to migrate bitstreams from one assetstore to anotherBitstreamStorageServiceImpl Stores, retrieves and deletes bitstreams.Cleanup Cleans up asset store.DSBitStoreService Native DSpace (or "Directory Scatter" if you prefer) asset store.S3BitStoreService Asset store using Amazon's Simple Storage Service (S3).