Class Bucket<T>
- java.lang.Object
-
- org.pipecraft.infra.storage.Bucket<T>
-
- Direct Known Subclasses:
LocalDiskBucket
public abstract class Bucket<T> extends Object
A base class for a storage bucket implementations. Supports: 1. Download/upload (single and multiple files) 2. File listing 3. File deletion (single and multiple files) 4. Remote file copying (single and multiple files) 5. Remote File moving 6. Generating signed URLs with expiration for uploading/downloading files 7. Getting file metadata 8. Acquiring a lock using an atomic file write 9. Remotely composing files, creating a single concatenated file 10. Controlling access scope of uploaded files (public/private) Bucket terms: - File - A remote path not ending with '/' - Folder - A remote path ending with '/' - Object Any remote entity (either file or folder) Underlying bucket implementations usually treat all objects as files, and don't have a folder concept at all. Here we want to follow the standard path conventions, and use "/" as a (virtual) folder separator. While this class won't allow it through its API, it is still possible to use external tools to create "folder" entities (objects with paths ending with "/"). We highly recommend avoiding that, in order to prevent weird behaviors. Most of the Bucket operations support retries, and most are interruptible. See the documentation of each method. All implementations must have the following properties: 1. Partial data should never be seen: during a write, no reader should see partial data. Readers should see the previous file version, if any. 2. Read-after-write consistency: a thread writing a file successfully and then trying to read it should succeed reading the latest version. 3. If the caller attempts to create a remote file with path ending with "/", the implementation should reject the request and throw IOException. The motivation is to avoid the existence of "folder files" (i.e. standard objects from the point of view of the cloud library, but following the naming convention of a folder), since it usually creates inconsistent behaviors. 4. Creating a remote file always creates the (virtual) folders leading to the file. When listing direct objects under some path, these virtual folders should be included. The folder doesn't exist by itself, and when all folder files are deleted, the folder should disappear as well. 5. Support uploading files with private access permissions Optional features which implementations may keep unsupported: 1. List-after-write consistency: a thread listing objects after a successful write should see the new file 2. Exclusive file creation operation (lock functionality) 3. Generating Signed URLs and resumable URLs for uploads/downloads 4. Composing remote files 5. Data upload into a remote file using an OutputStream 6. Public access for uploaded files- Author:
- Oren Peer, Eyal Schneider
-
-
Field Summary
Fields Modifier and Type Field Description protected static RetrierDEFAULT_RETRIERprotected static intDEFAULT_RETRY_INITIAL_SLEEP_SECprotected static intDEFAULT_RETRY_MAX_ATTEMPTSprotected static doubleDEFAULT_RETRY_WAIT_TIME_FACTORstatic StringDONE_FILE_NAME
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract Tcompose(List<String> paths, String composedFilePath, boolean removeComprisingFiles)Composes (concats) remote files.voidcopy(String fromKey, String toKey)Copies a a remote file in the current bucket to a different location in the same bucketvoidcopyFolderRecursiveInterruptibly(String srcPath, String dstPath, int parallelism)Copies all files, recursively, from one folder in this bucket to another.voidcopyFolderRecursiveInterruptibly(String srcPath, String dstPath, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Copies all files, recursively, from one folder in this bucket to another.voidcopyInterruptibly(String fromKey, String toKey)Copies a a remote file in the current bucket to a different location in the same bucket.voidcopyInterruptibly(String fromKey, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Copies a a remote file in the current bucket to a different location in the same bucket.abstract voidcopyToAnotherBucket(String fromKey, String toBucket, String toKey)Copies a file between this bucket to another bucketvoidcopyToAnotherBucketInterruptibly(String fromKey, String toBucket, String toKey)Copies a file between this bucket to another bucket, using retries.voidcopyToAnotherBucketInterruptibly(String fromKey, String toBucket, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Copies a file between this bucket to another bucket, using retriesvoiddelete(String key)Deletes a single object.abstract voiddelete(T objectMeta)Deletes a single object.voiddeleteAllByMetaInterruptibly(Collection<T> fileRefs, int parallelism)Deletes a set of remote files.voiddeleteAllByMetaInterruptibly(Collection<T> fileRefs, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a set of remote files.voiddeleteAllByMetaInterruptibly(Iterator<T> fileRefsIt, int parallelism)Deletes a set of remote files.voiddeleteAllByMetaInterruptibly(Iterator<T> fileRefsIt, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a set of remote files.voiddeleteAllInterruptibly(Collection<String> filePaths, int parallelism)Deletes a set of remote files.voiddeleteAllInterruptibly(Collection<String> filePaths, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a set of remote files.voiddeleteAllInterruptibly(Iterator<String> filePathsIt, int parallelism)Deletes a set of remote files.voiddeleteAllInterruptibly(Iterator<String> filePathsIt, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a set of remote files.voiddeleteFolderRecursiveInterruptibly(String path, int parallelism)Deletes all folder contents, recursively.voiddeleteFolderRecursiveInterruptibly(String path, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes all folder contents, recursively.voiddeleteFolderRegularFiles(String path)Deletes all regular files under a remote folder.voiddeleteInterruptibly(String key)Deletes a single object with retries.voiddeleteInterruptibly(String key, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a single object with retries.voiddeleteInterruptibly(T objectMeta)Deletes a single object with retries.voiddeleteInterruptibly(T objectMeta, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Deletes a single object with retries.abstract booleanexists(String key)abstract URLgenerateReadOnlyUrl(String key, int expirationSeconds)Generates a signed url for reading a specific file.URLgenerateResumableSignedUrlForUpload(String key, String contentType, int expirationSeconds)Generates a resumable signed url for a for uploading to a specific file for private access.abstract URLgenerateResumableSignedUrlForUpload(String key, String contentType, int expirationSeconds, Long maxContentLengthInBytes, boolean isPublic)Generates a resumable signed url for a for uploading to a specific file.URLgenerateSignedUrl(String key, String contentType, int expirationSeconds)Generates a signed url for an upload on a specific file.abstract URLgenerateSignedUrl(String key, String contentType, int expirationSeconds, boolean isPublic)Generates a signed url for an upload on a specific file.voidget(String key, File output)Downloads a file given the remote file pathabstract voidget(T meta, File output)Downloads a file given the remote file's metadata objectSet<File>getAllRegularFiles(Collection<String> folderPaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism)downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, int parallelism)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, Function<String,String> fileNameResolver, int parallelism)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, Function<String,String> fileNameResolver, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesInterruptibly(String folderPath, File targetFolder, int parallelism)Downloads all regular files from a given folder, performing the task in parallel Uses retries on individual files, using default retry settings.Set<File>getAllRegularFilesInterruptibly(String folderPath, Predicate<T> predicate, File targetFolder, int parallelism)Downloads regular files from a given remote folder.Set<File>getAllRegularFilesInterruptibly(String folderPath, Predicate<T> predicate, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads regular files from a given remote folder.Set<File>getAllRegularFilesInterruptibly(Collection<String> filePaths, File targetFolder, int parallelism)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism)Downloads a set of regular files from different bucket locations.Set<File>getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a set of regular files from different bucket locations.SizedInputStreamgetAsStream(String key)Gets an input stream for a remote file, using the default chunk size.SizedInputStreamgetAsStream(String key, int chunkSize)Gets an input stream for a remote file, using a given chunk size.SizedInputStreamgetAsStream(T meta)Gets an input stream for a remote file, using the default chunk size.abstract SizedInputStreamgetAsStream(T meta, int chunkSize)Gets an input stream for a remote file, using a given chunk size.StringgetBucketName()<C> CgetFromJson(String key, Class<C> clazz)voidgetInterruptibly(String key, File output)Downloads a file with retries Uses default retry settingsvoidgetInterruptibly(String key, File output, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a file with retriesvoidgetInterruptibly(T meta, File output)Downloads a file with retries Uses default retry settingsvoidgetInterruptibly(T meta, File output, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Downloads a file with retriesTgetLastFile(String path, Pattern pattern)Returns the maximal (lexicographic ordering of the full path string) file in the path, filtering by given pattern.abstract LonggetLastUpdated(T objMetadata)abstract longgetLength(T objMetadata)abstract TgetObjectMetadata(String filePath)Supplies the metadata of a file, if it existsMap<String,T>getObjectMetadata(Collection<String> filePaths)Supplies metadata objects for a given set of remote file paths.Map<String,T>getObjectMetadata(Collection<String> filePaths, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Supplies metadata objects for a given set of remote file paths.OutputStreamgetOutputStream(String key)Gets an output stream for writing to a remote file.abstract OutputStreamgetOutputStream(String key, int chunkSize)Gets an output stream for writing to a remote file.abstract StringgetPath(T objMetadata)voidgetSliced(String key, File output)Downloads a file from cloud storage, using sliced download.voidgetSliced(String key, File output, int chunkSize)Downloads a file from cloud storage, using sliced download.booleanisFile(T objMetadata)booleanisFilePath(String path)booleanisFolderPath(String path)Iterator<T>listFiles(String folderPath)Lists all files directly under the given path (excluding folders)Iterator<T>listFiles(String folderPath, String fileRegex)Lists all files directly under the given path that match the regular expression.Iterator<T>listFiles(String folderPath, Pattern filePattern)Lists all files that match the regex pattern and are located directly under the given pathIterator<T>listFilesRecursive(String folderPath)Returns an iterator for all the files in a given remote folder, recursively.Iterator<T>listFilesRecursive(String folderPath, String fileRegex)Returns an iterator for all the files in a given remote folder matching the pattern, recursively.Iterator<T>listFilesRecursive(String folderPath, Pattern pattern)Returns an iterator for all the files in a given remote folder matching a regex pattern, recursively.Iterator<T>listFolders(String folderPath)Lists all sub-folders in the given pathIterator<T>listObjects(String folderPath)abstract Iterator<T>listObjects(String folderPath, boolean recursive)Iterator<T>listObjects(String folderPath, Predicate<T> condition)Creates an iterator of all objects directly under provided path that satisfy the provided conditionvoidmoveFolderRecursive(String srcPath, String dstPath, int parallelism)Moves all files from one folder to another, recursively.voidmoveFolderRecursive(String srcPath, String dstPath, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Moves all files from one folder to another, recursively.voidmoveInterruptibly(String fromKey, String toKey)Moves a file from one location to another, using retries with default settings.voidmoveInterruptibly(String fromKey, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Moves a file from one location to another, using retries.StringnormalizeFolderPath(String path)appends to path "/" if not exist alreadyvoidput(String key, InputStream input, long length, String contentType, boolean isPublic)Uploads data into a remote file from a givenInputStream.abstract voidput(String key, InputStream input, long length, String contentType, boolean isPublic, boolean allowOverride)Uploads data into a remote file from a givenInputStreamvoidputAllInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic)Uploads all regular files from a given folder.voidputAllInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Uploads all regular files from a given folder.voidputAllRecursiveInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic)uploads a complete local folder to the cloud, recursively.voidputAllRecursiveInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)uploads a complete local folder to the cloud, recursively.StringputDoneFile(String folderPath)Put a _DONE file in a given cloud folder.voidputEmptyFile(String key)Writes an empty file to the given path, setting private access.voidputFile(String key, File input, boolean isPublic)Uploads file to given pathvoidputFileInterruptibly(String key, File input, boolean isPublic)Uploads file to given path, with retries.voidputFileInterruptibly(String key, File input, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)Uploads file to given path, with retriesbooleanputLockFile(String key)Writes a file in an exclusive manner, providing lock semantics.voidputPrivate(String key, File input)Uploads file to given path and sets private read accessvoidputPrivate(String key, InputStream input, long length, String contentType)Uploads file from InputStream and sets private read accessvoidputPublic(String key, File input)Uploads file to given path and sets public read accessvoidputPublic(String key, InputStream input, long length, String contentType)Uploads file from InputStream and sets public read accessStringputUniquePrivate(String folderPath, File input)Uploads file, using an auto generated key, to given remote folder and sets private read accessStringputUniquePrivate(String folderPath, InputStream input, long length, String contentType)Uploads file from InputStream to unique key under given path and sets private read accessStringputUniquePrivate(String folderPath, InputStream input, String extension, long length, String contentType)Uploads a file from InputStream to a unique key under the given path and sets private read access.StringputUniquePublic(String folderPath, File input)Uploads file, using an auto generated key, to given remote folder and sets public read accessStringputUniquePublic(String folderPath, InputStream input, long length, String contentType)Uploads file from InputStream to unique key under given path and sets public read accessStringputUniquePublic(String folderPath, InputStream input, String extension, long length, String contentType)Uploads a file from InputStream to a unique key under the given path and sets public read access.protected voidvalidateNotFolderPath(String path)
-
-
-
Field Detail
-
DONE_FILE_NAME
public static final String DONE_FILE_NAME
- See Also:
- Constant Field Values
-
DEFAULT_RETRY_INITIAL_SLEEP_SEC
protected static final int DEFAULT_RETRY_INITIAL_SLEEP_SEC
- See Also:
- Constant Field Values
-
DEFAULT_RETRY_MAX_ATTEMPTS
protected static final int DEFAULT_RETRY_MAX_ATTEMPTS
- See Also:
- Constant Field Values
-
DEFAULT_RETRY_WAIT_TIME_FACTOR
protected static final double DEFAULT_RETRY_WAIT_TIME_FACTOR
- See Also:
- Constant Field Values
-
DEFAULT_RETRIER
protected static final Retrier DEFAULT_RETRIER
-
-
Constructor Detail
-
Bucket
public Bucket(String bucketName)
Constructor- Parameters:
bucketName- The bucket name
-
-
Method Detail
-
getBucketName
public String getBucketName()
- Returns:
- The bucket name
-
put
public abstract void put(String key, InputStream input, long length, String contentType, boolean isPublic, boolean allowOverride) throws IOException
Uploads data into a remote file from a givenInputStream- Parameters:
key- full path of remote file, relative to the bucketinput-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation. The stream is closed by this method.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.isPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.allowOverride- When false, if the remote file already exists the call will fail withFileAlreadyExistsException. This is an optional feature. Implementations may throwUnsupportedOperationExceptionwhen it's set to true.- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal here, because it follows the folder naming convention.FileAlreadyExistsException- In case that allowOverride is on and supported, and the target file already exists
-
put
public void put(String key, InputStream input, long length, String contentType, boolean isPublic) throws IOException
Uploads data into a remote file from a givenInputStream. If the file exists, it's being overridden.- Parameters:
key- full path of remote file, relative to the bucketinput-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation. The stream is closed by this method.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.isPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putFile
public void putFile(String key, File input, boolean isPublic) throws IOException
Uploads file to given path- Parameters:
key- full path of remote file, relative to the bucketinput- The local input file to read fromisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putLockFile
public boolean putLockFile(String key) throws IOException
Writes a file in an exclusive manner, providing lock semantics. This method guarantees that at most one writer will succeed creating the file. The created file is empty, with private access. Not all implementations support this feature, andUnsupportedOperationExceptionmay be thrown.- Parameters:
key- the lock file path (including file name), relative to the bucket- Returns:
- true when lock operation succeeded, false when the lock was already acquired
- Throws:
IOException- when locking attempt failed (not due to locking failure, which is indicated by the returned value)UnsupportedOperationException- In case that the operation isn't supported
-
putEmptyFile
public void putEmptyFile(String key) throws IOException
Writes an empty file to the given path, setting private access. If the file exists, it's being overridden.- Parameters:
key- the lock file path (including file name), relative to the bucket- Throws:
IOException- In case of IO error, or if the supplied key has a form of a folder rather than a file
-
putDoneFile
public String putDoneFile(String folderPath) throws IOException
Put a _DONE file in a given cloud folder. _DONE files are empty files used as a convention to signal a consumer of the folder files that the data is complete, in order to avoid a scenario where the consumer reads partial data while the producer is still uploading files to the folder. While single file consistency is guaranteed, there's no other method except done files to guarantee that a complete folder is created and ready. If the done file already exists, it is being overridden.- Parameters:
folderPath- The remote folder path. Treated as a folder path anyway - the called may or may not add a '/' to the path.- Returns:
- The full path (relative to the bucket) of the new remote _DONE file
- Throws:
IOException
-
getOutputStream
public OutputStream getOutputStream(String key) throws IOException
Gets an output stream for writing to a remote file. The file is assigned private access permissions. Note that since this is an optional feature, implementations may throwUnsupportedOperationException- Parameters:
key- The remote target file path, relative to the bucket. May exist or not. Overridden by this method if exists.- Returns:
- The output stream to write to. Not buffered. There is no guarantee that the file is written unless the writing completes with no errors,
and the stream is closed by the caller. The written data will override the existing remote file, if exists, in an atomic manner.
The output stream is not allowed to throw runtime exceptions for IO errors while writing - only
IOExceptions are allowed. - Throws:
IOException- in case of a write error, or if the supplied key has a form of a folder rather than a fileUnsupportedOperationException
-
getOutputStream
public abstract OutputStream getOutputStream(String key, int chunkSize) throws IOException
Gets an output stream for writing to a remote file. The file is assigned private access permissions. Note that since this is an optional feature, implementations may throwUnsupportedOperationException- Parameters:
key- The remote target file path, relative to the bucket. May exist or not. Overridden by this method if exists.chunkSize- The size (in bytes) of each written chunk, or 0 for using the default one.- Returns:
- The output stream to write to. Not buffered. There is no guarantee that the file is written unless the writing completes with no errors,
and the stream is closed by the caller. The written data will override the existing remote file, if exists, in an atomic manner.
The output stream is not allowed to throw runtime exceptions for IO errors while writing - only
IOExceptions are allowed. - Throws:
IOException- in case of a write error, or if the supplied key has a form of a folder rather than a fileUnsupportedOperationException
-
putPublic
public void putPublic(String key, File input) throws IOException
Uploads file to given path and sets public read access- Parameters:
key- full path of remote file, relative to the bucketinput- The local input file to read from- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putPublic
public void putPublic(String key, InputStream input, long length, String contentType) throws IOException
Uploads file from InputStream and sets public read access- Parameters:
key- full path of remote file, relative to the bucketinput-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putUniquePublic
public String putUniquePublic(String folderPath, File input) throws IOException
Uploads file, using an auto generated key, to given remote folder and sets public read access- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input- The local input file to read from- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, with no extension.
- Throws:
IOException- in case of an IO error
-
putUniquePublic
public String putUniquePublic(String folderPath, InputStream input, long length, String contentType) throws IOException
Uploads file from InputStream to unique key under given path and sets public read access- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, with no extension.
- Throws:
IOException
-
putUniquePublic
public String putUniquePublic(String folderPath, InputStream input, String extension, long length, String contentType) throws IOException
Uploads a file from InputStream to a unique key under the given path and sets public read access. This method allows to add a suffix (or file extension) to the generated key.- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation.extension- The suffix that should be added to the unique key. In case the value of this parameter is null, no extension will be added (the whole filename will be auto generated)length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, ending with the given extension (if non-null).
- Throws:
IOException
-
putPrivate
public void putPrivate(String key, File input) throws IOException
Uploads file to given path and sets private read access- Parameters:
key- full path of remote file, relative to the bucketinput- The local input file to read from- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putPrivate
public void putPrivate(String key, InputStream input, long length, String contentType) throws IOException
Uploads file from InputStream and sets private read access- Parameters:
key- full path of remote file, relative to the bucketinput- The input stream to read from. Buffering is not required, and is assumed to be added by the implementation.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Throws:
IOException- in case of an actual IO error, or if the given key is illegal. A path ending with '/' is considered to be illegal in this context, because it follows the folder naming convention.
-
putUniquePrivate
public String putUniquePrivate(String folderPath, File input) throws IOException
Uploads file, using an auto generated key, to given remote folder and sets private read access- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input- The local input file to read from- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, with no extension.
- Throws:
IOException- in case of an IO error
-
putUniquePrivate
public String putUniquePrivate(String folderPath, InputStream input, long length, String contentType) throws IOException
Uploads file from InputStream to unique key under given path and sets private read access- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation.length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, with no extension.
- Throws:
IOException
-
putUniquePrivate
public String putUniquePrivate(String folderPath, InputStream input, String extension, long length, String contentType) throws IOException
Uploads a file from InputStream to a unique key under the given path and sets private read access. This method allows to add a suffix (or file extension) to the generated key.- Parameters:
folderPath- full path of the target folder in storage, relative to the bucket. Treated as folder anyway - the caller may or may not add '/' to the path.input-InputStreamsource. Buffering is not required, and is assumed to be added by the implementation.extension- The suffix that should be added to the unique key. In case the value of this parameter is null, no extension will be added (the whole filename will be auto generated)length- size in bytes of input stream. Not all implementations require this field, and it depends on the cloud library requirements. Please read the specific implementation details.contentType- content mime-type (e.g. "image/jpeg"). Not mandatory, may be null.- Returns:
- The full path (relative to the bucket) of the generated file. The file name itself is composed of a generated character sequence, ending with the given extension (if non-null).
- Throws:
IOException
-
putFileInterruptibly
public void putFileInterruptibly(String key, File input, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Uploads file to given path, with retries- Parameters:
key- full path of remote file, relative to the bucketinput- The local input FileisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- in case of an upload failure, or if the supplied key has a form of a folder rather than a fileInterruptedException- in case of an interruption during retries
-
putFileInterruptibly
public void putFileInterruptibly(String key, File input, boolean isPublic) throws IOException, InterruptedException
Uploads file to given path, with retries. Uses default retry settings.- Parameters:
key- full path of remote file, relative to the bucketinput- The local input FileisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.- Throws:
IOException- in case of an upload failure, or if the supplied key has a form of a folder rather than a fileInterruptedException- in case of an interruption during retries
-
putAllInterruptibly
public void putAllInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Uploads all regular files from a given folder. Performs the task in parallel, using retries on individual files.- Parameters:
targetFolder- full path of remote folder, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.inputFolder- The local input folder to read all files fromparallelism- The number of threads to use for the taskisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- in case of an upload failureInterruptedException- in case of an interruption during retries
-
putAllInterruptibly
public void putAllInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic) throws IOException, InterruptedException
Uploads all regular files from a given folder. Performs the task in parallel, using retries on individual files. Uses default retry settings.- Parameters:
targetFolder- full path of remote folder, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.inputFolder- The local input folder to read all files fromparallelism- The number of threads to use for the taskisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.- Throws:
IOException- in case of an upload failureInterruptedException- in case of an interruption during retries
-
putAllRecursiveInterruptibly
public void putAllRecursiveInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
uploads a complete local folder to the cloud, recursively. Performs the task in parallel, using retries on individual files.- Parameters:
targetFolder- full path of remote folder, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.inputFolder- The local input folder to read all contents from, recursively. Empty folders under the input folder aren't copied.parallelism- The number of threads to use for the taskisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while uploading the filesInterruptedException
-
putAllRecursiveInterruptibly
public void putAllRecursiveInterruptibly(String targetFolder, File inputFolder, int parallelism, boolean isPublic) throws IOException, InterruptedException
uploads a complete local folder to the cloud, recursively. Performs the task in parallel, using retries on individual files. Uses default retry settings.- Parameters:
targetFolder- full path of remote folder, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.inputFolder- The local input folder to read all contents from, recursively. Empty folders under the input folder aren't copied.parallelism- The number of threads to use for the taskisPublic- true to set public file access, false for private. Not all implementations support public, so this parameter may be ignored.- Throws:
IOException- In case of IO error while uploading the filesInterruptedException
-
get
public abstract void get(T meta, File output) throws IOException
Downloads a file given the remote file's metadata object- Parameters:
meta- The metadata object pointing to the file to downloadoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.- Throws:
FileNotFoundException- In case that the required object wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.
-
get
public void get(String key, File output) throws IOException
Downloads a file given the remote file path- Parameters:
key- Remote file path, relative to the bucketoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.
-
getSliced
public void getSliced(String key, File output, int chunkSize) throws IOException, InterruptedException
Downloads a file from cloud storage, using sliced download. The default implementation simply delegates to get(key, File), but subclasses may implement this using concurrent download of different file slices as the method name indicates.- Parameters:
key- Remote file path, relative to the bucketoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.chunkSize- The size (in bytes) of each chunk read from the storage at once, or 0 for using the default one. This parameter may be ignored in implementations where it's not relevant.- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to the target file. This includes non-existing folder in the local file path.InterruptedException
-
getSliced
public void getSliced(String key, File output) throws IOException
Downloads a file from cloud storage, using sliced download. The default implementation simply delegates to getSliced(key, File, chunkSize), so if subclasses don't override the latter, then the default single threaded download will be used.- Parameters:
key- Remote file path, relative to the bucketoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to the target file. This includes non-existing folder in the local file path.
-
getAsStream
public abstract SizedInputStream getAsStream(T meta, int chunkSize) throws IOException
Gets an input stream for a remote file, using a given chunk size. The chunk size indicates how many bytes to fetch in each request, and may be ignored by some implementations.- Parameters:
meta- The remote object metadatachunkSize- The size (in bytes) of each chunk read from the storage at once, or 0 for using the default one. This parameter may be ignored in implementations where it's not relevant.- Returns:
- The input stream to read the requested resource from.
Note that this stream is a
SizedInputStream, therefore it provides the data size. The stream is aware of any RuntimeException thrown by the underlying cloud library, and convents it to a proper IOException. The returned stream isn't buffered. - Throws:
IOExceptionFileNotFoundException- In case that the key wasn't found in the bucket
-
getAsStream
public SizedInputStream getAsStream(String key) throws IOException
Gets an input stream for a remote file, using the default chunk size.- Parameters:
key- The remote file path, relative to the bucket- Returns:
- The input stream to read the requested resource from.
Note that this stream is a
SizedInputStream, therefore it provides the data size. The stream is aware of any RuntimeException thrown by the underlying cloud library, and convent it a proper IOException. The returned stream isn't buffered. - Throws:
IOExceptionFileNotFoundException- In case that the key wasn't found in the bucket
-
getAsStream
public SizedInputStream getAsStream(T meta) throws IOException
Gets an input stream for a remote file, using the default chunk size.- Parameters:
meta- The remote object metadata- Returns:
- The input stream to read the requested resource from.
Note that this stream is a
SizedInputStream, therefore it provides the data size. The stream is aware of any RuntimeException thrown by the underlying cloud library, and convent it a proper IOException. The returned stream isn't buffered. - Throws:
IOExceptionFileNotFoundException- In case that the key wasn't found in the bucket
-
getAsStream
public SizedInputStream getAsStream(String key, int chunkSize) throws IOException
Gets an input stream for a remote file, using a given chunk size. The chunk size indicates how many bytes to fetch in each request, and may be ignored by some implementations.- Parameters:
key- The remote file path, relative to the bucketchunkSize- The size (in bytes) of each chunk read from the storage at once, or 0 for using the default one. This parameter may be ignored in implementations where it's not relevant.- Returns:
- The input stream to read the requested resource from.
Note that this stream is a
SizedInputStream, therefore it provides the data size. The stream is aware of any RuntimeException thrown by the underlying cloud library, and convent it a proper IOException. The returned stream isn't buffered. - Throws:
IOExceptionFileNotFoundException- In case that the key wasn't found in the bucket
-
getFromJson
public <C> C getFromJson(String key, Class<C> clazz) throws IOException, IllegalJsonException
- Parameters:
key- Path to an existing json file, relative to the bucketclazz- The java class corresponding to the json file's structure. This serves as a deserialization spec.- Returns:
- An instance of the given class, populated with the data read from the json file.
- Throws:
IOException- In case of read errorIllegalJsonException- In case that the json is illegal and can't be deserialized properly
-
getInterruptibly
public void getInterruptibly(String key, File output, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a file with retries- Parameters:
key- Remote file path, relative to the bucketoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.InterruptedException
-
getInterruptibly
public void getInterruptibly(T meta, File output, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a file with retries- Parameters:
meta- The metadata object pointing to the file to downloadoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.InterruptedException
-
getInterruptibly
public void getInterruptibly(String key, File output) throws IOException, InterruptedException
Downloads a file with retries Uses default retry settings- Parameters:
key- Remote file path, relative to the bucketoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.InterruptedException
-
getInterruptibly
public void getInterruptibly(T meta, File output) throws IOException, InterruptedException
Downloads a file with retries Uses default retry settings- Parameters:
meta- The metadata object pointing to the file to downloadoutput- The target file to write to (overriding write). The file may not exist, but the folder must exist.- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of IO error while reading the file or writing to local file. This includes non-existing folder in the local file path.InterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(String folderPath, File targetFolder, int parallelism) throws IOException, InterruptedException
Downloads all regular files from a given folder, performing the task in parallel Uses retries on individual files, using default retry settings.- Parameters:
folderPath- The source folder path, relative to the bucket. Treated as a folder anyway, meaning that the caller may or may not add '/' to it.targetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
IOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(String folderPath, Predicate<T> predicate, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads regular files from a given remote folder. Performs the task in parallel, and allows filtering the files to download by a predicate. Retries are done on individual files, using default retry settings.- Parameters:
folderPath- The source folder path, relative to the bucket. Treated as a folder anyway, meaning that the caller may or may not add '/' to it.predicate- A predicate on the file objects, defining which files to select for downloadtargetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The set of downloaded file objects
- Throws:
IOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(String folderPath, Predicate<T> predicate, File targetFolder, int parallelism) throws IOException, InterruptedException
Downloads regular files from a given remote folder. Performs the task in parallel, and allows filtering the files to download by a predicate. Retries are done on individual files, using default retry settings.- Parameters:
folderPath- The source folder path, relative to the bucket. Treated as a folder anyway, meaning that the caller may or may not add '/' to it.predicate- A predicate on the file objects, defining which files to select for downloadtargetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
IOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFiles
public Set<File> getAllRegularFiles(Collection<String> folderPaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism) throws IOException
downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Uses default retry settings.- Parameters:
folderPaths- The list of paths of folders to download all direct files fromtargetFolder- The local target folder to write files to. Created if needed.fileNameResolver- A mapper between remote path (relative to bucket) to the local file name to assign to it (withoug path).parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the files
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Expects a fileNameResolver, which allows assigning different names to target files. This may be useful for example when protecting from remote files with the same name to override each other.- Parameters:
cloudFilePaths- The list of remote paths of files to download. Relative to the bucket.targetFolder- The local target folder to write files to. Created if needed.fileNameResolver- A mapper between remote path (relative to bucket) to the local file name to assign to it (just name, without path).parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. The local file names will be exactly as in their remote copy. This means that remote files from different folders will override each other if they have the same name.- Parameters:
cloudFilePaths- The list of remote paths of files to download. Relative to the bucket.targetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(Collection<String> cloudFilePaths, File targetFolder, Function<String,String> fileNameResolver, int parallelism) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Uses default retry settings. Expects a fileNameResolver, which allows assigning different names to target files. This may be useful for example when protecting from remote files with the same name to override each other.- Parameters:
cloudFilePaths- The collection of remote paths of files to download. Relative to the bucket.targetFolder- The local target folder to write files to. Created if needed.fileNameResolver- A mapper between remote path (relative to bucket) to the local file name to assign to it (just name, without path).parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesInterruptibly
public Set<File> getAllRegularFilesInterruptibly(Collection<String> filePaths, File targetFolder, int parallelism) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Uses default retry settings. The local file names will be exactly as in their remote copy. This means that remote files from different folders will override each other if they have the same name.- Parameters:
filePaths- The collection of remote paths of files to download. Relative to the bucket.targetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesByMetaInterruptibly
public Set<File> getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, Function<String,String> fileNameResolver, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Expects a fileNameResolver, which allows assigning different names to target files. This may be useful for example when protecting from remote files with the same name to override each other.- Parameters:
metaObjects- The collection of objects pointing to the remote bucket files to downloadtargetFolder- The local target folder to write files to. Created if needed.fileNameResolver- A mapper between remote path (relative to bucket) to the local file name to assign to it (just name, without path).parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesByMetaInterruptibly
public Set<File> getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. The local file names will be exactly as in their remote copy. This means that remote files from different folders will override each other if they have the same name.- Parameters:
metaObjects- The collection of objects pointing to the remote files to downloadtargetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesByMetaInterruptibly
public Set<File> getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, Function<String,String> fileNameResolver, int parallelism) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Uses default retry settings. Expects a fileNameResolver, which allows assigning different names to target files. This may be useful for example when protecting from remote files with the same name to override each other.- Parameters:
metaObjects- The collection of objects pointing to the remote files to downloadtargetFolder- The local target folder to write files to. Created if needed.fileNameResolver- A mapper between remote path (relative to bucket) to the local file name to assign to it (just name, without path).parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
getAllRegularFilesByMetaInterruptibly
public Set<File> getAllRegularFilesByMetaInterruptibly(Collection<T> metaObjects, File targetFolder, int parallelism) throws IOException, InterruptedException
Downloads a set of regular files from different bucket locations. Performs the task in parallel, using retries on individual files. Uses default retry settings. The local file names will be exactly as in their remote copy. This means that remote files from different folders will override each other if they have the same name.- Parameters:
metaObjects- The collection of objects pointing to the remote files to downloadtargetFolder- The local target folder to write files to. Created if needed.parallelism- The number of threads to use for the task- Returns:
- The set of downloaded file objects
- Throws:
FileNotFoundException- In case that the one of the paths wasn't found in the bucketIOException- In case of IO error while downloading the filesInterruptedException
-
copyToAnotherBucket
public abstract void copyToAnotherBucket(String fromKey, String toBucket, String toKey) throws IOException
Copies a file between this bucket to another bucket- Parameters:
fromKey- The path of the source file, relative to this buckettoBucket- The name of the target bucket (may be the current bucket name)toKey- The path of the target file, relative to the target bucket. Overridden if exists.- Throws:
IOException
-
copy
public void copy(String fromKey, String toKey) throws IOException
Copies a a remote file in the current bucket to a different location in the same bucket- Parameters:
fromKey- The path of the source file, relative to this buckettoKey- The path of the target file, relative to this bucket- Throws:
IOException
-
copyInterruptibly
public void copyInterruptibly(String fromKey, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Copies a a remote file in the current bucket to a different location in the same bucket. Uses retries.- Parameters:
fromKey- The path of the source file, relative to this buckettoKey- The path of the target file, relative to this bucketmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- in case the client or the service has failedInterruptedException
-
copyInterruptibly
public void copyInterruptibly(String fromKey, String toKey) throws IOException, InterruptedException
Copies a a remote file in the current bucket to a different location in the same bucket. Uses retries with default retry settings.- Parameters:
fromKey- The path of the source file, relative to this buckettoKey- The path of the target file, relative to this bucket- Throws:
IOException- in case the client or the service has failedInterruptedException
-
copyToAnotherBucketInterruptibly
public void copyToAnotherBucketInterruptibly(String fromKey, String toBucket, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Copies a file between this bucket to another bucket, using retries- Parameters:
fromKey- The path of the source file, relative to this buckettoBucket- The name of the target buckettoKey- The path of the target file, relative to the target bucketmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- in case the client or the service has failedInterruptedException
-
copyToAnotherBucketInterruptibly
public void copyToAnotherBucketInterruptibly(String fromKey, String toBucket, String toKey) throws IOException, InterruptedException
Copies a file between this bucket to another bucket, using retries. Uses default retry settings.- Parameters:
fromKey- The path of the source file, relative to this buckettoBucket- The name of the target buckettoKey- The path of the target file, relative to the target bucket- Throws:
IOException- in case the client or the service has failedInterruptedException
-
copyFolderRecursiveInterruptibly
public void copyFolderRecursiveInterruptibly(String srcPath, String dstPath, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Copies all files, recursively, from one folder in this bucket to another. Performs the task in parallel, using retries on individual files.- Parameters:
srcPath- source path of the folder to copy, relative to the bucket. Treated as folder, meaning that the caller may or may not append '/' to the path/dstPath- destination path of the folder to copy into, relative to the bucket. Treated as folder, meaning that the caller may or may not append '/' to the path/parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep between retries (increases exponentially)waitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of error copying the filesInterruptedException
-
copyFolderRecursiveInterruptibly
public void copyFolderRecursiveInterruptibly(String srcPath, String dstPath, int parallelism) throws IOException, InterruptedException
Copies all files, recursively, from one folder in this bucket to another. Performs the task in parallel, using retries on individual files. Uses default retry settings.- Parameters:
srcPath- source path of the folder to copy, relative to the bucket. Treated as folder, meaning that the caller may or may not append '/' to the path/dstPath- destination path of the folder to copy into, relative to the bucket. Treated as folder, meaning that the caller may or may not append '/' to the path/parallelism- The number of threads to use for the task- Throws:
FileNotFoundException- In case that the key wasn't found in the bucketIOException- In case of error copying the filesInterruptedException
-
delete
public abstract void delete(T objectMeta) throws IOException
Deletes a single object. Nothing happens and no exception is thrown in case the file doesn't exist.- Parameters:
objectMeta- The metadata object pointing to the remote file to be deleted- Throws:
IOException
-
delete
public void delete(String key) throws IOException
Deletes a single object. Nothing is done in case the file doesn't exist.- Parameters:
key- The path of the remote file, relative to the bucket- Throws:
IOException
-
deleteInterruptibly
public void deleteInterruptibly(String key, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a single object with retries. Nothing is done in case the file doesn't exist.- Parameters:
key- The path of the remote file, relative to the bucketmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOExceptionInterruptedException
-
deleteInterruptibly
public void deleteInterruptibly(T objectMeta, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a single object with retries. Nothing is done in case the file doesn't exist.- Parameters:
objectMeta- The metadata object pointing to the remote file to be deletedmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOExceptionInterruptedException
-
deleteInterruptibly
public void deleteInterruptibly(String key) throws IOException, InterruptedException
Deletes a single object with retries. Nothing is done in case the file doesn't exist. Uses default retry settings.- Parameters:
key- The path of the remote file, relative to the bucket- Throws:
IOExceptionInterruptedException
-
deleteInterruptibly
public void deleteInterruptibly(T objectMeta) throws IOException, InterruptedException
Deletes a single object with retries. Nothing is done in case the file doesn't exist. Uses default retry settings.- Parameters:
objectMeta- The metadata object pointing to the remote file to be deleted- Throws:
IOExceptionInterruptedException
-
deleteFolderRegularFiles
public void deleteFolderRegularFiles(String path) throws IOException
Deletes all regular files under a remote folder.- Parameters:
path- A path to a folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.- Throws:
IOException
-
deleteFolderRecursiveInterruptibly
public void deleteFolderRecursiveInterruptibly(String path, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes all folder contents, recursively. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
path- A path to a folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOExceptioninitialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteFolderRecursiveInterruptibly
public void deleteFolderRecursiveInterruptibly(String path, int parallelism) throws IOException, InterruptedException
Deletes all folder contents, recursively. Uses default retry settings. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
path- A path to a folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.parallelism- The number of threads to use for the task- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllByMetaInterruptibly
public void deleteAllByMetaInterruptibly(Iterator<T> fileRefsIt, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a set of remote files. This is the preferred method to use when the number of remote files to delete is large and we want to be memory sensitive. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
fileRefsIt- An iterator on file references of all files to delete.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOExceptioninitialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllByMetaInterruptibly
public void deleteAllByMetaInterruptibly(Iterator<T> fileRefsIt, int parallelism) throws IOException, InterruptedException
Deletes a set of remote files. This is the preferred method to use when the number of remote files to delete is large and we want to be memory sensitive. Uses default retry settings. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
fileRefsIt- An iterator on file references of all files to delete.parallelism- The number of threads to use for the task- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllByMetaInterruptibly
public void deleteAllByMetaInterruptibly(Collection<T> fileRefs, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a set of remote files. Implementations which support fast batch operations are encouraged to override this method. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
fileRefs- The metadata objects pointing to all bucket files to deleteparallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOExceptioninitialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllByMetaInterruptibly
public void deleteAllByMetaInterruptibly(Collection<T> fileRefs, int parallelism) throws IOException, InterruptedException
Deletes a set of remote files. Uses default retry settings. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
fileRefs- The metadata objects pointing to all bucket files to deleteparallelism- The number of threads to use for the task- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllInterruptibly
public void deleteAllInterruptibly(Iterator<String> filePathsIt, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a set of remote files. This is the preferred method to use when the number of remote files to delete is large and we want to be memory sensitive. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
filePathsIt- An iterator on paths of all files to delete. Paths are relative to the bucket.parallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries on individual files in case of IOExceptioninitialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllInterruptibly
public void deleteAllInterruptibly(Iterator<String> filePathsIt, int parallelism) throws IOException, InterruptedException
Deletes a set of remote files. This is the preferred method to use when the number of remote files to delete is large and we want to be memory sensitive. Uses default retry settings. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
filePathsIt- An iterator on paths of all files to delete. Paths are relative to the bucket.parallelism- The number of threads to use for the task- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllInterruptibly
public void deleteAllInterruptibly(Collection<String> filePaths, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Deletes a set of remote files. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
filePaths- The paths of all files to delete, relative to the bucketparallelism- The number of threads to use for the taskmaxRetries- Maximum number of retries in case of IOExceptioninitialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
deleteAllInterruptibly
public void deleteAllInterruptibly(Collection<String> filePaths, int parallelism) throws IOException, InterruptedException
Deletes a set of remote files. Uses default retry settings. In the default implementation, the 'parallelism' parameter controls the number of threads used for the operation. However, some implementation support fast batch operations and may ignore this parameter. Retry policy depends on the implementation. The default implementation performs retries on individual file level, but implementations may use global level retries.- Parameters:
filePaths- The paths of all files to delete, relative to the bucketparallelism- The number of threads to use for the task- Throws:
IOException- In case of IO error while deleting the filesInterruptedException
-
moveInterruptibly
public void moveInterruptibly(String fromKey, String toKey, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Moves a file from one location to another, using retries. By default, implemented as a combination of copy and delete, which means that the operation isn't atomic, and partial results may be seen. However, it is guaranteed that the source file isn't removed before the copy is complete.- Parameters:
fromKey- The path of the source file, relative to the buckettoKey- The path of the target file, relative to the bucketmaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Throws:
IOExceptionInterruptedException
-
moveInterruptibly
public void moveInterruptibly(String fromKey, String toKey) throws IOException, InterruptedException
Moves a file from one location to another, using retries with default settings. By default, implemented as a combination of copy and delete, which means that partial results may be seen. However, it is guaranteed that the source file isn't removed before the copy is complete. Uses default retry settings.- Parameters:
fromKey- The path of the source file, relative to the buckettoKey- The path of the target file, relative to the bucket- Throws:
IOException- in case the client or the service has failedInterruptedException
-
moveFolderRecursive
public void moveFolderRecursive(String srcPath, String dstPath, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Moves all files from one folder to another, recursively. Performs the task in parallel, using retries on individual files. By default, implemented as a combination of copy and delete, which means that partial results may be seen. However, it is guaranteed that any of the source files isn't removed before the copy is complete.- Parameters:
srcPath- The path to the source folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.dstPath- The path to the destination folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.maxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)parallelism- The number of threads to use for the task- Throws:
IOExceptionInterruptedException
-
moveFolderRecursive
public void moveFolderRecursive(String srcPath, String dstPath, int parallelism) throws IOException, InterruptedException
Moves all files from one folder to another, recursively. Performs the task in parallel, using retries on individual files. By default, implemented as a combination of copy and delete, which means that partial results may be seen. However, it is guaranteed that any of the source files isn't removed before the copy is complete. Uses default retry settings.- Parameters:
srcPath- The path to the source folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.dstPath- The path to the destination folder, relative to the bucket. The path is treated as a folder anyway, regardless of whether the suffix is '/' or not.parallelism- The number of threads to use for the task- Throws:
IOExceptionInterruptedException
-
normalizeFolderPath
public String normalizeFolderPath(String path)
appends to path "/" if not exist already
-
exists
public abstract boolean exists(String key) throws IOException
- Parameters:
key- file path, relative to the bucket- Returns:
- true if and only if the file exists in the bucket. In case the key point to a folder (either existing or not), false is returned.
- Throws:
IOException- upon failure to inspect the bucket
-
listObjects
public abstract Iterator<T> listObjects(String folderPath, boolean recursive) throws IOException
- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.recursive- Indicates whether to to return objects recursively beneath the given folder- Returns:
- The set of all objects (files and folders) under the given path, as an iterator.
In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
In case of an IO error during iteration, any runtime exception is allowed, but
UncheckedIOExceptionis recommended. We recommend implementations to return a lazy iterator (with pages in the background) in order to be more memory friendly. - Throws:
IOException- upon failure to list the bucket
-
listObjects
public Iterator<T> listObjects(String folderPath) throws IOException
- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.- Returns:
- The set of all objects (files and folders) directly under the given path, as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listObjects
public Iterator<T> listObjects(String folderPath, Predicate<T> condition) throws IOException
Creates an iterator of all objects directly under provided path that satisfy the provided condition- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.condition- A predicate on the listed object, determining whether to include it in the iteration or not.- Returns:
- The set of all condition-satisfying objects (files/folders) under the given path. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFilesRecursive
public Iterator<T> listFilesRecursive(String folderPath) throws IOException
Returns an iterator for all the files in a given remote folder, recursively.- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.- Returns:
- The list of all files in the tree of the given path as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFilesRecursive
public Iterator<T> listFilesRecursive(String folderPath, Pattern pattern) throws IOException
Returns an iterator for all the files in a given remote folder matching a regex pattern, recursively.- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.pattern- the file pattern to look for, will look in the whole file path (not only file name, but folders too)- Returns:
- The list of all files in the tree of the given path as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFilesRecursive
public Iterator<T> listFilesRecursive(String folderPath, String fileRegex) throws IOException
Returns an iterator for all the files in a given remote folder matching the pattern, recursively.- Parameters:
folderPath- A path under the bucket.fileRegex- the file pattern to look for, will look in the whole file path (not only file name, but folders too)- Returns:
- The list of all files in the tree of the given path as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFiles
public Iterator<T> listFiles(String folderPath) throws IOException
Lists all files directly under the given path (excluding folders)- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.- Returns:
- The list of all files under the given path, as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFiles
public Iterator<T> listFiles(String folderPath, Pattern filePattern) throws IOException
Lists all files that match the regex pattern and are located directly under the given path- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.filePattern- regex pattern to match the file name (without path) and filter results by- Returns:
- The list of all objects in the given path as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFolders
public Iterator<T> listFolders(String folderPath) throws IOException
Lists all sub-folders in the given path- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.- Returns:
- The list of all folder objects under the given path. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
listFiles
public Iterator<T> listFiles(String folderPath, String fileRegex) throws IOException
Lists all files directly under the given path that match the regular expression.- Parameters:
folderPath- A folder path, relative to the bucket. Treated as a folder path - the caller may or may not add '/' to it.fileRegex- regular expression to match the file name (without path) and filter results by- Returns:
- The list of all objects in the given path, as an iterator. In case that the path doesn't exist, returns an empty iterator. No particular order is guaranteed.
- Throws:
IOException- upon failure to list the bucket
-
getLastFile
public T getLastFile(String path, Pattern pattern) throws IOException
Returns the maximal (lexicographic ordering of the full path string) file in the path, filtering by given pattern. The search is recursive.- Parameters:
path- the base path to look for, relative to the bucketpattern- the full path pattern (relative to bucket) to match against- Returns:
- the metadata object corresponding the the last file, or null if not found
- Throws:
IOException
-
getObjectMetadata
public abstract T getObjectMetadata(String filePath) throws IOException
Supplies the metadata of a file, if it exists- Parameters:
filePath- the path for the file, relative to the current bucket- Returns:
- the metadata of the required object
- Throws:
IOException- upon failure to retrieve the metadata for the specified fileFileNotFoundException- if the provided path refers to a non existing remote file, or if it refers to a folder
-
getObjectMetadata
public Map<String,T> getObjectMetadata(Collection<String> filePaths) throws IOException, InterruptedException
Supplies metadata objects for a given set of remote file paths. This basic implementation delegates to getObjectMetadata(String), and runs requests in parallel (using #cores threads). Implementations are encouraged to override this method with a more efficient batch operation if available.- Parameters:
filePaths- The remote paths to fetch the metadata objects for- Returns:
- The metadata objects map, where keys are input paths and values are corresponding metadata objects. Nulls will be returned for a path if and only if that path doesn't exist.
- Throws:
IOException- upon storage failure while trying to retrieve the metadata objectsInterruptedException- in case that the thread is interrupted
-
getObjectMetadata
public Map<String,T> getObjectMetadata(Collection<String> filePaths, int maxRetries, int initialRetrySleepSec, double waitTimeFactor) throws IOException, InterruptedException
Supplies metadata objects for a given set of remote file paths. This basic implementation delegates to getObjectMetadata(String), and runs requests in parallel (using #cores threads). Implementations are encouraged to override this method with a more efficient batch operation if available.- Parameters:
filePaths- The remote paths to fetch the metadata objects formaxRetries- Maximum number of retries in case of IOException (except forFileNotFoundExceptionwhich won't trigger retries).initialRetrySleepSec- The initial number of seconds to sleep before the first retrywaitTimeFactor- A factor by which the sleep times between retries increases (millisecond precision)- Returns:
- The metadata objects map, where keys are input paths and values are corresponding metadata objects. Nulls will be returned for a path if and only if that path doesn't exist.
- Throws:
IOException- upon storage failure while trying to retrieve the metadata objectsInterruptedException- in case that the thread is interrupted
-
getPath
public abstract String getPath(T objMetadata)
- Parameters:
objMetadata- A metadata of an object under the bucket- Returns:
- The path of the object represented by this metadata (relative to the bucket).
In case that the metadata object represents a folder (and only in this case), the path should be terminated with '/'.
In case the metadata indicates the object doesn't belong to the current bucket,
IllegalArgumentExceptionis thrown.
-
getLength
public abstract long getLength(T objMetadata)
- Parameters:
objMetadata- A metadata of an object under the bucket or another- Returns:
- The size in bytes of the object as specified by the metadata. In case the metadata object refers to a folder, 0 should be returned.
-
getLastUpdated
public abstract Long getLastUpdated(T objMetadata)
- Parameters:
objMetadata- A metadata of an object under the bucket or another- Returns:
- The time the file was last updated, as milliseconds since epoch In case the metadata object refers to a folder, null should be returned.
-
isFile
public boolean isFile(T objMetadata)
- Parameters:
objMetadata- A metadata of an obhect under this bucket- Returns:
- true if and only if the metadata object refers to a file.
We consider any objects with path ending with "/" as folders, and others as files.
In case the object doesn't belong to the current bucket,
IllegalArgumentExceptionis thrown.
-
generateSignedUrl
public URL generateSignedUrl(String key, String contentType, int expirationSeconds)
Generates a signed url for an upload on a specific file. This generated url will expire within the required number of seconds.
The file will have private read access.
This generated url only allows the client to use PUT operation on it.- Parameters:
key- the target file path, relative to the bucketcontentType- the key's content type. this content type must be supplied by the client when uploading to the url.expirationSeconds- number of seconds during which the sign url is valid- Returns:
- The signed url for the required key
- Throws:
UnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
generateSignedUrl
public abstract URL generateSignedUrl(String key, String contentType, int expirationSeconds, boolean isPublic)
Generates a signed url for an upload on a specific file. This generated url will expire within the required number of seconds.
This generated url only allows the client to use PUT operation on it.- Parameters:
key- the target key, relative to the bucketcontentType- the key's content type. this content type must be supplied by the client when uploading to the url.expirationSeconds- number of seconds in which the sign url is validisPublic- true to set public file access, false for private- Returns:
- a signed url for the required key
- Throws:
UnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
generateReadOnlyUrl
public abstract URL generateReadOnlyUrl(String key, int expirationSeconds)
Generates a signed url for reading a specific file. This generated url will expire within the required number of seconds.
This generated url only allows the client read (GET) from this url.- Parameters:
key- the remote file path, relative to the bucketexpirationSeconds- number of seconds during which the sign url is valid- Returns:
- a signed url for the required file
- Throws:
UnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
generateResumableSignedUrlForUpload
public URL generateResumableSignedUrlForUpload(String key, String contentType, int expirationSeconds) throws IOException
Generates a resumable signed url for a for uploading to a specific file for private access. This generated url will expire within the required number of seconds.
Should be used if cannot anticipate the size of the uploaded file or given to a non-third party, otherwise use the size limited one.
This generated url only allows the client to use PUT operation on it.- Parameters:
key- the target file the sign url points tocontentType- the key's content type. this content type must be supplied by the client when uploading to the url.expirationSeconds- number of seconds in which the sign url is valid- Returns:
- a signed url for the required key
- Throws:
IOException- if could not sign urlUnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
generateResumableSignedUrlForUpload
public abstract URL generateResumableSignedUrlForUpload(String key, String contentType, int expirationSeconds, Long maxContentLengthInBytes, boolean isPublic) throws IOException
Generates a resumable signed url for a for uploading to a specific file. This generated url will expire within the required number of seconds.
The upload file is limited only to the declared size in byte if maxContentLengthInBytes is not null.
This generated url only allows the client to use PUT operation on it.- Parameters:
key- the target file the sign url points tocontentType- the key's content type. this content type must be supplied by the client when uploading to the url.expirationSeconds- number of seconds in which the sign url is validmaxContentLengthInBytes- if not null then limits the uploaded content bytes.isPublic- true to set public file access, false for private.- Returns:
- a signed url for the required key
- Throws:
IOException- if could not sign urlUnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
compose
public abstract T compose(List<String> paths, String composedFilePath, boolean removeComprisingFiles) throws IOException
Composes (concats) remote files. This operation is done remotely, and typically performed in an efficient manner by creating a virtual file rather than copying data. Important notes: 1) Be careful when composing files, since not all file formats (specially compressed ones) are valid after concatenation. 2) The final file is created atomically, but remote intermediate files may be created in the target folder during the operation. The intermediate files are removed (best effort) once the operation terminates, either successfully or not.- Parameters:
paths- The non-empty list of paths (relative to the bucket) of files to compose, in the required ordercomposedFilePath- The target path (relative to the bucket) of the composed fileremoveComprisingFiles- Whether to remove the files that were concatenated. Deletion is done only after successful composing of all files.- Returns:
- The path to the composed object, relative to the bucket. May be one of the input files (for achieving kind of append functionality for example). Overridden if already exists.
- Throws:
IOException- In case of IO error, or if the supplied target file has a form of a folder rather than a fileUnsupportedOperationException- In case that the specific implementation doesn't support this operation
-
validateNotFolderPath
protected void validateNotFolderPath(String path) throws IOException
- Parameters:
path- A remote path, expected to have a file form- Throws:
IOException- In case that the path has the form of a folder path
-
isFolderPath
public boolean isFolderPath(String path)
- Parameters:
path- A remote path- Returns:
- true if and only if the path ends with "/"
-
isFilePath
public boolean isFilePath(String path)
- Parameters:
path- A remote object path- Returns:
- true if and only if the object refers to a file. We consider any path ending with "/" as denoting a folder, and others as files.
-
-