Class GoogleStorageBucket


  • public class GoogleStorageBucket
    extends org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
    A storage bucket implementation based on Google Storage. This implementation supports all optional Bucket operations, and maintains List-after-write consistency.
    Author:
    Oren Peer, Eyal Schneider
    • Method Summary

      All Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      com.google.cloud.storage.Blob compose​(List<String> gsPaths, String composedFilePath, boolean removeComprisingFiles)  
      void copyToAnotherBucket​(String fromKey, String toBucket, String toKey)  
      void delete​(com.google.cloud.storage.Blob obj)  
      void deleteAllByMetaInterruptibly​(Collection<com.google.cloud.storage.Blob> fileRefs, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)  
      boolean exists​(String key)  
      URL generateReadOnlyUrl​(String key, int expirationSeconds)  
      URL generateResumableSignedUrlForUpload​(String key, String contentType, int expirationSeconds, Long maxContentLengthInBytes, boolean isPublic)  
      URL generateSignedUrl​(String key, String contentType, int expirationSeconds, boolean isPublicRead)  
      void get​(com.google.cloud.storage.Blob meta, File output)  
      void get​(ExecutorService ex, String key, File output)
      Downloads a file from Google Storage, using sliced download
      void get​(ExecutorService ex, String key, File output, int chunkSize)
      Downloads a file from Google Storage, using sliced download.
      Set<File> getAllRegularFilesByMetaInterruptibly​(Collection<com.google.cloud.storage.Blob> metaObjects, File targetFolder, Function<String,​String> fileNameResolver, int parallelism, int maxRetries, int initialRetrySleepSec, double waitTimeFactor)
      Retrieves a set of files from different paths, in an efficient manner.
      org.pipecraft.infra.io.SizedInputStream getAsStream​(com.google.cloud.storage.Blob meta, int chunkSize)  
      <C> C getFromJson​(String key, Class<C> clazz)  
      Long getLastUpdated​(com.google.cloud.storage.Blob keyMetadata)  
      long getLength​(com.google.cloud.storage.Blob keyMetadata)  
      com.google.cloud.storage.Blob getObjectMetadata​(String key)  
      Map<String,​com.google.cloud.storage.Blob> getObjectMetadata​(Collection<String> filePaths)  
      OutputStream getOutputStream​(String key, int chunkSize)  
      String getPath​(com.google.cloud.storage.Blob keyMetadata)  
      void getSliced​(String key, File output)
      Deprecated.
      Use the other get(ExecutorService ..) or getSliced(..) methods, as they handles better interrupted exceptions
      void getSliced​(String key, File output, int chunkSize)
      Downloads a file from Google Storage, using sliced download.
      Iterator<com.google.cloud.storage.Blob> listObjects​(String folderPath, boolean recursive)  
      void put​(String key, InputStream input, long length, String contentType, boolean isPublic, boolean allowOverride)  
      • Methods inherited from class org.pipecraft.infra.storage.Bucket

        copy, copyFolderRecursiveInterruptibly, copyFolderRecursiveInterruptibly, copyInterruptibly, copyInterruptibly, copyToAnotherBucketInterruptibly, copyToAnotherBucketInterruptibly, delete, deleteAllByMetaInterruptibly, deleteAllByMetaInterruptibly, deleteAllByMetaInterruptibly, deleteAllInterruptibly, deleteAllInterruptibly, deleteAllInterruptibly, deleteAllInterruptibly, deleteFolderRecursiveInterruptibly, deleteFolderRecursiveInterruptibly, deleteFolderRegularFiles, deleteInterruptibly, deleteInterruptibly, deleteInterruptibly, deleteInterruptibly, generateResumableSignedUrlForUpload, generateSignedUrl, get, getAllRegularFiles, getAllRegularFilesByMetaInterruptibly, getAllRegularFilesByMetaInterruptibly, getAllRegularFilesByMetaInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAllRegularFilesInterruptibly, getAsStream, getAsStream, getAsStream, getBucketName, getInterruptibly, getInterruptibly, getInterruptibly, getInterruptibly, getLastFile, getObjectMetadata, getOutputStream, isFile, isFilePath, isFolderPath, listFiles, listFiles, listFiles, listFilesRecursive, listFilesRecursive, listFilesRecursive, listFolders, listObjects, listObjects, moveFolderRecursive, moveFolderRecursive, moveInterruptibly, moveInterruptibly, normalizeFolderPath, put, putAllInterruptibly, putAllInterruptibly, putAllRecursiveInterruptibly, putAllRecursiveInterruptibly, putDoneFile, putEmptyFile, putFile, putFileInterruptibly, putFileInterruptibly, putLockFile, putPrivate, putPrivate, putPublic, putPublic, putUniquePrivate, putUniquePrivate, putUniquePrivate, putUniquePublic, putUniquePublic, putUniquePublic, validateNotFolderPath
    • Method Detail

      • put

        public void put​(String key,
                        InputStream input,
                        long length,
                        String contentType,
                        boolean isPublic,
                        boolean allowOverride)
                 throws IOException
        Specified by:
        put in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • getOutputStream

        public OutputStream getOutputStream​(String key,
                                            int chunkSize)
                                     throws IOException
        Specified by:
        getOutputStream in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • get

        public void get​(com.google.cloud.storage.Blob meta,
                        File output)
                 throws IOException
        Specified by:
        get in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • get

        public void get​(ExecutorService ex,
                        String key,
                        File output,
                        int chunkSize)
                 throws IOException,
                        InterruptedException
        Downloads a file from Google Storage, using sliced download. Retries are performed on individual slices.
        Parameters:
        ex - The executor to use for downloading the file
        key - Google Storage key name (not including bucket)
        output - The target file to write to (overriding write). The file may not exist, but the folder must exist.
        chunkSize - The size (in bytes) of each chunk read from GS at once, or 0 for using the default one. It is recommended that the chunkSize size be between 2MB and 6MB. the memory usage will be at least MAX_THREAD_FOR_DOWNLOAD * chunkSize
        Throws:
        FileNotFoundException - In case that the key wasn't found in the bucket
        IOException - In case of IO error while reading the file, or in case that the thread has been interrupted
        InterruptedException - in case of an interruption while waiting for all slices to be downloaded
      • get

        public void get​(ExecutorService ex,
                        String key,
                        File output)
                 throws IOException,
                        InterruptedException
        Downloads a file from Google Storage, using sliced download
        Parameters:
        ex - The executor to use for downloading the file
        key - Google Storage key name (not including bucket)
        output - The target file to write to
        Throws:
        FileNotFoundException - In case that the key wasn't found in the bucket
        IOException - In case of IO error while reading the file
        InterruptedException - in case of an interruption while waiting for all slices to be downloaded
      • getSliced

        public void getSliced​(String key,
                              File output,
                              int chunkSize)
                       throws IOException,
                              InterruptedException
        Downloads a file from Google Storage, using sliced download. Uses a temporary pooled executor.
        Overrides:
        getSliced in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Parameters:
        key - Google Storage key name (not including bucket)
        output - The target file to write to
        chunkSize - The size (in bytes) of each chunk read from GS at once, or 0 for using the default one. It is recommended that the chunkSize size be between 2MB and 6MB. the memory usage will be at least MAX_THREAD_FOR_DOWNLOAD * chunkSize
        Throws:
        FileNotFoundException - In case that the key wasn't found in the bucket
        IOException - In case of IO error while reading the file
        InterruptedException - in case of an interruption while waiting for all slices to be downloaded
      • getSliced

        @Deprecated
        public void getSliced​(String key,
                              File output)
                       throws IOException
        Deprecated.
        Use the other get(ExecutorService ..) or getSliced(..) methods, as they handles better interrupted exceptions
        Downloads a file from Google Storage, using sliced download. Uses a temporary pooled executor, and default chunk size.
        Overrides:
        getSliced in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Parameters:
        key - Google Storage key name (not including bucket)
        output - The target file to write to
        Throws:
        FileNotFoundException - In case that the key wasn't found in the bucket
        IOException - In case of IO error while reading the file, or if the thread is interrupted while waiting for all slices to download
      • getFromJson

        public <C> C getFromJson​(String key,
                                 Class<C> clazz)
                          throws IOException,
                                 org.pipecraft.infra.storage.IllegalJsonException
        Overrides:
        getFromJson in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
        org.pipecraft.infra.storage.IllegalJsonException
      • getAsStream

        public org.pipecraft.infra.io.SizedInputStream getAsStream​(com.google.cloud.storage.Blob meta,
                                                                   int chunkSize)
                                                            throws IOException
        Specified by:
        getAsStream in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • getAllRegularFilesByMetaInterruptibly

        public Set<File> getAllRegularFilesByMetaInterruptibly​(Collection<com.google.cloud.storage.Blob> metaObjects,
                                                               File targetFolder,
                                                               Function<String,​String> fileNameResolver,
                                                               int parallelism,
                                                               int maxRetries,
                                                               int initialRetrySleepSec,
                                                               double waitTimeFactor)
                                                        throws IOException,
                                                               InterruptedException
        Retrieves a set of files from different paths, in an efficient manner. This method is intended to speed up multi file download, by applying concurrent sliced download on multiple files. The implementation bounds the total number of file handles (connections+local files) it opens at once. The bound is twice the value passed in the parallelism parameter. Expects a fileNameResolver, which allows assigning different names to target files. This may be useful for example when protecting from remote files with the same name to override each other.
        Overrides:
        getAllRegularFilesByMetaInterruptibly in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Parameters:
        metaObjects - The collection of objects pointing to the remote files to download
        targetFolder - The local target folder to write files to. Created if needed.
        fileNameResolver - A mapper between remote path (relative to bucket) to the local file name to assign to it (just name, without path).
        parallelism - The number of threads to use for the task
        maxRetries - Maximum number of retries in case of IOException (except for FileNotFoundException which won't trigger retries).
        initialRetrySleepSec - The initial number of seconds to sleep between retries (increases exponentially)
        waitTimeFactor - A factor by which the sleep times between retries increases (millisecond precision)
        Returns:
        The set of downloaded file objects
        Throws:
        FileNotFoundException - In case that the one of the paths wasn't found in the bucket
        IOException - In case of IO error while downloading the files
        InterruptedException - In case that the thread is interrupted
      • copyToAnotherBucket

        public void copyToAnotherBucket​(String fromKey,
                                        String toBucket,
                                        String toKey)
                                 throws IOException
        Specified by:
        copyToAnotherBucket in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • delete

        public void delete​(com.google.cloud.storage.Blob obj)
                    throws IOException
        Specified by:
        delete in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • deleteAllByMetaInterruptibly

        public void deleteAllByMetaInterruptibly​(Collection<com.google.cloud.storage.Blob> fileRefs,
                                                 int parallelism,
                                                 int maxRetries,
                                                 int initialRetrySleepSec,
                                                 double waitTimeFactor)
                                          throws IOException,
                                                 InterruptedException
        Overrides:
        deleteAllByMetaInterruptibly in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
        InterruptedException
      • exists

        public boolean exists​(String key)
                       throws IOException
        Specified by:
        exists in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • listObjects

        public Iterator<com.google.cloud.storage.Blob> listObjects​(String folderPath,
                                                                   boolean recursive)
                                                            throws IOException
        Specified by:
        listObjects in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • generateSignedUrl

        public URL generateSignedUrl​(String key,
                                     String contentType,
                                     int expirationSeconds,
                                     boolean isPublicRead)
        Specified by:
        generateSignedUrl in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
      • generateReadOnlyUrl

        public URL generateReadOnlyUrl​(String key,
                                       int expirationSeconds)
        Specified by:
        generateReadOnlyUrl in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
      • generateResumableSignedUrlForUpload

        public URL generateResumableSignedUrlForUpload​(String key,
                                                       String contentType,
                                                       int expirationSeconds,
                                                       Long maxContentLengthInBytes,
                                                       boolean isPublic)
                                                throws IOException
        Specified by:
        generateResumableSignedUrlForUpload in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • getObjectMetadata

        public com.google.cloud.storage.Blob getObjectMetadata​(String key)
                                                        throws IOException
        Specified by:
        getObjectMetadata in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException
      • getPath

        public String getPath​(com.google.cloud.storage.Blob keyMetadata)
        Specified by:
        getPath in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
      • getLength

        public long getLength​(com.google.cloud.storage.Blob keyMetadata)
        Specified by:
        getLength in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
      • getLastUpdated

        public Long getLastUpdated​(com.google.cloud.storage.Blob keyMetadata)
        Specified by:
        getLastUpdated in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
      • compose

        public com.google.cloud.storage.Blob compose​(List<String> gsPaths,
                                                     String composedFilePath,
                                                     boolean removeComprisingFiles)
                                              throws IOException
        Specified by:
        compose in class org.pipecraft.infra.storage.Bucket<com.google.cloud.storage.Blob>
        Throws:
        IOException