Package org.projectnessie.versioned.gc
Class IdentifyUnreferencedAssets<T,R extends AssetKey>
- java.lang.Object
-
- org.projectnessie.versioned.gc.IdentifyUnreferencedAssets<T,R>
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classIdentifyUnreferencedAssets.AssetFilterSpark filter to determine if a value is referenced by checking if the byte[] serialization is in a bloom filter.static classIdentifyUnreferencedAssets.AssetFlatMapper<T,R extends AssetKey>Spark flat map function to convert a value into an iterator of AssetKeys keeping their reference state.static classIdentifyUnreferencedAssets.CategorizedAssetKeyPair of referenced state of an asset key and its byte[] representation.static classIdentifyUnreferencedAssets.UnreferencedItemUnreferenced Item.static classIdentifyUnreferencedAssets.UnreferencedItemConverterSpark function to convert a Row into a concrete UnreferencedItem object.
-
Field Summary
Fields Modifier and Type Field Description protected Serializer<T>valueSerializer
-
Constructor Summary
Constructors Constructor Description IdentifyUnreferencedAssets(Serializer<T> valueSerializer, Serializer<AssetKey> assetKeySerializer, AssetKeyConverter<T,R> assetKeyConverter, org.apache.spark.api.java.function.FilterFunction<CategorizedValue> valueTypeFilter, org.apache.spark.sql.SparkSession spark)Drive a job that generates a dataset of unreferenced assets.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.spark.sql.Dataset<IdentifyUnreferencedAssets.UnreferencedItem>identify(org.apache.spark.sql.Dataset<CategorizedValue> categorizedValues)
-
-
-
Field Detail
-
valueSerializer
protected final Serializer<T> valueSerializer
-
-
Constructor Detail
-
IdentifyUnreferencedAssets
public IdentifyUnreferencedAssets(Serializer<T> valueSerializer, Serializer<AssetKey> assetKeySerializer, AssetKeyConverter<T,R> assetKeyConverter, org.apache.spark.api.java.function.FilterFunction<CategorizedValue> valueTypeFilter, org.apache.spark.sql.SparkSession spark)
Drive a job that generates a dataset of unreferenced assets.
-
-
Method Detail
-
identify
public org.apache.spark.sql.Dataset<IdentifyUnreferencedAssets.UnreferencedItem> identify(org.apache.spark.sql.Dataset<CategorizedValue> categorizedValues)
-
-