Class DistributedShufflerConfig.Builder<T>

    • Method Detail

      • workers

        public DistributedShufflerConfig.Builder<T> workers​(Set<HostPort> workers)
        Parameters:
        workers - The set of workers (each specified by address+ports) taking part in the distributed shuffling. 1. Must include the current worker, with the same port as specified by the getPort(..) method. 2. When used in distributed shuffling, the same set of workers must be used by all workers
        Returns:
        This builder
      • shardFunc

        public DistributedShufflerConfig.Builder<T> shardFunc​(Function<T,​Integer> shardFunction)
        Parameters:
        shardFunction - The explicit item sharding function to use. For each item, returns the id of the shard responsible to handle it (0 .. workers.size() - 1). The mapping between shard id and worker is done internally, but the caller can get it using DistributedShufflerPipe.getWorkerShardId(..). Note that this method overrides shardBy, and vice-versa. By default, sharding is based on hashing of the item itself.
        Returns:
        This builder.
      • shardBy

        public DistributedShufflerConfig.Builder<T> shardBy​(Function<T,​Object> shardBy)
        Parameters:
        shardBy - Specifies an object to shard by. For each item, returns an object, whose hash value will be internally used to determine the shard the item belongs to. Note that this method overrides shardFunc, and vice-versa. By default, sharding is based on hashing of the item itself.
        Returns:
        This builder.
      • build

        public DistributedShufflerConfig<T> build()
        Returns:
        A new config object based on the current settings and defaults, where applicable