Package org.pipecraft.infra.bq
Class TableLoadConfig.Builder
- java.lang.Object
-
- org.pipecraft.infra.bq.TableLoadConfig.Builder
-
- Enclosing class:
- TableLoadConfig
public static class TableLoadConfig.Builder extends Object
-
-
Method Summary
-
-
-
Method Detail
-
getSourceURIs
public Set<String> getSourceURIs()
- Returns:
- the full paths to the source data. Each URI should be fully qualified, and may contain one wildcard ('*') in the file name part of the path. When loading local files, all URIs should have the form of a full local file system path. For remote loading they should all be valid cloud storage paths.
-
getDestinationTableReference
public com.google.cloud.bigquery.TableId getDestinationTableReference()
- Returns:
- the destination table reference
-
setDestinationTablePartition
public TableLoadConfig.Builder setDestinationTablePartition(LocalDate partition)
- Parameters:
partition- The destination table partition to write to, as a date. Should be specified only for partitioned tables.
-
getDestinationTablePartition
public LocalDate getDestinationTablePartition()
- Returns:
- The destination table partition to write to, as a date. Should be specified only for partitioned tables.
-
setLoadFormat
public TableLoadConfig.Builder setLoadFormat(TableLoadConfig.LoadFormat loadFormat)
- Parameters:
loadFormat- the format of the input file. Default is CSV.- Returns:
- This builder object
-
getLoadFormat
public TableLoadConfig.LoadFormat getLoadFormat()
- Returns:
- the format of the input file. Default is CSV.
-
setCSVFieldDelimiter
public TableLoadConfig.Builder setCSVFieldDelimiter(String csvFieldDelimiter)
Applies to CSV format only.- Parameters:
csvFieldDelimiter- the input file field delimiter. Default is ",".- Returns:
- This builder object
-
getCSVFieldDelimiter
public String getCSVFieldDelimiter()
- Returns:
- the input file field delimiter. Applies to CSV format only. Default is ",".
-
setCSVHasHeader
public TableLoadConfig.Builder setCSVHasHeader(boolean csvHasHeader)
Relevant for CSV format only.- Parameters:
csvHasHeader- True if and only if the csv file has a header line to skip. Default is true.
-
getCSVHasHeader
public boolean getCSVHasHeader()
Relevant for CSV format only.- Returns:
- True if and only if the csv file has a header line to skip. Default is true.
-
setTableSchema
public TableLoadConfig.Builder setTableSchema(com.google.cloud.bigquery.Schema tableSchema)
- Parameters:
tableSchema- destination table schema. The schema can be null if the destination table already exists. If specified, it can serve for adding columns dynamically.- Returns:
- This builder object
-
getTableSchema
public com.google.cloud.bigquery.Schema getTableSchema()
- Returns:
- the destination table schema. The schema can be omitted if the destination table already exists. If specified, it can serve for adding columns dynamically.
-
setDestinationTableExpirationHs
public TableLoadConfig.Builder setDestinationTableExpirationHs(Integer destinationTableExpirationHs)
- Parameters:
destinationTableExpirationHs- the destination table expiration in hours. Null means no expiration.- Returns:
- this builder object
-
getDestinationTableExpirationHs
public Integer getDestinationTableExpirationHs()
- Returns:
- the destination table expiration in hours. Null means no expiration.
-
setCreateDisposition
public TableLoadConfig.Builder setCreateDisposition(com.google.cloud.bigquery.JobInfo.CreateDisposition createDisposition)
- Parameters:
createDisposition- The table creation mode. Defines how the command deals with a situation where the table to load into already exists. Default is CREATE_IF_NEEDED.- Returns:
- This builder object
-
getCreateDisposition
public com.google.cloud.bigquery.JobInfo.CreateDisposition getCreateDisposition()
- Returns:
- The table creation mode. Defines how the command deals with a situation where the table to load into already exists. Default is CREATE_IF_NEEDED.
-
setWriteDisposition
public TableLoadConfig.Builder setWriteDisposition(com.google.cloud.bigquery.JobInfo.WriteDisposition writeDisposition)
- Parameters:
writeDisposition- The mode defining how the command deals with existing rows in the target table. Default is WRITE_APPEND.- Returns:
- This builder object
-
getWriteDisposition
public com.google.cloud.bigquery.JobInfo.WriteDisposition getWriteDisposition()
- Returns:
- The mode defining how the command deals with existing rows in the target table. Default is WRITE_APPEND.
-
setAllowJaggedRows
public TableLoadConfig.Builder setAllowJaggedRows(boolean allowJaggedRows)
- Parameters:
allowJaggedRows- true if and only if jagged rows are allowed. Jagged rows are rows that are missing optional columns (trailing columns only). When true, the missing values are treated as nulls. When false, missing values are considered an error. Default is false.- Returns:
- This builder object
-
getAllowJaggedRows
public boolean getAllowJaggedRows()
- Returns:
- true if and only if jagged rows are allowed. Jagged rows are rows that are missing optional columns (trailing columns only). When true, the missing values are treated as nulls. When false, missing values are considered an error. Default is false.
-
setClusteringFields
public TableLoadConfig.Builder setClusteringFields(Set<String> clusteringFields)
- Parameters:
clusteringFields- The set of names of table fields defined as clustering fields. Mandatory when the table has clustering fields.
-
getClusteringFields
public Set<String> getClusteringFields()
- Returns:
- The set of names of table fields defined as clustering fields. Mandatory when the table has clustering fields.
-
setTimeoutMs
public TableLoadConfig.Builder setTimeoutMs(Long timeoutMs)
- Parameters:
timeoutMs- The load execution timeout to set, in milliseconds. Must be positive. Null means no timeout. NOTE: Google's API doesn't seem to always respect this limit, and it's not always clear which timeout applies (The export level timeout here or the global one as provided in theBigQueryConnector's constructor.- Returns:
- This builder object
-
getTimeoutMs
public Long getTimeoutMs()
- Returns:
- The load execution timeout, in milliseconds.
Null means no timeout (default value).
NOTE: Google's API doesn't seem to always respect this limit, and it's not always clear which timeout applies
(The export level timeout here or the global one as provided in the
BigQueryConnector's constructor.
-
build
public TableLoadConfig build()
- Returns:
- A new
TableLoadConfigwith the current settings.
-
-