package util
- Alphabetic
- By Inheritance
- util
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
class
ArrayBasedMapBuilder extends Serializable
A builder of ArrayBasedMapData, which fails if a null map key is detected, and removes duplicated map keys w.r.t.
A builder of ArrayBasedMapData, which fails if a null map key is detected, and removes duplicated map keys w.r.t. the last wins policy.
-
class
ArrayBasedMapData extends MapData
A simple
MapDataimplementation which is backed by 2 arrays.A simple
MapDataimplementation which is backed by 2 arrays.Note that, user is responsible to guarantee that the key array does not have duplicated elements, otherwise the behavior is undefined.
- abstract class ArrayData extends SpecializedGetters with Serializable
-
class
ArrayDataIndexedSeq[T] extends IndexedSeq[T]
Implements an
IndexedSeqinterface forArrayData.Implements an
IndexedSeqinterface forArrayData. Notice that if the originalArrayDatais a primitive array and contains null elements, it is better to ask forIndexedSeq[Any], instead ofIndexedSeq[Int], in order to keep the null elements. -
case class
BadRecordException(record: () ⇒ UTF8String, partialResult: () ⇒ Option[InternalRow], cause: Throwable) extends Exception with Product with Serializable
Exception thrown when the underlying parser meet a bad record and can't parse it.
Exception thrown when the underlying parser meet a bad record and can't parse it.
- record
a function to return the record that cause the parser to fail
- partialResult
a function that returns an optional row, which is the partial result of parsing this bad record.
- cause
the actual exception about why the record is bad and can't be parsed.
-
class
CaseInsensitiveMap[T] extends Map[String, T] with Serializable
Builds a map in which keys are case insensitive.
Builds a map in which keys are case insensitive. Input map can be accessed for cases where case-sensitive information is required. The primary constructor is marked private to avoid nested case-insensitive map creation, otherwise the keys in the original map will become case-insensitive in this scenario. Note: CaseInsensitiveMap is serializable. However, after transformation, e.g.
filterKeys(), it may become not serializable. - sealed trait DateFormatter extends Serializable
- trait DateTimeFormatterHelper extends AnyRef
- class FailureSafeParser[IN] extends AnyRef
-
class
FractionTimestampFormatter extends Iso8601TimestampFormatter
The formatter parses/formats timestamps according to the pattern
yyyy-MM-dd HH:mm:ss.[..fff..]where[..fff..]is a fraction of second up to microsecond resolution.The formatter parses/formats timestamps according to the pattern
yyyy-MM-dd HH:mm:ss.[..fff..]where[..fff..]is a fraction of second up to microsecond resolution. The formatter does not output trailing zeros in the fraction. For example, the timestamp2019-03-05 15:00:01.123400is formatted as the string2019-03-05 15:00:01.1234. - class GenericArrayData extends ArrayData
- class HyperLogLogPlusPlusHelper extends Serializable
- class Iso8601DateFormatter extends DateFormatter with DateTimeFormatterHelper
- class Iso8601TimestampFormatter extends TimestampFormatter with DateTimeFormatterHelper
- trait LegacyDateFormatter extends DateFormatter
-
class
LegacyFastDateFormatter extends LegacyDateFormatter
The legacy formatter is based on Apache Commons FastDateFormat.
The legacy formatter is based on Apache Commons FastDateFormat. The formatter uses the default JVM time zone intentionally for compatibility with Spark 2.4 and earlier versions.
Note: Using of the default JVM time zone makes the formatter compatible with the legacy
DateTimeUtilsmethodstoJavaDateandfromJavaDatethat are based on the default JVM time zone too. - class LegacyFastTimestampFormatter extends TimestampFormatter
-
class
LegacySimpleDateFormatter extends LegacyDateFormatter
The legacy formatter is based on
java.text.SimpleDateFormat.The legacy formatter is based on
java.text.SimpleDateFormat. The formatter uses the default JVM time zone intentionally for compatibility with Spark 2.4 and earlier versions.Note: Using of the default JVM time zone makes the formatter compatible with the legacy
DateTimeUtilsmethodstoJavaDateandfromJavaDatethat are based on the default JVM time zone too. - class LegacySimpleTimestampFormatter extends TimestampFormatter
-
abstract
class
MapData extends Serializable
This is an internal data representation for map type in Spark SQL.
This is an internal data representation for map type in Spark SQL. This should not implement
equalsandhashCodebecause the type cannot be used as join keys, grouping keys, or in equality tests. See SPARK-9415 and PR#13847 for the discussions. -
class
MicrosCalendar extends GregorianCalendar
The custom sub-class of
GregorianCalendaris needed to get access to protectedfieldsimmediately after parsing.The custom sub-class of
GregorianCalendaris needed to get access to protectedfieldsimmediately after parsing. We cannot use theget()method because it performs normalization of the fraction part. Accordingly, theMILLISECONDfield doesn't contain original value.Also this class allows to set raw value to the
MILLISECONDfield directly before formatting. - sealed trait ParseMode extends AnyRef
-
case class
PartialResultException(partialResult: InternalRow, cause: Throwable) extends Exception with Product with Serializable
Exception thrown when the underlying parser returns a partial result of parsing.
Exception thrown when the underlying parser returns a partial result of parsing.
- partialResult
the partial result of parsing a bad record.
- cause
the actual exception about why the parser cannot return full result.
-
class
QuantileSummaries extends Serializable
Helper class to compute approximate quantile summary.
Helper class to compute approximate quantile summary. This implementation is based on the algorithm proposed in the paper: "Space-efficient Online Computation of Quantile Summaries" by Greenwald, Michael and Khanna, Sanjeev. (https://doi.org/10.1145/375663.375670)
In order to optimize for speed, it maintains an internal buffer of the last seen samples, and only inserts them after crossing a certain size threshold. This guarantees a near-constant runtime complexity compared to the original algorithm.
-
case class
RandomIndicesGenerator(randomSeed: Long) extends Product with Serializable
This class is used to generate a random indices of given length.
This class is used to generate a random indices of given length.
This implementation uses the "inside-out" version of Fisher-Yates algorithm. Reference: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_%22inside-out%22_algorithm
-
case class
RandomUUIDGenerator(randomSeed: Long) extends Product with Serializable
This class is used to generate a UUID from Pseudo-Random Numbers.
This class is used to generate a UUID from Pseudo-Random Numbers.
For the algorithm, see RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace, section 4.4 "Algorithms for Creating a UUID from Truly Random or Pseudo-Random Numbers".
- class StringKeyHashMap[T] extends AnyRef
- sealed trait TimestampFormatter extends Serializable
Value Members
- def escapeSingleQuotedString(str: String): String
- def fileToString(file: File, encoding: Charset = UTF_8): String
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
quietly[A](f: ⇒ A): A
Silences output to stderr or stdout for the duration of f
- def quoteIdentifier(name: String): String
- def resourceToBytes(resource: String, classLoader: ClassLoader = Utils.getSparkClassLoader): Array[Byte]
- def resourceToString(resource: String, encoding: String = UTF_8.name(), classLoader: ClassLoader = Utils.getSparkClassLoader): String
- def sideBySide(left: Seq[String], right: Seq[String]): Seq[String]
- def sideBySide(left: String, right: String): Seq[String]
- def stackTraceToString(t: Throwable): String
- def stringToFile(file: File, str: String): File
- def toPrettySQL(e: Expression): String
-
def
truncatedString[T](seq: Seq[T], sep: String, maxFields: Int): String
Shorthand for calling truncatedString() without start or end strings.
-
def
truncatedString[T](seq: Seq[T], start: String, sep: String, end: String, maxFields: Int): String
Format a sequence with semantics similar to calling .mkString().
Format a sequence with semantics similar to calling .mkString(). Any elements beyond maxNumToStringFields will be dropped and replaced by a "... N more fields" placeholder.
- returns
the trimmed and formatted string.
- def usePrettyExpression(e: Expression): Expression
- object ArrayBasedMapData extends Serializable
- object ArrayData extends Serializable
- object CaseInsensitiveMap extends Serializable
- object CompressionCodecs
- object DataTypeJsonUtils
- object DateFormatter extends Serializable
-
object
DateTimeUtils
Helper functions for converting between internal and external date and time representations.
Helper functions for converting between internal and external date and time representations. Dates are exposed externally as java.sql.Date and are represented internally as the number of dates since the Unix epoch (1970-01-01). Timestamps are exposed externally as java.sql.Timestamp and are stored internally as longs, which are capable of storing timestamps with microsecond precision.
-
object
DropMalformedMode extends ParseMode with Product with Serializable
This mode ignores the whole corrupted records.
-
object
FailFastMode extends ParseMode with Product with Serializable
This mode throws an exception when it meets corrupted records.
-
object
HyperLogLogPlusPlusHelper extends Serializable
Constants used in the implementation of the HyperLogLogPlusPlus aggregate function.
Constants used in the implementation of the HyperLogLogPlusPlus aggregate function.
See the Appendix to HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality (https://docs.google.com/document/d/1gyjfMHy43U9OWBXxfaeG-3MjGzejW1dlpyMwEYAAWEI/view?fullscreen) for more information.
- object IntervalUtils
- object LegacyDateFormats extends Enumeration
- object NumberConverter
- object ParseMode extends Logging
-
object
PermissiveMode extends ParseMode with Product with Serializable
This mode permissively parses the records.
- object QuantileSummaries extends Serializable
-
object
RebaseDateTime
The collection of functions for rebasing days and microseconds from/to the hybrid calendar (Julian + Gregorian since 1582-10-15) which is used by Spark 2.4 and earlier versions to/from Proleptic Gregorian calendar which is used by Spark since version 3.0.
The collection of functions for rebasing days and microseconds from/to the hybrid calendar (Julian + Gregorian since 1582-10-15) which is used by Spark 2.4 and earlier versions to/from Proleptic Gregorian calendar which is used by Spark since version 3.0. See SPARK-26651.
-
object
StringKeyHashMap
Build a map with String type of key, and it also supports either key case sensitive or insensitive.
- object StringUtils extends Logging
- object TimestampFormatter extends Serializable
-
object
TypeUtils
Functions to help with checking for valid data types and value comparison of various types.