-
Notifications
You must be signed in to change notification settings - Fork 36
Tuples
Inspired by ScalaTuplesInKotlin, the API introduces a lot of helper- extension functions to make working with Scala Tuples a breeze in your Kotlin Spark projects. While working with data classes is encouraged, for pair-like Datasets / RDDs / DStreams Scala Tuples are recommended, both for the useful helper functions, as well as for Spark performance. To enable these features simply add
import org.jetbrains.kotlinx.spark.api.tuples.*to the start of your file.
Tuple creation can be done in the following manners:
val a: Tuple2<Int, Long> = tupleOf(1, 2L)
val b: Tuple3<String, Double, Int> = t("test", 1.0, 2)
val c: Tuple3<Float, String, Int> = 5f X "aaa" X 1NOTE: While the X method is the quickest way to create a tuple, some caution is necessary, as
tupleOf(1) X 2 !== tupleOf(tupleOf(1), 2)but due to the way the infix method works:
tupleOf(1) X 2 == tupleOf(1, 2)Tuples can be expanded and merged like this:
// expand
tupleOf(1, 2).appendedBy(3) == tupleOf(1, 2, 3)
tupleOf(1, 2) + 3 == tupleOf(1, 2, 3)
tupleOf(2, 3).prependedBy(1) == tupleOf(1, 2, 3)
1 + tupleOf(2, 3) == tupleOf(1, 2, 3)
// merge
tupleOf(1, 2) concat tupleOf(3, 4) == tupleOf(1, 2, 3, 4)
tupleOf(1, 2) + tupleOf(3, 4) == tupleOf(1, 2, 3, 4)
// extend tuple instead of merging with it
tupleOf(1, 2).appendedBy(tupleOf(3, 4)) == tupleOf(1, 2, tupleOf(3, 4))
tupleOf(1, 2) + tupleOf(tupleOf(3, 4)) == tupleOf(1, 2, tupleOf(3, 4))NOTE: Prepending a tuple with a String might result in unexpected behavior like this, since String has the operator fun plus(other: Any?):
"some string" + tupleOf(1, 2) == "some string(1,2)"In these cases you can turn to
tupleOf(1, 2).prependedBy("some string") == tupleOf("some string", 1, 2)The concept of EmptyTuple from Scala 3 is also already present:
tupleOf(1).dropLast() == tupleOf() == emptyTuple() == EmptyTupleFinally, all these tuple helper functions are also baked in:
-
componentX()- for destructuring:
val (a, b) = tuple
- for destructuring:
-
contains(x)- for
if (x in tuple) { ... }
- for
-
iterator()- for
for (x in tuple) { ... } - generalizes types to smallest common ancestor
- for
-
asIterable()- generalizes types to smallest common ancestor
size-
get(n) / get(i..j)- for
tuple[1] / tuple[i..j] - returns single item or list of items
- generalizes types to smallest common ancestor
- can throw IndexOutOfBoundsException
- for
-
getOrNull(n) / getOrNull(i..j)- same as
get(n), but returnsnullinstead of throwing an exception
- same as
-
getAs<T>(n) / getAs<T>(i..j)- returns a single item or list of items cast to
T - can throw ClassCastException and IndexOutOfBoundsException
- returns a single item or list of items cast to
-
getAsOrNull<T>(n) / getAsOrNull<T>(i..j)- same as
getAs<T>(n)but returnsnullinstead of throwing an exception
- same as
-
copy(_1 = ..., _5 = ...)- similar to datasets, this returns a copy of the Tuple with only the provided arguments replaced
first() / last()-
_1,_6etc. (instead of_1(),_6()) -
zip- zips two tuples as one large Tuple of Tuple2s
- is infix
- on different sizes, the smallest size is kept
-
dropLast() / dropFirst()- returns a new tuple without the first or last element
- same as
dropLast1() / dropFirst1()
-
dropN() / dropLastN()- returns a new tuple with the first or last
Nelements dropped - used like
drop11() -
drop0()simply copies the tuple - returns
EmptyTupleif all elements are dropped
- returns a new tuple with the first or last
-
takeN() / takeLastN()- returns a new tuple with the first or last
Nelements dropped - used like
take11() -
take0()simply returnsEmptyTuple
- returns a new tuple with the first or last
-
splitAtN()- returns a Tuple2 with the original split at position
N - for:
- returns a Tuple2 with the original split at position
val a: Tuple3<Int, Double, String> = tupleOf(1, 2.0, "3.0")
val (c: Tuple2<Int, Double>, d: Tuple1<String>) = a.splitAt2()- can also return
EmptyTuplewhensplitAt0() -
map- generalizes types to smallest common ancestor
- can be used to convert all values in a tuple at once
-
cast- used to cast contents of a tuple
- used like
tuple.cast<Int, String, Int>() - can throw ClassCastException