Kotlin Sequences: Getting Started
In this Kotlin Sequences tutorial, you’ll learn what a sequence is, its operators and when you should consider using them instead of collections. By Ricardo Costeira.
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!
Create accountAlready a member of Kodeco? Sign in
Contents
Kotlin Sequences: Getting Started
25 mins
Using Sequence Operators
Sequences have two kinds of operators:
- Intermediate operators: Operators used to build the sequence.
- Terminal operators: Operators used to execute the operations the sequence was built with.
You'll learn about intermediate operators first.
Intermediate Operators
To start understanding how operators work, write that last sequence in your scratch file:
val naturalNumbersUpToTwoHundredMillion =
generateSequence(seed = 1) { previousNumber ->
if (previousNumber < 200_000_000) {
previousNumber + 1
} else {
null
}
}
Now, build a new sequence from it by adding two intermediate operators. You'll probably recognize these, as sequences and collections have a lot of similar operators:
val firstHundredEvenNaturalNumbers = naturalNumbersUpToTwoHundredMillion
.filter { number -> number % 2 == 0 } // 1
.take(100) // 2
In this code, you:
- Filter the elements by their parity, accepting only the even ones, i.e, the ones divisible by two.
- Take the first 100 elements, discarding the rest.
As mentioned before, sequences process their operations one element at a time. In other words, filter
starts by operating on the first number, 1
, and then discarding it since it's not divisible by two. Then, it operates on 2
, letting it proceed to take
, as 2
is an even number. The operations keep going until the element operated on is 200
since, in the [1, 200_000_000]
interval, 200
is the hundredth even number. At that point, neither take
nor filter
handle any more elements.
This might get confusing to read, so here's a visualization of what's happening:
Thanks to take(100)
, 200,000,000
never gets operated on, along with the all the numbers before it, from 200
onward.
As you'll notice in your scratch file, firstHundredEvenNaturalNumbers
isn't actually outputting any values yet. In fact, the scratch file just shows the type:
As you may suspect already, you still need a terminal operator to output the sequence's result.
Terminal Operators
Terminal operators can take many forms. Some, like toList()
or toSet()
, can output the sequence results as a collection. Others, like first()
or sum()
, output a single value.
There are a lot of terminal operators, but there's an easy way to identify them without having to dig into the implementation or documentation.
Back in your scratch file, just below take(100)
, start typing the map
operator. As you type, Android Studio will pop up code completion. If you look at the suggestions, you'll see that map
has the return type of Sequence
, with R
being the return type for map
.
Now, delete it! Delete the map
you just typed. And in its place, start typing the forEach
terminal operator. When code completion pops up, notice the return type of forEach
.
Unlike map
, forEach
doesn't return a Sequence
. Which makes sense, right? It's a terminal operator, after all. So, long story short, that's how you can distinguish them at a glance:
- Intermediate operators always return a
Sequence
. - Terminal operators never return a
Sequence
.
You now know how to build a sequence and output its result. So, now it's time to try it out! Finish that terminal operator you were just writing by printing each element with it. In the end, you should have something like:
val firstHundredEvenNaturalNumbers = naturalNumbersUpToTwoHundredMillion
.filter { number -> number % 2 == 0 }
.take(100)
.forEach { number -> println(number) }
You'll see the result printed on the top right side of the scratch file.
If you expand it, you'll see that it printed every even number up to 200.
Just like with collections, operator order is important in sequences. For instance, swap take
with filter
, like so:
val firstHundredEvenNaturalNumbers = naturalNumbersUpToTwoHundredMillion
.take(100)
.filter { number -> number % 2 == 0 }
.forEach { number -> println(number) }
take(100)
line — with the intent of pasting it later — the IDE will run the code from the scratch file, and it'll take a while before you get any results. This is because forEach
is a terminal operator, therefore, it'll iterate two hundred million elements.
After a few seconds, the scratch file should run your code again. Expand it, and you'll see that it has printed every even number up to 100. Since take
is running first, filter
only gets to operate on the first 100 natural numbers, starting from one.
Now that you've played around with sequences a bit, all that's left is to address the elephant in the room: When should you use sequences?
Sequences vs. Collections
You now know how to build and use sequences. But when should you use them instead of collections? Should you use them at all?
This can be quickly answered with one of the most famous sayings in software development: It depends. :]
The long answer is a bit more complex. It always depends on your use case. In fact, to be really sure, you should always measure both implementations to check which one is faster. However, knowing about a few quirks surrounding sequences will also help you make a better-informed decision.
Element Operation Order
In case you have the memory of a goldfish, remember that sequences operate on each element at a time. Collections, on the other hand, execute each operation for the whole collection, building an intermediate result before proceeding to the next operation. So, each collection operation creates an intermediate collection with its results, where the next operation will operate on:
val list = naturalNumbersUpToTwoHundredMillion
.toList()
.filter { number -> number % 2 == 0 }
.take(100)
.forEach { number -> println(number) }
In the code above, filter
would create a new list, then take
would operate on that list, creating a new one of its own, and so on and so forth. That's a lot of wasted work! Especially since you're only taking 100 elements in the end. There's absolutely no need to bother with the elements after the hundredth one.
Sequences effectively avoid computing intermediate results, being able to outperform collections in cases like this one. However, it's not all roses and unicorns.
Each intermediate operation added introduces some overhead. This overhead comes from the fact that each operation involves the creation of a new function object to store the transformation to be executed later. In fact, this overhead can be problematic for datasets that aren't large enough or in cases where you don't need that many operations. This overhead may even outweigh the gains from avoiding intermediate results.
To better understand where this overhead comes from, look at filter
's implementation:
public fun Sequence.filter(predicate: (T) -> Boolean): Sequence {
return FilteringSequence(this, true, predicate)
}
filter
in the scratch file. If you try, the IDE will show you a decompiled .class
file. For that reason, the final project has a Sequences.kt file with all the tutorial code, where you can easily check the inner workings of sequences. Or you can also check the Jetbrains source code.
That FilteringSequence
is a Sequence
of its own. It wraps the Sequence
where you call on filter
. In other words, each intermediate operator creates a new Sequence
object that decorates the previous Sequence
. In the end, you're left with at least as many objects as intermediate operators, all wrapped around each other.
To complicate things a bit, not all intermediate operators limit themselves to just decorating the previous sequence. Some of them need to be aware of the sequence's state.