Java Streams — A quick primer and application
Introduction
Java is one of my favourite languages. Probably my favourite. Whenever I say this to anyone, I usually get a response akin to saying that I enjoy watching paint dry. I get it. Java is a language that is kinda boring and generally not very fun to code in for most people. In a world where Python and JavaScript are the most popular programming languages, I can see why people don’t want to fight an archaic type system and try to work with features in a language that certainly isn’t designed for them (looking at you Java 9 modules). But, there are excellent Java features that make coding in it a joy IMHO. One of which is Streams.
A Java Stream (not to be confused with Java I/O streams) is a wrapper around a data structure, like a Set or a List, that allows you to operate on it singly or in bulk without mutating the underlying structure. This of course is great when programming in a functional style and Streams support the normal functional-style methods like reduce
and map
. Let’s look at an example to showcase how powerful this feature is.
Examples
Let’s take this function as an example. It’s simple and fairly innocuous. You’ve probably seen loops like this all over the place. What if I told you this was extremely prone to bugs and old-fashioned by Java standards? Most people often baulk when I say this to them. Isn’t Java old? It doesn’t do the nice things JavaScript or Python do with lists, right? Actually, Java has the enhanced for loop.
Let’s refactor this method to use it.
This is definitely better, but still not great. Do we really need a for loop to iterate over the data in the list? In Java we have Iterators. These could work very well for a simple use case, but is it readable? We could use the method forEachRemaining()
that also accepts a lambda. The problem with this is that the iterator only goes in one direction, forward. This would mean we couldn’t reuse the iterator anywhere else and would have to create a new one from the collection. Iterators can also mutate the underlying collection, meaning we could enter an invalid state if the collection is reused without copying the original first. We just want to loop over the items in the array. We shouldn't;t have to worry about making copies of things! Well, luckily, Streams address all of these problems. Using a Stream, we can express this function’s work in one line and a pretty short one at that. We also do not touch the underlying collection; only take each of its values in turn and apply it to a lambda.
There are a couple of things to unpack here. First, all collections in Java implement a .stream()
method that turns the caller into a Stream. From here, we can then access the items using one of the many different functions available to a Stream. Here is the documentation for the Stream interface.
There are a load of different methods available, some obvious, some not. Each of these generally accepts a lambda expression and the value(s) is then bound to the argument(s) of this lambda. The interesting thing though, while not Stream related, is the fact that the value’s type is implicit. This means that the JVM will work out the type of the value for us during compilation. Then, by using a method reference of a method the value’s type responds to, we remove the need to catch the item inside a lambda. This makes what was a for loop into a one-line method. The savings on using Streams only get better the more complex things get.
Consider this example. The purpose of this code is to iterate over the list and print out the length of each item. This is a simple use case and hardly seems worth using a Stream over just looping normally. But, the benefit comes from when we want to extend this. Let’s say I only want to print items that have a length greater than 5. In the conventional for loop (or enhanced) example, we’d have to use an if statement or some other mechanism for checking the value. This is more LoC than we really need. Using a Stream, we can simply refactor the current code to:
We still do the same check, but see how much simpler it is. The declarative style of Streams makes them incredibly simple to work with and read. Now, extending this function to do anything else is likely to be a simple one-line change especially if we use functional interfaces (more on this another time).
One final thing to mention about Streams is that they come with a companion piece, Collectors. Collectors are a group of static methods that allow you to terminate a Stream. Streams on their own are similar to iterators in that they emit values until they are exhausted or terminated. However, an iterator only emits a value when the next() method is called. Streams do it continuously and only stop when exhausted. This normally isn’t a major problem, but in the cases that a Stream is infinite (we can generate these in a similar way to a list generator in Python), you will have a memory leak on your hands if the Stream isn’t short-circuited or terminated. Thankfully, Collectors achieve this for us. Check the documentation here to see what is available:
The above example is the use of a Collector to resolve a value. The method returns the total of all the lengths of the strings in the list. I bet you knew that without having to be told though. The name of the Collector method is summingInt
and it takes the method reference String::length
. This code is clear to almost any programmer and its succinctness means it is incredibly easy to digest and reduces the cognitive load in the interpretation of the code. As showcased previously, extending this is also extremely simple. But, be aware that the collect method terminates the Stream. Nothing else will be emitted from this other than the value collected.
Let’s apply what we know
Considering what we now know let’s refactor some code in a real-life project to use Streams.
The above code example is a snippet from a budgeting app I’m working on called Calculo. The purpose of this code is to check if a value exists inside of an enum. A user will create an expense and assign it a category, for which this enum holds the values. I want to check if the value the user has submitted is a valid category before attempting to assign it to the created entity and persisting it. If I used a collection, this would be simple, but I wanted an enum here to ensure the value was correct and not changeable at runtime.
The code isn’t bad. But, it is somewhat complicated to read and could be tricky to extend in future. This is a prime candidate for refactoring to a Stream. Applying what we now understand, the code looks like this:
Of course, we keep the guard clause to ensure we do no work if the input is empty. But, look at how much simpler this code is now. Removing the for loop expresses our work declaratively making it much easier to digest. As mentioned earlier, the lambda passed to the anyMatch()
method accepts the value emitted from the Stream and binds it to the named argument in the lambda. The type is worked out implicitly too, meaning there is no need to tell the lambda that I expect the type to be a string. One thing that is different from the examples showcased is the use of a static collection method to create a Stream. An enum’s values()
method returns an array. Unfortunately, array in Java does not implement the collection interface, so does not natively implement a .stream()
method. So, to get around this, we use the static method in the Arrays class. This does add some overhead here, but it is certainly worth it in order to reduce the LoC of what should be a very simple method.
Hopefully, I’ve been able to show that Java isn’t a dinosaur language and that it provides some excellent tools to code in a modern style. If you want to have a play with Streams, fork this repl with the code examples used above: