Core Java

Java 8 Stream API Tutorial

Java 8 offers several new functionalities. One of the most important is the new Streams API. Basically, Streams are sequences of elements that support concatenated operations. They used a source and allow different intermediate and terminal operations. The combination of source and all the operations involved is called stream pipeline (because Streams allow operations to be pipelined or concatenated).
 
 
 
 
 
 

 
As source we can have collections, arrays, lines of a file, files in a directory or numeric ranges; Intermediate operations are filter, map, distict or flatMap; several intermediate operations can be concatenated. Terminal operations are for example forEach, collect, reduce and min or max. Only one terminal operation is possible, if a terminal operation is executed the stream will be closed and cannot be used again.

Streams and Lambdas work very well together, make the code more clear and concise and offer several possibilities like parallelism. As information, the Streams API has been implemented using Lambdas as well.

In this article we are going to show how to create Streams from different sources and how to use the main Streams operations.

All examples are being implemented using Eclipse Luna version 4.4 and Java version 8 update 5.

First examples

In this chapter we are going to show a couple of simple examples with possible usages of the Stream API.

As explained above, in order to create an stream, we need always a source. A source can be an array:

		// you can use arrays as Streams sources
        int[] numbers = { 1, 2, 3, 4 };
        IntStream numbersFromArray = Arrays.stream( numbers );
        numbersFromArray.forEach( System.out::println );

in the code above we can see a integer stream being created from an Array and the usage of the terminal operation forEach.

We can create Streams directly using different types:

		// you can create an Stream directly 
        Stream.of(1,2,"asdfas",4,5,"adsfasa",7,8,9,10).forEach( System.out::println );

We can use a collection as source:

        // you can use a collection as well as Streams sources
        List collectionStr = new ArrayList();
        collectionStr.add( "uno" );
        collectionStr.add( "dos" );
        collectionStr.add( "tres" );
        collectionStr.add( "cuatro" );
        collectionStr.add( "cinco" );
        collectionStr.add( "seis" );
        collectionStr.add( "siete" );
        collectionStr.add( "ocho" );
        Stream numbersFromCollection = collectionStr.stream();

a file, in combination with the new nio API:

		// you can use a file as source (in combination with the nio API)
        Files.list( new File( "." ).toPath() ).forEach( System.out::println );

In the code shown above, we can see how to use streams within the new java.nio.file.Files features coming out in Java 8. In this case, Files.list() returns a stream with the entries in the directory passed as parameter that can be manipulated with the mentioned operations. We are going to explain this more in depth in this article.

At the beggining of this article, we explained that Streams support several operations. We divided these operations in two main groups: intermediate and final ones. Intermediate ones are basically the ones that produce an stream. Final ones are the ones that do not produce an stream but a value of other type, like double, int or whatever. A good example of a terminal operation is forEach.

        // you can use Streams for filtering in combination with lambdas
        numbersFromCollection.filter( ( s ) -> s.startsWith( "s" ) ).forEach( System.out::println );

in the code above we can see the intermediate operation filter (using a Lambda expression) and the terminal forEach that prints out in the standard console. We should mention that this code would not work because the stream numbersFromCollection has already been operated or closed. The output would be:

		Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
		at java.util.stream.AbstractPipeline.(Unknown Source)
		at java.util.stream.ReferencePipeline.(Unknown Source)
		at java.util.stream.ReferencePipeline$StatefulOp.(Unknown Source)
                ...

This happens because a terminal operation has been used within the stream numbersFromCollection. So we should create the stream again:

		collectionStr.stream().filter( ( s ) -> s.startsWith( "s" ) ).forEach( System.out::println );

There are several operations that we can apply while using Streams like sorting:

        // for sorting
        collectionStr.stream().sorted().forEach( System.out::println );

mapping:

        // mapping -> convert to upper case
        collectionStr.stream().map( String::toUpperCase ).forEach( System.out::println );

searching and matching:

        // for matching purposes
        collectionStr.stream().anyMatch( ( s ) -> s.startsWith( "s" ) );
        collectionStr.stream().noneMatch( ( s ) -> s.startsWith( "z" ) );

retrieving statistics:

        // for counting and retrieving statistics
        collectionStr.stream().filter( ( s ) -> s.startsWith( "s" ) ).count();

reducing and grouping:

        // for reducing the original pipeline
        Optional reduced = collectionStr.stream().sorted().reduce( ( s1, s2 ) -> s1 + "#" + s2 );
        reduced.ifPresent( System.out::println );

These are just a few examples of the Stream API usage; there are many more types of Streams and operations (intermediate and final).

Streams API applications

We are going to show now a real implementation example.

We suppose that we have a directory with several files. These files contain two types of information: song lyrics and meal menus. But it is not possible to know beforehand what kind of file it is, so we need to read it before knowing what content is inside of it in order to be able to analyze it.
For the menus we are going to calculate the total price and print it out in the console; for lyrics we are going to print them completely out and count the number of times that the word “love” appears in it song.

The traditional approach would be to iterate through all files existing in the directory, opening them, checking if they are songs or menus and count the appearances of the word “love” or print out the total price. This seems not to be very difficult to implement but we are trying to do it using the Streams API.

We saw that it is possible to generate an Stream with all the file names located in a given directory:

		Files.list( new File( PATH2FILES ).toPath() );

If we want to filter the files by the prefix we can do it using the filter() method:

		Files.list( new File( PATH2FILES ).toPath() ).filter(x -> checkPrefix(x))

So we have the problem of retrieving all interesting files on a directory solved already, now we have to open these files and read their content. Using the nio.file.Files API we can read all the lines of a given path using Streams:

		Files.lines( path ).forEach( x -> System.out.println(x) );

and in order to filter the empty lines:

		Files.lines( path ).filter( x -> !checkEmpty( x ) )

Now we need to differentiate between menus and songs by reading the content. Since we do not have strong requirements we are going to make our life easy: we are going to say that menu files contain a maximum of 10 lines and a minimum of 2 and should contain the “total price” string; on the other hand, songs should start with the title in quotation marks (“Blowin’ In The Wind” for example) and should have more than 10 lines.
We do not care about computation time and performance for the moment, we are just going to process every file the same way.

In order to check if the string “total price :” is contained in the file, we can write:

	Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).anyMatch( line -> line.contains( "total price:" ) ) 	

the code shown above is using the final operation anyMatch that returns a boolean depending on the Predicate passed as argument. In order to show this price we can use a new filter for the string “total price:” by typing something like that:

		Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).filter( line -> line.contains( "total price:" ) ).forEach( x -> System.out.println( "total price of menu " + path + " : " + x ) );

Here we are simplifying things a bit, because we are just printing the whole line, whatever it contains. Anyway, we should continue with our program. In order to check if the number of lines is the expected one for menus we can write:

		long countLines = Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).count();
		isMenu = 2 <= countLines && countLines < 10;

we are using here the count() final operation, which returns the number of elements in the Stream.

In order to retrieve the first line for checking if it is a title of a song we can type:

		String title = Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).findFirst().get();

using the operation findFirst() to retrieve the first element in the Stream. And finally we can do something like that in order to count the number of times the word “love” appears in each file:

		 Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).mapToInt( line -> line.toLowerCase().split( "love" ).length - 1 ).sum() 

there are several things that we should explain here. We are using the mapToInt() operation in order to map each line (element of the stream) into a number that contains the number of appearances of the word “love” and creates an IntStream with these elements. Afterwards the sum() operation is applied in order to add all the occurrences.

We have just mentioned an special Stream type, in this case the IntStream. I would like to mention that there are several typified streams (IntStream, DoubleStream, LongStream and Stream which is the one that we are using in our examples until now) with specific operations like sum(), summaryStatistics(), average()....

After some refactoring the code would look like:

		 // retrieving all files in directory
		 Files.list( new File( PATH2FILES ).toPath() ).filter( x -> checkPrefix( x ) )
						.forEach( path -> handleFile( path ) );
		
		 ...
 
		// for each file check if it is menu
    	long count = Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).count();
		if( 2 <= count && count  !checkEmpty( pathName ) )
					.filter( line -> line.contains( "total price:" ) ).forEach( x -> System.out.println( "total price of menu " + path + " : " + x ) );
		}
		else
		{
			//check if it is song
			String title = Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) ).findFirst().get();
			if( title.trim().charAt( 0 ) == '\"' && title.trim().charAt( title.length() - 1 ) == '\"' )
			{
				// print out the appearances of "Love"	
				System.out.println( "Love in " + path + "  :" + Files.lines( path ).filter( pathName -> !checkEmpty( pathName ) )
							.mapToInt( line -> line.toLowerCase().split( "Love" ).length - 1 ).sum() );
			}
		}

This example shows the power of the Streams API and many of its main functionalities. We can see that the code is very clear and easy to test and maintain. There are things that have not been taken into account, like performance or security. These are very important things while manipulating files in production and should be taken into consideration. Applying several final stream operations can be a very expensive tasks and should be analyzed if there are better options for each individual case. The Stream API offers also the possibility to handle streams operations in parallel, but this is not in the scope of this article.

Summary

In this article we explained briefly what the new Streams API offers and we explained how it can be used in real life applications. We explained its main operations and behaviors and we shown how powerful it is in combination with the Lambda expressions.

In the following link you can find a list or articles with more information about many Java 8 features: http://www.javacodegeeks.com/2014/05/java-8-features-tutorial.html.

For more information about the Stream API you can visit the Oracle official page: http://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html

If you want to download all the code shown in this article, please click in the following link: streams

Dani Buiza

Daniel Gutierrez Diez holds a Master in Computer Science Engineering from the University of Oviedo (Spain) and a Post Grade as Specialist in Foreign Trade from the UNED (Spain). Daniel has been working for different clients and companies in several Java projects as programmer, designer, trainer, consultant and technical lead.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button