Sunday, June 5, 2011

Java Without Semicolons: An Introduction to Scala - Part 1: Iteration

This series of posts was inspired by a suggestion from Martin Oedersky at Scala Days 2011 as well as something I have heard Dick Wall say on several occasions. To paraphrase Dick, you can get started with Scala by converting Java code to Scala with few changes beyond getting rid of the semicolons. It may not take advantage of many of Scala's more powerful features, but it's a great way to get started and an excellent jumping off point to learning Scala's richer language features.

Scala is a deep and flexible language that allows an enormous degree of freedom to library developers and advanced users, but those features can be daunting to new developers trying to take them all in. So, what areas should a Java programmer getting started with Scala start with? This blog series will cover Scala features that Java programmers should be able to put to immediate use. In other words, I'll suggest some of the first things a newcomer to Scala can try after getting rid of the semicolons.

Setup

You're encouraged to follow along with these examples in the Scala REPL and use them as a jumping off point for your own exploration. Installing Scala is easy - if you already have Java set up, just unzip the Scala distribution, set SCALA_HOME and add $SCALA_HOME/bin to your path. You can run the REPL by simply typing 'scala' in a shell. Further instructions at scala-lang.org or typesafe.com. I'll be using Scala 2.9.0.1 which is the latest version at the time of this writing.

Iteration

As a longtime Java developer, I was immediately drawn to Scala's alternatives to for-loop iteration. Java loops are easy enough to write, but can be buggy and confusing to read and comprehend.

This post will take a look at some functions Scala Collections provide to iterate that are as easy to read and understand as they are to write.

Function Literals

Assuming that Java is the only language you are familiar with, there's one important Scala language feature you'll need to understand in these examples, and that is function literals.

A function literal is analogous to String or int literals in Java in that it is a value that can be provided directly where it is used.

In Scala, you can define a function inline - that is pass a bit of functionality right where it is needed. This is what a simple function literal looks like:

x => x + 1

You can read this as "a function that take one argument x and returns the result of adding 1 to the value of x".

Scala's collections have special functions that take functions as arguments to be applied to the members of the collection. So unlike a Java for loop, where you get a handle on each member in the collection one at a time and do something to it, in Scala you can tell the collection to apply some functionality to all of its members and the actual iteration over elements is handled internally. Building on our example:

val someInts = List(4, 8, 15, 16, 23, 42)
val incrementedInts = someInts.map(x => x + 1)

The map function takes a single-argument function as its argument. It applies that function to every element in the List and returns a new List whose contents are the result of applying the function argument. In the example above, each element of the list is incremented by 1 and the result looks like this:

List(5, 9, 16, 17, 24, 43)

Note that the original list still exists unmodified.

In Scala code you will often see a shorter form of function literals. Scala allows us to substitute a _ character for the function parameters in many cases, so the above map call could be written as:

val incrementedInts = someInts.map(_ + 1)

The two are functionally identical, and it's up to you which you want to use.

In some cases the function literal can be even shorter. When the function passed as an argument takes a single argument itself, Scala will automatically pass each element of the collection to it without even needing the _. The following examples are functionally equivalent and use the foreach function, which is executed only for side effects and does not return a value (it returns Unit, equivalent to Java's void return type).

someInts.foreach(x => println(x))
someInts.foreach(println(_))
someInts.foreach(println)

Now let's say we only want the subset of our list that is odd numbers. The filter function is useful for such operations. It takes a function argument that returns a boolean to determine whether an element should be included in the output:

someInts.filter(x => x % 2 == 1)
//someInts.filter(_ % 2 == 1)

Maybe we want the first number in the list that is greater than 20. We can use find:

someInts.find(x => x > 20)
//someInts.find(_ > 20)

If you want to discover whether all elements of a collection satisfy some requirement, use forall, which takes a function that returns boolean and itself returns a boolean:

someInts.forall(x => x > 0)
//someInts.forall(_ > 0)

You can get the first n elements that meet a certain requirement by using takeWhile. The following example returns a new list containing all of the words up to the first word that begins with a vowel:

val words = List("the","quick","brown","fox","jumped","over","the","lazy","dog")
words.takeWhile(w => !"aeiou".contains(w(0)))
// returns List(the, quick, brown, fox, jumped)

If you wanted to retain the remainder of the list separately, you can pass the same function to span, which returns a tuple of two Lists:

words.span(w => !"aeiou".contains(w(0)))
// returns (List(the, quick, brown, fox, jumped),List(over, the, lazy, dog))

The last example of these sort of collection iteration functions I'll demonstrate is foldLeft. It's a bit different in that it does not return a collection, but rather a single value as a result of applying a two-argument function to each element of a list and the result of the last function execution. You need to provide it a seed value, and the first iteration will execute the function with the seed value and first element of the list. In this example we'll get the sum of all numbers with more than one digit:

someInts.foldLeft(0)((a,b) => if (b.toString.length > 1) a+b else a)

In the example, foldLeft is seeded with a 0. The function is then called with 0 and 4, returning 0. Then 0 and 8 returning 0. Then 0 and 15 returning 15, and so on. Folding operations take a bit of getting used to, but are a powerful functional programming paradigm.

For Comprehension

Scala does have a for loop structure that may look more familiar to java developers. In Scala it's called a for comprehension, and it can be used in very powerful ways. It can also be used to iterate over collections in a more imperative way:

for (word <- words) {
println(word)
}

But, often the functions on the various Scala collection classes allow you to perform common iteration tasks without using loops. This change can result in fewer more readable lines of code. In Java, even the contrived examples above would necessetate loops, temporary variables, temporary lists, and, in general, more boilerplate code not relating to business logic. The Scala collection libraries allow us to focus on the code that performs the actions we're really interested in. And it makes for some nifty one-liners:

//find all numbers from 1 to a million that are evenly divisible by 3 and 5,
//remove all instances of the digit '9' and sum the resulting numbers
(1 to 1000000).filter(i => i % 3 == 0 && i % 5 == 0).map(_.toString.replace("9","").toInt).sum

That's it for this installment. Hopefully someone getting started with Scala finds these examples helpful as a jumping off point.

No comments:

Post a Comment