Some time ago
Cedric Beust was the cause for some excitement in the Scala community by declaring that
he doesn't see the advantages of using Scala's Option type, which is also similar to Haskell's
Maybe type.
There were a lot of insightful comments which outlined the benefits of using
Option. James Iry has written a well-reasoned post called
Why Scala's "Option" and Haskell's "Maybe" types will save you from null.
I wanted to approach things differently. I wanted to show people some patterns which usually come with experience and let them decide which is better. That's why this turned into a very long post which looks more like a tutorial. Of course, I'm not adding anything new to the discussion, but just summarizing some of the
common wisdom accumulated by the community. I'm sure this is not the last typical blog post showing the wonders of Option, but I hope it can clear up some (Some?) misunderstandings. Or Maybe not.
But more importantly, I wanted to explain why having a solution as a language feature is a premature optimization. It's neither as flexible, nor as powerful as having it as just a type in the library.
NullPointerException
Cedric is definitely not alone- programmers who decide to give Scala a try (and moreso Haskell or the ML family) face conceptual differences from popular imperative languages. Just reading that
Option is a type wrapper does not mean it's easy to wrap one's head around it.
One advantage of using Scala is that if you are convinced that
NullPointerExceptions solve your problem better, you are free to use it.
Option is just one option. And of course, you can come back later any time if you make up your mind that
Option has Some advantages (e.g. composability). Of course, some might view having too many choices as a disadvantage.
But both Scala and Clojure must live with the design decisions on the JVM which were taken before them, and with interoperating with a wealth of existing libraries. So allowing
null is first of all a practical decision.
Getting started
First of all, let's define a simplistic employee type, a list of employees, and a map where the "builder" occupation points to the list of employees.
case class Person(name: String, email: String)
val employees = List(Person("bob", "bob@builder.com"))
val occupations = Map("builder" -> employees)
Compile-time safety and backward compatibility
One of the examples given by Cedric is pattern matching on an
Option type. His main point of contention is that this looks very much like testing for
null. Except one minor point: where is the example similar to the case when you
don't want to test for
null?
Exactly. No such valid example exists if you want to use the value inside
Option, at least not unless you're explicit about it. As
Paul Snively mentions, the compiler will stop you. Cedric has noticed, "the worst part about this example is that it forces me to deal with the null case right here". But this is not the worst part, it's maybe the best part.
Do you remember a feature which was added to Java 5, which was intended to save you from another type of exception,
ClassCastException? Of course, that would be generics! The problem is, it gives you type-safety, but only as long as you use classes compiled with the Java 5 compiler
and you don't use the escape hatch of raw types. As soon as you start using legacy code, you leave the safety of compiler checked code. You can ignore the warnings at your own risk, because then there's no guarantee you won't get a
ClassCastException.
Does this remind you of something? Compile-time safety as long as you don't use legacy code or the escape hatch? These restrictions sound exactly like the ones
Option has.
And of course, there is an escape hatch. You can use
Option.get or instead of pattern matching, you can even use your old friend the
if statement (Scala veterans, please close your eyes
now):
val evangelists = occupations.get("evangelist")
// ugly, ugly, ugly
if (evangelists == None)
println("No such occupation here!")
else
println("Found occupation " + evangelists.get)
But, as you'll see later, this doesn't mean that you have to deal with the "no value" case right here. Pattern matching is not the only option.
Simplification through a generalization
Instead of solving the most obvious problem, it always pays out to see if it isn't a manifestation of a bigger class of problems. Having a related class of problems solved by a common pattern simplifies things. There are fewer rules to remember. Not only that, but the specific applications of the general solution begin to interact in ways you couldn't have anticipated before. Eventually problems will appear which would be solved by a general solution, problems which you didn't know about when you implemented the solution.
As it turns out, the problem of
syntax similar to the safe dereference operator can be solved in Scala. I would say that having no explicit syntax for this, it's a fairly elegant solution, but this is subjective opinion.
Handling both value and lack of value and stop processing
This is handled by our well-known pattern match. It seems easy to use and obvious in what it does.
The advantage of pattern matching is that it uses the type system in such a way that forgetting to handle one of the cases explicitly will result in a compile time warning.
occupations.get("builder") match {
case Some(_) => println("builder occupation exists")
// oops, forgot to check for None or the catch-all _
}
// warning: match is not exhaustive!
// missing combination None
The disadvantage is that pattern matching doesn't compose very elegantly. If the result of the pattern match is just an intermediate step, you'll need to add another one, and another, and pattern matching does take some screen real estate.
Pattern matching is a bit like exception handling with
try/
catch blocks- you usually do it when you're interested in both the normal behaviour and the exceptional behaviour and that's fairly verbose. On a related note, did you know that you can use pattern matching in Scala's exception handlers?
So let's see what we can do to get more composable data processing.
Transform value
When we're interested in creating a series of steps for processing a value, we can use
map. It will transform the value if it's there, but will leave it inside the
Option. And
map won't change the
Option if it's empty (
None.map will result in
None).
val employee = employees find ( _.name == "bob" )
// Some(Person(bob,bob@builder.com))
employee map ( _.email )
// Some(bob@builder.com)
If you need to "flatten" the result you can use
flatMap. This means that instead of an
Option nested inside an
Option, you get just one
Option. It only results in
Some (a "full"
Option type) if it's called on
Some and also results in
Some:
val builders = occupations.get("builder")
// Some(List(Person(bob,bob@builder.com)))
val bobTheBuilder = builders flatMap { _ find ( _.name == "bob" ) }
// Some(Person(bob,bob@builder.com))
If you were using just
map, you would get
Some(Some(Person(bob,bob@builder.com))), which is probably a bit too nested for your taste.
Some of you are probably familiar with other languages which have
map (like Ruby or Python) and are scratching their heads: "Wait, wasn't
map defined only for lists/
Enumerables?". Please be patient.
Only get the value if it satisfies a test
If you find only some of the possible values useful, you can weed out what you have by using
filter. It will only result in
Some for values which satisfy a certain condition (called a
predicate).
bobTheBuilder filter { _.email endsWith "builder.com"}
// Some(Person(bob,bob@builder.com))
I'm sure at this point the folks who have used Google Collections have also joined the folks with past Ruby or Python experience screaming: "Hey, but filter is only used for Collections!"
Transform lack of value
That's fine, but eventually you want to get the value out. If there's no value, just assume some default value. We have to use pattern matching again, right?
But there is a shorter solution.
getOrElse extracts the value or puts a default value of the same type if there's nothing in the
Option container:
val larryWho = employees find ( _.name == "larry" )
// None
val emptyEmail = larryWho map ( _.email )
// None
emptyEmail.getOrElse("nobody@nowhere.com")
// nobody@nowhere.com
Groovy has this in the form of the Elvis operator. The trouble is, you can't get rid of the elvis operator, it's just adding cruft to the language, even though it's a useful one. It's also somewhat restrictive that it's all this operator can do.
Chain, chain, chain
The reason
map,
flatMap,
filter and
getOrElse are so useful is that they can be chained together, intermixed and the results can be passed around to other methods.
Let's shift to high gear and put it all together:
occupations.get("builder").
flatMap { _ find ( _.name == "bob" ) }.
map (_.email).
filter { _ endsWith "builder.com"}.
getOrElse("nobody@nowhere.com")
// bob@builder.com
If we're not yet interested in which step processing has failed, this is a clear way to express the process flow. It's also similar to the Fantom example Cedric desribed.
There is an ever shorter syntax for this using
for expressions (or
for comprehensions).
{for (builders <- occupations.get("builder");
bobTheBuilder <- builders find (_.name == "bob");
email = bobTheBuilder.email if email endsWith "builder.com"
) yield email
} getOrElse "nobody@nowhere.com"
// bob@builder.com
But wait, weren't
for expressions a way to loop over stuff? Well, yes, this too. More generally,
for expressions work with collections. And the beauty of it all is that we can use collections together with
Option and do nested invocations. Let's modify the example a bit and suppose that there might be more than one person named Bob and we want them all.
for (builders <- occupations.get("builder") toList;
bobTheBuilder <- builders if bobTheBuilder.name == "bob";
email = bobTheBuilder.email if email endsWith "builder.com"
) yield email
// List(bob@builder.com)
Because for all practical purposes,
Option behaves like a specialized collection. By viewing it as one, you reuse the experience of all the programmers using Groovy, Ruby, Python, Google Collections and whatnot, and flatten the learning curve.
Now imagine that the only way to work with a collection is to pattern match it. Would you use it? Yeah, me neither.
Safe invoke and composability
Now let's see how
filter works in Fantom:
fansh> list := [1, 2, null]
fansh> list.findAll |v| { v.isEven }
sys::NullErr: java.lang.NullPointerException
Oh crap, then I need to to use the safe invoke operator:
fansh> list.findAll |v| { v?.isEven }
ERROR(20): Cannot return 'sys::Bool?' as 'sys::Bool'
But it all results in a compile-time error. It's the same story with the
reduce higher-order function:
fansh> list.reduce(0) |r, v| { v + r }
sys::NullErr: java.lang.NullPointerException
So I can't practically use the safe invoke operator in nullable collections with
filter/
reduce, which the Fantom documentation has conveniently omitted from the
documentation page. So we're back to checking for
null the old way. This means that unlike
flatMap, the safe invoke works fine when you chain, but not when you compose.
Iterator
Let's now see some other advantages of
Option behaving like a collection. For instance, it lets you use
Iterable's API in some
elegant ways:
val noVal: Option[Int] = None
val someVal = Some(4)
List(1,2,3) ++ someVal ++ noVal
// List(1, 2, 3, 4)
Guess what happens here? Only the numbers contained in
Some are added to the list. I think there's no operator for this in Fantom and Groovy, and it would be overkill to include one, too.
One size doesn't fit all
Wait, if
Option is like a collection, does this mean that there are many types of
Option? Does this mean I can create my
own Option?
Yes, and
yes. Just as there isn't just one type of
List, or one type of
Map, there can also be several types of
Option. For instance,
Lift defines its own, which it currently calls
Box (I think it's a great metaphor). One of the things
Box has in addition to
Option is a type to collect
Failures. It's no longer just a "dunno what happened, something failed along the way". It's a list of error messages which can pinpoint exactly what went wrong. This is invaluable for a web framework, because when a user expects a complex form to be validated, a simple "some of our input is wrong" just won't cut it.
for {
id <- S.param("id") ?~ "id param missing" ~> 401
u <- User.find(id) ?~ "User not found"
} yield u.toXml
And guess what, you can also
define your own operators, which also work in
for comprehensions.
for {
login <- get("/account/verify_credentials.xml", httpClient, Nil)
!@ "Failed to log in"
message <- login.post("/statuses/update.xml", "status" -> "test_msg1")
!@ "Couldn't post message"
xml <- message.xml
} yield xml
Except that they're not operators. When conventions and patterns evolve, Lift folks can always change the "operator". Or another framework can do it.
Another good example of using an enhanced
Option is Josh Suereth's (jsuereth)
Scala ARM library. It's collecting a list of errors, and Java's safe resource blocks proposed for Java 7 looks primitive in comparison.
Option is not only Scala's to have
Actually, there's nothing specific about
Option that ties it to Scala. The only thing which is Scala specific is the syntax sugar of
for comprehensions. You can use
Option in Java if you want, although some of the examples above wouldn't be as concise and so it might be a bit of a pain. But this doesn't stop people from
trying to recreate Maybe in Java.
Many folks have even
tried to cheat and
use Java's enhanced for expression as
syntax sugar. So if Java can afford some syntax sugar over
Iterator, and according to Joshua Bloch the
for expression is a clear win, why shouldn't Scala do it? The difference is only that Scala's is applicable to a wider set of problems.
Why language syntax won't save you from the future
One advantage which some people don't realize Scala has is that it's a relatively minimal language with a relatively rich library. Apart from
Option, there are other examples where having a library instead of a language feature has brought huge benefits to Scala. One such example is actors.
Let's compare Scala's actors to Erlang's. Undoubtedly Erlang is the daddy of practical actor implementations. Actors in the Erlang virtual machine have some superb characteristics which other runtime implementations will have a hard time catching up with. They're scalable and lightweight. They work across hosts and in the same virtual machine. They can be hot swapped and you can create millions of them in a single virtual machine.
But there's only one type of actor. This means it must deal with all possible cases, and as it usually happens, it deals better with some and worse with others. I have no doubt that having actors as part of the language, choosing only one type of actor is a very sensible decision, but it can still be restrictive sometimes.
Scala deals with this differently. Scala's actors are not part of the language, and the actor message send syntax (which is borrowed from Erlang) is just a method invocation in a library. This means that Scala's free to evolve different actor implementation, and you're free to choose the one which suits your case better. Some are more full-featured, some are lightweight and performant; some are remote, some are local; some use a thread pools, some use a single scheduler; some use managed hierarchies, some don't. Regarding actors Scala the language is smaller than Erlang, but the Scala libraries are richer.
Which actor library will win? I don't know. And probably neither do you. There might not be one best answer. That's why hardcoding stuff in the language is not a good way to prepare for the future. Only experience will, either the collective experience of the community or the extensive experience of a genius Benevolent Dictator For Life.