What I'll try to prove in this post is not that Scala is not a complicated language. There are certainly many languages which are simpler. The core features which are used reasonably often are indeed a simplification over Java. Scala also has features which are more complicated than what Java has. However, the complicated Scala features are more specialized at extending the language while the complexity of Java is usually imposed on everyone including the beginner.
This post will also not try to describe the language features in exhaustive detail- that's what the language specification tries to achieve, and the blog post is already long enough. I will assume that you know about the core language rules or can easily look them up.
What is complexity? Many conflate it with unreadability, some say it's the opposite to ease of use. Let's start with the following definition of complexity: it's the many special exceptions (pun not intended) to the rules, making the whole system difficult to understand and remember.
Based on that definition, let's run a comparison of language features of Java and Scala.
Keywords
Java has more reserved words than Scala. Even when we remove the words of primitive types (boolean, int, float, etc.) Scala still has less keywords!
- Scala misses some of Java's control structures Yes, continue and break (well, at least until Scala 2.8) are not part of the language, as they are not deemed a best-practice way to solve programming problems.
- if/then/else in Scala returns a value, thus eliminating the need for the only ternary operator in Java, "?:", the use of which has long been discouraged.
- for loop Java folks discovered late in the game that the enhanced for loop is much less complicated to use in cases when you don't need the counter index. And they were right- it's one more detail which you (or at least newbies) can get wrong. But why stop there- Scala has a much more universal for loop, and there aren't two different syntaxes as in Java.
- Scala keywords not in Java one might argue that the override keyword in Scala complicates things as you might do the same thing in Java with an @Override annotation. That's not quite the case, as you still might override a method by accident and forget to put the annotation (as it's not mandatory), and then the compiler will not give as much as a warning! So that's one more special case you need to worry about and keep in your head. When you start using traits, you definitely start to appreciate that override is a keyword.
Access modifiers
Java has four access modifiers: default (package), private, public and protected. Scala has three, but it can combine them with a scope for protection. This flexibility allows you to define access control, for which Java has no analogs. They are also readable because they look self explanatory. For instance, if you have the class org.apache.esme.actor.UserActor, these are the Scala equivalents for Java's access modifiers:
- private[actor]
- Same as package access in Java
- private[UserActor]
- Same as private in Java
Scala's default access is public (yay, one less keyword!).
On the other hand, by defining the scope, Scala allows visibility types, which Java doesn't have:
- private[this]
- the member is accessible only from this instance. Not even other instances from the same class can access it
- private[esme]
- access for a package and subpackages. How many times did you have to use clumsy workarounds in Java because subpackages couldn't access the parent package?
- protected
- only subclasses can access the field. This is more consistent than Java's protected keyword. In Java, you have to remember that both subclasses and classes in the same package can access the field. How's that for remembering arbitrary rules?
- private
- This will not allow you to access inner objects. This is also more consistent than Java's private access, which is perhaps indicative of the fact that inner objects in Java were bolted on after the first version of the language was created
Namespaces
Java has four namespaces (fields, methods, packages, types), Scala has two (one for types, one for everything else). This has important implications when overriding. In Scala, you can start with a parameterless def (method definition) and then override with a val (immutable variable). This helps enforce the uniform access principle: whether you access a field or a method, this doesn't restrict you later on, because the syntax is the same everywhere you access it.
One thing which you cannot do is define a variable and a method with the same method. The "Programming in Scala" book explicitly mentions that allowing this would be a code smell because of possible code duplication and the ambiguity which could arise.
Types and the type system
Scala's type system has been criticized for being too complex, but let's have a look at Java's type system.
Primitive types
There are many exceptions in Java's type system, which make the language hard to use and evolve. Java has primitive types, while in Scala everything is an object, making this valid code:
(1 to 3) foreach println
Java's primitive types make a lot of things harder- before Java 5, manual wrapping in collections, e.g. int in Integer, boolean in Boolean, etc. was a pain. After auto-boxing and unboxing came out, the syntax is cleaner, but the overhead remains, as well as some very subtle bugs. For example, auto-unboxing a null wrapper type will cause a NullPointerException.
There is a lot of code duplication because of primitive types- you always have to specify special cases and can't be truly generic.
Generics
Scala has generics done right. For instance, you can define covariance at the declaration site, whereas Java requires you to do this at the call site. This, combined with Scala's type inference allows one to use generified libraries without having to know or define the complete type signature.
Java has yet another "special" type: arrays. As a result the rules for the underlying array type and the generified ArrayList are quite different and inconsistent. The type of arrays is checked at runtime, while the genericity of ArrayList is checked at compile-time. As a result, inappropriate assignment to an array element results in a ArrayStoreException only at runtime.
Constructors
Java initialization order in object construction is a pain to get right. The puzzlers on the certification exam use the most bizarre mix of static and instance initializer blocks and constructors calling or inheriting other constructors.
In Scala, any code which isn't part of a method or member variable declaration is executed as the primary constructor. You can define auxiliary constructors which call either the primary one or another auxiliary one defined in the same class before it. Can you come up with anything simpler?
Uniform syntax
Scala is sometimes accused of using too many symbols. Whatever you've seen, it's mostly not part of the language, but of the libraries. You can override them and even disable them. What does Java have to say about special symbols?
Arrays
Arrays in Java are accessed via square brackets. In Scala, parentheses are used, because accessing an array index is a method call (called apply). Don't worry though, the compiler optimizes this away.
Collection literals
Java has a lot of special syntax for array instantiation and will soon have ones for instantiating lists, sets and maps. Scala, again, creates these special collections using factory methods in the companion objects: Array(1,2,3), List(1,2,3), Map((1,2)). Lists can also be created using the cons "operator", but here's the trick: it's not actually an operator. It's a method, appending to the list. You can also create a map using the arrow tuple syntax: Map(1->2). And again, this is not "special" syntax, which is part of the language- it's a method, constructing a tuple!
Now someone might smirk and think: "Ha, gotcha! Do you mean to say that simply because you've pushed the complexity out of the language and into the libraries you don't need to deal with it?". True, but let's have a look at Java and its ever growing standard libraries. It has AWT and Swing. It has old I/O and new I/O (gosh, which one do I use?). It has RMI (do you remember RMI?) and OMG, it even has CORBA. These libraries will never die. Methods in Thread have been deprecated for ages. There's also no sign that the ill-conceived Date/Calendar classes will ever be removed, but you still must know JodaTime if you hope to get any job with dates done.
More importantly, extending the language easily helps abstract away the details and evolve the language without creating tons of special cases. As per our definition, special cases add up to increase complexity. We'll explore the topic of extending the language in the followup post.