Speaking my (programming) language?

Sunday, July 31, 2011

Partially unexpected effects of chaining partial functions in Scala

People who learn Scala usually agree that pattern matching is a great feature which helps make your code more expressive. Some time later they also discover that partial functions used in the match statement can also be used separately. And since partial functions are full-blown functions, you can combine them in a couple useful ways:

f1 orElse f2: Combines two partial function in a new one; if f1 is not defined in its argument, tries f2.
f1 andThen f2: Applies the result of f1 to f2. This means that the output of f1 must be a type compatible with the input of f2

orElse

As others have discovered, you can combine a list (TraversableOnce really) of partial functions into one with reduce. What's not so obvious though is that the way you combine them can lead to unexpected perfomance consequences.

In order to easily create a lot of partial functions to test, we will create a higher-order function to generate them (if you're used to Java, you can call it a factory). The produced partial function will print a short message when its isDefinedAt method is called (not when it's applied):


def genPF(defined: Int): PF = { case i if {println(defined); defined == i} => i }
val f123 = 1 to 3 map genPF reduceLeft(_ orElse _)

Let's try it:


> f123(1)
1
1
1
> f123(2)
1
2
1
2
1
2
> f123(3)
1
2
3

Wait, what? The isDefinedAt method is called up to 6 times. It gets even worse with a bigger number of composed functions.


val f1to5 = 1 to 5 map genPF reduceLeft(_ orElse _)


> f1to5(2)
1
2
1
2
1
2
1
2
> f1to5(3)
1
2
3
1
2
3
1
2
3
> f1to5(4)
1
2
3
4
1
2
3
4

Let's take a closer look at the definition of isDefinedAt and apply of the function created with orElse:


def orElse[A1 <: A, B1 >: B](that: PartialFunction[A1, B1]) : PartialFunction[A1, B1] =
  new PartialFunction[A1, B1] {
  def isDefinedAt(x: A1): Boolean =
    PartialFunction.this.isDefinedAt(x) || that.isDefinedAt(x)
  def apply(x: A1): B1 =
    if (PartialFunction.this.isDefinedAt(x)) PartialFunction.this.apply(x)
    else that.apply(x)
}

When you apply a composed partial function, we first check if it's defined in either f1 or f2, and then we check f1 again, so that we know which one to call. This means that in the worst case, isDefinedAt for f1 is called twice.

Given this, we can explain what happens here. The isDefinedAt delegates to the composed functions' methods, and when it's called twice... you know what happens when we do this again and again. We can fairly easily find out that isDefinedAt is called k * (n - k + 1) times, where n: number of composed functions, k: the first function that matches.

Luckily, there is an easy solution to combine partial functions in a more efficient way. We can use reduceRight, where isDefinedAt for each composed function is invoked at most twice. Verifying this and finding out why is left as an exercise for the curious reader (as you undoubtedly are, since you're reading this).

andThen

You would think that f1 andThen f2 should be defined only in the cases when the results of f1 are defined in f2


val doubler: PartialFunction[Int,Int] = { case i if Set(1,2) contains i => i * 2 }
val f = doubler andThen doubler

Of course, that's not how it works. In order to find out if the output of f1 is a valid input for f2, we would need to execute the function, and it's better not to do this in case it has side effects. This means that we cannot rely on calling isDefined for the combined function to avoid MatchErrors:


> f isDefinedAt 2
res3: Boolean = true
> f(2)
scala.MatchError: 4
...

Conclusion: when you're looking for performance, it always help to understand how the abstractions you're using decompose into simpler building blocks.

Friday, June 17, 2011

Testing actors in Scala

Probably the most frequent question that people asked on Scala eXchange 2011 was how to test actors. Since I've long planned to write up a blog post on this topic, this was an indication that it's high time to get it done.

It seems that the main problem people have is that actors are asynchronous and this introduces non-determinism in their tests. How do you know when to check if the actor has received and processed the message? Do you just wait a certain number of seconds before checking? This would make tests unnecessarily slow.

Another problem I think folks have is that they don't know how to verify that an actor has sent a message to another actor. Developers are familiar with mocking objects to verify that a method has been called, but how do you mock an actor to verify it has received a certain message?

Finally, it is difficult for most to handle the fact that it's not easy (or at least not idiomatic) to check the actor's internal state. This is especially valid with Akka actors which are created using a factory method and you can't define methods for mucking with the actor's internals.

Let me first address the last one. How do you check that internal state of an actor has changed? You don't! I find that actors are better at following the Object-Oriented principle of encapsulation even than objects are. Relying on a certain internal state couples your tests unnecessarily to the implementation and makes them brittle. As Viktor Klang has pointed out, if the internal state of an actor cannot be observed outside the actor, does it really matter what it is?

So how do you know when an actor has processed a message which was sent asynchronously? An easy way to eliminate non-determinism is to define a method where the functionality is located and call that upon receiving a message:


object MyActor {

  def computation(arg1: String, arg2: Int) = {

    ...
    result
  }
}

class MyActor extends Actor {

  loop {
    react {
      case Message(arg1, arg2) =>

        anotherActor ! computation(arg1, arg2)
    }
  }

}

Then you can test the method the way you are familiar with. Putting the method in the companion object means that you can't test the internal state- but this also means this approach should work with Akka actors as well. We also avoid the problem of checking if the next actor in the chain has received the result.

Sometimes you cannot test a helper method and sometimes testing the method is not enough. In these cases you want to verify explicitly that an actor has sent a message to another actor as a result of receiving a certain trigger message. Here's another approach we use at Apache ESME, which works very well:


case object Wait

class ConductorActor extends Actor {
  def act {

    react {
      case Wait => reply {
        receive {

          case MessageReceived(msg, reason) => msg
        }

      }
    }
  }
}

What's going on here? We define a helper actor (the recipient) which is the one supposed to receive the message from the actor we want to test (the sender). Usually the sender we want to test doesn't send a message to a hardcoded recipient- it is a good idea to either inject it as a construction parameter at instantiation time or register it via a message representing a subscription request.

This actor uses a fairly rarely used nested syntax, which is only available with the actors in the standard Scala library. The recipient handler would reply synchronously (which is what we want) with the message received only after we give it the signal that we've already sent a message to the sender. This implementation also relies on the fact that unhandled messages are kept in the inbox if there's no handler for them. While this can lead to memory leaks if these messages don't get handled, it is a nice way to process out-of-order messages, which is something we take advantage of here. This is similar to selective receive in Erlang and is a fairly painless way to handle race conditions- it doesn't matter which message has been received first here.

These features are not present in Akka, but you could emulate nested handlers using become and keep unhandled messages using. An even better idea would be to use the built-in TestKit or the akka-expect project, which use the same technique in a not so ad-hoc manner (but AFAIK don't work for non-Akka actors).

So now the only thing we need to do is send the trigger message to the tested sender and then ask the recipient if the resulting message has been sent by the sender:


// wait till the message appears in the timeline
// or fail after 5 seconds
val msgReceived = conductor !? (5000L, Wait)

if (msgReceived.isEmpty) fail("no message received")

If the recipient gets the message within a certain timeframe, the test is successful, otherwise we time out and fail the test. The nice thing about this approach is that in the happy path case, the test can continue immediately without slowing down the test suite. The test is slowed down by the designated timeout only when the test is going to fail, but this should be an exceptional event.

A minor inconvenience is if the sender doesn't expect the recipient to be a Scala library actor, but e.g. a Lift actor, but this can be easily overcome by using a bridge actor, which only acts as an intermediary and just forwards the request to the designated recipient actor:


class BridgeActor(receiver: Actor) extends LiftActor {

  protected def messageHandler = {
    case nm @ MessageReceived(_, _) => receiver ! nm

  }
}

val liftActor = new BridgeActor(conductor)

Distributor ! Listen(theUser.id.is, liftActor)
Distributor ! Listen(followerUser.id.is, liftActor)

Here we're injecting one actor as a construction parameter and registering another via the Listen message.

Further research

If you're using Akka, your best bet is the recommended TestKit- here's an article on how to use it:

http://roestenburg.agilesquad.com/2011/02/unit-testing-akka-actors-with-testkit_12.html

Another solution would be to use the akka-expect framework:

https://github.com/joda/akka-expect

A more universal library is Awaitility, which uses a similar solution, but with more general applicability to Java threads and Scala actors:

http://code.google.com/p/awaitility/

ScalaTest and Specs also have the conductor actor, which implement a similar idea of using a CountDownLatch to make actors deterministic:

http://www.scalatest.org/scaladoc/doc-1.0/org/scalatest/concurrent/Conductor.html

Thursday, October 21, 2010

Why Scala's Option won't save you from lack of experience

Some time ago Cedric Beust was the cause for some excitement in the Scala community by declaring that he doesn't see the advantages of using Scala's Option type, which is also similar to Haskell's Maybe type.

There were a lot of insightful comments which outlined the benefits of using Option. James Iry has written a well-reasoned post called Why Scala's "Option" and Haskell's "Maybe" types will save you from null.

I wanted to approach things differently. I wanted to show people some patterns which usually come with experience and let them decide which is better. That's why this turned into a very long post which looks more like a tutorial. Of course, I'm not adding anything new to the discussion, but just summarizing some of the common wisdom accumulated by the community. I'm sure this is not the last typical blog post showing the wonders of Option, but I hope it can clear up some (Some?) misunderstandings. Or Maybe not.

But more importantly, I wanted to explain why having a solution as a language feature is a premature optimization. It's neither as flexible, nor as powerful as having it as just a type in the library.

NullPointerException

Cedric is definitely not alone- programmers who decide to give Scala a try (and moreso Haskell or the ML family) face conceptual differences from popular imperative languages. Just reading that Option is a type wrapper does not mean it's easy to wrap one's head around it.

One advantage of using Scala is that if you are convinced that NullPointerExceptions solve your problem better, you are free to use it. Option is just one option. And of course, you can come back later any time if you make up your mind that Option has Some advantages (e.g. composability). Of course, some might view having too many choices as a disadvantage.

But both Scala and Clojure must live with the design decisions on the JVM which were taken before them, and with interoperating with a wealth of existing libraries. So allowing null is first of all a practical decision.

Getting started

First of all, let's define a simplistic employee type, a list of employees, and a map where the "builder" occupation points to the list of employees.


case class Person(name: String, email: String)
val employees = List(Person("bob", "bob@builder.com"))
val occupations = Map("builder" -> employees)

Compile-time safety and backward compatibility

One of the examples given by Cedric is pattern matching on an Option type. His main point of contention is that this looks very much like testing for null. Except one minor point: where is the example similar to the case when you don't want to test for null?

Exactly. No such valid example exists if you want to use the value inside Option, at least not unless you're explicit about it. As Paul Snively mentions, the compiler will stop you. Cedric has noticed, "the worst part about this example is that it forces me to deal with the null case right here". But this is not the worst part, it's maybe the best part.

Do you remember a feature which was added to Java 5, which was intended to save you from another type of exception, ClassCastException? Of course, that would be generics! The problem is, it gives you type-safety, but only as long as you use classes compiled with the Java 5 compiler and you don't use the escape hatch of raw types. As soon as you start using legacy code, you leave the safety of compiler checked code. You can ignore the warnings at your own risk, because then there's no guarantee you won't get a ClassCastException.

Does this remind you of something? Compile-time safety as long as you don't use legacy code or the escape hatch? These restrictions sound exactly like the ones Option has.

And of course, there is an escape hatch. You can use Option.get or instead of pattern matching, you can even use your old friend the if statement (Scala veterans, please close your eyes now):


val evangelists = occupations.get("evangelist")
// ugly, ugly, ugly
if (evangelists == None)
  println("No such occupation here!")
else
  println("Found occupation " + evangelists.get)

But, as you'll see later, this doesn't mean that you have to deal with the "no value" case right here. Pattern matching is not the only option.

Simplification through a generalization

Instead of solving the most obvious problem, it always pays out to see if it isn't a manifestation of a bigger class of problems. Having a related class of problems solved by a common pattern simplifies things. There are fewer rules to remember. Not only that, but the specific applications of the general solution begin to interact in ways you couldn't have anticipated before. Eventually problems will appear which would be solved by a general solution, problems which you didn't know about when you implemented the solution.

As it turns out, the problem of syntax similar to the safe dereference operator can be solved in Scala. I would say that having no explicit syntax for this, it's a fairly elegant solution, but this is subjective opinion.

Handling both value and lack of value and stop processing

This is handled by our well-known pattern match. It seems easy to use and obvious in what it does.

The advantage of pattern matching is that it uses the type system in such a way that forgetting to handle one of the cases explicitly will result in a compile time warning.


occupations.get("builder") match {
        case Some(_) => println("builder occupation exists")
        // oops, forgot to check for None or the catch-all _
}
// warning: match is not exhaustive!
// missing combination           None

The disadvantage is that pattern matching doesn't compose very elegantly. If the result of the pattern match is just an intermediate step, you'll need to add another one, and another, and pattern matching does take some screen real estate.

Pattern matching is a bit like exception handling with try/catch blocks- you usually do it when you're interested in both the normal behaviour and the exceptional behaviour and that's fairly verbose. On a related note, did you know that you can use pattern matching in Scala's exception handlers?

So let's see what we can do to get more composable data processing.

Transform value

When we're interested in creating a series of steps for processing a value, we can use map. It will transform the value if it's there, but will leave it inside the Option. And map won't change the Option if it's empty (None.map will result in None).


val employee = employees find ( _.name == "bob" )
// Some(Person(bob,bob@builder.com))
employee map ( _.email )
// Some(bob@builder.com)

If you need to "flatten" the result you can use flatMap. This means that instead of an Option nested inside an Option, you get just one Option. It only results in Some (a "full" Option type) if it's called on Some and also results in Some:


val builders = occupations.get("builder")
// Some(List(Person(bob,bob@builder.com)))
val bobTheBuilder = builders flatMap { _ find ( _.name == "bob" ) }
// Some(Person(bob,bob@builder.com))

If you were using just map, you would get Some(Some(Person(bob,bob@builder.com))), which is probably a bit too nested for your taste.

Some of you are probably familiar with other languages which have map (like Ruby or Python) and are scratching their heads: "Wait, wasn't map defined only for lists/Enumerables?". Please be patient.

Only get the value if it satisfies a test

If you find only some of the possible values useful, you can weed out what you have by using filter. It will only result in Some for values which satisfy a certain condition (called a predicate).


bobTheBuilder filter { _.email endsWith "builder.com"}
// Some(Person(bob,bob@builder.com))

I'm sure at this point the folks who have used Google Collections have also joined the folks with past Ruby or Python experience screaming: "Hey, but filter is only used for Collections!"

Transform lack of value

That's fine, but eventually you want to get the value out. If there's no value, just assume some default value. We have to use pattern matching again, right?

But there is a shorter solution. getOrElse extracts the value or puts a default value of the same type if there's nothing in the Option container:


val larryWho = employees find ( _.name == "larry" )
// None
val emptyEmail = larryWho map ( _.email )

// None
emptyEmail.getOrElse("nobody@nowhere.com")
// nobody@nowhere.com

Groovy has this in the form of the Elvis operator. The trouble is, you can't get rid of the elvis operator, it's just adding cruft to the language, even though it's a useful one. It's also somewhat restrictive that it's all this operator can do.

Chain, chain, chain

The reason map, flatMap, filter and getOrElse are so useful is that they can be chained together, intermixed and the results can be passed around to other methods.

Let's shift to high gear and put it all together:


occupations.get("builder").
flatMap { _ find ( _.name == "bob" ) }.
map (_.email).
filter { _ endsWith "builder.com"}.
getOrElse("nobody@nowhere.com")
// bob@builder.com

If we're not yet interested in which step processing has failed, this is a clear way to express the process flow. It's also similar to the Fantom example Cedric desribed.

There is an ever shorter syntax for this using for expressions (or for comprehensions).


{for (builders <- occupations.get("builder");
      bobTheBuilder <- builders find (_.name == "bob");
      email = bobTheBuilder.email if email endsWith "builder.com"
      ) yield email
} getOrElse "nobody@nowhere.com"
// bob@builder.com

But wait, weren't for expressions a way to loop over stuff? Well, yes, this too. More generally, for expressions work with collections. And the beauty of it all is that we can use collections together with Option and do nested invocations. Let's modify the example a bit and suppose that there might be more than one person named Bob and we want them all.


for (builders <- occupations.get("builder") toList;
     bobTheBuilder <- builders if bobTheBuilder.name == "bob";
     email = bobTheBuilder.email if email endsWith "builder.com"
     ) yield email
// List(bob@builder.com)

Because for all practical purposes, Option behaves like a specialized collection. By viewing it as one, you reuse the experience of all the programmers using Groovy, Ruby, Python, Google Collections and whatnot, and flatten the learning curve.

Now imagine that the only way to work with a collection is to pattern match it. Would you use it? Yeah, me neither.

Safe invoke and composability

Now let's see how filter works in Fantom:


fansh> list := [1, 2, null]
fansh> list.findAll |v| { v.isEven }
sys::NullErr: java.lang.NullPointerException

Oh crap, then I need to to use the safe invoke operator:


fansh> list.findAll |v| { v?.isEven }
ERROR(20): Cannot return 'sys::Bool?' as 'sys::Bool'

But it all results in a compile-time error. It's the same story with the reduce higher-order function:


fansh> list.reduce(0) |r, v| { v + r }
sys::NullErr: java.lang.NullPointerException

So I can't practically use the safe invoke operator in nullable collections with filter/reduce, which the Fantom documentation has conveniently omitted from the documentation page. So we're back to checking for null the old way. This means that unlike flatMap, the safe invoke works fine when you chain, but not when you compose.

Iterator

Let's now see some other advantages of Option behaving like a collection. For instance, it lets you use Iterable's API in some elegant ways:


val noVal: Option[Int] = None
val someVal = Some(4)
List(1,2,3) ++ someVal ++ noVal
// List(1, 2, 3, 4)

Guess what happens here? Only the numbers contained in Some are added to the list. I think there's no operator for this in Fantom and Groovy, and it would be overkill to include one, too.

One size doesn't fit all

Wait, if Option is like a collection, does this mean that there are many types of Option? Does this mean I can create my own Option?

Yes, and yes. Just as there isn't just one type of List, or one type of Map, there can also be several types of Option. For instance, Lift defines its own, which it currently calls Box (I think it's a great metaphor). One of the things Box has in addition to Option is a type to collect Failures. It's no longer just a "dunno what happened, something failed along the way". It's a list of error messages which can pinpoint exactly what went wrong. This is invaluable for a web framework, because when a user expects a complex form to be validated, a simple "some of our input is wrong" just won't cut it.


for {
  id <- S.param("id") ?~ "id param missing" ~> 401
  u <- User.find(id) ?~ "User not found"
} yield u.toXml

And guess what, you can also define your own operators, which also work in for comprehensions.


for {
  login <- get("/account/verify_credentials.xml", httpClient, Nil)
    !@ "Failed to log in"
  message <- login.post("/statuses/update.xml", "status" -> "test_msg1")

    !@ "Couldn't post message"
  xml <- message.xml
} yield xml

Except that they're not operators. When conventions and patterns evolve, Lift folks can always change the "operator". Or another framework can do it.

Another good example of using an enhanced Option is Josh Suereth's (jsuereth) Scala ARM library. It's collecting a list of errors, and Java's safe resource blocks proposed for Java 7 looks primitive in comparison.

Option is not only Scala's to have

Actually, there's nothing specific about Option that ties it to Scala. The only thing which is Scala specific is the syntax sugar of for comprehensions. You can use Option in Java if you want, although some of the examples above wouldn't be as concise and so it might be a bit of a pain. But this doesn't stop people from trying to recreate Maybe in Java.

Many folks have even tried to cheat and use Java's enhanced for expression as syntax sugar. So if Java can afford some syntax sugar over Iterator, and according to Joshua Bloch the for expression is a clear win, why shouldn't Scala do it? The difference is only that Scala's is applicable to a wider set of problems.

Why language syntax won't save you from the future

One advantage which some people don't realize Scala has is that it's a relatively minimal language with a relatively rich library. Apart from Option, there are other examples where having a library instead of a language feature has brought huge benefits to Scala. One such example is actors.

Let's compare Scala's actors to Erlang's. Undoubtedly Erlang is the daddy of practical actor implementations. Actors in the Erlang virtual machine have some superb characteristics which other runtime implementations will have a hard time catching up with. They're scalable and lightweight. They work across hosts and in the same virtual machine. They can be hot swapped and you can create millions of them in a single virtual machine.

But there's only one type of actor. This means it must deal with all possible cases, and as it usually happens, it deals better with some and worse with others. I have no doubt that having actors as part of the language, choosing only one type of actor is a very sensible decision, but it can still be restrictive sometimes.

Scala deals with this differently. Scala's actors are not part of the language, and the actor message send syntax (which is borrowed from Erlang) is just a method invocation in a library. This means that Scala's free to evolve different actor implementation, and you're free to choose the one which suits your case better. Some are more full-featured, some are lightweight and performant; some are remote, some are local; some use a thread pools, some use a single scheduler; some use managed hierarchies, some don't. Regarding actors Scala the language is smaller than Erlang, but the Scala libraries are richer.

Which actor library will win? I don't know. And probably neither do you. There might not be one best answer. That's why hardcoding stuff in the language is not a good way to prepare for the future. Only experience will, either the collective experience of the community or the extensive experience of a genius Benevolent Dictator For Life.

Saturday, September 11, 2010

Top 5 underused GNU screen features

Most people use GNU screen mostly for its abilities to detach from the terminal and create multiple shell sessions. This makes it ideal in combination with remote terminal connections like ssh. There are some features which, although not very popular, are still useful on a number of occasions.

As a bonus, let's start with a short explanation of the popular features. If you're familiar with detaching and multiplexing, you might want to skip to the sections about more rarely used features.

Detaching

The first thing to know about screen is that all shortcuts start with Ctrl + A by default.

The most common screen workflow is:

Connect to a remote system via ssh

Start/attach screen:
```
screen -D -RR
```
This command will detach a running screen session and create a new one, if necessary.

Work

Detach screen by pressing Ctrl + A, then D.

Disconnect.

Take a break, return to 1 :)

You can find the currently running screen sessions using the command:


screen -ls

Alternative: nohup is also used when you want to detach a program when you disconnect, but it will never attach to the terminal, so you cannot use it if it's interactive.

Multiplexing

When you connect via ssh, one shell is rarely enough. Connecting a second time is often inconvenient and slow. Enter screen- you can create a new session using Ctrl + A, then C. You switch to the next one using Ctrl + A, then space (or n). You switch to the previous one using Ctrl + A, then backspace (or p). Using Ctrl + A and then a digit you can directly jump to the session numbered 0-9.

Alternative: using ssh's option ControlMaster will reuse an existing connection and open another shell immediately, but this is only relevant for ssh.

Now you know the basics, let's see what other goodies screen offers.

1. Sharing sessions

Typing in a terminal is usually a lonely experience. Sometimes you wish that you could show someone else what you're doing or even do it together. It might be because you want to teach someone some UNIX tricks, or you must solve an issue together or you're up to the challenge of remote pair programming. Your wish is granted! In screen, you can make it so that your keyboards and monitors are in control of a single session.

The screen host should do the following:

Ctrl + A then type:
```
:multiuser on
```

Ctrl + A then type:
```
:acladd guest
```
where guest is the name of the user you want to let in your session

Then the screen guest can join using the following command (which will work if there's a single multi-user session open):


screen -x username/

Alternative: VNC can be used only for graphical remote connection sharing.

2. Copy and paste

It's not always easy to copy and paste text in a terminal. For one, if you're on a getty console (the black screen with the login prompt and no graphics), you can't even use your mouse (usually). If you have a crappy X terminal, you might have the problem that wrapped lines are cut by newlines or it's hard to select the output if it spans more than a single screen... Whatever it is, with screen you can cut like a pro without even using the mouse.

Ctrl + A, then pressing "[" will enter copy mode. When in copy mode, you navigate around using the vi key shortcuts (you know the vi shortcuts, otherwise you wouldn't be reading about working in terminal sessions, right?).

Press space once to mark the beginning and a second time to mark the end of the snippet you want to cut.

Ctrl + A, then "]" will paste the copied text.

Did I mention screen can also copy rectangular blocks of text? Reading about it in the manual is left as an exercise for the curious reader.

Alternative: X-Server's select/middle-click can be used- only if you're running in an X session, though.

3. Log output

We humans are not good at remembering stuff, that's what computers are for. To record exactly what you have typed and what was the output, start logging using Ctrl + A, then H, and the same sequence to stop logging. The logging session is written to screenlog.0 if you're recording in the first window, screenlog.1 in the second one, etc.

Alternative: The UNIX command script will start a new shell and record all keystrokes and output in the file typescript.

4. Monitor for activity

Let's say you've started a long-running command and you're waiting for it to finish while you're busy typing in another window. Switching the window periodically just to check what's going on quickly becomes annoying, so you type Ctrl + A, then M and you're set- screen is watching the console for you. If anything changes in this window, you will see a notification in the status line at the bottom.

Alternative: Many linux terminals can monitor for activity, e.g. konsole- but they're graphical and need to run in an X session.

5. Lock screen

It's often useful to lock the screen session without detaching, especially when using a multi-user session. Ctrl + A, then x, and you're done. Type your password and your session is available again.

Alternative: Popular desktop environments use xscreensaver, but again- this only works in X sessions.

Where to go from here?

If you like GNU screen, you might also try tmux. According to the site, "tmux is intended to be a modern, BSD-licensed alternative to programs such as GNU screen". If you're interested, check out this detailed tmux blog post.

If you want to take screen to the next level, you might give tiling window managers a try. They do for the graphical environment what screen does for the terminal. The idea is that you don't resize your windows, but the window manager does that automatically by partitioning the desktop area in adjacent windows. Most of these also try to use keyboard shortcuts extensively and obviate the need of a mouse. I'm currently impressed by xmonad, although I've heard nice things about the awesome window manager as well (yes, that's its name).

Tuesday, August 10, 2010

Testing with Lift's TestFramework

Getting started

HTTP testing is sufficiently tedious that some folks don't do it. Even if we do it, in Java it doesn't look as pretty as it could be.

The Scala Lift web framework, apart from the other advantages it has (which are a topic for many a blog post) offers some syntax-sugar wrappers so that testing of our REST APIs can be concise and to the point.

Combined with Jetty, this leads to some seriously short and readable code.

First, we need to:

use the trait "with TestKit" in our test

override the baseUrl property

start our Jetty server

Then if we want to output our own failure message later, we need to provide an implicit class of type ReportFailure, where we only need to implement the fail method, which predictably takes a String. For example, regardless of whether we use ScalaTest or Specs, we can fall back to the fail method which is inherited by our test class:

This is all we need so far.

Using it

Now we're ready to GET some action.

There's already a lot going on here. First of all, the get and post methods return a TestResponse (you don't see it, because it's inferred). And the nice thing about TestResponse (and Scala) is that since it implements the foreach method, it can be used in for expressions together with the "<-" symbol.

Whatever is extracted by the for expression can be used to issue get and post requests, but in the context of the previous request. This means that cookies are preserved, so you can use the same HTTP session. In this contrived example we use an httpClient as an argument to the get method (which is supposed to use some HTTP credentials and log us in). In the following call, we don't need to use the authenticating http client, because we can have our cookies (and eat them, too).

We can also use http parameters as a sequence of tuples, which can be delimited using the "->" operator.

For any TestResponse, we can use the xml property, which is the XML as a scala.xml.Elem wrapped in the (ubiquitous for Lift) Box.

Finally, there's concise syntax to assert certain properties about the response. For example the "!@" operator checks if the response code is 200, otherwise it fails with the error message specified after the operator.

If we expect a specific return code other than 200, we can also specify it explicitly:

Customizing

If you want to define a HTTP basic authentication client to be used by default by your requests, you can override the method theHttpClient. Lift's TestFramework provides the buildBasicAuthClient method, which can be reused to quickly create an HTTP client with a set user and password.

If you're like me, you might lose hours trying to find out why the request doesn't work when the server doesn't explicitly request authentication. Then setting the preemptive flag on the client would definitely save some time.

I know some of you Scala geeks are already bored because everything so far is just so simple. There wasn't even any mention of implicits! But rest assured, I will find some use for implicits.

Pimping

The usage snippet in the for expression above is not quite right. If the response contains no valid xml, then the specs matcher will not be executed and the test will not fail, although it should. Here we can use the fact that TestResponse also implements the filter method, which allows us to have if expressions (also known as filters). The post snippet could then be rewritten like this:

We are again using the Specs matcher to test if the response contains xml and fail otherwise. Notice that this expression does not return true- it need not be boolean. This is due to the fact that TestResponse's filter method intentionally returns Unit, not Boolean.

However, that's too much boilerplate, which we have to do for every time we want to check the contents of an xml response:

Check if the response is valid and has a response code of 200

Check if the xml is not empty

Extract the xml

Only then check the contents of the XML

If only we could make TestResponse do what Specs does with XML... But since this is Scala, we can create a wrapper and then transparently substitute it using our implicit conversion:

An important thing to remember about implicit conversions is that if we want them to feel seamless, the wrapper class' methods should return the original class type (the one before the conversion). Otherwise we could get unexpected object classes similarly to what occurred with the Scala 2.7 RichString, for instance with "aaa" reverse == "aaa"

Since we have chosen to use the same operator as Specs uses, we had to disambiguate with the XmlBaseMatchers class name, which is the specs trait containing all the XML matching goodness.

Now we can use our fancy TestResponse operators. Notice that since we return the same class type that we already had, we can chain:

This looks more compact than the original version so you can focus on the API differences and not be distracted by boilerplate preconditions.

Friday, March 12, 2010

Pomodoro- what's in it for me?

Lately I've been trying to use the Pomodoro technique. I have been using time boxing for a longer time and find Pomodoro to be a simple and efficient framework which revolves around time boxing.

Pomodoro being so simple, I was rather surprised to find a critique of Pomodoro recently. The critic ignores the fact that the Pomodoro author doesn't mandate that Pomodoro should be used in every occasion; that Pomodoro may use a different time unit than 25 minutes; or that you can void any Pomodoro if you see fit. But let alone the fact that the author misinterprets how strict the Pomodoro should be. By focusing on productivity, it's easy to ignore an equally important aspect- Pomodoro also helps relieve stress and prevents burnout.

Let me try to explain how Pomodoro achieves this by describing the basic building blocks of the Pomodoro process, which applies to time-boxing in general.

What is time boxing? Simply put, it's deciding what to do in the next fixed short period of time- and sticking to it. This consists of the following key stages and transitions:

1) at the start of the Pomodoro- comitting to one thing only for the next time period

2) during the Pomodoro- trying to focus on this one thing without interruptions

3) at the end of the Pomodoro- wrapping up and detaching from the work, followed by a short break

Now let's enumerate some of the reasons for stress, anxiety and burnout. Among these are: multitasking, procrastination, interruptions and obsessing over the completion of a task.

Comitting to one thing is crucial both for a productive and stress-free state of mind. Multitasking is proven to lead to anxiety and inefficiency time and time again. Why? First of all, because switching contexts is very inefficient and exhausting at that. Getting into the right set of mind to do a task takes time and effort and switching to another task destroys all that mental preparation.

Multitasking also leads to the impression that we're not acomplishing anything. If you have followed Joel's article on the subject, the explanation is simple: if you do 2 tasks which take 10 minutes to finish sequentially, you will have a finished task in 10 minutes. If you switch the tasks every minute, you will only get the first task done after 19 minutes! Multiply this by the number of tasks you're trying to switch every day and you get the picture.

The conclusion is that multitasking takes more effort and gets things accomplished slower. If this is not frustrating, I don't know what is.

Furthermore, giving a time limit for a simple task helps you get started. It encourages decomposing the problem into subtasks and transforms a huge amorphous blob of work look like something more palatable. More importantly, getting started is the best way to defeat procrastination, and procrastination is also a major factor for anxiety. Now which is more stressful: a heap of work, where you don't know where to start, or neat organized small tasks, each of which is easier to estimate? Try for yourself.

Interruptions are another very frequent source of frustration. An extremely important but often forgotten type of interruptions is internal ones. Keeping track of one's desire to chat with someone, check your email or just browse the web for something interesting which has popped in your mind is hard to resist, and can break your flow. It is, however, much easier to control these urges if you know that a break is coming soon, rather than when you feel that it's all work all day long, and a small diversion will just take a couple of seconds. Except that it doesn't.

Interruptions from other people are harder to avoid, especially in a job, which is related to responding to different events, like answering a support hotline. However, even in the case of phone calls, you cannot be in two unrelated conversations at once. Besides, the criticism of Pomodoro listed two examples which are notably free from interruptions: a surgeon in an operation or a lawyer defending a case in court. You don't need to control task switching in these scenarios because the situation does not allow for any other tasks to be performed. Can you imagine a lawyer in court or an operating surgeon surf the web or check their mail? Thought so.

Finally, it is very easy to forget that the slow but steady runner wins the race. We often lose track of how much time and effort we have been spending on a task. The short length of the Pomodoro (or any time boxing technique) helps you take a step away and ask: where am I going with this? Is it taking longer than anticipated? Am I actually doing anything or going in circles? Taking a break and detaching yourself from the task at hand helps you see the bigger picture and backtrack if you've reached a dead end. Besides, taking a break will often let your subconscious find a solution, which is otherwise not obvious. Obsessing on completing the task is counterproductive and will leave you unable to take on the next task and burned out in the long run.

There are other benefits of time-boxing, but all in all, it is a very useful technique. Even people who claim they are able to concentrate without taking breaks, are often doing it subconsciously by letting their minds meander from time to time or switching for a couple of minutes to less stressful aspects of the task. But why rely on being a born time-boxer? Once you agree that time-boxing is helpful, it pays off to make it a habit. It will make you more concentrated and calm.

And remember- if it ever feels like you're more stressed by using time-boxing, don't push yourself too hard and stop it. Take a break; enjoy your own flow.

Sunday, January 17, 2010

Is Scala more complicated than what Java tries to become?

Is Scala more complicated than Java? My last post did not tell the whole truth. I've only listed Scala features, which have a Java analog. There is a glaring omission of advanced Scala features like implicit conversions, operator overloading, call-by-name parameters and pattern matching. These Scala features are more complicated than what Java has. There, I said it. But then Scala is more complicated in the way a calculator is more complicated than an abacus- sure you can do some of the same stuff with an abacus, but trying to calculate the square root of a number is much more cumbersome.

However, this complexity pays off, because it lets us simplify many day-to-day features. This post will try a different angle by comparing where Java wants to be and where Scala is right now. I hope after reading it you will at least question your assumptions whether this trade-off is worth it.

Upon its creation, Java was a fairly simple language. A major reason it took over C++ is because it was specifically designed to steer away from multiple inheritance, automatic memory management and pointer arithmetic. But it's not a simple language anymore, and it's getting more and more complicated.

Why? Java wasn't designed to be too extensible. Scala, on the other hand, was designed to be scalable, in the sense of flexible syntax. The very creators of Java knew very well that a "main goal in designing a language should be to plan for growth" (Guy Steele's famous words from Growing a Language)

We need to expand our definition of language complexity. The language needs to be able to abstract away accidental complexity, or using it will be difficult. Examples of accidental complexity: jumping to a position in your program with goto, and then remembering to go back (before procedural programming); or allocating memory, and then remembering to deallocate it (before garbage collectors). Another example: using a counter to access collections, and remembering to initialize and increment it correctly, not to mention checking when we're done.

Creating extensions of the language in order to hide these complexities doesn't happen often. When it does, it offers huge rewards. On the other hand, if a language is rigid, even though it looks simple, this forces you to invent your own arcane workarounds. When the language leaves you to deal with complexity on your own, the resulting code will necessarily be complicated.

Let's see what special new language features Java tries to add to the language, which Scala can do because of its flexibility and extensibility.

Pattern matching

Pattern matching is often compared with Java's switch/case statement. I have listed pattern matching as something which doesn't have an analog in Java, because comparing it to "switch" really doesn't do it justice. Pattern matching can be used for arbitrary types, it can be used to assign variables and check preconditions; the compiler will check if the match is exhaustive and if the types make sense. Meanwhile Java has only recently accepted Strings in switch statements, which is only scratching the surface of Scala's pattern matching.

Furthermore, Scala is using pattern matching all through the language- from variable assignment to exception handling. To compare, the proposal for handling multiple exceptions in Java is postponed yet again.

Case classes

In order to get rid of Java's verbose getters, setters, hashCode and equals, one solution is to muck with the javac compiler, like the folks from Project Lombok have done. Is going to the guts of javac complicated? I'm sure it is.

In Scala, you can do it if you just define your classes as case classes.

Implicit conversions

In short, implicit conversions help transparently convert one type to another if the original type doesn't support the operations requested.

There are many examples where this is useful.

What in Java is hardcoded in the language as conversions and promotions, in Scala is defined using implicit conversions. This is another example where Java can get quite complicated. In most cases where you need to decide how to convert a method argument, for instance, you must have in mind narrowing and widening conversions, promotions, autoboxing, varargs and overriding (whew!). In Scala, the advantage of having implicit conversions is that you can inspect the code, where no ambiguity can result. You can analyze the conversions taking place in the interpreter by supplying the "-Xprint:typer" parameter. You can even disable these implicits, if you don't like them, by shadowing the import.

Another example of what implicits can do is adding methods and functionality to existing classes. Dynamic languages already do that easily using open classes and "missing method" handlers. In Java one way to do this using bytecode manipulation trickery via libraries like cglib, bcel, asm or javassist.

Bytecode manipulation in Java is required for popular libraries like Hibernate, Spring and AspectJ. Few "enterprise" Java developers can imagine development without Hibernate and Spring. Although there are many more things you can do with AspectJ, it can be used to emulate implicits with type member declarations. However, even though using AspectJ is a more high-level way to solve the problem, it adds even more complexity, as it defines additional keywords and constructs.

If you're new to Scala, you don't lose much if you don't know how implicit conversions work, just like you don't need to know about the magic that happens behind the scenes when Hibernate persists objects or when Spring creates its proxies. Just as with bytecode generation, you're not advised to use this feature often, as it is difficult to use. Still, you'll be glad it exists, because someone will create a library which will make your life and the life of many developers so much easier.

Operator overloading

The line between operators and methods in Scala is blurred- you can use the symbols +, -, /, *, etc. as method names. In fact, that's exactly how arithmetic operators work in Scala- they are method invocations (relax, everything is optimized by the compiler).

Some people object that operator overloading adds unnecessary complexity, because they can be abused. Still, you can also abuse method naming in much the same way. For instance, some hapless folk can define methods with visually similar symbols, like method1, methodl and methodI. They can use inconsistent capitalization, like addJar or addJAR. One could use meaningless identifiers like ahgsl. Why should operator best practices be different than method naming best practices?

What is complicated is treating numeric types like ints and BigInteger differently. Not only that, but operations with BigInteger are very verbose and barely readable even with simple expressions. To compare, this is how a recursive definition of factorial looks like in Scala with BigInteger:


def factorial (x: BigInt): BigInt =
  if (x == 0) 1 else x * factorial(x - 1)

This is how it would look if Scala didn't support operator overloading:


def factorial (x: BigInteger): BigInteger =
  if (x == BigInteger.ZERO)
    BigInteger.ONE
  else
    x.multiply(factorial(x.subtract(BigInteger.ONE)))

Call by name

One of the proposals for Java 7 language extension was automatic resource management. This is one more rule to the language, which you need to remember. Without this feature, code is also unnecessarily complicated, because it forces you to remember to always close resources after using them- if you slip up, subtle bugs with leaking files or connections can result.

In Scala, it's easy to add language constructs like this. Using function blocks, which are evaluated only when they are invoked, one can emulate almost any language construct, including while, if, etc..

Existential types

Existential types are roughly an alternative to Java wildcards, only more powerful.

Martin Odersky: If Java had reified types and no raw types or wildcards, I don't think we would have that much use for existential types and I doubt they would be in Scala.

If Martin Odersky says that existential types wouldn't be in the language if it wasn't for Java compatibility, why would you even need to know about them? Mostly if you need to interoperate with Java generics.

Conclusion

Scala tries to define fewer language rules, which are however more universal. Many of these advanced features are not often used, but they pay off by allowing to create constructs, which in Java would require specific hardcoded additions to the language. In Scala, they can be defined simply as libraries.

Why does it matter that it's in the libraries, and not hardcoded in the language? You can more easily evolve and adapt these features, you can add your own extensions, and you can even disable some of the library parts or replace them.

The conclusion is that if a language is not designed to be extended, it will eventually develop features, which are not well-integrated and this language will collapse under the weight of its own complexity.

Finally, learning something so that you avoid a lot of routine error-prone operations reduces effort by increasing the level of abstraction, at the cost of additional complexity. When you were in school, it was complicated to learn multiplication, but if you got over it, it would save you from quite a bit of repetition than if you just used addition.

P.S. I realize it's not possible to resolve the issue once and for all which language is more complicated- Java or Scala- in a post or two. First of all, have in mind that simple is not the same as easy to use. There are also many topics which are open for discussion. I haven't touched on Scala traits; I haven't mentioned functions as first-class constructs compared to the Java 7 closure proposal; and there's a lot that can be said about how Scala obviates many Java design patterns. Extending the Scala syntax via compiler plugins is another interesting advanced topic.

I suppose someone could even write a blog post about these topics some day.

Speaking my (programming) language?

Sunday, July 31, 2011

Partially unexpected effects of chaining partial functions in Scala

orElse

andThen

Friday, June 17, 2011

Testing actors in Scala

Further research

Thursday, October 21, 2010

Why Scala's Option won't save you from lack of experience

NullPointerException

Getting started

Compile-time safety and backward compatibility

Simplification through a generalization

Handling both value and lack of value and stop processing

Transform value

Only get the value if it satisfies a test

Transform lack of value

Chain, chain, chain

Safe invoke and composability

Iterator

One size doesn't fit all

Option is not only Scala's to have

Why language syntax won't save you from the future

Saturday, September 11, 2010

Top 5 underused GNU screen features

Detaching

Multiplexing

1. Sharing sessions

2. Copy and paste

3. Log output

4. Monitor for activity

5. Lock screen

Where to go from here?

Tuesday, August 10, 2010

Testing with Lift's TestFramework

Getting started

Using it

Customizing

Pimping

Friday, March 12, 2010

Pomodoro- what's in it for me?

Sunday, January 17, 2010

Is Scala more complicated than what Java tries to become?

Pattern matching

Case classes

Implicit conversions

Operator overloading

Call by name

Existential types

Conclusion

Twitter Updates

Blog Archive

About Me