Speaking my (programming) language?: 2009

Monday, December 28, 2009

Bridge-crossing with Clojure

Solving puzzles has become popular among colleagues at the end of this year. One of these was the popular Bridge and torch problem. To summarize: four men have to cross a bridge in the dark with a torch, which can only light the way for two persons. These guys can cross the bridge in 1, 2, 5 and 10 minutes, respectively. When two people cross the bridge, it takes the time needed for the slower one to pass.

I'm not going to explain the optimal solution, but I wanted to show how this is a graph-traversal problem, which is amenable to solving using Dijkstra's algorithm. I also noticed that some clever guy has solved the puzzle using Prolog. So an idea was formed: generate the tree of solutions using a language, which I'm not really comfortable with (so I can practice and learn it better). I settled on Clojure, because I wanted to learn it this year (and the year is almost over).

The solution has a lot to do with sets. The state of each of the banks (which I'll call left and right) is a set of people, which are identified by their crossing speed. If we generate the list of possible combinations between the persons for each next move and recurse into them, then we'll have the tree of all solutions. In order to avoid indefinitely long (and stupid) solutions, we assume that going forward always takes two men, and going back takes only one.

The necessary libraries are set, combinatorics and str-utils. That's where I start:


(use '[clojure.contrib.combinatorics])
(use '[clojure.set])
(use '[clojure.contrib.str-utils])
(use '[clojure.contrib.pprint])

Here are the initial data structures: the two river banks are sets, the right one is empty:


(def left #{1, 2, 5, 10})
(def right #{})

And here's a first version of the solution:


(defn forward [left right steps minutes]
        (map #(back
                (difference left %)
                (union right (set %))
                (cons % steps)
                (+ minutes (reduce max %)) )
            (combinations left 2)) )

(defn back [left right steps minutes]
    (if (empty? left)
        (list steps minutes)
        (map #(forward
                (union left (set %))
                (difference right %)
                (cons % steps)
                (+ minutes (reduce max %)) )
            (combinations right 1)) ) )

(pprint (forward left right nil 0))

This prints some semblance of a tree, constructed by nested lists (hey, that's what Lisp is good for, remember?). What I didn't like about this is that there is too much repetition. There are many parts of these functions, which look the same. In order to evolve the solution, I wanted to extract the common parts in a single function:


(defn solve [from to steps minutes group]
    (if (and (empty? to) (not group))
        (list steps minutes)
        (mapcat #(solve
                (union to (set %))
                (difference from %)
                (cons % steps)
                (+ minutes (reduce max %))
                (not group) )
            (combinations from (if group 2 1))) ) )

(pprint (solve left right nil 0 true))

The function behaves differently depending on whether the group of people crossing consists of two or just one person. Also, the tree is now flattened, so the optimal solution can be easily found like this:


(print (reduce #(if (< (second %1) (second %2)) %1 %2) (apply hash-map (solve left right nil 0 true))))

Next step, I want to be able to collect a lot more data about each state, so I need to stop messing around with lists of lists and use something more organized like a hashmap. There's a reason Clojure includes the function second, but not third or, heaven forbid, fourth. Naming each part of your ad-hoc data structure plays a major role in readability.


(defn solve [from to steps minutes group]
    (if (and (empty? to) (not group))
        {:steps steps, :from from, :time minutes}
        {:steps steps, :from from, :time minutes,
            :solutions (map #(solve
                    (union to (set %))
                    (difference from %)
                    (cons % steps)
                    (+ minutes (reduce max %))
                    (not group) )
                (combinations from (if group 2 1)))} ) )

So now the interesting part can begin- rendering the tree in a nice graphical form using GraphViz. Watch this:


(defn printgraph [solution_tree]
    (let [steps (solution_tree :steps)
        nodename (if (nil? steps) "root" steps)
        from (solution_tree :from)
        label (if (= (count from) (count left)) (solution_tree :time) from)]
        (str "\"" nodename "\" [label=\"" label "\"];\n"
            (if (contains? :solutions solution_tree)
                ""
                (str-join ""
                    (map #(str "\"" nodename "\" -> "
                            "\"" (:steps %) "\""
                            " [label=\"" (intersection from (:from %)) "\"];\n"
                            (printgraph %))
                        (solution_tree :solutions)) ) ) ) ) )

And now you can output the recursive stuff wrapped with the header and footer like this:


(println "digraph G {")
(print (printgraph (solve left right nil 0 true)))
(println "}")

This graph shows each node as the state of alternating river banks, and the edges show the people crossing the bridge. The leaves display the number of minutes which it took for this particular combination of crossings.

I also wanted to generate a mind map for FreeMind, but since it doesn't support edge labels, I used the persons crossing for the nodes:


(defn mindmap [solution_tree]
    (str-join "\n"
        (map #(str
                "<node text=\"" (intersection (:from solution_tree) (:from %)) "\">"
                (mindmap %)
                "</node>")
            (solution_tree :solutions)) ) )

Now generating the complete file according to the FreeMind format:


(println "<map><node text=\"root\">\n")
(print (mindmap (solve left right nil 0 true)))
(println "</node></map>\n")

Finally, finding the optimal solution is just a wee bit more involved, because it requires traversing the tree recursively, but that's no problem for us anymore:


(defn optimal [tree]
    (if (contains? tree :solutions)
        (reduce #(if (< (second %1) (second %2)) %1 %2) (map optimal (tree :solutions)))
        (list (tree :steps) (tree :time)) ) )

(print (optimal (solve left right nil 0 true)))

In conclusion, I knew that one can write very compact code in Clojure (and any Lisp, in general), and this was certainly true here. I didn't use the Clojure-specific concurrency primitives or the lazy sequences, so my example would probably look similar in another Lisp. I did try to make it as functional as possible and Clojure does encourage that.

Some of the difficulties I had could be solved by static type checking, but I guess I'm spoiled by Scala. I've ran into small problems like renaming a variable and forgetting to rename (or misspelling) some instance of it in my code. I also tend to mix up which of the arguments to contains? comes first, and which second. This is probably because you can exchange a symbol as a key in a hashmap, and the hashmap without changing the meaning, e.g. (:from solution_tree) and (solution_tree :from). Of course, this is a problem with dynamic typing, not of Clojure in particular (and could probably be interpreted as a problem of using the language inappropriately).

Whatever the small hurdles, the problem was solved. Having fun? Check. Generating nice graphs and mind maps to impress colleagues? Check. Learning a bit of Lispy Clojure in the process? Check. Mission accomplished.

Thursday, December 3, 2009

String interpolation in Scala

When I tried Scala for the first time, one of the first missing language features I noticed was string interpolation (you guessed it, I was using Ruby at the time). Of course, this is a small convenience rather than a major language feature and usually String.format and java.util.Formatter are pretty good substitutes. Still, string interpolation comes now and then in Scala discussions and one has to wonder if Scala's powerful language extension facilities can emulate a feature like this.

It turns out you can get reasonably close:

trait InterpolationContext {
  class InterpolatedString(val s: String) {

    def i = interpolate(s)
  }

  implicit def str2interp(s: String) = new InterpolatedString(s)

  def interpolate(s: String) = {
    val sb = new StringBuilder
    var prev = 0
    for (m <- "\\$\\{.+?\\}".r.findAllIn(s).matchData) {
      sb.append(s.substring(prev, m.start))
      val matchString = m.matched
      var identifier = matchString.substring(2, matchString.length - 1)
      try {
        val method = this.getClass.getMethod(identifier)
        sb.append(method.invoke(this))
      } catch {
        case _: NoSuchMethodException =>
        sb.append(matchString)
      }
      prev = m.end
    }
    sb.append(s.substring(prev, s.length))
    sb.toString
  }
}

object Test extends InterpolationContext {
  val a = "3"
  def main(args: Array[String]) {
    println("expanded: ${a}; not expanded: ${b}; more text".i)
  }
}

How does this work? If you call the i method on a String, it will force implicit conversion to a InterpolatedString, similar to how the r method works for converting to a Regex. The interpolate method uses reflection to find a method with the name equal to the identifier delimited by "${" and "}". This works both for vals and defs due to Scala's adhering to the uniform access principle. If the delimited string is not a valid identifier or a parameterless method with this name doesn't exist, it is not substituted, but stays as it is.

How do you use it? Just extend or mix in the InterpolationContext trait in the class where you want this to work, and then call i on the Strings where you want interpolation to work. The InterpolationContext trait serves two purposes- first of all, it imports the implicit conversion to the interpolated string, and second it provides access to the methods of the current object via reflection.

The limitations of this method are that interpolation only works on member vals and defs of the current object only. I rather like this, because you know you can't accidentally construct a String from user input in an html form like "Name: ${System.exec(\"echo echo pwned >> .bashrc\")}". Also, interpolation doesn't work for private members as well as local variables. Finally, you have to both mix in or extend the trait and call a method on every String (even though it's a one-letter method). This is not too bad, because you can control the scope where interpolation works and therefore avoid and debug nasty surprises easier.

I don't see this as a pattern, which can be used in real-life projects, and I don't see the use of implicits here worthy of emulation. Nevertheless, trying to copy features from other languages is instructive about the limitations of your current language of choice and the tradeoffs you get for these features. Last but not least- even if it's a gimmick feature, implementing it was fun.

Monday, November 23, 2009

Error detection with pattern matching

Scala's pattern matching is one of the most powerful features in the language. It not only helps write concise and very readable code, but also helps prevent trivial errors.

Let's say you want to save a line or two of code when comparing for some empty data structure like a List. You decide to use comparison for equality:


val items = Nil
if (items == Nil) println("No items")

Then you decide to refactor and turn the items collection into a Set. The Scala compiler is clever and should give you a warning, right? Well, not quite:


val items = Set()
if (items == Nil) println("No items")

This results in nothing printed, as the expression evaluates to false. Of course, given the types it's perfectly clear at compile-time that this will never be true. Can't the compiler give you a hint? Indeed, it will if you use pattern matching:


items match {
  case Nil => println("No items")
  case _ =>
}

This will result in the following error message:


error: pattern type is incompatible with expected type;
 found   : object Nil
 required: scala.collection.immutable.Set[Nothing]
         case Nil => println("No items")
              ^

The examples are contrived (unparameterized Nil and Set?), but you get the point. Yes, Nil can never be a value of the Set type. You might think this is fairly obvious, but pattern matching can be a life saver when implicit conversions are involved. Consider this:


"heh".reverse == "heh"

What does this evaluate to? This should be obvious, right? But the value is false! If you used pattern matching, you would easily see why:


"heh".reverse match {
  case "heh" => "obvious?"
}

This will make the compiler very nervous, and this is the reason why:


error: type mismatch;
 found   : java.lang.String("heh")
 required: scala.runtime.RichString
         case "heh" => "obvious?"
              ^

So reverse converts the String to the wrapper RichString, which is not the same type as String.

I have had similar problems detecting a bug where I was checking for equality with None a variable which was of type net.liftweb.common.Box (a type very similar conceptually to Scala's built-in Option).

This made me adopt a general rule to prefer pattern matching rather than equality comparison. The bugs it catches are sometimes subtle and hard to see, and that's exactly what Scala's rich static type checker tries to avoid. Use it to your advantage.

Since we're talking about bugs caught by pattern matching, there's one subtle bug, which is often (though not always) caught by the compiler (another contrived example follows):


val items = Set()
Set() match {
  case items => println("empty")
  case _ => println("full")
}

This will result in an error, which looks a little bit unusual to the newbie:


error: unreachable code
         case _ => println("full")
                   ^

This error is usually crying out loud: hey, you're inadvertently using variable binding instead of constant matching! The newly bound variable items shadows the existing variable with the same name and all other cases after it will never match.

One way to fix it is to use backticks to prevent the name to be bound to a new variable:


Set() match {
  case `items` => println("empty")
  case _ => println("full")
}

As a rule of thumb it is advised to use CapitalLetters for case classes and constants which you intend to pattern match.

This error wouldn't have occurred if you used equality comparison in the first place, but even in mildly complex cases pattern matching trumps plain equality checking in readability and detecting errors. Apparently, there are cases where pattern matching fails (for instance, matching structural types), so there's still no reason to deprecate good old "==". But there are many more errors, which pattern matching catches, like checking if the match is exhaustive. So there's no point in saving a couple of characters but lose the type safety you expect from Scala.

Wednesday, November 4, 2009

Embedded Scala interpreter

The Scala interpreter is proof that a statically typed language can have features most people only expect from a dynamically-typed language. One of the cool uses of an interpreter is embedding it within your application. This allows you to conveniently experiment with Scala and probably even interact with object instances in your running system. I won't explain how the interpreter works here, but I will try to show you a simple way of embedding the interpreter.

As it usually happens, someone beat me to it. Josh Suereth explains in great detail how to embed an interpreter, but he has done so many customizations that his solution would probably fit on several pages.

I wanted a simpler solution which one could understand at a glance. The code for Lord of the REPLs is much shorter although it doesn't customize much of what the standard interpreter offers.

I tried to come up with the shortest working version you could possibly get away with. Provided I create the settings properly, this is what I could muster:


val out = new java.io.StringWriter()
val interpreter = new scala.tools.nsc.Interpreter(settings, new PrintWriter(out))
interpreter.interpret(code)

Not too much code, is it? (Half of it is probably due to the full package names). Now you could collect your output from the "out" stream and probably convert to String if you need using "out.toString".

Not so fast, though. I said this works if I have the appropriate settings:


val settings = new scala.tools.nsc.Settings()

The problem here is that the interpreter doesn't find two of the crucial jars needed for its proper functioning: scala-compiler.jar and scala-library.jar. When it doesn't it spits out the following error:


scala.tools.nsc.FatalError: object scala not found.

Thanks to the following discussion by Eric Torreborre (author of Specs) I managed to find out that one needs to add to the bootclasspath of the settings object:


val origBootclasspath = settings.bootclasspath.value
val pathList = List(jarPathOfClass(compilerPath),
                         jarPathOfClass(libPath))
settings.bootclasspath.value = (origBootclasspath :: pathList).mkString(java.io.File.separator)

One could hardcode the path to these two jars, but that's not too flexible. If we want to do it right, we might create a function which discovers the path to the jar from the name of a class that's in it:


def jarPathOfClass(className: String) = {
  Class.forName(className).getProtectionDomain.getCodeSource.getLocation
}

Now you could find the paths to these jars like this:


val compilerPath = jarPathOfClass("scala.tools.nsc.Interpreter")
val libPath = jarPathOfClass("scala.ScalaObject")

I've read that getProtectionDomain.getCodeSource returns null in some classloaders and might have problems specifically with OSGi. In that case, one might need to resort to the following hack:


def jarPathOfClass(className: String) = {
  val resource = className.split('.').mkString("/", "/", ".class")
  val path = getClass.getResource(resource).getPath
  val indexOfFile = path.indexOf("file:")
  val indexOfSeparator = path.lastIndexOf('!')
  path.substring(indexOfFile, indexOfSeparator)
}

With the last ugly piece of code creating an interpreter is not so concise anymore, but sometimes you can't be both robust and concise.

If you want to see the above snippets assembled in one piece you can check out Apache ESME's source code for the ScalaInterpreter action.

Warning: interpreting code directly in your application is a huge security risk and might not always be a good idea.

Tuesday, October 13, 2009

Scala closures as poor man's mocks

Groovy has this feature that you can use a closure whenever you need an interface with one method only. A class implementing the interface is automatically created, and the closure provides the implementation of this single method. This process is called closure coercion and is particularly convenient to make tests readable and concise.

I'm not yet sure about the relative advantages of such code everywhere, since there might be ambiguities or the intent might be obscured. Tests usually contain a lot of boilerplate, though, so I'm all for making them more concise. Except for readability, people would be more likely to write a test if it doesn't take much effort to create yet another trivial mock.

Who tests your tests though? You don't need to answer. Still, the page about implementing interfaces in Groovy warns of this trap: "Be careful that you don't accidentally define the map with { }". Type safety could help sometimes when writing tests. Or at least that's a very good excuse to try to emulate this Groovy feature in Scala.

In Scala there are implicit conversions, which help define a controlled way to coerce a value to a different type. We cannot automatically convert all closures to all possible single-method interfaces in scope, but we can choose several we are using especially often (as it happens in tests). For instance, if we want to use a mock of java.lang.Readable often enough, we could define the following implicit conversion:


import java.nio.CharBuffer

implicit def fun2readable(f: Function1[CharBuffer,Int]): Readable =
  new Readable { def read(cb: CharBuffer) = f(cb) }

So now everywhere we need a Readable, we might just drop in the appropriate closure instead (technically, this is not a mock, but a stub):


val readable: Readable = {cb: CharBuffer => cb.put("12 34"); 5}
val s = new java.util.Scanner(readable)
assert (s.nextInt == 12)

Compare this with the Groovy version:


def readable = { it.put("12 34"); 5 } as Readable
def s = new Scanner(readable)
assert s.nextInt() == 12

In the Scala version, we are explicitly defining the type of the variable in order to force the implicit conversion. In methods, where Readable is expected as an argument, explicitly naming the type will not be necessary, whereas in Groovy you always need to coerce using "as Readable":


val s = new java.util.Scanner({cb: CharBuffer => cb.put("12 34"); 5})

So in Scala you're trading some verbosity up front for some conciseness in all of the code where the implicit conversion is visible. Furthermore, the type checker guarantees that only closures with the correct signature are converted to the interface in question (i.e. with an input of CharBuffer and output of Int) and will flag a compile-time error otherwise.

The reader might be tempted to also emulate map coercion, but beware of an unexpected difficulty: it's not easy to define both a closure (with one argument) and map coercion. Since a Map in scala is a function and due to type erasure, the compiler won't know whether to substitute the Map or the Function!

Can you generate these implicits automatically without defining them? Maybe with a Scala compiler plugin, but then you lose some of the transparency of knowing which implicits you've defined.

Saturday, September 26, 2009

Scala interpreter in Firefox' location bar

SimplyScala is awesome for trying out some quick Scala tricks. However, you can go one step further, and execute Scala one-liners in your Firefox location bar!

1. Create a bookmark to URL http://www.simplyscala.com/interp?code=%s
2. Associate a keyword to the bookmark. Mine is "scala", you can try something shorter like "s" or ">".
3. Now when you type "scala <expression>", the expression will be sent to SimlpyScala and evaluated. For instance, "scala List(1,2,3).reverse"

Why does this work? Whenever you have a keyword associated with a bookmark Firefox expands "%s" in the URL to the text following the keyword. Check this Lifehacker article for details.

The nice thing is that your bindings are kept in your session, so you can use variable names defined in previous invocations. There are inconveniences, for instance you cannot easily preview your history and it's hard to review inputs which are more than a couple of lines long.

I also find that in combination with URL-shortening services it's an acceptable alternative to using code snippet sharing sites like gist.github.com or paste.pocoo.org. The code is actually part of the URL, which makes sense for shorter code snippets. You don't have syntax highlighting, but you have the evaluated result readily available.

Wednesday, September 16, 2009

3 things you didn't know Scala pattern matching can(not) do

Since I started learning Scala, pattern matching has become a favorite feature. But powerful as it is, pattern matching has some limitations. On the other hand, there are some unexpected ways to use pattern matching.

Fake multiple assignment of tuples

You already knew you can initialize multiple variables using the tuple syntax:

val (a, b) = (1, 2)

I want to do the same with closure parameters, though. For example, to swap the elements of 2-tuples in a list:


List(1 -> "one") map {
  t => (t._2, t._1)
}

Still, using numbered slots for tuples doesn't seem as readable. It would be great if we could easily assign variables with meaningful names to document the purpose of the elements of the tuple. Parameters are already packed in a tuple, so why can't I do this:


List(1 -> "one") map {
  val (num, str) = _; (str, num)
}

Unfortunately this is not legal syntax. Of course, I can explicitly bind the tuple to a temporary variable and then decompose it:


List(1 -> "one") map {
  t => val (num, str) = t; (str, num)
}

But this seems too verbose for Scala. Fortunately, there's a trick which can help. Pattern matches are actually instances of partial functions! The closures used for map, filter, etc. are functions, too. This means we can use a pattern match to bind variables to the closure parameters:


List(1 -> "one") map {
  case (num, str) => (str, num)
}

Match the last element of a list

Decomposing lists in pattern matching is another power feature of Scala. You can match the first element of a list:


List(1, 2, 3) match {
  case List(1, _*) => "yeah!"
}

The underscore-star (_*) symbol serves as a placeholder for the rest of the elements of the list. You can also do this:


List(1, 2, 3) match {
  case 1 :: _ => "hurray!"
}

Which mimicks construction of the list using the cons operator. So what if I want to match on the last element instead of the first?


(1 to 9).toList match {
  case List(_*, 8, 9) => "sure"
}

Hey, this seems like it's working! Not so fast, though:


(1 to 9).toList match {
  case List(_*, 18) => "yeah, right"
}

In fact, in Scala 2.8 this syntax will trigger an error message: "_* may only come last"

How about using the alternative notation?


(1 to 9).toList match {
  case _ :: 9 :: Nil=> "no way"
}

Unfortunately Scala expects the placeholder to be an element, not a list. The triple colon expects a list, will it work?


(1 to 9).toList match {
  case _ ::: 9 :: Nil => "absolutely not"
}

Nope, this isn't even recognized by the compiler. While you're wondering why there is a ::: operator, but it's unknown to Scala in pattern matching, remember that ::, which is used in pattern matching, is actually a class. Furthermore, the syntax "a :: b" is in fact a more readable expression of "::(a, b)", which in turn is a short form of "call unapply on the matching expression and check if it decomposes to a and b".

And this, in short, is how the Scala black magic, called "extractors", works. But can we also use it to match on the last element of a list? Sure! Just define an object (let's call it "::>") and define its unapply method. The method expects a list and must return a tuple of the "init" part of the list and the "last".


object ::> {def unapply[A] (l: List[A]) = Some( (l.init, l.last) )}

List(1, 2, 3) match {
  case _ ::> last => println(last)
}

(1 to 9).toList match {
  case List(1, 2, 3, 4, 5, 6, 7, 8) ::> 9 => "woah!"
}
(1 to 9).toList match {
  case List(1, 2, 3, 4, 5, 6, 7) ::> 8 ::> 9 => "w00t!"
}

Match structural types

Since you've heard of structural types, didn't you wish you could do this to find out whether a class has a certain method?


List(1) match {
  case t: {def length: Int} => "cool!"
}

Unfortunately, here again we are fooled in thinking method checking happens when matching the class. It's not:


List(1) match {
  case t: {def aoeu: Int} => "really?"
}

The reason for this is that the class type is erased to AnyRef. Unfortunately, there's no clean workaround for this- yet. This looks positively ugly:


List(1) match {
  case t: {def aoeu: Int}
          if t.getClass.getMethods.exists{_.getName == "aoeu"} => "not really!"
}

Do not despair, though- there's a chance that this feature is going to land in Scala sooner or later.

Tuesday, June 23, 2009

The ultimate Twitter client

...is a Feed reader!

OK, not quite, but after some thinking about the top features the ideal Twitter client should have, I found feed readers have most. For the minor drawback of not being able to post messages, you get:

Marking messages as read

Fixed replies

Favorites of friends

others

friends

recommend

Tracking

No follow/unfollow counting

But some of these "features" require you to import the feed of every single one of your users. Noone in their right mind would do that manually, but there's a way to generate a list of feeds in the form of an OPML file (which many readers can import). First of all, get a list of your friends. Then apply the following XSL stylesheet:


<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:output method="xml" encoding="ISO-8859-1" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <opml>
   <body>
    <outline title="twitter">
    <xsl:for-each select="/ids/id">
     <outline>
      <xsl:attribute name="title"><xsl:value-of select="."/></xsl:attribute>
      <xsl:attribute name="xmlUrl">http://twitter.com/statuses/user_timeline/<xsl:value-of select="."/>.atom</xsl:attribute>
     </outline>
    </xsl:for-each>
    </outline>
   </body>
  </opml>
 </xsl:template>

</xsl:stylesheet>

This contains only the most essential elements of a feed, and the name of each feed is the ID, not the name, but Gooogle Reader imports the feed successfully and some readers can rename the feed from information provided in the RSS/Atom format.

For an OPML of your friends' favorites, replace "statuses/user_timeline" with favorites.

Monday, February 9, 2009

3 Scala gotchas

It's been over a year since I bought my "Programming In Scala" book by Martin Odersky. Since then, a lot has changed. There are now 3 more Scala books in the works: a "Programming Scala" by Venkat Subramaniam (Pragmatic Bookshelf), "Beginning Scala" by David Pollak (Apress) and another "Programming Scala" by Alex Payne (of Twitter fame) and Dean Wampler (by O'Reilly). Scala is the key actor in a couple of new software projects (pun intended), including one at Apache (ESME, to which yours truly is a committer; ok, I have no shame).

There's no perfect language, though, and you can't say you know a language until you know its warts. So here's my share of traps, which wasted a lot of my time- I certainly hope you discover them before you waste yours. Of course, these don't mean the language is ill-designed; if only minor corner cases like these can be discovered, this would suggest that a lot of thought has been put in the language.

Main method in companion objects

object Test {
  def main(args: Array[String]) {
    println("Running!")
  }
}

You copy & paste a simple example and compile it. Great, everything works and you're happy. So now you read about companion objects and decide to use your main method so you can test the companion class. You start with a dummy and (if you're cautious), try to compile right away:


class Test {
}

Good, it compiles. Running it, though, results in a NoSuchMethodException for main. Huh? Well, there's a main method only in singleton objects, not in companion objects.

Variables and constants in pattern matching


val Extension = "xml"
"json" match {
  case Extension => "Wrong!"
}

Here, you want to match on the contents of a variable. It works as expected: in the above example a MatchError is thrown, as expected. So suddenly you decide that you want to rename the variable...


val fileExtension = "xml"
"json" match {
  case fileExtension => "Wrong!"
}

Boom! Now pattern matching matches things it shouldn't! Actually, it matches everything you throw at it. What has happened? Well, Scala checks the name in order to decide if it's an existing constant (capitalized first letter) or a new variable name it should bind (lower-case first letter). So in the last example a new variable called "fileExtension" was created, which shadows the existing variable name.

Always returning Unit from methods


def test1 {
  println("test1")
}

def test2 = {
  "test"
}

No surprises here- one method prints a String, the other returns one. So then you suddenly decide that you want only one method to this (contrived) and merge these two:


def test {
  println("test1")
  "test2"
}

println(test)

...and the output is not what is expected: only "test1" is printed. What's wrong here? In Scala, if you omit the equals sign after the method declaration and before the method body, the result is always Unit (same as void in Java).

Wednesday, January 28, 2009

Scala is static- is Scala dynamic?

Despite being statically typed, Scala is not an inflexible language. Smarter folks than me have commented on extending the behavior of classes dynamically using implicits.

I want to focus on another feature of Scala which allows for dynamic behavior. In concurrent programming with actors, the default handling involves pattern matching on the messages received. Pattern matching and actor concurrency have been borrowed from Erlang- a dynamic language by nature- and have been one of the key differentiating concepts compared to languages I've learned before.

When calling a method on a class, you have to know the method signature at compile-time. The types of the parameters you pass must be defined (even through an implicit type conversion). But when you pass a message to an actor (something of an asynchronous method call), you don't need to care what type of class the actor is or what type of messages it can process- at least at compile time:

import scala.actors._
import scala.actors.Actor._

val a = actor {
  loop {
    react {
      case i: Int => println("Got an integer")
      case o: AnyRef => println(o.getClass)
    }
  }
}

a ! 2
a ! 2.8
a ! "what is this?"

There's an unassuming case denoted by the underscore character (_). This is the so-called catch-all case, the moral equivalent of Ruby's method_missing. As you can see, you can send any object to the actor, and all undefined classes can be handled dynamically by this placeholder case.

Of course, the spirit of Scala doesn't encourage handling patterns using the catch-all method. You can send messages to actors only through a typed scala.actors.Channel:

import scala.actors._
import scala.actors.Actor._

val c = new Channel[Int](actor {
  loop {
    react {
      case _ ! _ => println("test")
    }
  }
})

c ! 2

Once you use this Int-typed Channel, you can only send Int messages to it (unless there are any implicit conversions). For instance this would be caught by the compiler and fail with an error message:


c ! "2"
:10: error: type mismatch;
 found   : java.lang.String("2")
 required: Int
       c ! "2"
           ^

The Channel is constructed using an actor as an argument. This actor can then match by the scala.actors.! class (yes, it's the name of the class) and check which messages have arrived through which channels.

This took awhile to discover, since the best documentation is in the source itself. If there are any other (and better) idioms for using scala.actors.Channel, I'd be happy to learn them!

Wednesday, January 21, 2009

Apply search to following

There are still some things that don't make sense in search.twitter.com. For instance, it would be perfectly logical to search in just your followees' timelines. Why, TwitterSpy is already doing it. You could, of course, generate a huge query with "from:" and every possible person you're following, but search.twitter.com restricts the length of the query. Good luck if you have more than 10 friends. Uh, are there people like this after 1 month of using Twitter?

You could also write a console script, except that you don't want to type your password and reauthenticate once per follower. So doing it from the browser might be a better fit for a quick & dirty approach. This way your session would be reused.

Here's an educational bookmarklet. You can paste it in the URL bar of Firefox (didn't bother with other browsers) when you've opened twitter.com (won't work if window has a link open under another domain). And you can bookmark it- you thought it would be called a bookmarklet for nothing?

javascript:query = prompt("Twitter search:");
xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", "http://twitter.com/statuses/friends.json");
xmlhttp.onload=function (){
        list=eval(xmlhttp.responseText);
        for (i=0; i<list.length; i++)
                window.open("http://search.twitter.com/search?q="
                        + query + "+from:" + list[i].screen_name)
};
xmlhttp.send(null)

This will open one search window per followee- so don't try this if you have loads of'em. You will hit a limit in Firefox' default config, and you can increase it using the property:


browser.tabs.maxOpenBeforeWarn

Still, you will hit a hardware/hardcoded limit if you're not careful with how many people you're following. I told you this only has educational value, right?

Friday, January 9, 2009

SliTaz: smaller than you think, more useful than you think

I started using SliTaz on a USB stick half an year ago just as a workaround to make my laptop with crashed hard-disk usable and try to rescue some of the disk contents. Eventually I ended up keeping the distro as my regular OS. There's a lot to like about it.

First of all, it's unbelievably small- 25 MB. Yes, in times of Vista, there are graphical OS's and software distros that small. Yes, they cheat, but with style- the OS is uncompressed using lzma, one of the most efficient compression algorithms. And it has an amazing amount of software, too- Openbox provides a slick GUI and even graphical effects like transparency, Geany is a powerful programmer's editor, Firefox is there, a mail client, torrent client, CD/DVD burner, PDF viewer, mplayer, even Sudoku. I like the choice of lightweight, but stylish programs. Finally, there's a system information applet, which gives you just what you need, and a Debian-like package manager- with a GUI.

The lack of an office suite is one of the key decisions in SliTaz. These days it is not uncommon to rely on Google Docs as a no-frills office suite. And getting rid of one of the bulkiest pieces of software has a lot of unexpected consequences. The OS is loaded in RAM, which makes it exceptionally fast and responsive, and eliminates a lot of reading/writing to the USB stick, which prolongs its life.

Why is SliTaz significant? Small is fashionable again due to the resurgent interest in mobile devices and ultra-mini PCs. Especially the Asus EEE PC and similar devices would be a perfect fit.

In this age of mobility it becomes increasingly important to have a stable environment so you can maximize your productivity. It becomes annoying if you have to set up a new PC every couple of weeks or switch between two different environments. By carrying around a USB stick with your OS, you have not only your documents and portable programs with you- the window manager is the same, you can use the familiar key bindings, the shortcuts and menus are the same. It feels like you're on the same machine.

Finally, you can regain this lost feeling that you understand what's going on with your OS. You keep it simple, you have just what you need and use everyday, and when you need a rarely used package, just install it from the repository. It won't be there when you reboot (unless you explicitly write RAM to the USB), and security-wise, I like the idea- it's as if you update your software for security vulnerabilities when you need it. It's also harder for malware to set itself up to start on reboot.

But why did I choose this specific distro? I used DamnSmallLinux for a while, and it was really cool for awhile, but showing its age- most of the software is hopelessly old- kernel 2.4 and... only last year switched to Firefox 2? No, that's not a typo- in times when Firefox 3 is stable, lagging one major version behind seems backwards. If you want to be small, get Firefox 3, use Google Docs, get rid of the office suite.

More importantly, in DSL there was a 50 MB core of applications, which could not be easily modified and customized. That makes it harder to upgrade as well. In SliTaz you can change everything- remove or upgrade to your heart's content. I can remove more programs, making SliTaz even slimmer.

The only nitpick I have is that a lot of the configuration utilities assume you have certain base programs. For instance, I wanted to have mrxvt instead of xterm, and get rid of xterm. Eventually wrote a small wrapper for xterm which invokes mrxvt.

#!/bin/sh
while [ $# -gt 0 ]; do
 if [ $1 == "-fa" ]; then
  ARGS="$ARGS -fn"
 elif [ $1 == "-fs" ]; then
  shift
 elif [ $1 == "-e" ]; then
  ARGS="$ARGS -e sh -c '"
  while [ $# -gt 0 ]; do
   shift
   ARGS="$ARGS $1"
  done
  ARGS="$ARGS'"
  break
 else
  ARGS="$ARGS \"$1\""
 fi
 shift
done
eval exec mrxvt $ARGS

Speaking my (programming) language?