Friday, 21 October 2011

Scala blunder: appending to a Seq that is a List

I recently made a mistake in a loop reading lines from a file, doing some string manipulation and adding the result to a collection. A seemingly trivial Scala script just refused to halt. My mistake is illustrated by the following two toy examples, adding integers to a Seq and a Vector, respectively:

var x1 = Vector[Int]()
for(i <- 0 to 100000) { x1 = x1 :+ i }

var x2 = Seq[Int]()
for(i <- 0 to 100000) { x2 = x2 :+ i }
One of the above for loops runs about 38,648 times slower than the other one (according to a single, somewhat sloppy benchmark using Scala 2.9.1). The explanation, I believe, is that the Seq turned out to be backed by a List. Lists hate being appended to (:+), and this hatred manifests itself in bad performance. Good to know if you want a program to be impressingly slow.

By the way, this made me think of another one:
var s1 = ""
for(i <- 0 to 100000) s1 = s1 + i

var s2 = ""
for(i <- 0 to 100000) s2 = s2.concat(i.toString)
I don't know why you'd want to create a string like the above, but the version using + is about four times slower than the one using concat (Scala 2.9.1).

Wednesday, 28 September 2011

Programming Scala without... anything

You don't need all that fancy, modern stuff. A keyboard, a terminal window and the scala command are all you need:

$ scala -e 'println(io.Source.fromFile("freq_list.txt").getLines().map(_.split("\t")(0).toInt).sum)'
71213401
(Prints the result of summing the frequency numbers found in the first tab separated field of file freq_list.txt. The result turned out to be 71213401.)

When the programs get longer, you better stay focused.

Friday, 29 April 2011

Testing Scala 2.9.0 (RC2) parallel collections: four extra key strokes, double speed

We have just tried the new parallel collections that you can find in Scala 2.9.0.RC2.

By adding .par at a few places, the software we tested ran almost twice (1.9 x) as fast on a two core processor. Running the same code on a four core processor was, as expected, quicker (2.7 x), but not four times as fast. That's quite a performance boost, with close to zero programming effort.

The software we've tested validates (electronic) pronunciation dictionaries, where each entry has an orthography, a phonetic transcription and some other stuff. The program runs a large number of quality checks to find problems (faulty transcriptions, inconsistencies, etc) that are hard or impossible for a human lexicographer to find. It runs hundreds or even thousands of validation rules, using regular expressions and other string processing, on a hundred thousand or more dictionary entries.

The software runs a sequence of validation rules on each input entry. The validation rules are independent of each other, suitable for running in parallel. The rules, living in a Seq, are applied in sequence in a call to map(...). By calling .par.map(...) on the Seqs holding the validation rules, a multi-core processor is now able to perform the validation in parallel (par returns a parallel version of a collection).

Apart from using parallel collections at the point where the validation rules are run, we also run the main loop, reading the input lexicon data, using a parallel collection. Adding parallel collections at different places (the outermost loop and inside the validation) seems to add to the performance gain.

An initial problem that we had, was that the Scala 2.9.0.RC2 API documentation fooled us to believe that foldLeft would, just like map, run in parallel. That appears to be incorrect. We had to change calls to foldLeft into calls to map (followed by an additional foldLeft to aggregate the result). I don't know if I've misunderstood the documentation, or if parallel foldLeft is pending.

Anyway, double speed, or more, with zero effort. It sounds too good to be true, but this quick test suggests that it works like a charm.

And now I want more cores.