<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3840687515615686738</id><updated>2011-10-24T10:09:58.969+02:00</updated><category term='list arraybuffer'/><category term='parallel collections'/><category term='simple backup'/><category term='postgresql'/><category term='case class'/><category term='scala.xml.XML.loadFile'/><category term='empty fields'/><category term='epiphany'/><category term='scott rosenberg'/><category term='junit'/><category term='apt-get'/><category term='seq'/><category term='codepoints'/><category term='map'/><category term='upcasing'/><category term='companion object'/><category term='textfield'/><category term='collection'/><category term='richstring'/><category term='jcl'/><category term='netbeans 6.5'/><category term='firebird'/><category term='concatenate'/><category term='string'/><category term='firefox'/><category term='webstart'/><category term='iceweasel'/><category term='toLowerCase'/><category term='tail'/><category term='haskell'/><category term='utf8'/><category term='getlines'/><category term='scooters'/><category term='script'/><category term='debian'/><category term='patternmatching'/><category term='implicip conversion'/><category term='touppercase'/><category term='performance'/><category term='unicode'/><category term='head'/><category term='review'/><category term='laptop'/><category term='Turkish'/><category term='alias'/><category term='ubuntu 8.04'/><category term='System.setErr'/><category term='xml'/><category term='SALDO 1.0'/><category term='case insensitive'/><category term='keypressed'/><category term='scala'/><category term='64-bit'/><category term='mysql'/><category term='java'/><category term='vacation'/><category term='usb'/><category term='tellstick'/><category term='debian 4.0'/><category term='squeryl'/><category term='empty elements'/><category term='fall'/><category term='regular expression'/><category term='bash'/><category term='scala plugin'/><category term='book'/><category term='code point'/><category term='equality'/><category term='netbeans'/><category term='shadowing'/><category term='split'/><category term='savecompiled'/><category term='System.setOut'/><category term='tuple'/><category term='interview'/><category term='wireless'/><category term='identical typaces'/><category term='equalsIgnoreCase'/><category term='dreaming in code'/><category term='string.split'/><category term='sbackup'/><category term='pattern'/><category term='auxiliary constructors'/><category term='fromfile'/><category term='par'/><category term='Swedish lexicon'/><category term='ubuntu'/><category term='Svenskt associationslexikon'/><category term='Vector'/><category term='equals'/><category term='real world haskell'/><category term='+='/><category term='flamerobin'/><title type='text'>Nikoloogle Lindbloogle</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>58</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1161913327936097432</id><published>2011-10-21T15:50:00.000+02:00</published><updated>2011-10-21T15:50:32.419+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='seq'/><category scheme='http://www.blogger.com/atom/ns#' term='Vector'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala blunder: appending to a Seq that is a List</title><content type='html'>I recently made a mistake in a loop reading lines from a file, doing some string manipulation and adding the result to a collection. A seemingly trivial Scala script just refused to halt. My mistake is illustrated by the following two toy examples, adding integers to a &lt;code&gt;Seq&lt;/code&gt; and a &lt;code&gt;Vector&lt;/code&gt;, respectively:&lt;pre&gt;var x1 = Vector[Int]()&lt;br /&gt;for(i &lt;- 0 to 100000) { x1 = x1 :+ i }&lt;br /&gt;&lt;br /&gt;var x2 = Seq[Int]()&lt;br /&gt;for(i &lt;- 0 to 100000) { x2 = x2 :+ i }&lt;/pre&gt;One of the above for loops runs about 38,648 times slower than the other one (according to a single, somewhat sloppy benchmark using Scala 2.9.1).The explanation, I believe, is that the &lt;code&gt;Seq&lt;/code&gt; turned out to be backed by a &lt;code&gt;List&lt;/code&gt;. Lists hate being appended to (&lt;code&gt;:+&lt;/code&gt;), and this hatred manifests itself in bad performance. Good to know if you want a program to be &lt;em&gt;impressingly&lt;/em&gt; slow.&lt;p&gt;&lt;/p&gt;By the way, this made me think of another one:&lt;pre&gt;var s1 = ""&lt;br /&gt;for(i &lt;- 0 to 100000) s1 = s1 + i&lt;br /&gt;&lt;br /&gt;var s2 = ""&lt;br /&gt;for(i &lt;- 0 to 100000) s2 = s2.concat(i.toString)&lt;br /&gt;&lt;/pre&gt;I don't know why you'd want to create a string like the above, but the version using &lt;code&gt;+&lt;/code&gt; is about four times slower than the one using &lt;code&gt;concat&lt;/code&gt; (Scala 2.9.1).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1161913327936097432?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1161913327936097432/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1161913327936097432' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1161913327936097432'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1161913327936097432'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2011/10/scala-blunder-appending-to-seq-that-is.html' title='Scala blunder: appending to a Seq that is a List'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8356076246066330934</id><published>2011-09-28T13:51:00.000+02:00</published><updated>2011-09-28T13:57:50.581+02:00</updated><title type='text'>Programming Scala without... anything</title><content type='html'>You don't need all that fancy, modern stuff. A keyboard, a terminal window and the scala command are all you need: &lt;br /&gt;&lt;pre&gt;$ scala -e 'println(io.Source.fromFile("freq_list.txt").getLines().map(_.split("\t")(0).toInt).sum)'&lt;br /&gt;71213401&lt;/pre&gt;(Prints the result of summing the frequency numbers found in the first tab separated field of file &lt;code&gt;freq_list.txt&lt;/code&gt;. The result turned out to be 71213401.)&lt;br /&gt;&lt;br /&gt;When the programs get longer, you better stay focused.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8356076246066330934?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8356076246066330934/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8356076246066330934' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8356076246066330934'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8356076246066330934'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2011/09/programming-scala-without-anything.html' title='Programming Scala without... anything'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4856660043313151547</id><published>2011-04-29T19:20:00.003+02:00</published><updated>2011-04-29T19:30:31.244+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='par'/><category scheme='http://www.blogger.com/atom/ns#' term='parallel collections'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Testing Scala 2.9.0 (RC2) parallel collections: four extra key strokes, double speed</title><content type='html'>We have just tried the new parallel collections that you can find in Scala &lt;a href="http://www.scala-lang.org/node/9314"&gt;2.9.0.RC2&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;By adding &lt;code&gt;.par&lt;/code&gt; at a few places, the software we tested ran almost twice (1.9 x) as fast on a two core processor. Running the same code on a four core processor was, as expected, quicker (2.7 x), but not four times as fast. That's quite a performance boost, with close to zero programming effort.&lt;br /&gt;&lt;br /&gt;The software we've tested validates (electronic) pronunciation dictionaries, where each entry has an orthography, a phonetic transcription and some other stuff. The program runs a large number of quality checks to find problems (faulty transcriptions, inconsistencies, etc) that are hard or impossible for a human lexicographer to find. It runs hundreds or even thousands of validation rules, using regular expressions and other string processing, on a hundred thousand or more dictionary entries.&lt;br /&gt;&lt;br /&gt;The software runs a sequence of validation rules on each input entry. The validation rules are independent of each other, suitable for running in parallel. The rules, living in a &lt;code&gt;Seq&lt;/code&gt;, are applied in sequence in a call to &lt;code&gt;map(...)&lt;/code&gt;. By calling &lt;code&gt;.par.map(...)&lt;/code&gt; on the &lt;code&gt;Seqs&lt;/code&gt; holding the validation rules, a multi-core processor is now able to perform the validation in parallel (&lt;code&gt;par&lt;/code&gt; returns a parallel version of a collection).&lt;br /&gt;&lt;br /&gt;Apart from using parallel collections at the point where the validation rules are run, we also run the main loop, reading the input lexicon data, using a parallel collection. Adding parallel collections at different places (the outermost loop and inside the validation) seems to add to the performance gain.&lt;br /&gt;&lt;br /&gt;An initial problem that we had, was that the Scala 2.9.0.RC2 API documentation fooled us to believe that &lt;code&gt;foldLeft&lt;/code&gt; would, just like &lt;code&gt;map&lt;/code&gt;, run in parallel. That appears to be incorrect. We had to change calls to &lt;code&gt;foldLeft&lt;/code&gt; into calls to &lt;code&gt;map&lt;/code&gt; (followed by an additional &lt;code&gt;foldLeft&lt;/code&gt; to aggregate the result). I don't know if I've misunderstood the documentation, or if parallel &lt;code&gt;foldLeft&lt;/code&gt; is pending.&lt;br /&gt;&lt;br /&gt;Anyway, double speed, or more, with zero effort. It sounds too good to be true, but this quick test suggests that it works like a charm.&lt;br /&gt;&lt;br /&gt;And now I want more cores.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4856660043313151547?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4856660043313151547/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4856660043313151547' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4856660043313151547'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4856660043313151547'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2011/04/testing-scala-290-rc2-parallel.html' title='Testing Scala 2.9.0 (RC2) parallel collections: four extra key strokes, double speed'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-965872749255522621</id><published>2010-09-15T10:20:00.000+02:00</published><updated>2010-09-15T10:20:13.549+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='interview'/><category scheme='http://www.blogger.com/atom/ns#' term='squeryl'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Interview with Maxime Lévesque, author of Squeryl</title><content type='html'>&lt;div&gt;&lt;a href="http://squeryl.org/"&gt;Squeryl&lt;/a&gt; is a great Scala database API. On its website, it is describe like this: "A Scala ORM and DSL for talking with Databases with minimum verbosity and maximum type safety".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Preparing an introduction to Squeryl for a Swedish computer magazine, I sent a number of questions to Maxime Lévesque, the man behind Squeryl. The answers were so interesting, that I asked his permission to post them here:&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Could you describe yourself in a few words?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm a dad, a programmer, a hobbyist bass player and percussionist.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm the kind of programmer who prefers to write libraries and frameworks to writing applications. If I was in the construction industry I'd probably be making bricks, mortar and nails rather than houses.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Do you develop Squeryl as part of your work, or is it a hobby?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Squeryl started as a hobby, only later did I start using it in a commercial project.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What are the most important features of Squeryl? Why should you use it?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The main reason to use Squeryl in an application, in my opinion, is to have the data access code validated by the compiler. I've seen many projects where the database schema stops evolving after a lot of code has been written against it. Ugly workarounds are sometimes chosen because there isn't enough time to investigate the repercussions of a schema change or conduct all the testing required.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Strongly typed languages are good for "deterministic refactoring".&amp;nbsp;A data access layer needs to be refactorable, as any part of a system does. Perhaps to an even greater extent, because in a sense, bad design decisions get persisted with the data.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A developer needs all the help he can get from tools such as compilers and IDEs. Hard work and discipline don't scale. Why rely on it when you can have automated validation?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Reusability is another big one. Squeryl queries are composable, reusable pieces of code. A query that encodes a particular piece of application logic needs only be written once, and reused anywhere it is needed. I'm a big believer in the DRY principle (Don't Repeat Yourself).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Low verbosity would be another strength. I dislike APIs or frameworks that require you to write more than you should.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What's the story behind Squeryl?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In 2005 I wrote an ORM for dotNet. I was in need of one at the time and I couldn't find a decent one that exploited generics and annotations, so I wrote my own. By the time I considered publishing it, LINQ came out, and instantly obsoleted my ORM (and all other ORMs except HaskelDB in my opinion).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A few years later I started to write a query DSL in Java, and at every step, I got bitten by language limitations. Every time I worked around them, the solution became a bit more ugly and verbose. I then discovered Scala, and started experimenting with writing a statically typed query DSL. I was amazed by the expressivity of the language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The fact that it was possible to write Squeryl as a library (i.e., without a compiler plug-in) speaks a lot about the potency of the language. The first two attempts were abandoned when they reached a critical level of inelegance. They were Squeryl's pre-history.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Squeryl is in fact my third attempt at a Scala ORM. When I became confident that a fourth rewrite wouldn't be necessary, I published it on GitHub.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;If Squeryl didn't exist, what would you use?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If Squeryl didn't exist, I'd have a look at ScalaQuery or Circumflex. I only have a superficial knowledge of them, but I would surely try them out before going to any of the Java based ORMs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;If you are to demo Squeryl (e.g., to a Java programmer), do you have a favourite example?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's a one liner that says a lot :&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;pre&gt;&lt;span class="Apple-style-span" style="color: #741b47;"&gt;val&lt;/span&gt; &lt;span class="Apple-style-span" style="color: #bf9000;"&gt;avgHeight&lt;/span&gt;: &lt;span class="Apple-style-span" style="color: #38761d;"&gt;Option[Float]&lt;/span&gt; = &lt;br /&gt;  from(people)(p &lt;span class="Apple-style-span" style="color: #990000;"&gt;=&amp;gt; &lt;/span&gt;compute(avg(p.heightInCentimeters)))&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Apart from the shortness of the code, we can see a few implicit conversions at work. The compiler "knows" that the sum query can translate into a 32 bit floating point value, but it also "knows" that it is an Option[], because the avg aggregate function is not guaranteed to return something (the table can be empty). In fact it won't compile if you try to refer to it as a (non Option[]) Float.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Where has Squeryl turned up? Who uses it?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I haven't made any survey, it's on my todo list, but I've exchanged emails with developers that are building systems with Squeryl in fields ranging from finance to bioinformatics.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;I read something about Lift...?&lt;/b&gt; &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Ross Mellgren from the Lift team has written an integration module that is part of Lift 2.1 (release candidate).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What's on the roadmap?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;High on my priority list is free text search (backed by Lucene). Longer term I'd like to add things like support for sharding and extending the DSL to exploit the geospatial capabilities of databases like Postgres, Oracle and H2.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Is it of any importance that Squeryl was written i Scala? Or was this merely a coincidence?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Without Scala there wouldn't be a way to have strongly typed queries on the JVM without having verbosity that reaches a caricatural level. Not only wouldn't there be Squeryl, but there wouldn't be anything like it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When Java came out I was impressed with all the features it had built in: serialization, RMI, garbage collection, portability. It was in its time a game changing technology. Today I have the same impression of Scala: the level of static validation that it gives you,  all this with minimal verbosity. If I could say just one thing to qualify it, I'd have to say: game changing.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the answer is yes, Scala made Squeryl possible. I expect a lot of interesting Scala DSLs will get written in many domains in the coming years. I have a few other DSLs I'd like to write myself.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Any particular advice for someone beginning with Squeryl?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I would just copy an example from the Squeryl site, and modify it gradually so that it becomes your own schema. And most importantly, don't hesitate to ask questions in the discussion groups. I'm often impressed by the quality of the answers given by the community.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Thanks a lot for the great answers!&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-965872749255522621?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/965872749255522621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=965872749255522621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/965872749255522621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/965872749255522621'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2010/09/interview-with-maxime-levesque-author.html' title='Interview with Maxime Lévesque, author of Squeryl'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-678009514712530220</id><published>2010-05-06T15:46:00.013+02:00</published><updated>2010-05-06T16:38:35.852+02:00</updated><title type='text'>Using the Scala REPL to tell the difference between ЕКАТEРИНБУРГ and ЕКАТЕРИНБУРГ</title><content type='html'>Sometimes, one runs into UTF-8 strings with characters from different code blocks. This is problematic in cases where the fonts look the same, but the characters are different. The Scala REPL is handy for finding out what Unicode block each character in a string belongs to. Let's use "ЕКАТEРИНБУРГ" and "ЕКАТЕРИНБУРГ" as examples:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;"ЕКАТEРИНБУРГ"&lt;/span&gt; == &lt;span style="color: rgb(0, 102, 0);"&gt;"ЕКАТЕРИНБУРГ"&lt;/span&gt;&lt;br /&gt;res0: Boolean = false&lt;br /&gt;&lt;br /&gt;scala&gt; import java.lang.Character.UnicodeBlock&lt;br /&gt;import java.lang.Character.UnicodeBlock&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;"ЕКАТEРИНБУРГ"&lt;/span&gt;.foreach(c =&gt; println(c +&lt;span style="color: rgb(0, 102, 0);"&gt;"\t"&lt;/span&gt;+ UnicodeBlock.of(c)))&lt;br /&gt;Е CYRILLIC&lt;br /&gt;К CYRILLIC&lt;br /&gt;А CYRILLIC&lt;br /&gt;Т CYRILLIC&lt;br /&gt;E BASIC_LATIN&lt;br /&gt;Р CYRILLIC&lt;br /&gt;И CYRILLIC&lt;br /&gt;Н CYRILLIC&lt;br /&gt;Б CYRILLIC&lt;br /&gt;У CYRILLIC&lt;br /&gt;Р CYRILLIC&lt;br /&gt;Г CYRILLIC&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;"ЕКАТЕРИНБУРГ"&lt;/span&gt;.foreach(c =&gt; println(c +&lt;span style="color: rgb(0, 102, 0);"&gt;"\t"&lt;/span&gt;+ UnicodeBlock.of(c)))&lt;br /&gt;Е CYRILLIC&lt;br /&gt;К CYRILLIC&lt;br /&gt;А CYRILLIC&lt;br /&gt;Т CYRILLIC&lt;br /&gt;Е CYRILLIC&lt;br /&gt;Р CYRILLIC&lt;br /&gt;И CYRILLIC&lt;br /&gt;Н CYRILLIC&lt;br /&gt;Б CYRILLIC&lt;br /&gt;У CYRILLIC&lt;br /&gt;Р CYRILLIC&lt;br /&gt;Г CYRILLIC&lt;br /&gt;&lt;br /&gt;scala&gt;&lt;br /&gt;&lt;/pre&gt;The REPL exposed one of the seemingly identical strings to be an unhealthy mix of Latin and Cyrillic characters. Thanks, REPL.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-678009514712530220?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/678009514712530220/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=678009514712530220' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/678009514712530220'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/678009514712530220'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2010/05/using-scala-repl-to-tell-difference.html' title='Using the Scala REPL to tell the difference between ЕКАТEРИНБУРГ and ЕКАТЕРИНБУРГ'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4524228623276329554</id><published>2010-04-11T00:31:00.053+02:00</published><updated>2010-04-12T10:58:14.422+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='case class'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>A tiny Scala case class to clean up user input</title><content type='html'>We needed some cleaning up of user input entered into a text field.  We ended up with a Scala case class that cleans up its constructor string argument a bit, by removing multiple whitespace characters and trimming it. It behaves like this:&lt;br /&gt;&lt;pre&gt;scala&gt; Text("    a       a      ") == Text("a a")                                                                &lt;br /&gt;res0: Boolean = true&lt;br /&gt;scala&gt; Text("   a      a       ").text == Text("a a").text                                                      &lt;br /&gt;res1: Boolean = true&lt;br /&gt;scala&gt; Text("  a     a    ").text          &lt;br /&gt;res2: java.lang.String = a a&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The code looks like this:&lt;br /&gt;&lt;pre&gt;case class Text(private var _text: String) {&lt;br /&gt; val text = _text.trim.replaceAll(" +", " ")&lt;br /&gt; _text = text&lt;br /&gt;}&lt;/pre&gt;Since the input string, &lt;code&gt;var _text&lt;/code&gt;, is private, we can manipulate it a bit, without making it possible for others to tamper with. I'm not sure if this is the obvious way to do it, but it seems to work as intended.&lt;br /&gt;&lt;br /&gt;We tried a similar version that did not work:&lt;pre&gt;// Doesn't work&lt;br /&gt;case class BrokenText(private var _text: String) {&lt;br /&gt;  _text = _text.trim.replaceAll(" +", " ")&lt;br /&gt;  val text = _text&lt;br /&gt;}&lt;/pre&gt;This version does not work since &lt;code&gt;Text.text&lt;/code&gt; will return the original string, not the cleaned up one:&lt;pre&gt;scala&gt; BrokenText("    a      b      ")&lt;br /&gt;res0: BrokenText = BrokenText(a b)&lt;br /&gt;scala&gt; res0.text&lt;br /&gt;res1: String =     a      b    &lt;br /&gt;scala&gt;&lt;/pre&gt;Why the second version doesn't work? Beats me. (But I'm sure the answer will turn out to be obvious.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; See the two anonymous comments below: one answering my question above, the other one suggesting a neater way of handling it. Thanks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4524228623276329554?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4524228623276329554/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4524228623276329554' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4524228623276329554'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4524228623276329554'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2010/04/tiny-scala-case-class-to-clean-up-user.html' title='A tiny Scala case class to clean up user input'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8705929812082920369</id><published>2010-02-04T09:50:00.003+01:00</published><updated>2010-02-04T11:17:16.369+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='seq'/><category scheme='http://www.blogger.com/atom/ns#' term='list arraybuffer'/><category scheme='http://www.blogger.com/atom/ns#' term='tail'/><category scheme='http://www.blogger.com/atom/ns#' term='head'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: Getting into performance trouble, calling head and tail on an ArrayBuffer</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; The performance problem described below will be remedied in the final release of Scala 2.8. See martin's comment.&lt;br /&gt;&lt;br /&gt;====================================&lt;br /&gt;&lt;br /&gt;Recently, I wrote the following two different versions for doing the same thing (compute frequencies):&lt;br /&gt;&lt;pre&gt;// Version 1  --- Don't do this, lousy performance&lt;br /&gt;// Scala 2.8&lt;br /&gt;def freq[T](seq: Seq[T]): Map[T, Int] = { &lt;br /&gt; import annotation._&lt;br /&gt; @tailrec&lt;br /&gt; def freq(seq: Seq[T], map: Map[T, Int]): Map[T, Int] = {&lt;br /&gt;   seq match {&lt;br /&gt;     case s if s.isEmpty =&gt; map&lt;br /&gt;     case s =&gt; {&lt;br /&gt;       val elem = s.head&lt;br /&gt;       val n = map.getOrElse(elem, 0) + 1&lt;br /&gt;       freq(s.tail, map + (elem -&gt; n ))&lt;br /&gt;     }&lt;br /&gt;   }&lt;br /&gt; }&lt;br /&gt; freq(seq, Map())&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;// Version 2 --- 260 times faster than Version 1 on some input&lt;br /&gt;def freq[T](seq: Seq[T]): Map[T, Int] = {&lt;br /&gt; val freqs = collection.mutable.HashMap[T, Int]()&lt;br /&gt; for(elem &lt;- seq) {         &lt;br /&gt;   val n = freqs.getOrElseUpdate(elem, 0)         &lt;br /&gt;   freqs.update(elem, n + 1)     &lt;br /&gt; }     &lt;br /&gt; // Return immutable copy of freqs     &lt;br /&gt; Map() ++ freqs&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;When comparing the two versions, it turned out that for some input, Version 1 was about 260 times slower (after JVM warm-up).  The performance difference surfaced when both versions were called with the following different inputs:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;val linesList = io.Source.fromPath("testfile.txt").getLines().toList&lt;br /&gt;val linesSeq = io.Source.fromPath("testfile.txt").getLines().toSeq&lt;/pre&gt;&lt;br /&gt;Version 1 called with &lt;code&gt;linesSeq&lt;/code&gt; as input, performes horrlibly compared to when called with &lt;code&gt;linesList&lt;/code&gt;. On my own, I couldn't figure out why, but helpful and knowledgeable people at #scala solved my problem in a few seconds. The explanation appears to be that 1) The default implementation of &lt;code&gt;Seq&lt;/code&gt; is an &lt;code&gt;ArrayBuffer&lt;/code&gt;, and 2) Calling head and tail on an &lt;code&gt;ArrayBuffer&lt;/code&gt; is costly. The same operations are cheap on a &lt;code&gt;List&lt;/code&gt;. That's why Version 1 above is a performance trap.&lt;br /&gt;&lt;br /&gt;A possible way of getting better performance, is to change the inner, two argument, &lt;code&gt;freq&lt;/code&gt; method to use &lt;code&gt;List&lt;/code&gt;, instead of &lt;code&gt;Seq&lt;/code&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;// Version 1.b  --- Somewhat better&lt;br /&gt;// Scala 2.8&lt;br /&gt;def freq[T](seq: Seq[T]): Map[T, Int] = { &lt;br /&gt; import annotation._&lt;br /&gt; @tailrec&lt;br /&gt; def freq(seq: List[T], map: Map[T, Int]): Map[T, Int] = {&lt;br /&gt;   seq match {&lt;br /&gt;     case s if s.isEmpty =&gt; map&lt;br /&gt;     case s =&gt; {&lt;br /&gt;       val elem = s.head&lt;br /&gt;       val n = map.getOrElse(elem, 0) + 1&lt;br /&gt;       freq(s.tail, map + (elem -&gt; n ))&lt;br /&gt;     }&lt;br /&gt;   }&lt;br /&gt; }&lt;br /&gt; freq(seq.toList, Map())&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Better yet --- in Scala 2.8 --- is to scrap the entire method, and call &lt;code&gt;groupBy(identity).mapValues(_.length)&lt;/code&gt; directly on the &lt;code&gt;Seq&lt;/code&gt;...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8705929812082920369?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8705929812082920369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8705929812082920369' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8705929812082920369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8705929812082920369'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2010/02/scala-getting-into-performance-trouble.html' title='Scala: Getting into performance trouble, calling head and tail on an ArrayBuffer'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4878226649778808971</id><published>2010-02-01T22:26:00.041+01:00</published><updated>2010-02-03T15:15:58.418+01:00</updated><title type='text'>Counting Strings and Things in Scala (2.8)</title><content type='html'>I often need to count the frequencies of strings ("words", typically). Below are a few Scala snippets for counting strings and things. (Don't miss the last one.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;First try&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Let's start with a method for counting string frequencies in a list:&lt;br /&gt;&lt;pre&gt;// Scala 2.8&lt;br /&gt;def freq(wds: List[String]): Map[String, Int] = { &lt;br /&gt;  import annotation._&lt;br /&gt;  @tailrec&lt;br /&gt;  def freq(wds: List[String], map: Map[String, Int]): Map[String, Int] = {&lt;br /&gt;    wds match {&lt;br /&gt;      case l if l.isEmpty =&gt; map&lt;br /&gt;      case l =&gt; {&lt;br /&gt;        val elem = l.head&lt;br /&gt;        val n = map.getOrElse(elem, 0) + 1&lt;br /&gt;        freq( l.tail, map + (elem -&gt; n ) )&lt;br /&gt;      }&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;  freq(wds, Map())&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;It takes a list of strings, and returns a map (hash table) with a frequency count for each unique string. The one argument freq method contains an embedded two argument freq method. The second method recursively consumes elements of the list, incrementing the frequency count of the second accumulator argument. The two argument method is initialised with an empty map (at the end of the one argument method, &lt;code&gt;freq(wds, Map()&lt;/code&gt;).&lt;br /&gt;&lt;br /&gt;In each recursion, a new, immutable word frequency map is produced, with the incremented frequency count. The &lt;pre&gt;import annotation._&lt;br /&gt;@tailrec&lt;/pre&gt; part tells the compiler to check whether it can optimize the tail recursive call or not. (The Scala compiler can optimize a special case of tail recursion.)&lt;br /&gt;&lt;br /&gt;If the recursion makes you dizzy, you can use a mutable &lt;code&gt;HashMap&lt;/code&gt; instead:&lt;br /&gt;&lt;pre&gt;def freq(wds: List[String]): Map[String, Int] = {&lt;br /&gt;  val freqs = collection.mutable.HashMap[String, Int]()&lt;br /&gt;  for(w &lt;- wds) {     &lt;br /&gt;    val n = freqs.getOrElseUpdate(w, 0)     &lt;br /&gt;    freqs.update(w, n + 1)   &lt;br /&gt;  }   &lt;br /&gt;  // Return immutable copy of freqs   &lt;br /&gt;  Map() ++ freqs &lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Second try&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You'll soon find out that the above code is limited, since it only accepts &lt;code&gt;List&lt;/code&gt; input. There is a more general concept, &lt;code&gt;Seq&lt;/code&gt;, that will make it possible to call &lt;code&gt;freq&lt;/code&gt; with different kinds of sequences (lists, listbuffers, arrays):&lt;br /&gt;&lt;pre&gt;// Scala 2.8&lt;br /&gt;def freq(wds: Seq[String]): Map[String, Int] = {  &lt;br /&gt;  import annotation._&lt;br /&gt;  @tailrec&lt;br /&gt;  def freq(wds: Seq[String], map: Map[String, Int]): Map[String, Int] = {&lt;br /&gt;    wds match {&lt;br /&gt;      case l if l.isEmpty =&gt; map&lt;br /&gt;      case l =&gt; {&lt;br /&gt;        val elem = l.head&lt;br /&gt;        val n = map.getOrElse(elem, 0) + 1&lt;br /&gt;        freq( l.tail, map + (elem -&gt; n ) )&lt;br /&gt;      }&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;  freq(wds, Map())&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Third try&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;One day you find yourself relocated from the word counting department to the character counting department. A string is a sequence, but of &lt;code&gt;Chars&lt;/code&gt;, not &lt;code&gt;Strings&lt;/code&gt;. The code above will not help you count character frequencies. Here is an attempt at generalising the code further, to make it able to count the frequencies of any thing, &lt;code&gt;T&lt;/code&gt;, not just &lt;code&gt;String&lt;/code&gt;:&lt;br /&gt;&lt;pre&gt;// Scala 2.8&lt;br /&gt;def freq[T](seq: Seq[T]): Map[T, Int] = {  &lt;br /&gt;  import annotation._&lt;br /&gt;  @tailrec&lt;br /&gt;  def freq(seq: Seq[T], map: Map[T, Int]): Map[T, Int] = {&lt;br /&gt;    seq match {&lt;br /&gt;      case s if s.isEmpty =&gt; map&lt;br /&gt;      case s =&gt; {&lt;br /&gt;        val elem = s.head&lt;br /&gt;        val n = map.getOrElse(elem, 0) + 1&lt;br /&gt;        freq(s.tail, map + (elem -&gt; n ))&lt;br /&gt;      }&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;  freq(seq, Map())&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here's the more general non-tail-recursive version:&lt;br /&gt;&lt;pre&gt;def freq[T](seq: Seq[T]): Map[T, Int] = {&lt;br /&gt;  val freqs = collection.mutable.HashMap[T, Int]()&lt;br /&gt;  for(elem &lt;- seq) {     &lt;br /&gt;    val n = freqs.getOrElseUpdate(elem, 0)     &lt;br /&gt;    freqs.update(elem, n + 1)   &lt;br /&gt;    }   &lt;br /&gt;  // Return immutable copy of freqs   &lt;br /&gt;  Map() ++ freqs &lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Hooray.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Last try&lt;/span&gt; (shamelessly lifted from someone at #scala)&lt;br /&gt;&lt;br /&gt;But... you can still do better than this. A while ago, someone on the #scala irc channel (unfortunately, I don't remember this persons name) answered a question on how to associate each integer in a sequence with the number of times each integer occurred (or something like that). It turns out that, in Scala 2.8, it is possible to write a frequency counting thing even more compactly:&lt;br /&gt;&lt;pre&gt;def freq[T](seq: Seq[T]) = seq.groupBy(x =&gt; x).mapValues(_.length)&lt;/pre&gt;It's so short, that it's almost not worth defining a method/function for it. You can simply call &lt;code&gt;.groupBy(x =&gt; x).mapValues(_.length)&lt;/code&gt; directly on your &lt;code&gt;Seq&lt;/code&gt;. (Or &lt;code&gt;groupBy(identity).mapValues(_.length)&lt;/code&gt;, which is the same thing.)&lt;br /&gt;&lt;br /&gt;Double hooray. &lt;br /&gt;&lt;br /&gt;Benchmarking is tricky, but a small test indicates that the last, most beautiful, version is also the quickest, and that the recursive ones using only immutable maps (in some situations) are quite slow.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4878226649778808971?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4878226649778808971/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4878226649778808971' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4878226649778808971'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4878226649778808971'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2010/02/counting-strings-and-things-in-scala-28.html' title='Counting Strings and Things in Scala (2.8)'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-7875762471910700084</id><published>2009-11-22T16:33:00.023+01:00</published><updated>2009-11-24T12:25:09.343+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='keypressed'/><category scheme='http://www.blogger.com/atom/ns#' term='textfield'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Beware! scala.swing.TextField proclaims EditDone when it isn't</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Forget about &lt;code&gt;EditDone&lt;/code&gt;. See &lt;span style="font-weight: bold;"&gt;Update&lt;/span&gt; below!&lt;br /&gt;&lt;br /&gt;&lt;code&gt;scala.swing.TextField&lt;/code&gt; is a basic GUI component that can be used for&lt;br /&gt;letting the user input  a line of text. When listening to this component, one can react to an &lt;code&gt;EditDone&lt;/code&gt; event:&lt;br /&gt;&lt;pre&gt;// Inside some GUI component ...&lt;br /&gt;val textField = new TextField(20)&lt;br /&gt;contents += textField&lt;br /&gt;listenTo(textField)&lt;br /&gt;&lt;br /&gt;reactions += {case EditDone(`textField`) =&gt;&lt;br /&gt;  println("Ok, searching DB for input "+ textField.text)&lt;br /&gt;}&lt;br /&gt;//...&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Fine. Whenever the user (me) hits the Enter key, the message, "Ok,&lt;br /&gt;searching for DB input ...", simulating a database search, is printed.&lt;br /&gt;&lt;br /&gt;However, what happens when some unrelated software product suddenly&lt;br /&gt;pops up a window while the user (me) is still inputting text into the&lt;br /&gt;TextField? I tell you what: The evil, non-sentient contraption prints&lt;br /&gt;the simulated search message --- just as if I had hit Enter.&lt;br /&gt;&lt;br /&gt;When the TextField loses focus, it emits an EditDone event. But I'm&lt;br /&gt;not done editing. I've only typed "a". I was about to type&lt;br /&gt;"abecedarian". Now the silly thing will search the database for all&lt;br /&gt;words containing the letter "a". I never told it to do that. This&lt;br /&gt;happened just because some other, unrelated, ill-behaving program&lt;br /&gt;grabbed the focus.&lt;br /&gt;&lt;br /&gt;Of course, the focus may also be lost because the user voluntarily&lt;br /&gt;changes windows (for instance, in order to Google for "abecedarian").&lt;br /&gt;&lt;br /&gt;As far as I can tell, there is no sane way to tell an EditDone event&lt;br /&gt;produced by the user (me) hitting Enter from an EditDone event&lt;br /&gt;produced because the TextField component lost focus.  This cannot be&lt;br /&gt;right.&lt;br /&gt;&lt;br /&gt;(A while ago, I asked about this on the Scala-user list. Not one single&lt;br /&gt;answer from one single soul in the entire Universe. It feels lonely.)&lt;br /&gt;&lt;br /&gt;(I'm using Scala 2.8.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Forget about &lt;code&gt;EditDone&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;What you should do, is not to listen to the &lt;code&gt;TextField&lt;/code&gt;, but to &lt;code&gt;TextField.keys&lt;/code&gt;. This way, you'll be able to catch a &lt;code&gt;KeyPressed&lt;/code&gt; event, and check if the key pressed was &lt;code&gt;Enter&lt;/code&gt;. Simple.&lt;br /&gt;&lt;br /&gt;It's a bit tricky to figure out, however, since it's not in the &lt;code&gt;TextField&lt;/code&gt; Scala docs (you'll have to find your way to &lt;code&gt;scala.swing.Component&lt;/code&gt;). This is how it could look:&lt;br /&gt;&lt;pre&gt;import swing._&lt;br /&gt;import event._&lt;br /&gt;&lt;br /&gt;//...&lt;br /&gt;&lt;br /&gt;// Inside some GUI component ...&lt;br /&gt;val textField = new TextField(20)&lt;br /&gt;contents += textField&lt;br /&gt;&lt;br /&gt;listenTo(textField.keys)&lt;br /&gt;&lt;br /&gt;import Key._&lt;br /&gt;reactions += {case KeyPressed(`textField`, Enter, _, _) =&gt;&lt;br /&gt;  println("Ok, searching DB for input "+ textField.text)&lt;br /&gt;}&lt;br /&gt;//...&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Thanks to Ingo Maier for &lt;a href="http://old.nabble.com/Re%3A-Re%3A-swing.TextField-generates-EditDone-event-when--losing-focus-p26487967.html"&gt;explaining&lt;/a&gt; this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-7875762471910700084?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/7875762471910700084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=7875762471910700084' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7875762471910700084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7875762471910700084'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/11/beware-scalaswingtextfield-proclaims.html' title='Beware! scala.swing.TextField proclaims EditDone when it isn&apos;t'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6878507690161151492</id><published>2009-11-16T15:10:00.018+01:00</published><updated>2009-11-16T16:01:19.210+01:00</updated><title type='text'>Source's getLines in Scala 2.8 now strips line end</title><content type='html'>In Scala 2.8 (not yet officially released), &lt;code&gt;scala.io.Source&lt;/code&gt; has been updated.&lt;br /&gt;&lt;br /&gt;When reading lines from a file, you do not longer need to trim the lines, since newlines are removed by default. The code to read lines from a file using &lt;code&gt;Source&lt;/code&gt; may now look something like this (where &lt;code&gt;fName&lt;/code&gt; is a file name (a string)):&lt;br /&gt;&lt;pre style="color: rgb(51, 0, 0);"&gt;val lines = io.Source.fromPath(fName) getLines()&lt;/pre&gt;&lt;br /&gt;If you want to specify the input file encoding to be UTF-8, you could try this:&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(51, 0, 0);"&gt;val lines = io.Source.fromPath(fName)(&lt;/span&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;"UTF8"&lt;/span&gt;&lt;span style="color: rgb(51, 0, 0);"&gt;) getLines()&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;When you look at the API documentation, you'll find that &lt;code&gt;fromPath&lt;/code&gt; takes a &lt;code&gt;Codec&lt;/code&gt; as a second implicit parameter. Through some mysterious conversion (or "implicit conversion"), you can call it with a string (&lt;span style="color: rgb(0, 102, 0);"&gt;"UTF8"&lt;/span&gt;) instead, as in the example above.&lt;br /&gt;&lt;br /&gt;Anyway, no more &lt;code&gt;Source.fromFile(fName).getLines.map(_.stripLineEnd)&lt;/code&gt;. Someone is improving Scala!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6878507690161151492?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6878507690161151492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6878507690161151492' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6878507690161151492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6878507690161151492'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/11/sources-getlines-in-scala-28-now-strips.html' title='Source&apos;s getLines in Scala 2.8 now strips line end'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2243052234228928865</id><published>2009-08-17T11:05:00.004+02:00</published><updated>2009-08-17T11:32:51.680+02:00</updated><title type='text'>Cracker: "Tired of coding Perl"</title><content type='html'>Perl is doomed.&lt;br /&gt;&lt;br /&gt;Even &lt;a href="http://en.wikipedia.org/wiki/Cracker_%28band%29"&gt;Cracker&lt;/a&gt; is tired of it. The evidence is found in a song on their latest album, where they sing "I'm tired of coding Perl, tired of V.B.A."&lt;br /&gt;&lt;br /&gt;Take a look some 30 seconds into the video. The guy browsing the mod_perl Developer's cookbook is not happy. The "Turn on, tune in, drop out with me" video is &lt;a href="http://www.youtube.com/watch?v=HyxIrfjla88"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Perl is doomed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2243052234228928865?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2243052234228928865/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2243052234228928865' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2243052234228928865'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2243052234228928865'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/08/cracker-tired-of-coding-perl.html' title='Cracker: &quot;Tired of coding Perl&quot;'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-7308523068907501064</id><published>2009-07-29T16:00:00.008+02:00</published><updated>2009-07-30T09:45:14.681+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='auxiliary constructors'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala case classes don't have auxiliary constructors?</title><content type='html'>The lesson of today, is that Scala case classes don't appear to have auxiliary constructors.&lt;br /&gt;&lt;br /&gt;In Scala, auxiliary constructors may be added to a class by defining a "&lt;code&gt;this&lt;/code&gt;" method:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;scala&gt; class AClass(s1: String, s2: String) {&lt;br /&gt;  def this(s: String) = this(s, "default")&lt;br /&gt;}&lt;br /&gt;defined class AClass&lt;br /&gt;&lt;br /&gt;scala&gt; new AClass("hey")&lt;br /&gt;res0: AClass = AClass@187b5ff&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Look what happens when you try the same trick on a case class:&lt;pre&gt;&lt;br /&gt;scala&gt; case class ACaseClass(s1: String, s2: String) {&lt;br /&gt;  def this(s: String) = this(s, "default")&lt;br /&gt;}&lt;br /&gt;defined class ACaseClass&lt;br /&gt;&lt;br /&gt;scala&gt; ACaseClass("hey")&lt;br /&gt;&lt;console&gt;:7: error: wrong number of arguments for method apply: (String,String)ACaseClass in object ACaseClass&lt;br /&gt;ACaseClass("hey")&lt;br /&gt;^&lt;br /&gt;&lt;/console&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The attempt at adding an auxiliary constructor compiles, but results in a runtime error.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Oops, yes the can have auxiliary constructors --- see comment below, by jkriesten, straightening things out!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Paul (see comment below) points to the following discussion on this topic &lt;a href="http://www.scala-lang.org/node/976"&gt;http://www.scala-lang.org/node/976&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-7308523068907501064?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/7308523068907501064/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=7308523068907501064' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7308523068907501064'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7308523068907501064'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/07/scala-case-classes-dont-have-auxiliary.html' title='Scala case classes don&apos;t have auxiliary constructors?'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-3831385114686364957</id><published>2009-06-09T21:53:00.024+02:00</published><updated>2009-06-09T22:38:52.208+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='code point'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Printing the Unicode code points of UTF8 characters (Scala)</title><content type='html'>Sometimes it is useful to be able to print the Unicode code point of a UTF8 character. (For instance, when you need to check if you mistakenly use a similar looking character instead of the one you're supposed to use.)&lt;br /&gt;&lt;br /&gt;Using Scala's RichString's format method, you can create a string of a zero padded, four digit, hexadecimal Unicode number, for example of the &lt;code&gt;'ä'&lt;/code&gt; character, like this:&lt;br /&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;"%04X"&lt;/span&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;.format('ä'.toInt)&lt;/span&gt;&lt;br /&gt;res0: String = 00E4&lt;br /&gt;&lt;br /&gt;scala&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here's a related example, printing a tab separated list of some IPA (phonetic) characters and their Unicode code points in a format suitable for using in Scala/Java strings:&lt;br /&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;"ɸβfvθðszʃʒʂʐçʝxɣχʁħʕʜ"&lt;/span&gt;\&lt;br /&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;.map(c =&gt;&lt;/span&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;"%s\t\\u%04X"&lt;/span&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;.format(c, c.toInt))&lt;/span&gt;\&lt;br /&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;.foreach(println)&lt;/span&gt;&lt;br /&gt;ɸ \u0278&lt;br /&gt;β \u03B2&lt;br /&gt;f \u0066&lt;br /&gt;v \u0076&lt;br /&gt;θ \u03B8&lt;br /&gt;ð \u00F0&lt;br /&gt;s \u0073&lt;br /&gt;z \u007A&lt;br /&gt;ʃ \u0283&lt;br /&gt;ʒ \u0292&lt;br /&gt;ʂ \u0282&lt;br /&gt;ʐ \u0290&lt;br /&gt;ç \u00E7&lt;br /&gt;ʝ \u029D&lt;br /&gt;x \u0078&lt;br /&gt;ɣ \u0263&lt;br /&gt;χ \u03C7&lt;br /&gt;ʁ \u0281&lt;br /&gt;ħ \u0127&lt;br /&gt;ʕ \u0295&lt;br /&gt;ʜ \u029C&lt;br /&gt;&lt;br /&gt;scala&gt;&lt;/pre&gt;(The line terminating backslashes in the Scala code are added to indicate the fact that the above is a one-liner that doesn't fit the page. Remove these and the newlines if you want to run the code in the Scala shell.)&lt;br /&gt;&lt;br /&gt;Knowing the codepoints can be useful, e.g. when you don't want to or can't input non-ASCII characters into your code:&lt;br /&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(0, 51, 0);"&gt;var v = &lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;"\u0278"&lt;/span&gt;&lt;br /&gt;v: java.lang.String = ɸ&lt;br /&gt;&lt;br /&gt;scala&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In Java, it looks similar, but you have to cast your chars to ints:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;String.format(&lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;"%04X"&lt;/span&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;, (int) 'ä')&lt;/span&gt;&lt;/code&gt;, etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-3831385114686364957?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/3831385114686364957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=3831385114686364957' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/3831385114686364957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/3831385114686364957'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/06/printing-unicode-code-points-of-utf8.html' title='Printing the Unicode code points of UTF8 characters (Scala)'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1816113654285693916</id><published>2009-03-24T09:26:00.033+01:00</published><updated>2009-03-24T22:05:53.725+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='touppercase'/><category scheme='http://www.blogger.com/atom/ns#' term='toLowerCase'/><category scheme='http://www.blogger.com/atom/ns#' term='equalsIgnoreCase'/><title type='text'>The perils of changing the case of UTF8 strings</title><content type='html'>Below are a few examples of what happens to some just slightly exotic UTF8 strings when up-cased and then down-cased again. The German ß (Eszett) doesn't have an uppercase variant, and becomes two characters. The Greek Sigma has one uppercase variant, but two different lowercase versions: one word final (ς); one for other positions (σ) (explaining my not-so-very-amusing &lt;a href="http://nikolajlindberg.blogspot.com/2009/03/scala-reversing-string-by-up-and.html"&gt;joke&lt;/a&gt; in an earlier post).&lt;br /&gt;&lt;br /&gt;In the table below, you'll find two other Greek lowercase characters that don't like to be up-cased, ΰ and ΐ. These two characters ultimately become six (see the length columns).&lt;br /&gt;&lt;br /&gt;Last, the Turkish variants of &amp;lt;i&amp;gt;, always trusty when it comes to creating confusion (in a computer). The last but one row is interesting, since the original string is severely damaged. In the last row, the proper locale ("tr") is used, and the same string ends up in a much better condition.&lt;br /&gt;&lt;br /&gt;The table was generated using Scala (thus Java) strings. The column &lt;code&gt;EqIgnoreCase&lt;/code&gt; reports the result of comparing the original string and the up-cased and then down-cased version of that string using Scala's/Java's &lt;code&gt;equalsIgnoreCase&lt;/code&gt;. The two rightmost columns present the length of the string before and after changing the case up and down again.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;table style="background-color:black;" border="0" cellpadding="3" cellspacing="1"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="left"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;Orig&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="left"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;UpCase ↑&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="left"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;UpDown ⇅&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="right"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;EqIgnoreCase&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="right"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;OrigLen&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;th align="right"&gt;&lt;span style="color:white;"&gt;&lt;b&gt;NewLen&lt;/b&gt;&lt;/span&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody style="background-color: white;"&gt;&lt;tr&gt;&lt;td&gt;ß&lt;/td&gt;&lt;td&gt;SS&lt;/td&gt;&lt;td&gt;ss&lt;/td&gt;&lt;td&gt;false&lt;/td&gt;&lt;td align="right"&gt;1&lt;/td&gt;&lt;td align="right"&gt;2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ςσ&lt;/td&gt;&lt;td&gt;ΣΣ&lt;/td&gt;&lt;td&gt;σς&lt;/td&gt;&lt;td&gt;true&lt;/td&gt;&lt;td align="right"&gt;2&lt;/td&gt;&lt;td align="right"&gt;2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ΰΐ&lt;/td&gt;&lt;td&gt;Ϋ́Ϊ́&lt;/td&gt;&lt;td&gt;ΰΐ&lt;/td&gt;&lt;td&gt;false&lt;/td&gt;&lt;td align="right"&gt;2&lt;/td&gt;&lt;td align="right"&gt;6&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;iİıI&lt;/td&gt;&lt;td&gt;IİII&lt;/td&gt;&lt;td&gt;iiii&lt;/td&gt;&lt;td&gt;true&lt;/td&gt;&lt;td align="right"&gt;4&lt;/td&gt;&lt;td align="right"&gt;4&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;iİıI&lt;/td&gt;&lt;td&gt;İİII&lt;/td&gt;&lt;td&gt;iiıı&lt;/td&gt;&lt;td&gt;true&lt;/td&gt;&lt;td align="right"&gt;4&lt;/td&gt;&lt;td align="right"&gt;4&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="line-height:1.6"&gt;The lesson? Nothing special. That you can do terrible things to strings. That changing the case of strings may be an irreversible operation. That if you are to normalize some text into either lower or uppercase, you might need to decide what's most suitable for a given language. That it might be a good idea to keep the original strings after normalization. That using the correct locale might help. That I'm not a graphical designer (the table is hideous).&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1816113654285693916?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1816113654285693916/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1816113654285693916' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1816113654285693916'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1816113654285693916'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/03/perils-of-changing-case-of-utf8-strings.html' title='The perils of changing the case of UTF8 strings'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-5858550004764451602</id><published>2009-03-08T00:14:00.005+01:00</published><updated>2009-03-08T00:23:21.460+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='touppercase'/><category scheme='http://www.blogger.com/atom/ns#' term='toLowerCase'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: Reversing a string by up- and then downcasing it</title><content type='html'>Did you know that you can reverse a string by merely upcasing it and then downcasing it again? Here's an example:&lt;br /&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;val s =&lt;/span&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;"ςσσ"&lt;/span&gt;&lt;br /&gt;s: java.lang.String = ςσσ&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;s.toUpperCase.toLowerCase == s.reverse.toString&lt;/span&gt;&lt;br /&gt;res0: Boolean = true&lt;br /&gt;&lt;br /&gt;scala&gt;&lt;/pre&gt;&lt;br /&gt;If you don't believe me, just copy and paste the two lines of code above into the Scala interpreter, and see it for yourself.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-5858550004764451602?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/5858550004764451602/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=5858550004764451602' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/5858550004764451602'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/5858550004764451602'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/03/scala-reversing-string-by-up-and.html' title='Scala: Reversing a string by up- and then downcasing it'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1551295609963626600</id><published>2009-03-05T11:26:00.014+01:00</published><updated>2009-03-09T12:19:08.140+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='firebird'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='Turkish'/><title type='text'>The Firebird database: Problem handling UTF8 characters</title><content type='html'>The 'Latin capital letter I with dot above', İ (Unicode 0130), strikes again! This innocent looking Turkish character seems to be reliable when it comes to breaking software that should be able to handle UTF8. (See also &lt;a href="http://nikolajlindberg.blogspot.com/2008/03/beware-of-java-comparing-turkish.html"&gt;this post&lt;/a&gt; for a Java example.)&lt;br /&gt;&lt;br /&gt;This time it breaks the &lt;a href="http://www.firebirdsql.org/"&gt;Firebird&lt;/a&gt; database (in my case, v2.1.1 on a 64-bit Debian system). Downcasing some random characters in a database configured to handle UTF8 works fine:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;SELECT LOWER(&lt;span style="color: rgb(0, 102, 0);"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;AӴЁΪΣƓ&lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;'&lt;/span&gt;) FROM RDB$DATABASE&lt;/span&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;returns the expected string, &lt;code style="color: rgb(0, 102, 0);"&gt;aӵёϊσɠ&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;However, when you throw in the trouble-making &lt;code style="color: rgb(0, 102, 0);"&gt;İ&lt;/code&gt;, everything blows up:&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;SELECT LOWER(&lt;/span&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;'AӴЁΪΣƓİ'&lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;) FROM RDB$DATABASE&lt;/span&gt;&lt;br /&gt;*** IBPP::SQLException ***&lt;br /&gt;Context: Statement::Fetch&lt;br /&gt;Message: isc_dsql_fetch failed.&lt;br /&gt;&lt;br /&gt;SQL Message : -104&lt;br /&gt;Invalid token&lt;br /&gt;&lt;br /&gt;Engine Code    : 335544849&lt;br /&gt;Engine Message :&lt;br /&gt;Malformed string&lt;/pre&gt;&lt;br /&gt;Slightly different input, generates a different error message:&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;SELECT LOWER(&lt;/span&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;'İA'&lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;) FROM RDB$DATABASE&lt;/span&gt;&lt;br /&gt;*** IBPP::SQLException ***&lt;br /&gt;Context: Statement::Fetch&lt;br /&gt;Message: isc_dsql_fetch failed.&lt;br /&gt;&lt;br /&gt;SQL Message : -802&lt;br /&gt;Arithmetic overflow or division by zero has occurred.&lt;br /&gt;&lt;br /&gt;Engine Code    : 335544321&lt;br /&gt;Engine Message :&lt;br /&gt;arithmetic exception, numeric overflow, or string truncation&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;There is an &lt;a href="http://tech.groups.yahoo.com/group/firebird-support/message/100535?var=1&amp;amp;l=1"&gt;item&lt;/a&gt; on the Firebird user list, but without any answers so far.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; As mariuz points out in a comment below, this defect now seems to be fixed in an upcoming version. See &lt;a href="http://tracker.firebirdsql.org/browse/CORE-2355"&gt;this&lt;/a&gt; bug tracker item.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1551295609963626600?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1551295609963626600/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1551295609963626600' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1551295609963626600'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1551295609963626600'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/03/firebird-database-problem-handling-utf8.html' title='The Firebird database: Problem handling UTF8 characters'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8058980771109106022</id><published>2009-01-06T13:06:00.020+01:00</published><updated>2009-01-09T19:48:40.749+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='real world haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='book'/><title type='text'>Book: Real World Haskell (not much real world so far :)</title><content type='html'>I've just started to read &lt;a href="http://www.realworldhaskell.org/"&gt;Real World Haskell&lt;/a&gt; (the paper book).  It seems like a nice book (except for a few irritating and confusing typos/mistakes at the start of the book).&lt;br /&gt;&lt;br /&gt;However, I've read more than 100 pages so far, and still not a sign of any of the "real world" stuff promised by the title. I still don't know much or anything about IDE:s, how to compile the code, scripting, any practical details on how to structure your code into modules, or anything in that direction. So far, mostly (sometimes rather long-wined) discussions on specific (list) functions. One of the examples, end up in a conclusion that might be paraphrased as "by the way, don't use the function we've discussed the last few pages; in real world settings it doesn't work too well".&lt;br /&gt;&lt;br /&gt;In the real world, you run into both needles and haystacks , occasionally, but that doesn't help making sense of&lt;pre style="color: rgb(102, 0, 0);"&gt;isInAny3 needle haystack = any (isInfixOf needle) haystack&lt;/pre&gt;And one more real world example of the kind &lt;code style="color: rgb(102, 0, 0);"&gt;zip3foobar "quux"&lt;/code&gt; and I may start losing interest... or just start screaming.&lt;br /&gt;&lt;br /&gt;Well, the upcoming chapters have promising titles, so I guess I just have to keep reading.  And I guess you have to start with the basics. Still, over 100 pages, and mostly foobars so far...&lt;br /&gt;&lt;br /&gt;The book is available &lt;a href="http://book.realworldhaskell.org/read/"&gt;on-line&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8058980771109106022?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8058980771109106022/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8058980771109106022' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8058980771109106022'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8058980771109106022'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2009/01/book-real-world-haskell-no-much-real.html' title='Book: Real World Haskell (not much real world so far :)'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2803042169827784845</id><published>2008-12-18T23:33:00.027+01:00</published><updated>2009-01-06T16:51:27.330+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='script'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala for small throw-away scripting tasks</title><content type='html'>I've come to use Scala for tiny scripts to be thrown away after doing some small task. Typically this involves processing a few files, comparing some textual data, maybe extracting some fields of tab-separated files, etc. The kind of things that Perl used to be the obvious choice for.&lt;br /&gt;&lt;br /&gt;Although lacking Perl's simplified syntax for iterating over all lines in files, Scala works quite nicely for small tasks.&lt;br /&gt;&lt;br /&gt;For example, today I had to extract from a file all lines of four or more characters including only upper-case characters, and capitalize the output:&lt;pre style="color: rgb(102, 0, 0);"&gt;scala.io.Source.fromFile(args(0))&lt;br /&gt;.getLines.map(_.stripLineEnd).filter(_.matches("[A-Z]{4,}"))&lt;br /&gt;.map(_.toLowerCase.capitalize).foreach(println)&lt;/pre&gt;Not exactly a thing of beauty, but it only took a minute and it works. And it reminds me a bit of a classic Unix command line pipeline.&lt;br /&gt;&lt;br /&gt;A few things on my wish-list to make Scala even better for small scripts:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A nicer way of setting the output character encoding (currently you have to do something like &lt;code&gt;Console.setOut(new java.io.PrintStream(Console.out,true,"UTF8"))&lt;/code&gt;)&lt;/li&gt;&lt;li&gt; It would be great if &lt;code&gt;Source.getLines&lt;/code&gt; could remove the new line character of each line&lt;/li&gt;&lt;li&gt;A better name for &lt;code&gt;RichString.stripLineEnd&lt;/code&gt; (for some reason, it is totally impossible for me to remember the name of this method)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Maybe scripting support in the Scala Netbeans plugin? (Currently, I think the plugin wants you to put your code in a class/object)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2803042169827784845?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2803042169827784845/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2803042169827784845' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2803042169827784845'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2803042169827784845'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/scala-for-small-throw-away-scripting.html' title='Scala for small throw-away scripting tasks'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1484055138729795057</id><published>2008-12-12T12:01:00.013+01:00</published><updated>2008-12-12T12:37:26.898+01:00</updated><title type='text'>Scala: Reading  a tab separated file into a Map (first attempt)</title><content type='html'>Below is my first attempt, in Scala, at reading a tab separated file into a map, where the first and second fields of the input file make up the key-value pairs.&lt;br /&gt;&lt;br /&gt;There are probably better ways of doing it, but the following seems to work:&lt;pre style="color: rgb(102, 0, 0);"&gt;val keyValuePairs = scala.io.Source.fromFile(inputFileName, "UTF8")&lt;br /&gt;       .getLines.map(_.stripLineEnd.split("\t", -1))&lt;br /&gt;       .map(fields =&gt; fields(0) -&amp;gt; fields(1)).toList&lt;br /&gt;&lt;br /&gt;val map = Map(keyValuePairs : _*)&lt;/pre&gt;&lt;br /&gt;The &lt;code&gt;keyValuePairs:_*&lt;/code&gt; stuff is a way to call a variable length argument, the constructor of (the immutable) &lt;code&gt;Map&lt;/code&gt;, with a list (&lt;code&gt;keyValuePairs&lt;/code&gt;).&lt;br /&gt;&lt;br /&gt;I'm pretty sure that there are neater ways of doing it. Furthermore, the above snippet does not do any sensible error checking or input validation (such as skipping empty line, for instance).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1484055138729795057?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1484055138729795057/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1484055138729795057' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1484055138729795057'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1484055138729795057'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/scala-reading-tab-separated-file-into.html' title='Scala: Reading  a tab separated file into a Map (first attempt)'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1911021721704956185</id><published>2008-12-11T16:26:00.009+01:00</published><updated>2009-01-11T15:27:25.176+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='netbeans'/><category scheme='http://www.blogger.com/atom/ns#' term='junit'/><title type='text'>Intelligent Software: Netbeans (or JUnit?) can count to three!</title><content type='html'>I just noticed a (very) small detail in Netbeans. I was adding some unit test, when I noticed that Netbeans can count to, at least, three.&lt;br /&gt;&lt;br /&gt;When running a JUnit test suite of only one test, you get the message "The test passed". After adding another test, the message is "Both tests passed", then "3 tests passed", etc. (Well, of course, given that the tests pass.)&lt;br /&gt;&lt;br /&gt;Now, that's what I call (artificial) intelligence.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.economist.com/science/displaystory.cfm?story_id=12847128"&gt;Here's&lt;/a&gt; an unrelated article on counting to three (and more).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1911021721704956185?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1911021721704956185/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1911021721704956185' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1911021721704956185'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1911021721704956185'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/intelligent-software-netbeans-or-junit.html' title='Intelligent Software: Netbeans (or JUnit?) can count to three!'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6937433438125340363</id><published>2008-12-09T22:01:00.010+01:00</published><updated>2008-12-18T23:33:17.476+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='shadowing'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: Beware of inadvertently shadowing variables</title><content type='html'>I've just spent 15 minutes looking for a stupid mistake in some Scala code. The problem was that I had shadowed a variable.&lt;br /&gt;&lt;br /&gt;In some situations in Scala, you are allowed to &lt;span style="font-style: italic;"&gt;shadow&lt;/span&gt; variables. In other words, it is sometimes legal to give a new variable the same name as an existing one. This can lead to mistakes. The following legal code illustrates how you can shadow a method input variable:&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;def theShadow(list :Array[String]) : Seq[String] = {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;  // Mistake! Inadvertently&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;  // shadowing the input parameter:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;  val list = List("Asa", "nisi", "masa")&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;  list&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt; }&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;(The above is a very obvious example. When you make this mistake in real code, it will probably be in a less obvious context.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6937433438125340363?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6937433438125340363/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6937433438125340363' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6937433438125340363'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6937433438125340363'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/scala-beware-of-inadvertently-shadowing.html' title='Scala: Beware of inadvertently shadowing variables'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-5524890750861267962</id><published>2008-12-09T12:48:00.017+01:00</published><updated>2008-12-12T14:58:29.967+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='empty elements'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: XML serializer adds closing elements to empty elements</title><content type='html'>When printing Scala XML nodes/elements, closing tags for empty elements are added, even if there weren't any in the input.&lt;br /&gt;&lt;br /&gt;For example, if you input &lt;code&gt;&amp;lt;childless/&amp;gt;&lt;/code&gt;, the XML processor will add a closing tag like this:&lt;pre&gt;scala&amp;gt; &lt;span style="color: rgb(102, 0, 0);"&gt;val elem = &amp;lt;childless/&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;elem: scala.xml.Elem = &amp;lt;childless&amp;gt;&amp;lt;/childless&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;(The two versions of the XML element are equivalent, but sometimes it is practical to be able to do a simple string comparison of the input and output XML files. The added closing tags may make this harder.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;See &lt;a href="http://www.nabble.com/minor-XML-question-td19157873.html#a19157873"&gt;this&lt;/a&gt; thread.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-5524890750861267962?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/5524890750861267962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=5524890750861267962' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/5524890750861267962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/5524890750861267962'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/scala-xml-serializer-adds-closing.html' title='Scala: XML serializer adds closing elements to empty elements'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8130968059457714537</id><published>2008-12-05T11:49:00.009+01:00</published><updated>2008-12-13T10:50:28.517+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='scala.xml.XML.loadFile'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: Problems using the XML API</title><content type='html'>I've encountered some problems using the Scala XML API. The first one had to do with &lt;code&gt;scala.xml.XML.loadFile&lt;/code&gt; throwing away comment nodes of the input XML file.&lt;br /&gt;&lt;br /&gt;A helpful person on the scala-user list suggested instead using &lt;code&gt;&lt;code&gt;scala.xml.parsing.ConstructingParser.fromFile&lt;/code&gt;&lt;/code&gt;. This worked nicely, keeping the comment elements of the input file intact. However, when processing larger XML files, this approach did not work well, resulting in out of memory exceptions.&lt;br /&gt;&lt;br /&gt;Finally, I got yet a helpful answer on the scala-user list, this time in the form of some code, translating Java XML nodes into the Scala equivalents.&lt;br /&gt;&lt;br /&gt;If you get into the same trouble as I did, you may want to take a look at &lt;a href="http://www.nabble.com/attachment/20821893/0/Xml.scala"&gt;this code snippet&lt;/a&gt; posted on the scala-user list by David Pollak. (You might have to change the code a bit to suit your needs, though.)&lt;br /&gt;&lt;br /&gt;Yet a problem I've encountered: you might be hit by a performance problem when extracting child nodes of a large &lt;code&gt;Elem&lt;/code&gt; using the &lt;code&gt;\\&lt;/code&gt; or &lt;code&gt;\&lt;/code&gt; operators. (The fix seems to be to loop over the child nodes instead.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Summary:&lt;/span&gt; The current Scala XML API may not work flawlessly if you both want to process rather large documents and at the same time keep all the information of the original input XML file... but it works fine if you write your own XML file reader (see link above) and are careful with the use of &lt;code&gt;\\&lt;/code&gt; or &lt;code&gt;\&lt;/code&gt; on large &lt;code&gt;Elems&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here's &lt;a href="http://nikolajlindberg.blogspot.com/2008/05/scala-programming-language-and-xml.html"&gt;an earlier post&lt;/a&gt; on Scala XML processing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8130968059457714537?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8130968059457714537/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8130968059457714537' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8130968059457714537'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8130968059457714537'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/12/scala-problems-using-xml-api.html' title='Scala: Problems using the XML API'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4918236914071461402</id><published>2008-11-19T12:56:00.006+01:00</published><updated>2008-11-19T16:06:02.938+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='netbeans 6.5'/><category scheme='http://www.blogger.com/atom/ns#' term='netbeans'/><category scheme='http://www.blogger.com/atom/ns#' term='scala plugin'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: New Netbeans 6.5 plugin</title><content type='html'>There is a new version of a &lt;a href="http://www.netbeans.org/"&gt;Netbeans&lt;/a&gt; 6.5 plugin for &lt;a href="http://scala-lang.org/"&gt;Scala&lt;/a&gt; programming.&lt;br /&gt;&lt;br /&gt;The Scala plugin already seems quite useful, and it's getting better and better for each new version.&lt;br /&gt;&lt;br /&gt;Check it out &lt;a href="http://wiki.netbeans.org/Scala"&gt;here&lt;/a&gt;. This is a link to the blog of the &lt;a href="http://blogtrader.net/page/dcaoyuan/"&gt;author&lt;/a&gt; of the plugin.&lt;br /&gt;&lt;br /&gt;By the way, Netbeans 6.5 was just released too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4918236914071461402?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4918236914071461402/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4918236914071461402' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4918236914071461402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4918236914071461402'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/11/scala-new-netbeans-plugin.html' title='Scala: New Netbeans 6.5 plugin'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4264255488137028818</id><published>2008-11-18T12:10:00.015+01:00</published><updated>2008-11-18T23:36:05.761+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tuple'/><category scheme='http://www.blogger.com/atom/ns#' term='map'/><category scheme='http://www.blogger.com/atom/ns#' term='+='/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: The Map += method expects a Tuple: += ((k, v))</title><content type='html'>In &lt;a href="http://scala-lang.org/"&gt;Scala&lt;/a&gt;, you use the &lt;code&gt;+=&lt;/code&gt; method to add a key-value pair to a Map. The key-value pair should be in the form of a Tuple, or a Pair. You can use different syntax for such pairs: &lt;code style="color: rgb(102, 0, 0);"&gt;("year", 2008)&lt;/code&gt;, &lt;code style="color: rgb(102, 0, 0);"&gt;"year" -&gt; 2008&lt;/code&gt;, &lt;code style="color: rgb(102, 0, 0);"&gt;Tuple2("year", 2008)&lt;/code&gt; or &lt;code style="color: rgb(102, 0, 0);"&gt;Pair("year", 2008)&lt;/code&gt;:&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;("year",2008) == "year" -&gt; 2008&lt;/span&gt;&lt;br /&gt;res0: Boolean = true&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;"year" -&gt; 2008 == Pair("year", 2008)&lt;/span&gt;&lt;br /&gt;res1: Boolean = true&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;Pair("year", 2008) == Tuple2("year", 2008)&lt;/span&gt;&lt;br /&gt;res2: Boolean = true&lt;br /&gt;&lt;/pre&gt;  &lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Thus, a few different but equal ways of adding a key-value pair to a &lt;code&gt;Map&lt;/code&gt;: &lt;/p&gt;&lt;pre&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;val map = new scala.collection.mutable.HashMap[String,Int]&lt;/span&gt;&lt;br /&gt;map: scala.collection.mutable.HashMap[String,Int] = Map()&lt;br /&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;map += (("year",2008))&lt;/span&gt;  &lt;span style="color: rgb(0, 102, 0);"&gt;//Notice the parentheses&lt;/span&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;map += ("year" -&gt; 2008)&lt;/span&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;map += Pair("year",2008)&lt;/span&gt;&lt;br /&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;map += Tuple2("year", 2008)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;  &lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;However, this one fails, because of missing parentheses: &lt;/p&gt;&lt;pre&gt;scala&gt; &lt;span style="color: rgb(102, 0, 0);"&gt;map += ("year",2008)&lt;/span&gt;&lt;br /&gt;&lt;console&gt;:6: error: type mismatch;&lt;br /&gt;found   : java.lang.String("year")&lt;br /&gt;required: (String, Int)&lt;br /&gt;map+=("year",2008)&lt;br /&gt;^&lt;br /&gt;&lt;/console&gt;&lt;/pre&gt;  &lt;p&gt;  &lt;code&gt;&lt;/code&gt;&lt;code&gt;&lt;/code&gt;  &lt;/p&gt;&lt;br /&gt;You can check out, e.g., &lt;a href="http://www.nabble.com/scaladoc-is-not-uptodate-with-latest-release--td18015529.html"&gt;this&lt;/a&gt; and &lt;a href="http://www.nabble.com/Tuple2-and-maps-td19875692.html#a19875692"&gt;this&lt;/a&gt; thread on the Scala mailing list.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4264255488137028818?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4264255488137028818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4264255488137028818' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4264255488137028818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4264255488137028818'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/11/scala-map-method-expects-tuple-k-v.html' title='Scala: The Map += method expects a Tuple: += ((k, v))'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1296398393455140267</id><published>2008-11-10T20:51:00.006+01:00</published><updated>2008-11-13T10:16:38.624+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='collection'/><category scheme='http://www.blogger.com/atom/ns#' term='jcl'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: Converting Java collections into their Scala counterparts</title><content type='html'>&lt;p&gt; In the &lt;code&gt;scala.collection.jcl&lt;/code&gt; library, you'll find &lt;a href="http://scala-lang.org/"&gt;Scala&lt;/a&gt; wrappers, adding Scala methods to Java collections.  This means that a Java collection (e.g., an &lt;code&gt;ArrayList&lt;/code&gt;) will be converted to work as a Scala collection, making it possible to call foreach on a &lt;code&gt;ArrayList&lt;/code&gt;, etc:&lt;br /&gt;&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;import scala.collection.jcl.Conversions._&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val a = new java.util.ArrayList[String]&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;a.add("Asa")&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;a.add("nisi")&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;a.add("masa")&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;// foreach now works on a Java List:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;a.foreach(println)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt; Simlarily, you can now call &lt;code&gt;.mkString&lt;/code&gt; on a Java list: &lt;pre&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;// Let's use mkString to print the&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;// ListArray contents as a Prolog spell/3 fact:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;println(a.mkString("spell('", "', '", "')."))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 51, 0);"&gt;// -&gt; spell('Asa', 'nisi', 'masa').&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;  &lt;p&gt; See this Scala mailing list &lt;a href="http://www.nabble.com/Using-scala-constructs-on-Java-collections-td20112193.html#a20112193"&gt;thread&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1296398393455140267?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1296398393455140267/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1296398393455140267' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1296398393455140267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1296398393455140267'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/11/scala-converting-java-collections-into.html' title='Scala: Converting Java collections into their Scala counterparts'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-919463094281217552</id><published>2008-11-10T20:37:00.014+01:00</published><updated>2010-01-27T16:35:59.679+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='companion object'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: You cannot run a companion object as a stand-alone program</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; In Scala 2.8, the below is no longer true. A companion object can now work as the entry point of an application.&lt;br /&gt;&lt;br /&gt;===============================&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://scala-lang.org/"&gt;Scala&lt;/a&gt; programming language, a companion object is an object with the same name as a class in the same source file. (Scala's companion objects can be used similar to Java's static methods.)  &lt;p&gt; An object definition on its own can function as the entry point for running a Scala program. Compiling and running this object works fine:  &lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;object heyYouTheRocksteadyCrew{&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;    def main(args :Array[String]) {&lt;br /&gt;     println("Make a break!")&lt;br /&gt; }&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;  &lt;p&gt; However, if you try to run the same object when it is a companion object to a class with the same name, this will result in an exception:  &lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;pre style="color: rgb(153, 51, 0);"&gt;class heyYouTheRocksteadyCrew{}&lt;br /&gt;&lt;br /&gt;object heyYouTheRocksteadyCrew{&lt;br /&gt; def main(args :Array[String]) {&lt;br /&gt;     println("Make a move!")&lt;br /&gt; }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;  &lt;p style="color: rgb(153, 51, 0);"&gt; &lt;/p&gt;&lt;pre&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;java.lang.NoSuchMethodException:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;heyYouTheRocksteadyCrew.main([Ljava.lang.String;)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;   &lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt; The above is true of the current release, 2.7.2.final. (Until this is fixed,  &lt;a href="http://www.youtube.com/watch?v=h6DJrZEUjjo"&gt;these guys&lt;/a&gt; will not be too happy about any stupid&lt;span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;&lt;span style="font-family:monospace;"&gt; &lt;/span&gt;&lt;span style="color: rgb(51, 0, 51);"&gt;heyYouTheRocksteadyCrew&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;-exception...!)&lt;br /&gt;&lt;/p&gt;&lt;p&gt;There is at least &lt;a href="http://www.nabble.com/main-method-in-file-with-class-and-object--td14907952.html#a14907952"&gt;one thread&lt;/a&gt; about the above on the Scala mailing list.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-919463094281217552?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/919463094281217552/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=919463094281217552' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/919463094281217552'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/919463094281217552'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/11/scala-you-cannot-run-companion-object.html' title='Scala: You cannot run a companion object as a stand-alone program'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2471136547591573519</id><published>2008-10-27T14:30:00.019+01:00</published><updated>2008-11-20T14:30:15.918+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firebird'/><category scheme='http://www.blogger.com/atom/ns#' term='flamerobin'/><title type='text'>New version of FlameRobin, Admin GUI for Firebirdsql</title><content type='html'>&lt;a href="http://flamerobin.org/"&gt;FlameRobin&lt;/a&gt; 0.9.0 is released&lt;br /&gt;&lt;br /&gt;There is a new version of FlameRobin, a GUI for creating and managing &lt;a href="http://firebirdsql.org/"&gt;Firebird&lt;/a&gt; databases. This version introduces tabbed browsing. In earlier versions of FlameRobin, you would end up with an unmanageable large number of open windows after a short while. This problem is now largely gone. (However, not everyone found this to be a problem --- see one of the comments below.)&lt;br /&gt;&lt;br /&gt;I've mostly used FlameRobin for inspecting existing databases, for some minor editing of (existing) stored procedures, and for querying the database. The overall judgment is that it is a fine and very useful piece of software.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; There is no pre-compiled version for 64-bit Linux, so if this is what you need, you're on your own... (we gave up on compiling from source after 15 minutes of library dependency hell). This was on a Ubuntu machine.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update of the update:&lt;/span&gt; Apparently, there &lt;span style="font-style: italic;"&gt;was&lt;/span&gt; already a pre-compiled 64-bit Ubuntu version available! To find the information about the Ubuntu repository, you should not go to the download page that the announcement of the new version points to,  you should go &lt;a href="http://flamerobin.blogspot.com/2008/10/flamerobin-090-ready-to-install-on.html"&gt;here&lt;/a&gt;. Thanks  mariuz, for pointing this out (see comment below).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2471136547591573519?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2471136547591573519/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2471136547591573519' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2471136547591573519'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2471136547591573519'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/10/new-version-of-flamerobin-admin-gui-for.html' title='New version of FlameRobin, Admin GUI for Firebirdsql'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4924476294884881381</id><published>2008-09-05T18:31:00.015+02:00</published><updated>2010-01-27T16:36:38.575+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='richstring'/><category scheme='http://www.blogger.com/atom/ns#' term='equality'/><category scheme='http://www.blogger.com/atom/ns#' term='string'/><category scheme='http://www.blogger.com/atom/ns#' term='equals'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala: String vs RichString oddities</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; In Scala 2.8, the below is no longer true. &lt;code&gt;String.reverse&lt;/code&gt; now returns a &lt;code&gt;String&lt;/code&gt; rather than a &lt;code&gt;RichString&lt;/code&gt;:&lt;pre&gt;&lt;br /&gt;scala&gt; "a".reverse == "a"&lt;br /&gt;res0: Boolean = true&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;=========================================&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt; programming language, there is a class called &lt;code&gt;RichString&lt;/code&gt;, that adds features to the underlying Java &lt;code&gt;String&lt;/code&gt;. In the current version of Scala (2.7.2.final), this leads to some odd behaviour:&lt;br /&gt;&lt;pre style="color: rgb(102, 0, 0);"&gt;"Im a string" == "Im a string".reverse.reverse&lt;/pre&gt; returns &lt;code&gt;false&lt;/code&gt;, while&lt;pre style="color: rgb(102, 0, 0);"&gt;"Im a string" == "Im a string".reverse.reverse.toString&lt;/pre&gt;returns &lt;code&gt;true&lt;/code&gt;!&lt;br /&gt;&lt;br /&gt;Just to make your head spin, the following code does indeed work as expected:&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val str :String = "Im a string".reverse.reverse&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;println(str == "Im a string")&lt;/span&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;// prints "true"&lt;/span&gt;&lt;/pre&gt;while&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val str = "Im a string".reverse.reverse&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;println(str == "Im a string")&lt;/span&gt; &lt;span style="color: rgb(0, 102, 0);"&gt;// prints "false"&lt;/span&gt;&lt;/pre&gt; does not.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The explanation is that &lt;code&gt;String.reverse&lt;/code&gt; returns a &lt;code&gt;RichString&lt;/code&gt;, and that &lt;code&gt;==&lt;/code&gt; returns &lt;code&gt;false&lt;/code&gt; when comparing a &lt;code&gt;String&lt;/code&gt; and a &lt;code&gt;RichString&lt;/code&gt;, even though it is the "same" string (as in the example above).&lt;br /&gt;&lt;br /&gt;If I understand it correctly, this oddity will be fixed in future releases of Scala.&lt;br /&gt;&lt;br /&gt;(And no, Scala's &lt;code&gt;==&lt;/code&gt; is not the same as Java's ditto. It means "equal objects" rather than "refers to the same instance of an object".)&lt;br /&gt;&lt;br /&gt;Scala mailing list item &lt;a href="http://www.nabble.com/Problem-with-RichString-%28error-%29-td16850336.html"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4924476294884881381?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4924476294884881381/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4924476294884881381' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4924476294884881381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4924476294884881381'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/09/scala-string-vs-richstring-oddities.html' title='Scala: String vs RichString oddities'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2178309326385291573</id><published>2008-09-05T16:17:00.009+02:00</published><updated>2008-09-11T15:06:26.614+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern'/><category scheme='http://www.blogger.com/atom/ns#' term='case insensitive'/><category scheme='http://www.blogger.com/atom/ns#' term='patternmatching'/><title type='text'>Case insensitive pattern matching of Unicode strings in Java</title><content type='html'>To make case insensitive pattern matching of Unicode strings in Java, you can call &lt;code&gt;Pattern.compile&lt;/code&gt; with a second argument, like this: &lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Pattern p = &lt;br /&gt;Pattern.compile(patternString, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;(This is useful when dealing with non-ASCII/non-Latin1 text, such as Cyrillic. However, it may not work flawlessly for the Turkish Unicode characters.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; I just learned that there is a nicer way of doing this: start the patternString above with "(?iu)":&lt;pre&gt;Pattern p = &lt;br /&gt;Pattern.compile("(?iu)"+ patternString);&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2178309326385291573?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2178309326385291573/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2178309326385291573' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2178309326385291573'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2178309326385291573'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/09/case-insensitive-pattern-matching-of.html' title='Case insensitive pattern matching of Unicode strings in Java'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2007531200636632187</id><published>2008-08-30T00:38:00.042+02:00</published><updated>2008-09-02T16:43:30.595+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='implicip conversion'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>Scala and implicit conversion: Turning a string into pure Weirdness</title><content type='html'>In the &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt; programming language, you can turn water into wine, or vice versa, using implicit conversion.&lt;br /&gt;&lt;br /&gt;Imagine that you have a class called &lt;code style="color: rgb(0, 0, 153);"&gt;Weird&lt;/code&gt;:&lt;br /&gt;&lt;pre style="color: rgb(0, 0, 153);"&gt;&lt;br /&gt;class Weird(s :String) {&lt;br /&gt;  def imWeird :String = {&lt;br /&gt;    "I'm "+ s +" and I'm weird!"&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;It consists of merely a string, &lt;code style="color: rgb(0, 0, 153);"&gt;s&lt;/code&gt;, and a method, &lt;code style="color: rgb(0, 0, 153);"&gt;imWeird&lt;/code&gt;, that returns a jolly message containing the very same string. (Thus, the code&lt;pre&gt;&lt;span style="color: rgb(0, 0, 153);"&gt;val freak = new Weird("a freak")&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 153);"&gt;println(freak.imWeird)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt; outputs &lt;code style="color: rgb(153, 0, 0);"&gt;I'm a freak and I'm weird!&lt;/code&gt;.)&lt;br /&gt;&lt;br /&gt;Now, Scala allows you to create an implicit conversion that adds the method(s) of &lt;code style="color: rgb(0, 0, 153);"&gt;Weird&lt;/code&gt; to any other class. Or rather, turns  an object into a &lt;code style="color: rgb(0, 0, 153);"&gt;Weird&lt;/code&gt; whenever one calls &lt;code style="color: rgb(0, 0, 153);"&gt;Weird&lt;/code&gt;'s methods (functions) on the given object.&lt;br /&gt;&lt;br /&gt;For example, the following implicit conversion&lt;pre style="color: rgb(0, 0, 153);"&gt;  implicit def string2Weird(s: String) = new Weird(s)&lt;/pre&gt;makes it possible to call &lt;code style="color: rgb(0, 0, 153);"&gt;Weird&lt;/code&gt;'s method(s) on a &lt;code style="color: rgb(0, 0, 153);"&gt;String&lt;/code&gt;. This code&lt;br /&gt;&lt;pre style="color: rgb(0, 0, 153);"&gt;val happy = "Happy"&lt;br /&gt;println(happy.imWeird)&lt;/pre&gt;will now output &lt;pre style="color: rgb(153, 0, 0);"&gt;  I'm Happpy and I'm weird!&lt;/pre&gt;&lt;br /&gt;The name of the implicit conversion method, &lt;code style="color: rgb(0, 0, 153);"&gt;string2Weird&lt;/code&gt;, is arbitrary.&lt;br /&gt;&lt;br /&gt;&lt;!--When the going gets weird, the weird go computing.--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2007531200636632187?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2007531200636632187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2007531200636632187' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2007531200636632187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2007531200636632187'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/08/scala-and-implicit-conversion-turning.html' title='Scala and implicit conversion: Turning a string into pure Weirdness'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6500619881708404356</id><published>2008-07-18T17:16:00.019+02:00</published><updated>2008-07-19T14:34:02.812+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tellstick'/><category scheme='http://www.blogger.com/atom/ns#' term='wireless'/><category scheme='http://www.blogger.com/atom/ns#' term='usb'/><title type='text'>Turning the lights on (and off)  from your computer</title><content type='html'>I just bought a Telldus Technologies &lt;a href="http://telldus.se/"&gt;TellStick&lt;/a&gt;. It is a wireless USB-device that can be used for, e.g., turning on and off the light (if it is equipped with a suitable receiver).&lt;br /&gt;&lt;br /&gt;The TellStick software is free, and it works under Linux. It even comes with a Java API, and a NetBeans project with some sample code. However, to get the Java binding to work, you have to install some strange libraries (&lt;a href="http://www.rxtx.org/"&gt;rxtx&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;You can controll any number of recievers using your TellStick.&lt;br /&gt;&lt;br /&gt;Next time a program throws an exception, it will be able to turn on (or off) a lamp somewhere in the office...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6500619881708404356?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6500619881708404356/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6500619881708404356' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6500619881708404356'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6500619881708404356'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/07/turning-on-and-off-lights-from-your.html' title='Turning the lights on (and off)  from your computer'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-864297585439938449</id><published>2008-05-21T20:35:00.012+02:00</published><updated>2008-07-19T14:29:05.515+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Svenskt associationslexikon'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish lexicon'/><category scheme='http://www.blogger.com/atom/ns#' term='SALDO 1.0'/><title type='text'>Saldo 1.0: Large, freely available Swedish morphologic and semantic lexicon</title><content type='html'>&lt;a href="http://spraakbanken.gu.se/"&gt;Språkbanken&lt;/a&gt; has published a large, freely available Swedish lexicon, &lt;a href="http://spraakbanken.gu.se/sal/saldo_dist_v1.0.tgz"&gt;Saldo&lt;/a&gt;, "a Swedish basic language resource". The release appears to include some 68,000 uninflected lemma forms as well as more than 740,000 expanded (full) word forms. There is morphologic and semantic information.&lt;br /&gt;&lt;br /&gt;This resource should be valuable for part-of-speech tagging, lemmatizers, spell-checking, (semantic) analysis of Swedish text, etc, etc.&lt;br /&gt;&lt;br /&gt;The release includes software for interfacing with the lexicon. (Those of you into functional programming might be interested in the fact that the lexicon software is written in Haskell.)&lt;br /&gt;&lt;br /&gt;It is released under the LGPL license.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-864297585439938449?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/864297585439938449/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=864297585439938449' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/864297585439938449'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/864297585439938449'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/05/saldo-10-large-freely-available-swedish.html' title='Saldo 1.0: Large, freely available Swedish morphologic and semantic lexicon'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1977657510247690855</id><published>2008-05-16T11:26:00.023+02:00</published><updated>2008-08-29T12:44:32.722+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='upcasing'/><category scheme='http://www.blogger.com/atom/ns#' term='fromfile'/><category scheme='http://www.blogger.com/atom/ns#' term='touppercase'/><category scheme='http://www.blogger.com/atom/ns#' term='script'/><category scheme='http://www.blogger.com/atom/ns#' term='savecompiled'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><category scheme='http://www.blogger.com/atom/ns#' term='getlines'/><title type='text'>Scala one-liner for upcasing lines of text</title><content type='html'>The following is a &lt;a href="http://scala-lang.org/"&gt;Scala&lt;/a&gt; script that up-cases each line of an UTF8 encoded input file (&lt;code&gt;args(0)&lt;/code&gt;) and prints the result to standard output:&lt;br /&gt;&lt;pre&gt;import scala.io.Source&lt;br /&gt;&lt;br /&gt;Console.setOut(new java.io.PrintStream(Console.out,true,"UTF8"))&lt;br /&gt;&lt;br /&gt;Source.fromFile(args(0), "UTF8").getLines.foreach(line =&amp;gt; print(line.toUpperCase))&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If you're trusting the default character encoding to work for you, you may reduce it to:&lt;pre&gt;&lt;br /&gt;import scala.io.Source&lt;br /&gt;&lt;br /&gt;Source.fromFile(args(0)).getLines.foreach(line =&gt; print(line.toUpperCase))&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Another way to do it, is to read the lines into an iterator, using the iterator's &lt;code&gt;.map&lt;/code&gt; method to upcase each line:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;import scala.io.Source&lt;br /&gt;&lt;br /&gt;val lines = Source.fromFile(args(0)).getLines.map(_.toUpperCase)&lt;br /&gt;&lt;br /&gt;lines.foreach(print)&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;A Java programmer may be relieved (or horrified) to learn that Scala does not have any checked exceptions. There are only runtime exceptions, and you don't need to add any try/catch statements if you don't want to.&lt;br /&gt;&lt;br /&gt;When you run a Scala script, you can instruct the Scala interpreter to compile the script, and use the compiled version (a jar file) if it's younger than the source-file. This gives better performance (shorter start-up, etc). You use the &lt;code&gt;savecompiled&lt;/code&gt; command line argument.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1977657510247690855?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1977657510247690855/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1977657510247690855' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1977657510247690855'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1977657510247690855'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/05/scala-one-liner-for-upcasing-lines-of.html' title='Scala one-liner for upcasing lines of text'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1565942973413551447</id><published>2008-05-10T13:29:00.040+02:00</published><updated>2008-12-11T10:29:25.788+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>The Scala programming language and XML</title><content type='html'>The &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt; programming language is a combined scripting and "proper" language, that sits on top of the Java VM. You can either run scripts similar to how you run a Ruby or Perl script, or compile your Scala classes to Java bytecode. You run a Scala application similar to how you run a Java application. You can also run a Scala application using the Java VM (but you have to add the Scala library jar file to your class path). You can mix Java and Scala programs, calling Scala objects from Java, and vice versa.&lt;br /&gt;&lt;br /&gt;Scala has a feature that I have never seen in a language suitable for general programming: XML (processing) as a feature of the language. The people behind Scala has added XML to the syntax of the language itself. You do not have to load some library or use some special API for processing XML, since it's already part of the language.&lt;br /&gt;&lt;br /&gt;It is not only that XML is valid in Scala code, but XML has its own built-in data types. For instance, &lt;pre style="color: rgb(102, 0, 0);"&gt;val xml = &amp;lt;vegetable&amp;gt;potato&amp;lt;/vegetable&amp;gt;&lt;/pre&gt; is a valid Scala statement. In other words, XML-elements written in a Scala program are just not merely strings. The &lt;code&gt;xml&lt;/code&gt; object can now be manipulated in various ways, much like a &lt;code&gt;DOM&lt;/code&gt; object in Java (but with less hassle than in Java).&lt;br /&gt;&lt;br /&gt;You can refer to variables in your XML:&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val veg = "potato"&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val col = "white"&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;val xml = &amp;lt;vegetable colour={col}&amp;gt;{veg}&amp;lt;/vegetable&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;// The value of xml now corresponds to&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;// &amp;lt;vegetable colour="white"&amp;gt;potato&amp;lt;/vegetable&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;You can also embed function/method calls into XML elements. Imagine that you have a method that returns a sequence of &lt;code&gt;n&lt;/code&gt; XML elements, like this (you'll need to import &lt;code&gt;scala.xml.NodeSeq&lt;/code&gt; and &lt;code&gt;scala.xml.NodeBuffer&lt;/code&gt;):&lt;br /&gt;&lt;br /&gt;&lt;pre style="color: rgb(102, 0, 0);"&gt;def genNumElems(n :Int) :NodeSeq = {&lt;br /&gt;val result = new NodeBuffer&lt;br /&gt;for(i &amp;lt;- 1 to n) {&lt;br /&gt;    result &amp;amp;+ (&amp;lt;number value={i.toString}/&amp;gt;)&lt;br /&gt;  }&lt;br /&gt;  result&lt;br /&gt;}&lt;/pre&gt;(The odd-looking &lt;code&gt;&amp;amp;+&lt;/code&gt; operator means "add".)&lt;br /&gt;&lt;br /&gt;You can now embed a call to &lt;code&gt;genNumElems&lt;/code&gt; in an XML element, e.g., like this:&lt;br /&gt;&lt;pre style="color: rgb(102, 0, 0);"&gt;val numList = &amp;lt;number_list&amp;gt;{genNumElems(4)}&amp;lt;number_list/&amp;gt;&lt;/pre&gt;&lt;br /&gt;Printing &lt;code&gt;numList&lt;/code&gt; produces:&lt;pre style="color: rgb(102, 0, 0);"&gt;&amp;lt;number_list&amp;gt;&lt;br /&gt;  &amp;lt;number value="1"&amp;gt;&amp;lt;/number&amp;gt;&lt;br /&gt;  &amp;lt;number value="2"&amp;gt;&amp;lt;/number&amp;gt;&lt;br /&gt;  &amp;lt;number value="3"&amp;gt;&amp;lt;/number&amp;gt;&lt;br /&gt;  &amp;lt;number value="4"&amp;gt;&amp;lt;/number&amp;gt;&lt;br /&gt;&amp;lt;/number_list&amp;gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If you want nicer output, you can use a &lt;code&gt;PrettyPrinter&lt;/code&gt; (that you import from &lt;span style="font-family:times new roman;"&gt;scala.xml._&lt;/span&gt;):&lt;br /&gt;&lt;pre style="color: rgb(102, 0, 0);"&gt;val pp = new PrettyPrinter(100,2) &lt;span style="color: rgb(0, 102, 0);"&gt;// width and indentation&lt;/span&gt;&lt;br /&gt;println(pp.format(xml))&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Reading and writing XML data/files to and from Scala is easy. The following is a one-liner that reads an XML file given as a command line argument (&lt;code&gt;args(0)&lt;/code&gt;) and returns a list of all elements named "tr" that are child elements to any elements called "table" of the XML file:&lt;br /&gt;&lt;pre style="color: rgb(102, 0, 0);"&gt;val trNodes = scala.xml.XML.loadFile(args(0)) \\ "table" \ "tr"&lt;/pre&gt;You may print the &lt;code&gt;&amp;lt;tr&amp;gt;&lt;/code&gt; elements (with an empty line between each element) thus:&lt;pre style="color: rgb(102, 0, 0);"&gt;trNodes.foreach(tr =&amp;gt; println(tr + "\n"))&lt;/pre&gt;&lt;br /&gt;The built-in XML support in Scala's syntax and basic libraries are not the most important or interesting features of Scala, but they sure seem to be very useful.&lt;br /&gt;&lt;br /&gt;(Incidentally, the &lt;code&gt;table&lt;/code&gt; and &lt;code&gt;tr&lt;/code&gt; elements above are present in Oocalc's (OpenOffice.org) XML format for spreadsheets.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; It appears that is not always good advice to use &lt;code&gt;scala.xml.XML.loadFile&lt;/code&gt; to read an XML document. One reason is that comment elements are lost. For more advanced XML processing, one should turn to &lt;code&gt;scala.xml.parsing.ConstructingParser.fromFile&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; You may run into trouble when processing larger XML documents using the second approach. See &lt;a href="http://nikolajlindberg.blogspot.com/2008/12/scala-problems-using-xml-api.html"&gt;this comment&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1565942973413551447?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1565942973413551447/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1565942973413551447' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1565942973413551447'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1565942973413551447'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/05/scala-programming-language-and-xml.html' title='The Scala programming language and XML'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8599112601776999944</id><published>2008-04-27T08:38:00.024+02:00</published><updated>2008-05-24T17:55:39.921+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='debian'/><category scheme='http://www.blogger.com/atom/ns#' term='debian 4.0'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu 8.04'/><category scheme='http://www.blogger.com/atom/ns#' term='iceweasel'/><category scheme='http://www.blogger.com/atom/ns#' term='epiphany'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>Bye bye, Ubuntu, Hello Debian</title><content type='html'>Sadly, the new version of Ubuntu, 8.04, didn't accept my laptop (an a few years old Acer TravelMate 290, without any strange hardware). I couldn't find any information on how to resolve &lt;a href="http://nikolajlindberg.blogspot.com/2008/04/ubuntu-804-revitalised-my-laptop.html"&gt;the problems&lt;/a&gt; I ran into, so I had to ditch Ubuntu, and replace it with Debian 4.0.&lt;br /&gt;&lt;br /&gt;This was a pity, since the new Ubuntu looked quite promising. The install is incredible easy and rather quick. Apart from the desktop background image, the new system looks and feels good. They appear to have made good choices when it comes to the pre-installed software. But this doesn't help when Ubuntu fails to shutdown the computer properly.&lt;br /&gt;&lt;br /&gt;Installing Debian is not as  straightforward, but still not very hard. It took a little longer, mostly because I used the net installer that grabs the software packages from the internet and not from the installation CD-ROM.&lt;br /&gt;&lt;br /&gt;However, compared to Ubuntu, it takes some more fixing after the installation to get a system that your are comfortable with. For instance, the Debian people appear to think that you should prefer a web browser called Epiphany to Firefox... They don't even offer you the standard Firefox browser, but their own version, "Iceweasel". (There seems to be &lt;a href="http://en.wikipedia.org/wiki/Iceweasel"&gt;a totally silly reason&lt;/a&gt; as to why Firefox is not called Firefox.)&lt;br /&gt;&lt;br /&gt;Worse, the default fonts did not look good on my laptop, so I had to install new fonts (by running &lt;code&gt;apt-get install msttcorefonts&lt;/code&gt;, I think?).&lt;br /&gt;&lt;br /&gt;A bit surprisingly, Debian supports playing mp3 files without installing additional libraries. &lt;br /&gt;&lt;br /&gt;After a bit of tweaking, Debian feels nice. Still, I would prefer a working version of Ubuntu. It would be interesting to know what went wrong in the relationship between Ubuntu 8.04 and my laptop.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8599112601776999944?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8599112601776999944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8599112601776999944' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8599112601776999944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8599112601776999944'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/04/bye-bye-ubuntu.html' title='Bye bye, Ubuntu, Hello Debian'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6159704453449624707</id><published>2008-04-25T18:27:00.022+02:00</published><updated>2008-05-02T12:41:49.559+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='laptop'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu 8.04'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>Ubuntu 8.04 revitalised my laptop, but I'm still not happy...</title><content type='html'>Ah, I just did a clean install of Ubuntu 8.04 (aka "Gniffly Gnaffly") on my Acer laptop. The net update failed, so I had to burn an install CD. I didn't mind, however, since &lt;a href="http://nikolajlindberg.blogspot.com/2007/11/ubuntu-710-sluggified-my-laptop.html"&gt;last time&lt;/a&gt; I did an upgrade of the laptop, something strange happened, and it became incredibly slow. &lt;br /&gt;&lt;br /&gt;With Ubuntu 8.04 installed, the laptop is back on track. It starts fast, and everything (that I care about) seems to work. &lt;br /&gt;&lt;br /&gt;The first things to do after install, are to change the desktop background image (the default depicts an oil-drenched dead bird?), turn off the system sounds including the beep and turn off all visual effects.&lt;br /&gt;&lt;!--&lt;br /&gt;(My biggest problem now is that &lt;em&gt;someone&lt;/em&gt; recently spilled a gin-drink into my keyboard, making the keys slower (and stickier) than they used to be. When pressing the "g" key, the "t" key follows (the two keys are stuck to each other)... Quit handy when you want to press both the "g" and "t" keys at the same time, but most of the time I guess you don't. I also guess that I cannot blame this on Ubuntu.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update 1:&lt;/span&gt; It wasn't a gin drink, it was a refreshing mixture of Apfelkorn and Indian Tonic Water.&lt;br /&gt;--&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; No! Ubuntu still doesn't behave well. It turns out, that sometimes when I turn off the laptop, it isn't turned off correctly! Ubuntu goes down, the screen gets black, but still, the laptop is not properly turned off (both the indicator that the computer is on, and the indicator that the hard drive is working keep glowing...). Gah.&lt;br /&gt;&lt;br /&gt;Furthermore, I've noticed some instances of ill-boding flickering of the screen.&lt;br /&gt;&lt;br /&gt;Maybe it is &lt;a href="http://nikolajlindberg.blogspot.com/2008/04/bye-bye-ubuntu.html"&gt;time to go back to Debian&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6159704453449624707?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6159704453449624707/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6159704453449624707' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6159704453449624707'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6159704453449624707'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/04/ubuntu-804-revitalised-my-laptop.html' title='Ubuntu 8.04 revitalised my laptop, but I&apos;m still not happy...'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8695023473517643316</id><published>2008-04-22T15:08:00.012+02:00</published><updated>2008-04-30T20:28:24.310+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='split'/><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='string.split'/><category scheme='http://www.blogger.com/atom/ns#' term='regular expression'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern'/><category scheme='http://www.blogger.com/atom/ns#' term='empty fields'/><title type='text'>Keeping empty fields when splitting tab separated lines in Java</title><content type='html'>Frequently, I process text files containing tab separated data. Sometimes these have empty columns, i.e., two or more tabs without any data between them. More often than not, I want to keep the empty fields. However, Java's &lt;code&gt;String.split&lt;/code&gt; defaults to removing empty fields.&lt;br /&gt;&lt;br /&gt;This is what you do to keep the empty fields: &lt;br /&gt;&lt;pre&gt;String[] fields = string.split("\t", -1)&lt;/pre&gt;&lt;br /&gt;In the following example, the test string &lt;code&gt;tst&lt;/code&gt; will be split into zero parts (&lt;code&gt;result1&lt;/code&gt;) and four parts (&lt;code&gt;result2&lt;/code&gt;) respectively: &lt;br /&gt;&lt;pre&gt;String tst = "\t\t\t";&lt;br /&gt;String[] result1 = tst.split("\t");       //result1.length == 0&lt;br /&gt;String[] result2 = tst.split("\t", -1);   //result2.length == 4&lt;/pre&gt;&lt;code&gt;result2&lt;/code&gt; will contain four instances of the empty string (&lt;code&gt;""&lt;/code&gt;).&lt;br /&gt;&lt;br /&gt;The same thing goes when you split a string using a pre-compiled regular expression:&lt;br /&gt;&lt;pre&gt;Pattern pattern = Pattern.compile("\t");&lt;br /&gt;String[] result3 = pattern.split(tst);     //result3.length == 0&lt;br /&gt;String[] result4 = pattern.split(tst, -1); //result4.length == 4&lt;/pre&gt;&lt;br /&gt;By the way, I compared the performance of the two variants above (String's split and a pre-compiled pattern matching a tab). Luckily, the difference in performance was negligible, the compiled pattern winning with a small margin. When the split pattern is more complicated, I would expect bigger performance differences between compiled and uncompiled regular expressions. (Running Sun's &lt;code&gt;java&lt;/code&gt; command with and without the &lt;code&gt;server&lt;/code&gt; argument made a big difference, however. The default &lt;code&gt;client&lt;/code&gt; was significantly slower.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8695023473517643316?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8695023473517643316/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8695023473517643316' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8695023473517643316'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8695023473517643316'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/04/keeping-empty-fields-when-splitting-tab.html' title='Keeping empty fields when splitting tab separated lines in Java'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4670400336656475590</id><published>2008-04-08T14:38:00.031+02:00</published><updated>2008-05-21T21:11:54.209+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='scott rosenberg'/><category scheme='http://www.blogger.com/atom/ns#' term='dreaming in code'/><category scheme='http://www.blogger.com/atom/ns#' term='review'/><category scheme='http://www.blogger.com/atom/ns#' term='book'/><title type='text'>Book: Dreaming in Code by Scott Rosenberg</title><content type='html'>Title: Dreaming in Code - Two dozen programmers, three years, 4,732 bugs, and one quest for transcendent software&lt;br /&gt;Author: Scott Rosenberg&lt;br /&gt;ISBN: 978-1-4000-8247-6&lt;br /&gt;URL: &lt;a href="http://www.dreamingincode.com/"&gt;www.dreamingincode.com&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This book reports how a well-funded, ambitious software project failed. For several years, the author followed the work of a group of developers employed to produce a revolutionary piece of software. The non-profit organisation set up for the task did not have any customers, but was funded by an idealist with deep pockets. Many of the people involved appear to be quite experienced and knowledgeable. The project is called Chandler, and the open source organisation is called &lt;a href="http://www.osafoundation.org/"&gt;OSAF&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The theme of the book is that producing software is complicated and that it is hard to predict whether a project will succeed or fail, and that there is no obvious solution to this problem. The book is aimed at non-programmers. The first 80 or so pages give a background on software development intertwined with the story of the software project. It takes a while for the book to get going. If you are familiar with programming and different programming concepts, you might find some of these pages less interesting. However, the author manages to explain things such as object oriented programming, open source, scripting languages, the halting problem, etc, for a non-programmer and without simplifying too much.&lt;br /&gt;&lt;br /&gt;The author appears to be interested in his subject, and understands it well. He is also a good writer. The author does a good job of explaining and exemplifying how hard software development can be. (Still, I cannot keep from thinking, that with a tenth of the funding of the project he describes, &lt;em&gt;my&lt;/em&gt; company could do wonders...) But the text is too long, and the problems of the project he is describing are in an exaggerated way generalised into problems of all software development.&lt;br /&gt;&lt;br /&gt;This may have been a hard book to write, since I suspect that at the outset, the author figured that he would be describing what should turn out to be a successful project. Instead he had to describe and explain a failure. This is probably why the book to a large extent discusses different software failures. The author has made quite a lot of research, and describes different methodologies for software development meant to reduce the risk and to ensure the quality of software development.&lt;br /&gt;&lt;br /&gt;On the whole the book is interesting, but rather pessimistic concerning the state of software development. While perhaps not an inspiring text, you should be able to pick up a few things to stay clear of, though.&lt;br /&gt;&lt;br /&gt;With the help of some more editing, maybe the book could have been a little shorter (and better).&lt;br /&gt;&lt;br /&gt;A side note: The book discusses the problem of producing reusable code. This is perhaps not the most central theme of the book, but every time I hear about the failure of software producers to create reusable code, I cannot help but to reflect that almost every day, I reuse code. If you program in Java, for instance, there is a huge set of reusable libraries for almost everything: XML processing, GUI building, cryptography, email, sound, sorting, hash tables, databases... the list of components you do not have to implement yourself, but can use as building blocks for new applications, goes on and on. You can even find software for automatically producing code (some machine learning approaches, for instance). There are programming languages and environments that sit on top of other such software: programming languages reusing other programming languages...! (See for example the Scala or JRuby languages.)&lt;br /&gt;From this perspective, there is a breathtaking amount of (freely) available, high quality software about. &lt;br /&gt;&lt;!--&lt;br /&gt;To continue this less pessimistic view of software development: even if 2/3 of all software projects fail in some respect, the remaining 1/3 successful projects may be enough. The book compares coding to writing. How many books can be be considered as successful projects? I would guess less than 1/3 of the books published, and considerably less counting all book projects initiated but never published.&lt;br /&gt;--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4670400336656475590?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4670400336656475590/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4670400336656475590' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4670400336656475590'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4670400336656475590'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/04/dreaming-in-code-by-scott-rosenberg.html' title='Book: Dreaming in Code by Scott Rosenberg'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-330677903712961164</id><published>2008-04-05T13:29:00.008+02:00</published><updated>2008-11-20T16:17:33.609+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='simple backup'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><category scheme='http://www.blogger.com/atom/ns#' term='sbackup'/><title type='text'>Automatic Simple backups: SBackup</title><content type='html'>A colleague told me about a simple backup utility called... Simple Backup (or SBackup). With the help of Simple Backup, you can very easily do hard-drive backups (and restore the backups if needed). If you're using Ubuntu, you will find it with the help of &lt;code&gt;apt-get&lt;/code&gt;, &lt;code&gt;Synaptic&lt;/code&gt; or under &lt;code&gt;Applications&gt;Add/remove...&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;At a small office, SBackup may be suitable for doing daily, automatic PC-backups. You configure it to do incremental backups with a frequency of your own choice. You can tell SBackup to put the backups on a remote server through &lt;code&gt;ssh&lt;/code&gt;. Notice that the connection settings are in clear text, i.e., your password for &lt;code&gt;ssh&lt;/code&gt;-ing will be readable for anyone with access to your computer! (Thus, you should be a bit careful with how you use SBackup.)&lt;br /&gt;&lt;br /&gt;A nice feature is that the backups are in &lt;code&gt;tar.gz&lt;/code&gt; format, i.e., you can use standard tools to read the backed up files. You can also tell SBackup what local directories to include or exclude from the backups. Once configured, the only thing you may need to care about is that you have enough disk space on the machine that holds the backups.  If you do incremental backups of a number of computers, the backups may grow quite large. (However, they will not grow infinitely large, since SBackup can take care of removing old/redundant backups.)&lt;br /&gt;&lt;br /&gt;More info &lt;a href="http://sourceforge.net/projects/sbackup/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt;  Due to a reboot of the target server, SBackup silently stopped doing its backups. It had to do with obsolete ssh keys, most likely. On the Ubuntu clients, the problem can fixed by removing the &lt;code&gt;known_hosts&lt;/code&gt; file from the root home directory &lt;code&gt;&lt;pre&gt;sudo rm /root/.ssh/known_hosts&lt;/code&gt;&lt;/pre&gt;or by removing the same file from the user home&lt;pre&gt;rm ${HOME}/.ssh/known_hosts&lt;/pre&gt;(It appears to work differently on different versions of Ubuntu.)&lt;br /&gt;&lt;br /&gt;After this, we started the SBackup configuration GUI and tested the destination (you need to answer a question before it works again).&lt;br /&gt;&lt;br /&gt;The same goes for moving the backup destination to a different server. SBackup silently stops working. You have to delete the &lt;span style="font-family:times new roman;"&gt;known_hosts&lt;/span&gt; file, as above.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-330677903712961164?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/330677903712961164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=330677903712961164' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/330677903712961164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/330677903712961164'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/04/automatic-simple-backups-sbackup.html' title='Automatic Simple backups: SBackup'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8623993034796215544</id><published>2008-03-28T11:16:00.011+01:00</published><updated>2008-12-13T10:10:23.847+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='webstart'/><category scheme='http://www.blogger.com/atom/ns#' term='64-bit'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>No Web Start for 64-bit Sun Java</title><content type='html'>Sun does not include &lt;a href="http://en.wikipedia.org/wiki/Java_Web_Start"&gt;Java Web Start&lt;/a&gt; in its 64-bit version of Java. It appears that Sun thinks that you are not supposed to run Web Start on 64-bit machines, since these mostly are servers (?), and... eh... sorry, I cannot follow their reasoning. Let's hope they change their minds.&lt;br /&gt;&lt;br /&gt;I haven't tried it myself, but &lt;a href="http://dmartin.org/weblog/running-java-web-start-apps-on-ubuntu-linux-for-amd-64"&gt;here&lt;/a&gt; is a description of how to run 32-bit Java Web Start on 64-bit Ubuntu.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; At the time of writing this, an AMD64 version of Java Web Start is at the top of Sun's &lt;a href="http://bugs.sun.com/bugdatabase/top25_rfes.do"&gt;Request for Enhancements&lt;/a&gt; list.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update&lt;/span&gt;: There will be support for 64-bit Java Web Start in an upcoming release, &lt;a href="http://bugs.sun.com/view_bug.do?bug_id=4802695"&gt;1.6.0_12&lt;/a&gt; (I think)&lt;a name="160_11"&gt;&lt;/a&gt;. Ismael Juma points out that an early access release is available. See his comment below.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8623993034796215544?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8623993034796215544/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8623993034796215544' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8623993034796215544'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8623993034796215544'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/no-web-start-for-64-bit-sun-java.html' title='No Web Start for 64-bit Sun Java'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1419085280787824913</id><published>2008-03-26T10:40:00.001+01:00</published><updated>2008-03-26T16:45:30.874+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bash'/><title type='text'>Frequency list bash function</title><content type='html'>In addition to command aliases (see an &lt;a href="http://nikolajlindberg.blogspot.com/2008/03/favourite-bash-command-line-aliases.html"&gt;earlier post&lt;/a&gt;), you can add your own functions to the bash shell. Here is a simple but useful command line sequence:&lt;br /&gt;&lt;pre&gt;function freq() {&lt;br /&gt;    sort $* | uniq -c | sort -rn;&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Put it in &lt;code&gt;~/.bashrc&lt;/code&gt; and you will have a &lt;code&gt;freq&lt;/code&gt; command for creating frequency lists:&lt;pre&gt;freq &amp;lt;FILES&amp;gt;&lt;/pre&gt; will sort and count all identical lines of the input file(s), and present them in descending frequency. Useful in many situations, not the least for checking that files that are supposed to only contain unique lines actually do so.&lt;br /&gt;&lt;br /&gt;(I'm not too sure about bash function syntax, but the function above seems to do its work.)&lt;br /&gt;&lt;br /&gt;If you're not familiar with the different commands of the pipeline above, there is plenty to read (e.g., &lt;a href="http://stts.se/index.php?lang_id=en_uk&amp;page=egrep"&gt;egrep for linguists&lt;/a&gt;).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1419085280787824913?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1419085280787824913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1419085280787824913' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1419085280787824913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1419085280787824913'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/frequency-list-bash-function.html' title='Frequency list &lt;code&gt;bash&lt;/code&gt; function'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-348758838716614197</id><published>2008-03-25T14:04:00.006+01:00</published><updated>2009-01-08T14:12:50.633+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bash'/><category scheme='http://www.blogger.com/atom/ns#' term='alias'/><title type='text'>Favourite bash command line aliases</title><content type='html'>My favourite &lt;code&gt;bash&lt;/code&gt; aliases currently are&lt;br /&gt;&lt;pre&gt;alias hist='history|egrep'&lt;/pre&gt;&lt;br /&gt;and &lt;pre&gt;alias ös='ls'&lt;/pre&gt;&lt;br /&gt;The second one for the reason that 'ö' sits next to 'l' on my Swedish keyboard, and when I intended to type 'ls' I type 'ös' more often than not. The one I use the most, however, is &lt;code&gt;alias more='m'&lt;/code&gt; (I also have the classic &lt;code&gt;more='mroe'&lt;/code&gt; and &lt;code&gt;more='moer'&lt;/code&gt;  to catch some frequent typos).&lt;br /&gt;&lt;br /&gt;The first one, &lt;code&gt;hist&lt;/code&gt;, makes it possible to use regular expressions to search the history of earlier shell commands. This is useful when you cannot remember some tricky command line sequence, or are too lazy to type some long command that you know you issued the other day.&lt;br /&gt;&lt;br /&gt;For instance&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;hist 'java|ruby'&lt;/pre&gt;&lt;br /&gt;will print any previous command (in bash's history) containing any of the two strings.&lt;br /&gt;&lt;br /&gt;(Well, I think you can accomplish the same thing using the original &lt;code&gt;history&lt;/code&gt; command, but to paraphrase Morrissey, &lt;span style="font-style: italic;"&gt;now my head is full&lt;/span&gt;, and my brain doesn't have room for more cryptic command line arguments.)&lt;br /&gt;&lt;br /&gt;You can put your bash aliases in &lt;code&gt;&lt;code&gt;~/.bashrc&lt;/code&gt;&lt;code&gt;&lt;/code&gt;&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;(Thanks to Chris for spotting a (now corrected) mistake in the first example. See the comment below.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Hey, checkout the comment by Anonymous below: &lt;code&gt;Ctrl-r&lt;/code&gt; seems useful for searching the Bash history!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-348758838716614197?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/348758838716614197/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=348758838716614197' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/348758838716614197'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/348758838716614197'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/favourite-bash-command-line-aliases.html' title='Favourite &lt;code&gt;bash&lt;/code&gt; command line aliases'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4489073536920500577</id><published>2008-03-20T15:27:00.007+01:00</published><updated>2008-03-28T11:46:45.386+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='toLowerCase'/><category scheme='http://www.blogger.com/atom/ns#' term='equalsIgnoreCase'/><category scheme='http://www.blogger.com/atom/ns#' term='Turkish'/><title type='text'>Beware of Sun's Java equalsIgnoreCase --- Turkish example</title><content type='html'>There appears to be a mistake in the implementation of &lt;code&gt;String.equalsIgnoreCase&lt;/code&gt; in Sun's Java.&lt;br /&gt;&lt;br /&gt;Look what a colleague sent me (and see &lt;a href="http://nikolajlindberg.blogspot.com/2007/11/correct-case-with-javas-locale.html"&gt;an earlier post&lt;/a&gt; on Turkish characters below):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;import java.io.PrintStream;&lt;br /&gt;import java.util.Locale;&lt;br /&gt;&lt;br /&gt;public class TestTur&lt;br /&gt;{&lt;br /&gt;&amp;nbsp;public static final void main(final String[] args) throws Exception&lt;br /&gt;&amp;nbsp;{&lt;br /&gt;&amp;nbsp;&amp;nbsp;Locale.setDefault(new Locale("tr"));&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.setOut(new PrintStream(System.out,true,"UTF8"));&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;String s1 = "I";&lt;br /&gt;&amp;nbsp;&amp;nbsp;String s2 = "ı";&lt;br /&gt;&amp;nbsp;&amp;nbsp;String s3 = "i";&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.out.println(s1+"=="+s2+"? "+s1.equalsIgnoreCase(s2));&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.out.println(s1+"=="+s2+"? "+s1.toLowerCase().equals(s2.toLowerCase()));&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.out.println();&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.out.println(s1+"=="+s3+"? "+s1.equalsIgnoreCase(s3));&lt;br /&gt;&amp;nbsp;&amp;nbsp;System.out.println(s1+"=="+s3+"? "+s1.toLowerCase().equals(s3.toLowerCase()));&lt;br /&gt;&amp;nbsp;}&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, what do you think the above code prints? You would expect that&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;span style="color: rgb(51, 0, 153);"&gt;string1&lt;/span&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;.equalsIgnoreCase(&lt;/span&gt;&lt;span style="color: rgb(51, 0, 153);"&gt;string2&lt;/span&gt;&lt;span style="color: rgb(204, 0, 0);"&gt;)&lt;/span&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;is exactly the same as&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;span style="color: rgb(51, 0, 153);"&gt;string1&lt;/span&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;.toLowerCase().equals(&lt;/span&gt;&lt;span style="color: rgb(51, 0, 153);"&gt;string2&lt;/span&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;.toLowerCase())&lt;/span&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;wouldn't you...?&lt;br /&gt;&lt;br /&gt;Surprise, surprise. This is what the above code prints:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;I==ı? true&lt;br /&gt;I==ı? true&lt;br /&gt;&lt;br /&gt;I==i? true&lt;br /&gt;I==i? false&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I bet &lt;a href="http://en.wikipedia.org/wiki/Mustafa_Kemal_Atat%C3%BCrk"&gt;Mustafa Kemal Atatürk&lt;/a&gt; didn't see that one coming!&lt;br /&gt;&lt;br /&gt;The above peculiarity did actually lead to some problems for us, so this is a practical problem rather than an academic one.&lt;br /&gt;&lt;br /&gt;Part of the problem when dealing with Turkish text (except for the mistake in how Java's &lt;code&gt;equalsIgnoreCase&lt;/code&gt; works),  is that "Latin" 'i' and Turkish 'i' as well as "Latin" 'I' and Turkish 'I' share the same Unicode codepoints. Maybe they should have been different characters. A little late for that now.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4489073536920500577?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4489073536920500577/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4489073536920500577' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4489073536920500577'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4489073536920500577'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/beware-of-java-comparing-turkish.html' title='Beware of Sun&apos;s Java &lt;code&gt;equalsIgnoreCase&lt;/code&gt; --- Turkish example'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6331847885854431202</id><published>2008-03-18T12:14:00.017+01:00</published><updated>2009-01-21T15:28:52.819+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='firebird'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='postgresql'/><title type='text'>Firebird vs Postgresql</title><content type='html'>We have similar databases running on MySql, Postgresql and Firebird. One of the reasons for moving away from MySql was the fact that the UTF8 support didn't work properly. I cannot remember the details, but it had to do with non-Latin-1 data, such as text in Czech or Russian. In some situations MySql refused to correctly identify equal UTF8 strings. You put in some word that you cannot retrieve again, bleh!&lt;br /&gt;&lt;br /&gt;Furthermore, we've never understood how the user permissions are supposed to work in MySql (we always end up frantically running all possible variants of the GRANT ALL command).&lt;br /&gt;&lt;br /&gt;We moved to Postgresql, which worked a lot better. Now we've started using &lt;a href="http://firebirdsql.org/"&gt;Firebird&lt;/a&gt;, that also seems like a very nice piece of software.&lt;br /&gt;&lt;br /&gt;Here is list of a few things I've noticed when moving from Postgresql to Firebird:&lt;br /&gt;&lt;br /&gt;* Firebird lacks built-in support for regular expressions. (We make heavy use of complex string searches of natural language data. If we hadn't got help from &lt;a href="http://dany.se/"&gt;an expert&lt;/a&gt;, who helped us compile some user defined functions, UDF:s, for this purpose, this would have been a show-stopper.)&lt;br /&gt;&lt;br /&gt;* Postgres' psql command line tool is better than Firebird's isql(-fb). (If you are a Windows user, see Carlos' comment below)&lt;br /&gt;&lt;br /&gt;* Firebird database files grow and grow. This is true even if you delete data. You have to manually back-up and restore a database to reclaim disk space. Maybe this is not a great problem in normal usage, but I noticed that the databases I use for running test suits against keep growing, though the test database itself is quite small (and the data are cleared out between test runs). [&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Please notice that long-time users of Firebird insist that this is not a problem. See Carlos', Sergio Marcelo's and also Michal's comments below.]&lt;br /&gt;&lt;br /&gt;* I've never had any luck installing Firebird from a Debian package. I have had to do a manual install to get it to work&lt;br /&gt;&lt;br /&gt;* Firebird has a useful GUI, FlameRobin, that let's you inspect and change your databases. FlameRobin comes with an editor useful for writing/editing stored procedures. The editor has code completion, that helps you with suggestions of table and column names and the like as you type.&lt;br /&gt;&lt;br /&gt;* Firebird has a nice way to manage database files: all tables of a database end up in a single file, that you can name whatever you like, and put wherever you like.&lt;br /&gt;&lt;br /&gt;* It appears to be easier to find useful documentation for Postgres than for Firebird (but Firebird does have a nice &lt;a href="http://www.firebirdfaq.org/"&gt;FAQ site&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Answer to Darius Damalakas comment below: I'm not the right person to comment on the performance of the different DBMSs. However, we haven't noticed any significant difference in performance between MySql, Postgresql and Firebird. Currently, the bottlenecks in our software are to be found outside of the databases, so the performance of the individual DBMSs has not been a big concern. They're all fast enough.&lt;br /&gt;&lt;br /&gt;Firebird does seem to be a snappy system, and I would be surprised to find it to perform less good than Postgres.&lt;br /&gt;&lt;br /&gt;So far, the only difference in features that has mattered to us, is the lack of built-in support for regular expressions in Firebird (see above). In all other respects (of importance to us), the functionality of Postgres and Firebird seems equivalent.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; Support for regular expressions is scheduled for the upcoming 2.5.0 release of Firebird.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update:&lt;/span&gt; In response to an anonymous (and rather critical) comment, &lt;a href="http://www.blogger.com/profile/09862886782232467681"&gt;mariuz&lt;/a&gt; has added some useful links in a comment below.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Update&lt;/span&gt;: In a comment below, Michal has posted some information on DatabaseGrowthIncrement, taken from the release notes of Firebird 2.1.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6331847885854431202?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6331847885854431202/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6331847885854431202' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6331847885854431202'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6331847885854431202'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/firebird-vs-postgresql.html' title='Firebird vs Postgresql'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-4924012370081516566</id><published>2008-03-15T15:48:00.002+01:00</published><updated>2008-10-17T11:38:43.502+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firebird'/><category scheme='http://www.blogger.com/atom/ns#' term='debian'/><category scheme='http://www.blogger.com/atom/ns#' term='apt-get'/><title type='text'>Beware of Firebird 2.0 Debian package</title><content type='html'>We are migrating an application to the Firebird 2.0 database manager (&lt;a href="http://www.firebirdsql.org"&gt;firebirdsql.org&lt;/a&gt;). Our server runs Debian (AMD64), and we used the Firebird 2.0 (superserver) Debian package as suggested in the Firebird site's FAQ section. However, when the package was installed, it appears to have silently overlooked a dependency, missing a library necessary for getting the "user defined functions", UDF:s, to work correctly. (Firebird didn't find the UDF:s, resulting in runtime errors when calls to the functions were issued from a Firebird database.)&lt;br /&gt;&lt;br /&gt;We made sure that Firebird as well as the UDF:s were all compiled for AMD64.&lt;br /&gt;&lt;br /&gt;When uninstalling the apt-get Firebird package, and manually installing Firebird 2.0.3 from the standard .tar.gz file, the missing dependency was spotted, and the database could be properly installed. Unfortunately, I didn't keep a record, but it might have been the correct version of libstdc++ that was missing.&lt;br /&gt;&lt;br /&gt;As far as I can remember, this is the only time a Debian apt-get package has failed me. In addition to the fact the apt-get install of Firebird might be broken, you have to be careful not to apt-get install "Firebird 2", since this will give you Firebird 1.5! Peculiar. (But see the comment from &lt;a href="http://www.blogger.com/profile/09862886782232467681"&gt;mariuz&lt;/a&gt; below).&lt;br /&gt;&lt;br /&gt;I had a similar experience the first time I tried to install Firebird from a Debian package. This was Firebird 1.5 (the Firebird 2 Debian package), before Firebird 2.0 was released. I never got that one to run either, but had to install the tar.gz version obtained from the official Firebird webserver. I can't remember exactly what went wrong at that time, but it was impossible to get the Debian package that we tried at that time to work. The manual install worked perfectly, just as it did this time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; The Debian Firebird2 package (containing Firebird 1.5) appears to be &lt;a href="http://lists.debian.org/debian-security-announce/2008/msg00098.html"&gt;discontinued&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-4924012370081516566?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/4924012370081516566/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=4924012370081516566' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4924012370081516566'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/4924012370081516566'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/beware-of-firebird-20-debian-package.html' title='Beware of Firebird 2.0 Debian package'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6430288361919742300</id><published>2008-03-15T15:04:00.003+01:00</published><updated>2008-10-17T11:36:20.650+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='string'/><category scheme='http://www.blogger.com/atom/ns#' term='+='/><category scheme='http://www.blogger.com/atom/ns#' term='concatenate'/><title type='text'>Don't concatenate Java strings using +=</title><content type='html'>The other day, I ran into a Java performance problem. It was an extremely simple Scanner loop, reading a file of some 20,000 lines of text, concatenating the lines into one single string:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Scanner sc = new Scanner(new File(fName), "UTF8");&lt;br /&gt;String result = "";&lt;br /&gt;while(sc.hasNextLine())&lt;br /&gt;{&lt;br /&gt;  result += sc.nextLine(); //Avoid this!&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;// Do something with result&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The above loop took incredible long time to finish, and I had no clue of what could possibly be wrong. A colleague glanced at the code and said "StringBuilder". I had forgotten about the poor performance of string concatenation using &lt;code&gt;+=&lt;/code&gt; (or &lt;code&gt;+&lt;/code&gt;). I must have thought that this was a problem of the past.&lt;br /&gt;&lt;br /&gt;Removing the &lt;code&gt;+=&lt;/code&gt; part for a &lt;code&gt;StringBuilder&lt;/code&gt; resulted in excellent performance:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Scanner sc = new Scanner(new File(fName), "UTF8");&lt;br /&gt;StringBuilder result = new StringBuilder();&lt;br /&gt;while (sc.hasNextLine()) &lt;br /&gt;{&lt;br /&gt;  result.append(sc.nextLine());&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;// Do something with result.toString&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update:&lt;/span&gt; ttaveira points out that you may gain some additional speed by initializing the &lt;code&gt;StringBuilder&lt;/code&gt; to a suitable capacity. See the comment below.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6430288361919742300?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6430288361919742300/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6430288361919742300' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6430288361919742300'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6430288361919742300'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/03/dont-concatenate-java-strings-using.html' title='Don&apos;t concatenate Java strings using +='/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6780549217391086219</id><published>2008-02-15T11:42:00.001+01:00</published><updated>2008-04-07T13:34:17.478+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='System.setErr'/><category scheme='http://www.blogger.com/atom/ns#' term='System.setOut'/><title type='text'>Reading/writing non-default character encoded data in Java</title><content type='html'>When in an environment where the default (system) character encoding differs from the desired character encoding of the output data, you can use &lt;code&gt;System.setOut&lt;/code&gt; and &lt;code&gt;System.setErr&lt;/code&gt;. For reading data of a different character encoding than the default encoding, you can tell e.g. the &lt;code&gt;Scanner&lt;/code&gt; class what character encoding to expect.&lt;br /&gt;&lt;br /&gt;The following could be used for reading and writing UTF8 data on a system where the default character encoding may be different from UTF8:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;System.setOut(new PrintStream(System.out,true,"UTF8"));&lt;br /&gt;System.setErr(new PrintStream(System.err,true,"UTF8"));&lt;br /&gt;&lt;br /&gt;Scanner scanner = new Scanner(new File(fileName), "UTF8");&lt;br /&gt;&lt;br /&gt;while(scanner.hasNextLine())&lt;br /&gt;   {&lt;br /&gt;    // Read input lines,&lt;br /&gt;    String line = scanner.nextLine();&lt;br /&gt;    line = doSomething(line);&lt;br /&gt;    // Write some output to STDOUT/STDERR&lt;br /&gt;    System.out.println(line);&lt;br /&gt;    ...&lt;br /&gt;   }&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The boolean flag of the second constructor argument of &lt;code&gt;PrintStream&lt;/code&gt; activates autoflush, but one does not need to use this argument.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6780549217391086219?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6780549217391086219/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6780549217391086219' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6780549217391086219'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6780549217391086219'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2008/02/input-and-output-of-non-default.html' title='Reading/writing non-default character encoded data in Java'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-7499819184719767619</id><published>2007-11-25T21:07:00.003+01:00</published><updated>2008-05-24T12:11:00.088+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='touppercase'/><title type='text'>Correct case with Java's Locale</title><content type='html'>In Turkish, the uppercase version of 'i' is 'İ' (not 'I'). The problem is that the Turkish and the "ordinary" Latin 'i' is the same character (the same Unicode code point). If you upcase the 'i' in a Turkish context using the default settings, you might get the wrong letter.&lt;br /&gt;&lt;br /&gt;In Java, you can use the Locale class to get this right:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;Locale tr = new Locale("tr"); //Turkish&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;String trI = "i".toUpperCase(tr);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;System.out.println(trI);&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The above code outputs&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;İ&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;(and not 'I').&lt;br /&gt;&lt;br /&gt;Be aware that &lt;span style="font-style: italic;"&gt;comparing&lt;/span&gt; Turkish strings &lt;a href="http://cafe.elharo.com/java/turkish/"&gt;may not work flawlessly&lt;/a&gt;.  See also &lt;a href="http://nikolajlindberg.blogspot.com/2008/03/beware-of-java-comparing-turkish.html"&gt;this post&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;You should also notice that changing the &lt;code&gt;Locale&lt;/code&gt; does other things as well. For instance you might  end up getting error messages in Turkish...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-7499819184719767619?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/7499819184719767619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=7499819184719767619' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7499819184719767619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/7499819184719767619'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/correct-case-with-javas-locale.html' title='Correct case with Java&apos;s Locale'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8418654883098537279</id><published>2007-11-23T10:53:00.001+01:00</published><updated>2008-09-10T15:23:32.515+02:00</updated><title type='text'>Sun's javac and source file character encoding</title><content type='html'>Sometimes you may need to tell the Java compiler about the character encoding of the source files. For example, if an ISO-8859-1 encoded source file is compiled in a UTF-8 environment, any funny (non-ASCII) characters may become a problem.&lt;br /&gt;&lt;br /&gt;This is an example of how you can tell Sun's javac about the source file encoding:&lt;br /&gt;&lt;pre&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;javac -encoding iso8859-1 &amp;lt;FILE PATH(s)&amp;gt;&lt;/span&gt;&lt;span style="color: rgb(102, 0, 0);font-family:Georgia,serif;" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/pre&gt;(Without the encoding switch, javac uses your system's default encoding.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This way, you can tell Sun's JVM to expect UTF-8 IO:&lt;pre&gt;java -Dfile.encoding=utf8&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8418654883098537279?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8418654883098537279/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8418654883098537279' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8418654883098537279'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8418654883098537279'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/telling-suns-javac-about-source-file.html' title='Sun&apos;s javac and source file character encoding'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-8218524282940546198</id><published>2007-11-22T10:39:00.000+01:00</published><updated>2007-11-24T22:45:18.813+01:00</updated><title type='text'>Identifying Unicode code blocks in Java</title><content type='html'>With the help of Java's Character class, one can identify to what code block a unicode character belongs. This may be useful when, for example, validating a string in order to find peculiar mixtures of character code blocks (see an example in a &lt;a href="http://nikolajlindberg.blogspot.com/2007/11/unicode-mystery.html"&gt;previous post&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;The following code&lt;br /&gt;&lt;pre style="color: rgb(153, 0, 0);"&gt;Character.UnicodeBlock ub = null;&lt;br /&gt;&lt;br /&gt;ub = Character.UnicodeBlock.of('\u042F');&lt;br /&gt;System.out.println(ub);&lt;br /&gt;&lt;br /&gt;ub = Character.UnicodeBlock.of('۲');&lt;br /&gt;System.out.println(ub);&lt;br /&gt;&lt;/pre&gt;outputs&lt;br /&gt;&lt;pre&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;CYRILLIC&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;ARABIC&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is a method returning all code blocks for the characters of a string:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;Set&amp;lt;UnicodeBlock&amp;gt; getUnicodeCodeBlocks(final String s)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;      Set&amp;lt;UnicodeBlock&amp;gt; result = new HashSet&amp;lt;UnicodeBlock&amp;gt;();&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;      for(char c : s.toCharArray())&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;      {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;        result.add(Character.UnicodeBlock.of(c));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;      }&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;      return result;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-8218524282940546198?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/8218524282940546198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=8218524282940546198' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8218524282940546198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/8218524282940546198'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/identifying-unicode-code-blocks-in-java.html' title='Identifying Unicode code blocks in Java'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-664643783513713987</id><published>2007-11-20T17:26:00.000+01:00</published><updated>2008-03-19T21:29:00.853+01:00</updated><title type='text'>The ancient art of bashing the Danish language</title><content type='html'>This is what a Swede (Dr Hemming Gadh) is said to have said about the Danish language in the year 1510 (translation below):&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;‘Der till medh: så wærdas de icke heller att talla som annat folck, utan tryckia ordhen fram lika som the willia hosta, och synas endeles medh flitt forwendhe ordhen i strupan, for æn de komma fram, sammaledes wanskapa the munnen, då the talla, wridhan och wrengan, så att the draga then offwra leppen till then wenstra sidon och den nedra till then högra sidon, menandes dett wara sig en besynnerlighe prydning och wellståndh.’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The Swedish of the section above is quite weird in itself. The following English translation is lifted from &lt;a href="http://www.humaniora.sdu.dk/boernesprog/DK/forsk/doc/E-print_3_2006.pdf"&gt;Syllabic and morphological structure: what can be learnt from their interaction in Danish?&lt;/a&gt; (Hans Basbøll, 2006):&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(102, 0, 0);"&gt;‘Also this: nor do they [the Danes] stoop (‘worthy themselves’) to speak like other people, but press the words forward as if they will cough, and appear partly to deliberately turn the words around in the throat, before they come forward (i.e. out of the mouth), partly they misshape the mouth when they speak, twist it and sneer it, so that they pull the upper lip to the left side and the lower to the right side, thinking this to be a particular ornament and well-standing.’&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The first section of the paper referred to above is "Can spoken Danish be understood?". The paper cites studies pointing to the fact "[...] that Danish is found (more or less) difficult by everybody" (‘everybody’ meaning other Scandinavians).&lt;br /&gt;&lt;br /&gt;The old Danish-bashing quotation is also found in The Phonology of Danish, Hans Basbøll, Oxford University Press, 2005, p. 83.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-664643783513713987?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/664643783513713987/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=664643783513713987' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/664643783513713987'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/664643783513713987'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/ancient-art-of-bashing-danish-language.html' title='The ancient art of bashing the Danish language'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1494791116080915517</id><published>2007-11-20T12:37:00.001+01:00</published><updated>2007-11-23T00:28:37.212+01:00</updated><title type='text'>The byte-order mark</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Things you didn't know that you had to know about #42:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The &lt;span style="color: rgb(0, 153, 0);"&gt;BOM,&lt;/span&gt; or the &lt;a href="http://en.wikipedia.org/wiki/Byte_Order_Mark"&gt;byte-order mark&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The &lt;span style="color: rgb(0, 102, 0);"&gt;BOM&lt;/span&gt; is a hateful creature, that sits at the beginning of some of your files. It's there only to mess things up.&lt;br /&gt;&lt;br /&gt;You cannot see the &lt;span style="color: rgb(0, 51, 0);"&gt;BOM,&lt;/span&gt; but when you do, it hurts your eye:&lt;span style="color: rgb(255, 0, 0);font-size:180%;" &gt; ï»¿&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The most rational action when running into the &lt;span style="color: rgb(0, 51, 51);"&gt;BOM&lt;/span&gt; in a UTF-8 file, is to blame the nearest Windows user, and then delete it (the &lt;span style="color: rgb(0, 0, 102);"&gt;BOM,&lt;/span&gt; not the user).&lt;br /&gt;&lt;br /&gt;Don't ask me why the &lt;span style="color: rgb(51, 0, 153);"&gt;BOM&lt;/span&gt; made me remember this old tune:&lt;br /&gt;&lt;object width="425" height="355"&gt;&lt;param name="movie" value="http://www.youtube.com/v/ZuL2kvEaWrw&amp;rel=1"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/ZuL2kvEaWrw&amp;rel=1" type="application/x-shockwave-flash" wmode="transparent" width="425" height="355"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1494791116080915517?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1494791116080915517/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1494791116080915517' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1494791116080915517'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1494791116080915517'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/da-bom-in-da-data-byte-order-mark.html' title='The byte-order mark'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-2475796362988759586</id><published>2007-11-07T10:50:00.001+01:00</published><updated>2007-11-25T00:01:04.640+01:00</updated><title type='text'>The file and iconv commands</title><content type='html'>&lt;span style="font-family:times new roman;"&gt; file&lt;/span&gt; and &lt;span style="font-family:times new roman;"&gt;iconv&lt;/span&gt; are two simple but useful commands handy when dealing with files of different character encondings.&lt;br /&gt;&lt;span style="color: rgb(153, 0, 0);font-family:times new roman;" &gt;&lt;br /&gt;file &amp;lt;file path&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The &lt;span style="font-family:times new roman;"&gt;file&lt;/span&gt; command makes a guess of what kind of file a file is. If you are lucky it may, for instance, help you to find out that a text file is encoded using UTF16, or ISO-8859-1, etc.&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(153, 0, 0);font-family:times new roman;" &gt;iconv -f  &amp;lt;current encoding&amp;gt; -t  &amp;lt;target encoding&amp;gt; &amp;lt;file path&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The &lt;span style="font-family:times new roman;"&gt;iconv&lt;/span&gt; command is useful for converting between different character encodings. For instance, if you have noticed (with the help of the &lt;span style="font-family:times new roman;"&gt;file&lt;/span&gt; command above) that a Unicode file is in UTF16, but you want in to be UTF8, you may use the &lt;span style="font-family:times new roman;"&gt;iconv&lt;/span&gt; command:&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(153, 0, 0);font-family:times new roman;" &gt;iconv -f utf16 -t utf8 &amp;lt;file path&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;(There are other, similar, commands, but for some reason &lt;span style="font-family:times new roman;"&gt;iconv&lt;/span&gt; appears to be the only one that I can remember the name of.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-2475796362988759586?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/2475796362988759586/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=2475796362988759586' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2475796362988759586'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/2475796362988759586'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/file-and-iconv-commands.html' title='The file and iconv commands'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1521624864107651790</id><published>2007-11-06T23:55:00.000+01:00</published><updated>2007-11-10T13:51:54.905+01:00</updated><title type='text'>The unicode command</title><content type='html'>I was just told about the  &lt;span style="font-family:times new roman;"&gt;unicode&lt;/span&gt; command.&lt;br /&gt;&lt;br /&gt;On a Debian based system, try &lt;span style="font-family:times new roman;"&gt;sudo apt-get install unicode&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Running the &lt;tt&gt;unicode&lt;/tt&gt; command on the different characters  'y' and 'у' (see earlier post) looks like this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;nikolaj@fon:~$ unicode yу&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;U+0079 LATIN SMALL LETTER Y&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;UTF-8: 79  UTF-16BE: 0079  Decimal: y&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;y (Y)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Uppercase: U+0059&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Category: Ll (Letter, Lowercase)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Bidi: L (Left-to-Right)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;U+0443 CYRILLIC SMALL LETTER U&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;UTF-8: d1 83  UTF-16BE: 0443  Decimal: у&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;у (У)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Uppercase: U+0423&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Category: Ll (Letter, Lowercase)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:times new roman;"&gt;Bidi: L (Left-to-Right)&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1521624864107651790?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1521624864107651790/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1521624864107651790' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1521624864107651790'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1521624864107651790'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/unicode-command.html' title='The &lt;tt&gt;unicode&lt;/tt&gt; command'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-3274619238054182629</id><published>2007-11-06T22:36:00.006+01:00</published><updated>2008-05-13T20:28:45.177+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='codepoints'/><category scheme='http://www.blogger.com/atom/ns#' term='utf8'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='identical typaces'/><title type='text'>Unicode mystery: Identical typeface, different characters</title><content type='html'>Do the following two strings look similar to you?&lt;br /&gt;&lt;br /&gt;уeЕoОxХaАM&lt;br /&gt;yеEоOхXаAМ&lt;br /&gt;&lt;br /&gt;Sorry, but they aren't. Not one of the corresponding characters of those strings are equal, but for the eye. For example, 'y' and 'у' have different Unicode codes (0079 and 0443).&lt;br /&gt;&lt;br /&gt;The fact is that both strings contain a mix of Latin and Cyrillic characters, but not the same mix. There are a number of Cyrillic and Latin characters that look the same, but aren't. Comparing these strings using, for example, Java's String.equals method will return false. Sorting strings of mixed character encodings (but with identically looking type faces) will produce odd results. Latin P is quite different from the Russian Р, etc.&lt;br /&gt;&lt;br /&gt;Giving it a bit of thought, it is not strange at all, but the first time you run into the problem, it can be quite tough to figure it out. It's like the first time you accidentally activated the "Insert" key on you keyboard, and thought that your computer was broken. (By the way: "Insert", "Scroll Lock", "Pause/Break"... What on earth are these keys doing on my keyboard...?!)&lt;br /&gt;&lt;br /&gt;If you are dealing with (language) data of different character encodings, it's wise to validate your strings, to ensure that, e.g., a Russian string contains only Cyrillic characters, and that the Western European ones contain only Latinos (see &lt;a href="http://nikolajlindberg.blogspot.com/2007/11/identifying-unicode-code-blocks-in-java.html"&gt;this&lt;/a&gt;  related post).&lt;br /&gt;&lt;br /&gt;And don't miss the 'A' example in the comment below!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-3274619238054182629?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/3274619238054182629/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=3274619238054182629' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/3274619238054182629'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/3274619238054182629'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/unicode-mystery.html' title='Unicode mystery: Identical typeface, different characters'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-6865896473257461661</id><published>2007-11-05T22:04:00.000+01:00</published><updated>2007-11-25T21:27:39.701+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vacation'/><category scheme='http://www.blogger.com/atom/ns#' term='fall'/><category scheme='http://www.blogger.com/atom/ns#' term='scooters'/><title type='text'>Ubuntu 7.10 sluggified my laptop</title><content type='html'>Upgrading my Acer laptop from Ubuntu 7.04 (aka "Bibbly Bobbly") to 7.10 (aka "Sniggly Snoggly") significantly downgraded its GUI performance.  It now takes twice as long to boot the laptop.  It takes five times as long to bring up the desktop after log-in. Starting Firefox takes a millenium. I didn't even get the promised 3D Desktop jingle-jangle. All I got were some lousy desktop icons.&lt;br /&gt;&lt;br /&gt;Furthermore, something happened to the networking,  and it now takes quite a while for the poor thing to find the Internets. All together, this means that I have to get up no less than 2.5 minutes earlier each day, to meet my busy schedule.&lt;br /&gt;&lt;br /&gt;Unsnappiness 2.0, and I want my money back!&lt;br /&gt;&lt;br /&gt;However, when I &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;nikoloogle&lt;/span&gt; for people with a similar experience, all I find is people going "&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;Wooah&lt;/span&gt;, I got better performance with &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;Ubuntu&lt;/span&gt; 7.10&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12"&gt;&lt;/span&gt;!". One wonders, I'm I alone in this?&lt;br /&gt;&lt;br /&gt;Well, maybe not everyone praised its superior performance:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.flickr.com/photos/xmoln/1810682660/" title="Photo Sharing"&gt;&lt;img src="http://farm3.static.flickr.com/2025/1810682660_3b7f72833b_m.jpg" alt="ERROR" height="180" width="240" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-6865896473257461661?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/6865896473257461661/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=6865896473257461661' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6865896473257461661'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/6865896473257461661'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/ubuntu-710-sluggified-my-laptop.html' title='Ubuntu 7.10 sluggified my laptop'/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm3.static.flickr.com/2025/1810682660_3b7f72833b_t.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3840687515615686738.post-1361314985470822713</id><published>2007-11-05T17:37:00.000+01:00</published><updated>2007-11-05T21:51:22.026+01:00</updated><title type='text'></title><content type='html'>Welcome to the least visited page of the Internets!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3840687515615686738-1361314985470822713?l=nikolajlindberg.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nikolajlindberg.blogspot.com/feeds/1361314985470822713/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3840687515615686738&amp;postID=1361314985470822713' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1361314985470822713'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3840687515615686738/posts/default/1361314985470822713'/><link rel='alternate' type='text/html' href='http://nikolajlindberg.blogspot.com/2007/11/vlkommen-till-vrldens-nst-mest-sllan.html' title=''/><author><name>Nikolaj Lindberg</name><uri>http://www.blogger.com/profile/12153448128671603936</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp0.blogger.com/_jIEZFmoaVas/SACEmcTFbGI/AAAAAAAAAC0/V4MWnAHIp64/S220/tegel.jpg'/></author><thr:total>0</thr:total></entry></feed>
