Monday, 16 November 2009

Source's getLines in Scala 2.8 now strips line end

In Scala 2.8 (not yet officially released), scala.io.Source has been updated.

When reading lines from a file, you do not longer need to trim the lines, since newlines are removed by default. The code to read lines from a file using Source may now look something like this (where fName is a file name (a string)):

val lines = io.Source.fromPath(fName) getLines()

If you want to specify the input file encoding to be UTF-8, you could try this:
val lines = io.Source.fromPath(fName)("UTF8") getLines()

When you look at the API documentation, you'll find that fromPath takes a Codec as a second implicit parameter. Through some mysterious conversion (or "implicit conversion"), you can call it with a string ("UTF8") instead, as in the example above.

Anyway, no more Source.fromFile(fName).getLines.map(_.stripLineEnd). Someone is improving Scala!

4 comments:

Tibi said...

Not sure it’s good: it’s a breaking change. If you have some code relying on the \n you’re screwed.

Daniel said...

If you rely heavily on scala.io you are screwed anyhow, really. :-(

The scala.io library exists simply because scala.xml needed something. It was written as stopgap solution, and it has serious deficiencies. For instance, have you considered when the resource opened by Source is released?

Anyway, there is an on-going effort to write a new Scala I/O library, heavily inspired by the lessons learned from JDK7.

Nikolaj Lindberg said...

Tibi,

I find a breaking change every now and then refreshing :)

(If I'm not mistaken, one should expect Scala 2.8 to break some 2.7 code.)


Daniel,

I've noticed that people seem to dislike scala.io a bit... My experience is that it works very well for scripting (smaller text processing tasks) at least. Maybe I'd change my mind, if I were to use it for more demanding stuff.

Ed Staub said...

I just had to write something that reads 100000 lines of input, about 45 chars apiece, all 8-bit stuff.

BufferedReader(FileReader): 270 ms
Source.fromFile.getLines: 1689 ms

(Folks who don't think that performance matters need not repeat themselves here...)