Showing posts with label string. Show all posts
Showing posts with label string. Show all posts

Friday, 5 September 2008

Scala: String vs RichString oddities

Update: In Scala 2.8, the below is no longer true. String.reverse now returns a String rather than a RichString:


scala> "a".reverse == "a"
res0: Boolean = true

=========================================

In the Scala programming language, there is a class called RichString, that adds features to the underlying Java String. In the current version of Scala (2.7.2.final), this leads to some odd behaviour:
"Im a string" == "Im a string".reverse.reverse
returns false, while
"Im a string" == "Im a string".reverse.reverse.toString
returns true!

Just to make your head spin, the following code does indeed work as expected:
val str :String = "Im a string".reverse.reverse
println(str == "Im a string") // prints "true"
while
val str = "Im a string".reverse.reverse
println(str == "Im a string") // prints "false"
does not.



The explanation is that String.reverse returns a RichString, and that == returns false when comparing a String and a RichString, even though it is the "same" string (as in the example above).

If I understand it correctly, this oddity will be fixed in future releases of Scala.

(And no, Scala's == is not the same as Java's ditto. It means "equal objects" rather than "refers to the same instance of an object".)

Scala mailing list item here.

Saturday, 15 March 2008

Don't concatenate Java strings using +=

The other day, I ran into a Java performance problem. It was an extremely simple Scanner loop, reading a file of some 20,000 lines of text, concatenating the lines into one single string:

Scanner sc = new Scanner(new File(fName), "UTF8");
String result = "";
while(sc.hasNextLine())
{
result += sc.nextLine(); //Avoid this!
}

// Do something with result


The above loop took incredible long time to finish, and I had no clue of what could possibly be wrong. A colleague glanced at the code and said "StringBuilder". I had forgotten about the poor performance of string concatenation using += (or +). I must have thought that this was a problem of the past.

Removing the += part for a StringBuilder resulted in excellent performance:

Scanner sc = new Scanner(new File(fName), "UTF8");
StringBuilder result = new StringBuilder();
while (sc.hasNextLine())
{
result.append(sc.nextLine());
}

// Do something with result.toString

Update: ttaveira points out that you may gain some additional speed by initializing the StringBuilder to a suitable capacity. See the comment below.