Saturday 15 March 2008

Don't concatenate Java strings using +=

The other day, I ran into a Java performance problem. It was an extremely simple Scanner loop, reading a file of some 20,000 lines of text, concatenating the lines into one single string:

Scanner sc = new Scanner(new File(fName), "UTF8");
String result = "";
while(sc.hasNextLine())
{
result += sc.nextLine(); //Avoid this!
}

// Do something with result


The above loop took incredible long time to finish, and I had no clue of what could possibly be wrong. A colleague glanced at the code and said "StringBuilder". I had forgotten about the poor performance of string concatenation using += (or +). I must have thought that this was a problem of the past.

Removing the += part for a StringBuilder resulted in excellent performance:

Scanner sc = new Scanner(new File(fName), "UTF8");
StringBuilder result = new StringBuilder();
while (sc.hasNextLine())
{
result.append(sc.nextLine());
}

// Do something with result.toString

Update: ttaveira points out that you may gain some additional speed by initializing the StringBuilder to a suitable capacity. See the comment below.

1 comment:

Anonymous said...

Was reading about this just now, don't forget in your last example to set a approximate capacity to what you'll need in StringBuilder().

http://www.precisejava.com/javaperf/j2se/StringAndStringBuffer.htm