Friday, February 17, 2017

Memory usage in C#: StringBuilder versus String concatenate The WRONG WAY

Do not use this as an example of how to do benchmark testing!  I am leaving it here as an example of how NOT to do it.  Look at the next post for the RIGHT way (or at least more right than this).  You have been warned.

For fun I built a program that is a memory hog.  I used recursion and appended strings using the '+' operator and built a bunch of objects.  My thought was that I would stress out the garbage collector with the recursion and string nastiness.  Then I went back and tried the same without the string mess by using StringBuilder.  I left the object creation and recursion the same.  What do you think happened?  Here is the code, and my results follow.

The results are:

ReadAllFilesAttributesAndConcat string length = 117,158,296
ReadAllFilesAttributesAndConcat max memory = 1,756,577,792
Duration =21:56
ReadAllFilesAttributesAndAppend string length = 24,809,733
ReadAllFilesAttributesAndAppend max memory = 720,510,976
Duration =18:53

Second fun run:
ReadAllFilesAttributesAndConcat string length = 131,089,243
ReadAllFilesAttributesAndConcat max memory = 1,755,693,056
Duration =25:37
ReadAllFilesAttributesAndAppend string length = 38,738,692
ReadAllFilesAttributesAndAppend max memory = 795,697,152
Duration =20:53

Why are the string lengths different?  Could it be temp files were created and destroyed while it was running?  Nope, the problem was in my code.  I updated it to this:

and ran it again against my D: and the results are as follows:

string length = 13,362,511
max memory = 304,152,576
Duration =01:50
string length = 13,362,511
max memory = 192,352,256
Duration =01:37

In the first runs the case of using '+' to concatenate strings, the max is 1.6 GB, in the case of using StringBuilder it is only .7 GB.  In the corrected run we see .3 GB for the first and .2 GB for the second.  The file sizes were the same and StringBuilder was faster.

The moral to the story?  Test your code!  Also, StringBuilder for the win!  Other things to note, I should not have run it against the main drive as it introduces variability because of swap files and such.  Also, it took way too long for this to run.  I am running it on a SSD, and it still took over 40 minutes to finish each run.  I might try something smaller than my C: next time, but I wanted to make sure and fully exercise the methods without having to be smart about building a test string generator. That'll learn me. :-)  Last, my strings grew to 117 million characters.  That is a lot of stuff!  Would the memory usage difference have been smaller if I hadn't pushed it with such long strings?  Probably.  Would the performance difference been smaller?  Again, probably.
Keep your code clean!

No comments:

Post a Comment