Friday, October 22, 2004

StringReplace isn't a speed deamon...

Currently, the Win32 version of StringReplace in the SysUtils unit isn't exactly a speed deamon . It has several problems:

1. It grows the result string bit-by-bit. If many replacements are being done, many reallocations and memory copy operations are going on.

2. It calls AnsiPos which calls AnsiStrPos, which at least calls StrLen twice and StrPos once. StrPos internally recalculates the string lengths again (by scanning for #0). So for every AnsiPos call, the entire source string is scanned twice to find the length. This length is constant and known by StringReplace.

3. It copies the 'rest' part of the source string to a separate SearchStr variable. This is a lot of overhead just to know where in the string you are.

4. AnsiPos/AnsiStrPos also have the overhead of multi-byte support even for locales that don't need it. I have written a suggestion for replacement that includes versions of StrPos and StrScan that take length parameters:

function StrLPos(const Str1, Str2: PChar; Len1, Len2: integer): PChar; 

function StrLScan(const Str: PChar; Chr: Char; Len: integer): PChar;

Next I wrote a new AnsiStrLenPos that takes lengths as parameters:

function AnsiStrLenPos(Str, SubStr: PChar; L1, L2: Cardinal): PChar; 

This new version uses StrLPos to quickly find the next position without scanning for the length. It also optimizes the fairly common case when the substring length is 1 by calling StrLScan instead. By explicitly checking for FarEast (implying a multibyte character set), we can speed up the normal case. There are no case-sensitve code here (we're upping the strings for that case), so the plain StrLScan and StrLPos should do fine.


I have uploaded my faster version of StringReplace to CodeCentral: http://codecentral.borland.com/codecentral/ccWeb.exe/listing?id=22677


An earlier version of WebBroker on Steroids used this faster StringReplace, markedly speeding up the template processing (while reducing memory consumption and fragmentation). A later version dropped using StringReplace for a faster algorithm using custom parsing.


Notice that the dedicated assembly hackers over in the borland.public.delphi.langauge.basm newsgroup have a FastCode competition going (one amongst many) to write the fastest implementation of StringReplace. The winner here will probably be orders of magnitude faster than my version ;).


One day these optimized versions of the Delphi RTL routines might end up in the next version of your favorite developement tool.

1 comment:

Hallvards New Blog said...

Yes, I've seen that. However, it looks like it falls well behind the best entries in the FastCode contest (I don't think it passed validation either).

The results are in here:

http://dennishomepage.gugs-cats.dk/AnsiStrReplaceChallenge.htm

The fastest entry is some 7-9 times faster than the RTL version (depending on the CPU). My own contribution scored less than 3 times faster.

Note that real-world speed gain will vary considerably - as always how you benchmark determines the results.

When that is said, Eric Grange's winning entry has some really neat tricks. The most important trick is to search for and store the position of all hits in the source string before allocating the result string and copying over the string data. Another is to use an upper-case character table to avoid pre-upping strings in the case-insensitive mode.



Copyright © 2004-2007 by Hallvard Vassbotn