Tuesday, October 26, 2004

The FastCode AnsiStringReplace results are in

The results of the FastCode AnsiStringReplace competition are in:


The fastest entry is some 7-9 times faster than the RTL reference version (depending on the CPU). My own contribution scored less than 3 times faster... ;). Note that real-world speed gain will vary considerably - as always what you benchmark determines the results.

When that is said, Eric Grange's winning entry has some really neat tricks. The most important trick is to search for and store the position of all hits in the source string before allocating the result string and copying over the string data. Another is to use an upper-case character table to avoid pre-upping strings in the case-insensitive mode.

A much faster StringReplace can be nice, but before you jump in, consider if StringReplace really is the right tool for the job at hand. As Martin James put it in the delphi.non-tech newsgroup:

"In anything other than trivial cases with small strings, I always use a char-by-char state machine anyway, certainly with HTML text parsing/tag handling. In such cases, any 'stringReplace' is hugely inefficient."

My response was:

"Yes, you are right. If you want performance, using StringReplace is most probably the wrong solution (no matter how fast it is). Most of the time you don't want to replace a single string, you want to replace a number of strings/patterns. A custom parser using a state-machine is often the best solution, performance wise. However, such optimizations might be prohibitively expensive to develop for simple problems where StringReplace will suffice."

And on the FastStrings comment from my last blog entry, here are the results on one CPU:

P4 1600A Northwood
StringReplace_EG_MMX_1 8 671 1165 1836 // Eric Grange = 1836 ms
StringReplace_HV_IA32_1 4 2846 5018 7864 // Mine = 7864 ms
StringReplace_DKC_Pas_1 8 4565 4990 9555
FastAnsiReplace 0 4382 9248 13630 // FastStrings version = 13630 ms
StringReplaceRef 0 5523 9514 15037 // Bug-fixed RTL version = 15037 ms
StringReplaceRTL 8 6848 13336 20184 // RTL version = 20184 ms

So even my version kicked FastStrings ReplaceString... <bg>. And even as we speak, those delphi.language.basm folks are now cooperating to improve the blind-test winners few notches more...

Friday, October 22, 2004

StringReplace isn't a speed deamon...

Currently, the Win32 version of StringReplace in the SysUtils unit isn't exactly a speed deamon . It has several problems:

1. It grows the result string bit-by-bit. If many replacements are being done, many reallocations and memory copy operations are going on.

2. It calls AnsiPos which calls AnsiStrPos, which at least calls StrLen twice and StrPos once. StrPos internally recalculates the string lengths again (by scanning for #0). So for every AnsiPos call, the entire source string is scanned twice to find the length. This length is constant and known by StringReplace.

3. It copies the 'rest' part of the source string to a separate SearchStr variable. This is a lot of overhead just to know where in the string you are.

4. AnsiPos/AnsiStrPos also have the overhead of multi-byte support even for locales that don't need it. I have written a suggestion for replacement that includes versions of StrPos and StrScan that take length parameters:

function StrLPos(const Str1, Str2: PChar; Len1, Len2: integer): PChar; 

function StrLScan(const Str: PChar; Chr: Char; Len: integer): PChar;

Next I wrote a new AnsiStrLenPos that takes lengths as parameters:

function AnsiStrLenPos(Str, SubStr: PChar; L1, L2: Cardinal): PChar; 

This new version uses StrLPos to quickly find the next position without scanning for the length. It also optimizes the fairly common case when the substring length is 1 by calling StrLScan instead. By explicitly checking for FarEast (implying a multibyte character set), we can speed up the normal case. There are no case-sensitve code here (we're upping the strings for that case), so the plain StrLScan and StrLPos should do fine.

I have uploaded my faster version of StringReplace to CodeCentral: http://codecentral.borland.com/codecentral/ccWeb.exe/listing?id=22677

An earlier version of WebBroker on Steroids used this faster StringReplace, markedly speeding up the template processing (while reducing memory consumption and fragmentation). A later version dropped using StringReplace for a faster algorithm using custom parsing.

Notice that the dedicated assembly hackers over in the borland.public.delphi.langauge.basm newsgroup have a FastCode competition going (one amongst many) to write the fastest implementation of StringReplace. The winner here will probably be orders of magnitude faster than my version ;).

One day these optimized versions of the Delphi RTL routines might end up in the next version of your favorite developement tool.

Wednesday, October 13, 2004

The Delphi 2005 launch in Oslo was a blast!

More than 140 people showed up for a breath-taking demonstration of some of the great new features of Delphi 2005 (yes, that is the official product name). Jason Vokes covered the IDE, Unit tests integration, Error Insight, Help Insight, SyncEdit, Refactoring, ASP.NET, Borland Data Providers, and much more. Henrik Jondell demonstrated ECO II, Database reversing, Model driven development and more.

Yours truly got 15 minutes to show some slides on me, Infront AS, how we use Delphi, the most productivity enhancing new Delphi features and what's new in the Delphi language for D8, D2005 Win32, D2005 .NET and in the future.

You can download my slides and some of the demo code I didn't have time to show here: http://cc.borland.com/ccWeb.exe/listing?id=22630.

John Kaster has uploaded his BorCon Delphi 2005 slides here: http://cc.borland.com/ccWeb.exe/listing?id=22324

They contain much more detail and cover much more ground (this is were I got some of the future language stuff) - recommended!

Copyright © 2004-2007 by Hallvard Vassbotn