Wednesday, January 30, 2008

TDM#4: Delphi 4 Bugs and Fixes

Delphi 4, released in the summer of 1998, was one of the most notorious Delphi releases ever. The initial release contained  a large number of serious bugs, and it later became clear that the release date had been pushed by management and/or marketing and not sanctioned by technical and R&D.

While all of this is water under the bridge, one very visible bug, the so-called TListBox ItemIndex bug, called for an interesting patching technique that has since been employed numerous times (by me and others) to fix RTL and VCL issues at runtime.

Dave Jewell and myself both contributed to the TDM article named Delphi 4 Bugs and Fixes, published in the September 1998 issue. I have Dave's permission to publish our joint effort here [Thanks, Dave!].

The gist of the problem was:

"Cause
With the release of Delphi4, Inprise changed the implementation of TCustomListBox’s ItemIndex property. The implementation of the access methods in Delphi 3 was [like this]:

function TCustomListBox.GetItemIndex: Integer;
begin
Result := SendMessage(Handle, LB_GETCURSEL, 0, 0);
end;

function TCustomListBox.GetSelCount: Integer;
begin
Result := SendMessage(Handle, LB_GETSELCOUNT, 0, 0);
end;

This was changed in Delphi 4 build 5.3 into [this]code:

function TCustomListBox.GetItemIndex: Integer;
begin
if not MultiSelect then
Result := SendMessage(Handle, LB_GETCARETINDEX, 0, 0) else
Result := SendMessage(Handle, LB_GETCURSEL, 0, 0);
end;

procedure TCustomListBox.SetItemIndex(Value: Integer);
begin
if GetItemIndex <> Value then
if MultiSelect then
SendMessage(Handle, LB_SETCARETINDEX, Value, 0) else
SendMessage(Handle, LB_SETCURSEL, Value, 0);
end;

The change has an obvious bug, in TCustomListBox.GetItemIndex the if statement should have been:

  if MultiSelect then 

instead of:

  if not MultiSelect

"


And indeed, that is how the method looks like in current Delphis. The solution to the problem was:


"a small unit that patches the code in TCustomListBox.GetItemIndex at runtime. It can handle both statically linked and packaged VCL.

unit FixupLB;
// Simple unit that patches the StdCtrls.TCustomListBox.GetItemIndex
// method in Delphi 4.0 (build 5.37)
//
// July 1998, by Hallvard Vassbotn (hallvard@falcon.no)

interface

implementation

uses
Windows,
TypInfo,
StdCtrls;

type
PPGetItemIndex = ^PGetItemIndex;
PGetItemIndex = ^TGetItemIndex;
TGetItemIndex = packed record
PUSH_EBX : byte;
MOV_EBX_EAX : word;
CMP_BYTE_PTR : word;
ADDR_Offset : Cardinal;
False_ZERO : byte;
JNZ : byte;
ELSE_REL_ADDR : byte;
PUSH_LParam : word;
PUSH_WParam : word;
PUSH_Msg : byte;
Msg_Const : Cardinal;
MOV_EAX_EBX : word;
CALL : byte;
end;
PGetItemIndexInPackage = ^TGetItemIndexInPackage;
TGetItemIndexInPackage = packed record
JMP_DWORD: word;
ActualGetItemIndex : PPGetItemIndex;
end;
TBytes = array[0..3] of byte;

const
OpCode_JNZ = $75;
OpCode_JZ = $74;
OpCode_JMP_DWORD = $25FF;

function IsBuggyCode(var GetItemIndex: PGetItemIndex): boolean;
var
GetItemIndexInPackage: PGetItemIndexInPackage absolute GetItemIndex;
begin
// Verify that this is a static method
Result := (TBytes(GetItemIndex)[3] < $FE);
if Result then
begin
// Handle the case when TCustomList.GetItemIndex is in a package
if (GetItemIndexInPackage^.JMP_DWORD = OpCode_JMP_DWORD) then
GetItemIndex := GetItemIndexInPackage^.ActualGetItemIndex^;
with GetItemIndex^ do
begin
Result :=
// The buggy instruction, should have been OpCode_JZ
(JNZ = OpCode_JNZ) and
// Check the other instructions as well, just to be sure
(PUSH_EBX = $53) and
(MOV_EBX_EAX = $D88B) and
(CMP_BYTE_PTR = $BB80) and
(False_ZERO = $00) and
(ELSE_REL_ADDR= $18) and
(PUSH_LParam = $006A) and
(PUSH_WParam = $006A) and
(PUSH_Msg = $68) and
(Msg_Const = $19f) and
(MOV_EAX_EBX = $C38B) and
(CALL = $E8) ;
end;
end;
end;

procedure WriteCodeByte(CodeAddress: pointer; Value: byte);
var
WrittenBytes: Cardinal;
begin
// Must use WriteProcessMemory or VirtualProtect to write to code segment
WriteProcessMemory(GetCurrentProcess, CodeAddress, @Value, SizeOf(Value), WrittenBytes);
end;

type
// Publish ItemIndex to easily get address of GetItemIndex method
TPublishedListBox = class(TCustomListBox)
published
property ItemIndex;
end;

procedure Fix_TCustomListBox_GetItemIndex_Bug;
var
PropInfo : PPropInfo;
GetItemIndex: PGetItemIndex;
begin
// Get the property information for the newly published ItemIndex property
PropInfo := TypInfo.GetPropInfo(TPublishedListBox.ClassInfo, 'ItemIndex');
if Assigned(PropInfo) then
begin
// Get the get-property method address
GetItemIndex := PropInfo^.GetProc;
// Now check that the buggy code is there
if IsBuggyCode(GetItemIndex) then
// Patch in the correction, voilĂ !
// Same as GetItemIndex^.JNZ := OpCode_JZ, without the AV...
WriteCodeByte(@GetItemIndex^.JNZ, OpCode_JZ);
end;
end;

initialization
Fix_TCustomListBox_GetItemIndex_Bug;
end.

...


When you get hold of and install the Delphi 4 update, the fix unit will silently find out that the code in GetItemIndex has changed and cause no harm. However,at that time you should remove the fix unit from the project anyway: it is better not to have self-modifying code in your project if you can avoid it."

The full PDF article and code are available for download.

Saturday, January 26, 2008

TDM#3: The Rise and Fall of TObject

"Consider yourself an Delphi expert? Here is a test question for you: Can you name the 8 methods of TObject that are part of the process of constructing and destroying objects in Object Pascal"

That was the introductionary question of my The Rise and Fall of TObject article published in The Delphi Magazine, July 1998. Now we're in the era of Delphi 3, with Delphi 4 just being released.

The question above is in retrospect a trick-question, because there are not 8 but 10 TObject methods related to construction and destroying objects ;). The article only covers the situation in Delphi 3, but Delphi 4 added two more methods (AfterConstruction and BeforeDestruction) to the mix - this was covered in a separate .txt file included with the code.

The updated list of TObject methods involved in construction and destruction is:

  TObject = class
constructor Create;
procedure Free;
class function InitInstance(Instance: Pointer): TObject;
procedure CleanupInstance;
class function InstanceSize: Longint;
procedure AfterConstruction; virtual;
procedure BeforeDestruction; virtual;
class function NewInstance: TObject; virtual;
procedure FreeInstance; virtual;
destructor Destroy; virtual;
end;

This article was slightly different than most of my other TDM articles (and for that matter, my blog posts) in that it mainly covered basic information that was already available in the documentation. Still, it did dig a little deeper into the implementation details that most other sources.


Here are a couple of excerpts from the article:


"Constructor implementation
Any constructor compiles as if it had been declared something like this:

class function TMyObject.Create(SelfClass: TClass; CalledWithClass: boolean): TMyObject;
var
Self: TMyObject;
begin
if CalledWithClass
then Self := _ClassCreate(SelfClass)
else Self := TObject(SelfClass);

// Actual constructor code goes here

if CalledWithClass then
// Remove exception frame
Result := Self;
end;

The actual compiled code delegates much of the work to a magic System routine called _ClassCreate. This routine is written in assembly, but the logic corresponds to something like this:

function _ClassCreate(aClass: TClass): TObject;
label
ExHandler;
begin
Result := aClass.NewInstance;
Setup special exception frame, exception handler = ExHandler;
Exit;
ExHandler:
except
Result.Free;
raise;
end;
end;

This procedure first creates an object instance by calling the NewInstance class method. Then it manually sets up an exception frame and exception handler. This takes care of calling the Free method and re-raising the exception if something goes wrong in the constructor. Borland probably chose to call the _ClassCreate function instead of compiling the code inline to avoid code bloat."


"The big picture
So now we have a better mental picture of what is going on when objects are created and destroyed. To give you a quick call-diagram of both processes [Updated to show include the Delphi 4 methods - Ed]:

constructor
_CreateClass
NewInstance
GetMem
InitInstance
_AfterConstruction
AfterConstruction

destructor
_BeforeDestruction
BeforeDestruction
_DestroyClass
FreeInstance
CleanupInstance
FreeMem

It easy to see the symmetry between the two operations here. We know that we can override the virtual class methods NewInstance and FreeInstance. This capability can be exploited in many different ways. For example, in Issue 24, Cyril Jandia showed how to use these as a debugging aid in the article “ClassTraps for Delphi and C++ Builder”. In the following section we will see how we can use these methods to take over the actual memory allocation for our object instances."


The full article, the Delphi 4 updates and code are available for download.

Thursday, January 24, 2008

TDM#2: Hooking Heapcheck

Inspired by the apparent popularity if my YAST article, a few months later, in July 1996, I published my second Delphi Magazine article - called Hooking HeapCheck. While the article is mainly irrelevant to 32-bit Delphis, it does show what is possible to achieve with a little hacking.

The gist of the article is to expand a mostly useless, parameterless Delphi 1 memory manager callback-function, HeapCheck, into a general purpose heap statistics tool. The missing parameter values are obtained by peeking into the stack and register contents.

"By checking values on the stack and in the registers, I'm able to determine whether the call is an allocation or a deallocation. For deallocations, I can determine both the pointer value and the size of the block. For allocations, only the block size is available. This is because the HeapCheck callback is called prior to the actual allocation, so the pointer value has not been determined yet."

The technique to peek at the stack contents without resorting to assembly code, is also interesting and is still relevant today:

"We could use some dirty tricks with Ptr, SSeg and SPtr or even used
assembly to look at the stack without POPing any parameters from the
stack. However, there is a cleaner and more elegant solution.

In Delphi, we can declare a procedure as "cdecl", which means that the
so-called C-calling convention should be used when calling and sending
parameters to the routine. This implies that the caller of the routine
PUSHes the parameters on the stack in reversed order and that the caller
is also responsible for cleaning up the stack after the call. The net
effect is that the routine accesses its parameters without changing
the layout of the stack."

BTW, I was wrong in my previous blog post - Delphi 1 *did* have (16-bit) BASM - I use it in the HeapCheck hooking code... :).

unit HeapStat;

{ Heap statistics unit. Supports both non-OOP callbacks and OOP events.

Implemented by using the scarcely documented HeapCheck callback in
the System unit. Delphi 1.0 only!

Written by Hallvard Vassbotn, December 1995 (hallvard@falcon.no) }

interface

{$IFNDEF VER80}
Can only be used with Delphi 1.0
{$ENDIF}

type
THeapOp = (hoAlloc, hoDealloc, hoBigAlloc);
THeapStat = record
ThisOp : THeapOp;
ThisSize : longint;
FreePtr : pointer;
AllocBytes : longint;
AllocCount : longint;
DeallocBytes : longint;
DeallocCount : longint;
DiffBytes : longint;
DiffCount : longint;
BigBytes : longint;
BigCount : longint;
Recursive : boolean;
IsInCallBack : boolean;
end;
THeapStatEvent = procedure(const HeapStat: THeapStat) of object;
THeapStatProc = procedure(const HeapStat: THeapStat);

procedure HeapStatInit(aHeapStatEvent: THeapStatEvent;
aHeapStatProc : THeapStatProc);

function HeapStatDone: boolean;

implementation

uses
SysUtils;

type
PtrRec = record
Ofs, Seg: word;
end;
TProcedure = procedure;

const
HeapStatEvent : THeapStatEvent = nil;
HeapStatProc : THeapStatProc = nil;
SaveHeapCheck : pointer = nil;
HeapCheckMagic : word = 0;
var
HeapStatistics: THeapStat;
LocalHS : THeapStat;

{$IFOPT S+} {$DEFINE STACKCHK} {$S-} {$ENDIF}

{ Note: Stack-checking cannot be turned on here. It would overwrite AX!

The HeapCheck callback doesn't really take any parameters, so we have
to do some checks of registers and stack contents to find out were
we are called from (allocation or deallocation) and what the
block size is.

Use Cdecl calling convention to look at the stack without changing
the stack pointer. }

procedure HeapCheckProc(AllocSize, FreeSeg, FreeSize: word); cdecl; far;
var
FreeOfs : word absolute AllocSize;
AX_, BX_, CX_, ES_, SP_, BP_: Word;
Allocate : boolean;
Deallocate: boolean;
begin
{ Save register values to variables }
asm
MOV AX_, AX
MOV BX_, BX
MOV CX_, CX
MOV ES_, ES
MOV BP_, BP
end;
{ Do some magic checks to see if we are called from NewMemory or DisMemory in WMEM.ASM }
Deallocate := (FreeSeg = ES_) and (FreeOfs = BX_) and (HeapCheckMagic = CX_);
Allocate := (not Deallocate) and (AllocSize = AX_);

with HeapStatistics do
begin
if Allocate then
begin
FreePtr := nil;
ThisSize := AllocSize;
if ThisSize < System.HeapLimit then
begin
ThisOp := hoAlloc;
Inc(AllocBytes, ThisSize);
Inc(AllocCount);
end
else
begin
ThisOp := hoBigAlloc;
Inc(BigBytes, ThisSize);
Inc(BigCount);
end;
end
else
begin
PtrRec(FreePtr).Seg := FreeSeg;
PtrRec(FreePtr).Ofs := FreeOfs;
ThisSize := FreeSize;
ThisOp := hoDealloc;
Inc(DeallocBytes, ThisSize);
Inc(DeallocCount);
end;
DiffBytes := AllocBytes - DeallocBytes;
DiffCount := AllocCount - DeallocCount;

{ Call the callback/event handler, but look out for recursiveness }
if not IsInCallBack then
begin
IsInCallBack := true;
LocalHS := HeapStatistics;
if Assigned(HeapStatProc) then
HeapStatProc(LocalHS);
if Assigned(HeapStatEvent) then
HeapStatEvent(LocalHS);
IsInCallBack := false;
end
else
Recursive := true;
end;

{ Call any previous HeapCheck callback that might be installed.}
if Assigned(SaveHeapCheck) then
TProcedure(SaveHeapCheck);
end;

{$IFDEF STACKCHK} {$UNDEF STACKCHK} {$S+} {$ENDIF}

procedure HeapStatInit(aHeapStatEvent: THeapStatEvent;
aHeapStatProc : THeapStatProc);
begin
if System.HeapCheck <> @HeapCheckProc then
begin
FillChar(HeapStatistics, SizeOf(HeapStatistics), 0);
SaveHeapCheck := System.HeapCheck;
System.HeapCheck := @HeapCheckProc;
HeapCheckMagic := LongRec(HeapCheck).Hi or LongRec(HeapCheck).Lo;
end;
HeapStatEvent := aHeapStatEvent;
HeapStatProc := aHeapStatProc;
end;

function HeapStatDone: boolean;
begin
System.HeapCheck := SaveHeapCheck;
Result := not HeapStatistics.Recursive;
end;

end.

There are not many references to the Delphi 1 HeapCheck variable online, but Ray Lischner mentions it in a newsgroup posting and in his Secrets of Delphi 2 book.


I found the original article script in my archives - it was simply a plain text file, edited with Notepad, as I didn't have access to a proper word processor at the time ;). Of course, Chris Frizelle edited my raw text and produced this beautifully formatted PDF - the code is also available.

Tuesday, January 22, 2008

TDM#1: Yet Another Stack Tracer

"Have you ever had any really hard-to-find bugs in your code? If not, you can skip this article, otherwise you’d better keep on reading!"

The above quote was the enticing introduction to my first full-feature Delphi Magazine article with the ironic title Yet Another Stack Tracer (or YAST for short). It was published in the seventh TDM issue, March 1996. The contents page said:

"YAST: Yet Another Stack Tracer!
Hallvard Vassbotn delves deeply into your programs’ operation to show how stack tracing can help in finding those really nasty bugs and presents a code unit you can plug straight in"

Prior to this I had written and got published a number of smaller Delphi tips, but this (almost) four page article on low-level stack probing to help diagnose run-time problems at customer sites without a debugger present, was my first real article. And as far as I know it was the first article published on stack tracing from a Delphi perspective.

Delphi 1 Nostalgia

When the article was published, Delphi 1 was the latest and greatest, but Borland Pascal 7.0 was still alive and kicking (at least some people were still using it). In those days it was possible to write DOS-based programs in real-mode and protected mode and Windows programs in protected mode. Protected mode meant 16-bit 80286-level protection with segment registers, 16-bit offsets and segment protection descriptors. Routines could be 'near' or 'far' and the size of the generated code of a single unit was limited to 64 kB.

16-bit Stack Tracer

The article and code presents a technique to walk the stack frames on the stack to dump a trace that shows the calls, parameters and local variables that led up to a specific leaf routine where typically an error condition had occurred. As the conclusion in the article said:

"With a stack tracer tool such as the one I’ve presented here, you are better equipped to track down errors and bugs that would otherwise be very difficult to find. A real debugger (like Turbo Debugger) is of course better to use when you are in the development phase, but
for error-reporting at user-sites, automatic logging with a stack trace facility could save your day (and possibly your contract!)."

(Note: Turbo Debugger was a separate purchase at the time - it could do stuff the D1 integrated debugger couldn't).

The central piece of code, the YAST unit, cross-compiled to BP 7.0 (DOS real- and protected-mode) and Delphi1 (16-bit Windows).

unit YAST;

{ Yet-Another-Stack-Tracer

Description: A general call-back based stack-trace utility.

Compiles with Delphi 1.0 and BP 7.0 in real-mode, DPMI and Windows mode.
Requires the ValidPtr unit if compiled for DOS.

Written by Hallvard Vassbotn, January 1996 (hallvard@falcon.no) }

interface

type
PBytes = ^TBytes;
TBytes = array[0..(High(Word)-$f) div sizeof(byte)] of byte;
PWords = ^TWords;
TWords = array[0..(High(Word)-$f) div sizeof(word)] of word;
TStackInfo = record
CallersBP : word;
DumpSize : word;
ParamSize : word;
IsFar : boolean;
ReturnLog : word;
case integer of
1: (CallerAdr : pointer;
ReturnAdr : pointer;
DumpPtr : PBytes;
ParamPtr : PWords);
2: (CallerOfs : word;
CallerSeg : word;
ReturnOfs : word;
ReturnSeg : word;
DumpOfs : word;
DumpSeg : word;
ParamOfs : word;
ParamSeg : word)
end;
TReportStackFrame = function(var StackInfo: TStackInfo; PrivateData: Pointer): boolean;

procedure TraceStack(ReportStackFrame: TReportStackFrame; PrivateData: Pointer);

implementation

uses
{$IFDEF WINDOWS}
{$IFDEF VER80}
WinProcs;
{$ELSE}
Win31;
{$ENDIF}
{$ELSE}
ValidPtr;
{$ENDIF}

type
PtrRec = record
Ofs, Seg : Word;
end;
TFarStackFrame = record
CallersBP : word;
case integer of
1: (CallerAdr : pointer);
2: (CallerOfs : word;
CallerSeg : word)
end;
TNearStackFrame = record
CallersBP : word;
CallerOfs : word;
end;
PStackFrame = ^TStackFrame;
TStackFrame = TFarStackFrame;

function GetSSBPPtr: pointer; inline
($8C/$D2 { MOV DX, SS }
/$89/$E8); { MOV AX, BP }

function LogSeg(Seg: word): word;
begin
{$IFDEF MSDOS}
LogSeg := Seg;
{$ELSE}
if Seg <> 0 then
LogSeg := Word(Ptr(Seg, 0)^)
else
LogSeg := Seg;
{$ENDIF}
end;

procedure CorrectBP(var BP: word);
{ Handle Windows stack frames (i.e. Inc BP in far prolog code) }
begin
if Odd(BP) then Dec(BP);
end;

function IsFarCode(Addr: pointer): boolean;
begin
{$IFDEF WINDOWS}
IsFarCode := not IsBadCodePtr(Addr);
{$ELSE}
IsFarCode := ValidCodePointer(Addr, 1);
{$ENDIF}
end;

function NextStackFrame(var StackFrame: PStackFrame;
var StackInfo : TStackInfo): boolean;
var
More: boolean;
begin
More := (StackFrame^.CallersBP <> 0) and (StackFrame^.CallerAdr <> nil);
if More then
with StackInfo do
begin
CallersBP := StackFrame^.CallersBP;
CorrectBP(CallersBP);
CallerAdr := StackFrame^.CallerAdr;
DumpPtr := Pointer(StackFrame);
DumpSize := (CallersBP - PtrRec(StackFrame).Ofs);
ParamPtr := Pointer(DumpPtr);
ParamSize := DumpSize div 2;
IsFar := IsFarCode(CallerAdr);
if IsFar then
begin
ReturnAdr := CallerAdr;
Dec(ParamSize, SizeOf(TFarStackFrame) div 2);
Inc(ParamOfs , SizeOf(TFarStackFrame));
end
else
begin
ReturnOfs := CallerOfs;
Dec(ParamSize, SizeOf(TNearStackFrame) div 2);
Inc(ParamOfs , SizeOf(TNearStackFrame));
end;
ReturnLog := LogSeg(ReturnSeg);

PtrRec(StackFrame).Ofs := StackFrame^.CallersBP;
CorrectBP(PtrRec(StackFrame).Ofs);
end;
NextStackFrame := More;
end;

procedure TraceStack(ReportStackFrame: TReportStackFrame; PrivateData: Pointer);
var
StackFrame : PStackFrame;
StackInfo : TStackInfo;
begin
FillChar(StackInfo, SizeOf(StackInfo), 0);
StackInfo.ReturnSeg := CSeg;
StackFrame := GetSSBPPtr;
while NextStackFrame(StackFrame, StackInfo) and
ReportStackFrame(StackInfo, PrivateData) do
{Loop};
end;

end.

Notice the kewl inline machine code instructions encoded in hex (the GetSSBPPtr function). When encountering a call to this function, the compiler would just inject the bytes encoded there directly into the instruction streaming of the "calling" routine. You really had to know what you were doing back then. There was no BASM and no inlining of normal routines. [OTOH, it is not possible to write inlined  BASM today (Delphi 2007)... ;)] The inline machine code bytes was the only way to get at the current contents of the SS and BP registers without resorting to an external assembler like TASM (Turbo Assembler, another add-on).


Some three years later, this article and code would form the basics for a slightly more feature-complete 32-bit exception stack tracer, HVEST. In that sense the YAST code provided one of the humble starting seeds for the current JclDebug unit, the open source project that competes with commercial alternatives such as madExcept, Exceptional Magic and EurekaLog.


We'll come back to the HVEST (or Exceptional Stack Tracing) article later. I have made both the whole original YAST article in PDF format and the full code available from my Google Pages storage area.

Republishing my old The Delphi Magazine articles

As you probably know The Delphi Magazine is no longer in publication. As their web-site indicates the articles and source code repository will only be online for a short while longer, but it is still possible to order the excellent value-for-money 1GB USB stick with all magazine issues and code.

TDM was generally accepted as the best source of technical articles about Delphi and development practices in general. Over the years I submitted and got published almost twenty full articles, in addition to a large number of smaller Delphi tips.

I have now been given permission to republish some of my old articles (thanks, Chris [Frizelle]!). In general I will provide download links to the original PDF files with the article and ZIP files with the code. In addition I will try to write a little about the topic, provide the context of the time, updates and comments.

The list of candidate articles to republish is:

  • YAST: Yet Another Stack Tracer!
  • Hooking Heapcheck
  • The Rise And Fall Of TObject
  • Delphi 4 Bugs And Fixes
  • Slimming The Fat Off Your Apps
  • Knitting Your Own Threads
  • Design Patterns
  • DelayLoading of DLLs
  • Exceptional Stack Tracing
  • BorDebug: Return Of The Giant
  • Speeding Up A Mathematical Expression Parser
  • Tightening The Strings
  • WebBroker On Steroids
  • Building Strings With An Attitude
  • Converting From Interface To Object

Let me know if there is any of these you would like me to prioritize.

Hope you will enjoy them! :-)



Copyright © 2004-2007 by Hallvard Vassbotn