Friday, March 24, 2006

Hack #8: Explicit VMT calls

To accommodate the COM binary protocol in the pre-interface Delphi 2 era, all user-defined virtual methods have positive VMT offsets. This also means that TObject-defined virtual methods have negative VMT offsets. In addition the VMT also contains a number of “magic” fields to support features such as parent class link, instance size, class name, dynamic method table, published methods table, published fields table, RTTI table, initialization table for magic fields, the deprecated OLE Automation dispatch table and implemented interfaces table.

There are a number of integer offset vmtXXX constants in System.pas (many of which has been marked deprecated due to the BASM VMTOFFSET directive) that document how the compiler lays out the VMT table in memory. If we want to write code that access these fields directly (as opposed to using the documented APIs consisting of TObject methods and TypInfo routines) it is probably more useful to define a record structure that matches the fixed part of the VMT. Ray Lischner wrote such a record for his Secrets of Delphi 2 and Delphi in a Nutshell books – here is my quickly hacked up version:

type
PClass = ^TClass;
PSafeCallException = function (Self: TObject; ExceptObject:
TObject; ExceptAddr: Pointer): HResult;
PAfterConstruction = procedure (Self: TObject);
PBeforeDestruction = procedure (Self: TObject);
PDispatch = procedure (Self: TObject; var Message);
PDefaultHandler = procedure (Self: TObject; var Message);
PNewInstance = function (Self: TClass) : TObject;
PFreeInstance = procedure (Self: TObject);
PDestroy = procedure (Self: TObject; OuterMost: ShortInt);
PVmt = ^TVmt;
TVmt = packed record
SelfPtr : TClass;
IntfTable : Pointer;
AutoTable : Pointer;
InitTable : Pointer;
TypeInfo : Pointer;
FieldTable : Pointer;
MethodTable : Pointer;
DynamicTable : Pointer;
ClassName : PShortString;
InstanceSize : PLongint;
Parent : PClass;
SafeCallException : PSafeCallException;
AfterConstruction : PAfterConstruction;
BeforeDestruction : PBeforeDestruction;
Dispatch : PDispatch;
DefaultHandler : PDefaultHandler;
NewInstance : PNewInstance;
FreeInstance : PFreeInstance;
Destroy : PDestroy;
{UserDefinedVirtuals: array[0..999] of procedure;}
end;

Given this definition of the VMT, we can write the following functions to obtain a PVmt from a class or instance reference:

function GetVmt(AClass: TClass): PVmt; overload;
begin
Result := PVmt(AClass);
Dec(Result);
end;

function GetVmt(Instance: TObject): PVmt; overload;
begin
Result := GetVmt(Instance.ClassType);
end;

Very simple. Lets write some test code to exercise these functions and the TVmt record. First we define a simple class that overrides all TObject virtuals and adds a couple of user defined virtual methods:

type
TMyClass = class
function SafeCallException(ExceptObject: TObject;
ExceptAddr: Pointer): HResult; override;
procedure AfterConstruction; override;
procedure BeforeDestruction; override;
procedure Dispatch(var Message); override;
procedure DefaultHandler(var Message); override;
class function NewInstance: TObject; override;
procedure FreeInstance; override;
destructor Destroy; override;
procedure MethodA(var A: integer); virtual;
procedure Method; virtual;
end;

The implementation of these methods simply writeln the ClassName and method name before calling the inherited implementation, and is not included here. Now we can write a test method that calls all the virtual methods explicitly through the obtained VMT pointer.

procedure Test;
var
Instance: TMyClass;
Instance2: TMyClass;
Vmt: PVmt;
Msg: Word;
begin
Instance := TMyClass.Create;
Vmt := GetVmt(Instance);
Writeln('Calling virtual methods explicitly through an obtained'+
' VMT pointer (playing the compiler):');
writeln(Vmt.Classname^);
Vmt^.SafeCallException(Instance, nil, nil);
Vmt^.AfterConstruction(Instance);
Vmt^.BeforeDestruction(Instance);
Msg := 0;
Vmt^.Dispatch(Instance, Msg);
Vmt^.DefaultHandler(Instance, Msg);
Instance2 := Vmt^.NewInstance(TMyClass) as TMyClass;
Instance.Destroy;
Vmt^.Destroy(Instance2, 1);
readln;
end;

Running this test code produces the following output:
TMyClass.NewInstance
TMyClass.AfterConstruction
Calling virtual methods explicitly through an obtained VMT pointer (playing the compiler):
TMyClass
TMyClass.SafeCallException
TMyClass.AfterConstruction
TMyClass.BeforeDestruction
TMyClass.DefaultHandler
TMyClass.Dispatch
TMyClass.DefaultHandler
TMyClass.NewInstance
TMyClass.BeforeDestruction
TMyClass.Destroy
TMyClass.FreeInstance
TMyClass.BeforeDestruction
TMyClass.Destroy
TMyClass.FreeInstance

It is interesting to note that explicitly calling through the obtained VMT pointer is actually slightly smaller and faster than the code the compiler generates. The reason is that we’re able to cache the VMT pointer (potentially in a register). For instance the two last calls to Destroy compiles into the following code:

Instance.Destroy;
00408781 B201 mov dl,$01
00408783 8BC6 mov eax,esi
00408785 8B08 mov ecx,[eax]
00408787 FF51FC call dword ptr [ecx-$04]
Vmt^.Destroy(Instance2, 1);
0040878A B201 mov dl,$01
0040878C 8BC7 mov eax,edi
0040878E FF5348 call dword ptr [ebx+$48]

As you can see, the compiler must retrieve the VMT pointer (mov ecx, [eax]) for each virtual method call, while for the explicit Vmt call we have already cached this pointer, so the latter is smaller and faster. In extreme cases you might be able to speed up a loop that contains virtual method calls by using this VMT caching technique.

A cleaner approach is probably to use a procedure pointer variable – this can be done if the virtual method call is on the same instance each time through the loop. If the instance varies through the loop (for instance you need to call the virtual method of all instances in a list), the call must go through each instance’s VMT to dispatch correctly. However, in the special case where you have a guarantee that the collection is homogenous (all the instances it contains is of exactly the same type), you could use the Vmt pointer caching technique. The minimal performance gains and the increased complexity and compiler-version specific hacks it uses, makes this technique not very practical in real-world projects, though.

But, nevertheless, its fun to spelunk in the magic data structures and code generation that compiler uses to implement our favourite language – don’t you think? :-)

[Updated: Delphi syntax highlighting provided by DelphiDabbler PasH]

Method calls compiler implementation

To support its many language features, the Delphi compiler uses a number of different data structures and machine code generation patterns. To help in understanding how the Delphi language features are implemented, to make it easier to recognize these patterns when debugging at the assembly level, and to make it possible to know what you're doing if you want to run-time patch code, I'll discuss some of these patters here.

The discussion follows the code patterns used by the D7, D2005 and D2006 [update: and D2007] Win32 compilers, but it should be more-or-less identical for other Win32 versions of Delphi.

Call to non-virtual method

The simplest case is when the compiler generates code for a call to a normal, non-virtual method or global routine. It uses the x86 CALL assembly instruction that takes an immediate relative offset as part of the instruction stream. This means that there is no overhead to calculate or fetch the target address from external memory. The address is part of the instruction encoding and the CPU will perform a perfect target address prediction. For instance:

type
TFoo = class
procedure Bar;
end;

var
Foo: TFoo;
begin
Foo := TFoo.Create;
Foo.Bar;


The generated code for the Foo.Bar call is:


MOV EAX, [Foo] CALL <Relative offset of TFoo.Bar>


The first instruction loads the implicit self pointer into the EAX register (by default all Delphi code uses the register calling convention). The offset in the CALL instruction is relative to the current IP (Instruction Pointer) register. This means that the encoding of calls to the same routine in different call-sites will use different relative offsets. The main reason to use offsets instead of the more obvious absolute address is to reduce the need of patching if a module (for instance a DLL) is moved around in memory (or rather loaded at another address than its predefined base address).



Call to virtual method
Virtual methods is the basis of polymorphism in Delphi. The whole point of polymorphism is that descendent classes may override the method from a base class, implementing specific behaviour. At the implementation level, this means that the compiler can no longer hard-code a specific target address and the CPU no longer has the luxury of being able to perfectly predict the target address.



At the Win32 compiler implementation level, virtual methods are dispatched using a CALL instruction that indirects via the object’s VMT table. Each virtual method has at compile time a unique, static index or VMT offset associated with it. You cannot get at this offset directly using Pascal code, but BASM now has a VMTOFFSET directive to be able to call virtual methods in a clean way. Here is an example of calling a virtual method from BASM:

type
TMyClass = class
procedure Method; virtual;
end;

procedure TMyClass.Method;
begin
writeln(ClassName, '.Method');
end;

procedure CallMyMethod(Instance: TMyClass);
asm
MOV ECX, [EAX]
CALL [ECX + VMTOFFSET TMyClass.Method]
end;

var
Instance: TMyClass;
begin
Instance := TMyClass.Create;
CallMyMethod(Instance);
readln;
end.

Here is an example of the machine code instructions involved in a virtual dispatch:


 // EAX contains the object instance
MOV ECX, [EAX]    // Get VMT pointer into ECX
CALL [ECX+0x014]  // Virtual dispatch via VMT method slot


In the next blog entry we will delve further into the details of the virtual method table, including a hack how to call through the VMT explicitly. [Updated: Delphi syntax highlighting provided by DelphiDabbler PasH]

Monday, March 20, 2006

Virtual methods and inherited

Delphi has unusually rich language support for polymorphic behaviour. The most straightforward and the one that most programmers will associate with polymorphism, is the virtual method. A virtual method is declared in a base class using the virtual directive:

TShape = class
procedure Draw(Canvas: TCanvas); virtual;
end;

The base class may or may not have a default implementation for the virtual method. If it doesn’t you mark it abstract, forcing all instantiated descendent classes to override the method.

TRectangle = class(TShape)
procedure Draw(Canvas: TCanvas); override;
end;

All this is basic stuff that all Delphi programmers know. Depending on the implementation (and documentation!) of the base class, the descendent class may decide to call the base class method before, (more rarely) in the middle, or after its own Simplementation (or not at all). There are two ways to call the inherited method, with subtle differences:

procedure TRectangle.Draw(Canvas: TCanvas);
begin
inherited Draw(Canvas);
Canvas.Rectangle(FRect);
end;

This will unconditionally call the inherited Draw method in the base class. If the base class method is abstract, this will fail at run-time with an EAbstractError exception - or a RTE (RunTimeError) 210 - if you don't use (the exception system in) SysUtils.

The alternative syntax is to call just "inherited;", like this:

procedure TRectangle.Draw(Canvas: TCanvas);
begin
inherited;
Canvas.Rectangle(FRect);
end;

When the parent class is non-abstract this will work identically as above, passing the same parameters that the current routine was passed. This is also the calling pattern inserted by code completion when implementing an overridden method. It is also used by the IDE when inserting event handlers in forms that use visual inheritance.

If the base class method is abstract, or if the base class does not contain the method at all (for non-virtual methods), the “inherited” call becomes a noop (No-Operation). The compiler generates no code for it (and thus you cannot set a breakpoint on it). This mechanism is part of the Delphi language’s excellent version resiliency.

One caveat with the "inherited;" syntax is that it is not supported for functions. For functions you must use the explicit syntax including the method name and any arguments. For instance:

type
TMyClass = class
function MethodC: integer; virtual;
end;
TMyDescendent = class(TMyClass)
function MethodC: integer; override;
end;

function TMyClass.MethodC: integer;
begin
Result := 43;
end;

function TMyDescendent.MethodC: integer;
begin
// inherited; // Error
// Result := inherited; // Error
Result := inherited MethodC; // Ok
end;

This might look like an oversight in the Delphi language design, but I think it is deliberate. The rationale behind it is probably that if TMyClass.MethodC is abstract (or made abstract in the future), the Result assignment in the descendent class will be removed, and thus Result has suddenly undefined value. This would certainly cause subtle bugs.

However, I think there is a small hole in the inherited call syntax. In many ways a procedure that takes an out parameter (and in some cases var parameter) behaves like a function returning a result. So in my opinion the inherited; syntax should be prohibited when calling methods with out (and maybe var) parameters. But it isn’t.

procedure TMyDescendent.MethodB(out A: integer);
begin
inherited;
Writeln(A);
end;

This means that if the parent class’ method is abstract (or just missing in the case of non-virtual methods), the value of the out parameter will be undefined. In my opinion, the compiler should forbid such inherited; calls, requiring the explicit inherited MethodB(A) syntax. The cat is already out, however, and making this a compile-time error will surely break existing code, so it should probably be made a warning instead.

[Updated: Delphi syntax highlighting provided by DelphiDabbler PasH]

Friday, March 17, 2006

Compiling Delphi code for .NET 2.0

Daniel Wischnewski on how to compile Delphi code for .NET 2.0 using the dccil command line compiler and the --clrversion parameter (and a couple of other tricks).

See the full story here:
Daniel Wischnewski - My Delphi Blog

David I.'s updated Delphi roadmap

David I. recently visited a Japan developer conference and the opening keynote he held has now been published. It includes an updated roadmap for both BDS and Interbase:

http://bdn.borland.com/article/images/33457/DevConJapanOpeningKeynote.pdf

It also covers information about spinning off DevCo from Borland ALM.

David I.'s sales pitch

I just received this from Borland - it's a Flash video featuring David I. demonstrating the best features of Borland Developer Studio 2006.

http://www.borland.com/media/en/edm/delphi_demo/delphi.html

I even got my name in there somewhere... ;)

Wednesday, March 15, 2006

Polymorphism ad nauseum

As all object oriented languages Delphi (née Object Pascal) supports the concept of polymorphism. Polymorphism is a Greek word that literally means many-shapes or many-forms. In programming-speak it refers to the idea that invoking a conceptual operation is detached from the actual implementation of that operation. In fact, at run-time the actual implemented operation may change radically depending on the type of the object being used to perform the operation.

One of the driving factors for OOP were graphical user-interfaces (GUIs). The archetypical schoolbook example of polymorphism follows the translation slavishly by using an abstract TShape base class with TRectangle and TEllipse descendants that override a virtual Draw routine. The main drawing routine loops over a list of TShape instances, calling the virtual Draw method - each call will be handled by a descendant and draw a rectangle or ellipse (or any other) shape.

In its most basic concept polymorphism is about changing the target address of a CALL instruction dynamically at runtime. From this angle the numbers of mechanisms that can provide polymorphism are many.

At the Delphi language level:
virtual
dynamic
message
procedure pointer
event (procedure of object)
interface
IDispatch

By using dirty hacks:
overwrite virtual method pointer
overwrite VMT pointer
overwrite message/dynamic slot
overwrite call-site address
overwrite callee instructions
overwrite DLL imports slots

I'll try to cover some of these in upcoming articles, so stay tuned!

Thursday, March 09, 2006

Verity Stob strikes again

Verity Stob is a legend in satirical computer writing. She has written columns for many magazines, including .EXE, Dr Dobb’s Journal and more recently The Register. Covering all aspects of computing and programming with stinging, half-serious satirical wit, she has also written some (IMO) hysterically funny old-testament style pieces of the more dramatic events in Borland and Delphi’s history.

The first piece appeared in the (now defunct) .EXE magazine in 1996(?) and was republished by The Register here. It covers the raise of Delphi/VCL as a VB-killer and descendent of Turbo Pascal and OWL, and finally Anders Hejlsberg’s move to Microsoft.

The second piece appeared only a few days ago in The Register. Predictably, it mocks the divesting of Delphi, but also covers such events as Dale Fuller’s departure, Tod Nielsen’s management speak, and Delphi 2005 quality issues. Verity also includes nuggets from the community’s non-tech discussions such as "Skype is written in Delphi" and improving the help system by following the PHP’s wiki-lead. She even throws in a Ballmer-inspired David I. for good measure.

The King-James-Bible style is hard to read, but probably fitting considering the religious-like cult of the Delphi community. The British angle and Monty Pythonesque humor is not for the faint-of-heart, but I think we should have enough self-irony to get a good laugh out of it.

I certainly did.

Update: David I. has written a calming and funny email to Ms. Stob, and got a reply!

Update II - New scrolls found
Thanks to my diligent readers, three additional Borland scrolls have been found. Here is the currently known list:

Wednesday, March 08, 2006

Comment moderation turned on

I'm sorry for the inconvenience, but in an attempt to eliminate comment spam I've had to turn on comment moderation for this blog (in addition to the word verification that was already there). This means that whenever you enter a comment, I have to manually approve and publish it. When you click on the "Post a comment" link below, you will be informed about this new policy:

Comment moderation has been enabled. All comments must be approved by the blog author.

Please don't post duplicate comments - your comment will be published soon, as long as it is not spam.

Tuesday, March 07, 2006

Classic Delphi and .NET book in the making

As I have mentioned earlier, I've been working until recently with tech-editing Jon Shemitz' upcoming book .NET 2.0 for Delphi programmers. The book has been taken a long time to complete, spanning the timeframe from the .NET DCCIL preview compiler to BDS 2006, from .NET 1.0 to .NET 2.0, and from VS 2002 to VS 2005. Due to print-schedule issues, the first books should be available in June 2006.

I may be biased having worked on and off with tech editing the book the last two years. In the end, I even contributed Chapter 10 on the new and improved features of the Delphi language since Delphi 7. Nevertheless, I'm one of the few that have read the book's chapters in their entirety (twice!), so I feel entitled to writing the first review of it ;).

In general, I very much like Jon's candid way if writing. He doesn't just describe how a system or API is working, he wants to figure out and explain why it works that way, how it compares with other technologies (Win32 and Java for instance). He doesn't talk down to us, reading the book rather feels like discussing the issues with a peer programmer.

I think the book will be very relevant for a large number of programmers. It will be useful to both beginners, intermediate and advanced programmers. Clearly, the book is primarily targeted at Win32 Delphi programmers that have or will take the step into the .NET world, or even just wants to have a better understanding of .NET from a Delphi programmer's perspective.

Like .NET itself, the book is primarily language agnostic with example code in C# and Delphi. While the author admits that C# is now his primary .NET language, the right-tool-for-the-job rule still applies. For a quick overview of the sections and chapters in the book, go to the author's book page here.

Conclusion
In my opinion this will be a classic Delphi book, one that should be on almost every Delphi programmer's bookshelf. In many ways it is a more readable, mature and Delphi-friendly version of Jeff Richter's Applied .NET Framework Programming. Note that the book does not cover specific IDE features - instead it gives you a comprehensive and practical understanding of .NET fundamentals, the CLR, the JITer, the memory and threading model, the FCL and the C# and Delphi programming languages.

Highly recommended!

Click here to order your copy at Amazon.

Update:
There have been some misunderstandings about the book's coverage of .NET 2.0 vs Delphi.

While it is true that Delphi does not fully support .NET 2.0 yet, the book is about .NET 2.0 nonetheless. And it is primarily targeted at existing Delphi programmers.

Delphi programmers can still use Delphi to program against .NET 1.1, or they can use C# to program against .NET 2.0. Or they can even use the Delphi command line compiler to compile and run against the .NET 2.0 framework, even though they cannot directly use 2.0 specific features such as generics. (I admit this is a little tricky.)

And when Delphi for .NET 2.0 becomes available, the book will still be highly relevant.

The book is mostly about .NET 2.0 (but it does explicitly mention what is new in 2.0 compared to 1.1), but it is (AFAIK) the only .NET boook from the perspective of a Delphi programmer.

It doesn't contain much IDE specific information (BDS or VS), concentrating more on the .NET foundations, FCL and at the language level of C# *and* Delphi.

The book contains a great deal of Delphi code both in inline samples and as complete online examples. And explanationsxplainations of how to do things in Delphi, how things compare with Win32 Delphi, etc. One chapter (10) is only about the Delphi language changes, for instance.

It will be very relevant for users that wants to:

  • Learn about .NET
  • Learn about Delphi for .NET (on a language level)
  • Learn about how to target .NET with Delphi
  • Learn about how to target .NET with C#
  • Use the right tool for the job

So it is a very pragmatic book, describing the world as it is, letting the programmer decide what tools he wants to use in each specific situation.



Copyright © 2004-2007 by Hallvard Vassbotn