Sunday, April 30, 2006

Pure interfaces in Delphi

In a comment on the recent interface-list blog post Huseyn asks:

In Delphi we expect all interfaces be descendants of IUnknown [or IInterface - HV], but in [the] C world there are many even more basic interfaces which are not inherited from IUnknown. I came across one of them in a DLL which I need to use. But I couldn't do it using Delphi.

It is true that a Delphi interface declaration always implicitly inherits from IUnknown or IInterface (to distinguish between COM interfaces and Delphi language interfaces). This means that all interfaces have the three methods QueryInterface, _AddRef and _Release. This makes it hard (or impossible) to implement a pure interface from a non-COM DLL using a Delphi interface declaration.

Back in the dark ages of Delphi 2, there where no explicit interfaces and COM support was built on the fact that the VMT table layout matched COM’s binary contract, and “interfaces” were declared using pure abstract base classes. This is how interfaces are still declared in C++, for instance. The class that wanted to implement such an “interface” would simply inherit directly from the pure abstract class, overriding and implementing the all the abstract methods. This was a very restrictive way of implementing interfaces; due to the fact the Delphi does not support multiple inheritance of classes - if you wanted to support multiple interfaces, you had to write one class per interface and manually write the QueryInterface method to return the correct interface/object reference. C++ does not have that restriction and is probably why it doesn’t have a separate language construct for interfaces, in contrast to languages such as Java, Delphi and C# that all have a single-inheritance model.

This little history lesson should give us a clue of how we can avoid the IUnknown methods of normal interfaces – we can simply declare and implement a pure abstract class instead.

For instance:

program TestPureInterface;
{$APPTYPE CONSOLE}
type
TMyPureInterface = class
procedure FirstMethod stdcall; virtual; abstract;
procedure SecondMethod stdcall; virtual; abstract;
end;

TMyImplementation = class(TMyPureInterface)
public
procedure FirstMethod; override;
procedure SecondMethod; override;
end;

procedure TMyImplementation.FirstMethod;
begin
Writeln('TMyImplementation.FirstMethod');
end;

procedure TMyImplementation.SecondMethod;
begin
Writeln('TMyImplementation.SecondMethod');
end;

procedure TestClient(PureInterface: TMyPureInterface);
begin
PureInterface.FirstMethod;
PureInterface.SecondMethod;
end;

var
MyImplementation: TMyImplementation;
begin
MyImplementation := TMyImplementation.Create;
TestClient(MyImplementation);
readln;
end.

This little sample code first declares a pure abstract class with to virtual abstract methods. This is a declaration of the COM-less interface the C-based DLL in question expects to talk to. Note that I assume the calling convention is stdcall and the specific ordering and semicolon use the compiler insists upon (which disagrees with Error Insight in Delphi 2006):

    procedure FirstMethod stdcall; virtual; abstract;

Then we implement the “interface” by writing a class that inherits from the abstract base class, implementing the required methods. Notice that the compiler does not require that we repeat the calling convention for the overrides:

    procedure FirstMethod; override;

Finally I’ve written some test code that exercises calling the implemented methods through the “interface” reference, typed as the abstract base class. This corresponds to the C code in the DLL we’re providing the interface implementation for.

Saturday, April 29, 2006

Getting a list of implemented interfaces

Frenk asked in the non-tech newsgroup:

"Is there some way to find out which interfaces (interface list) a particular component implements (I don't know [the interfaces], so querying is not possible)?"

Yes, there is.

Call TObject.GetInterfaceTable to get a pointer to the list of interfaces a specific class implements - see System.pas for details. Note that this does not include interfaces that you implement by overriding QueryInterface manually - but that is not usual for Delphi code.

For example, this code demonstrates how to dump all implementerd interfaces of a class:

program TestIntfTable;

{$APPTYPE CONSOLE}

uses
Classes,
SysUtils,
TypInfo,
ComObj;

procedure DumpInterfaces(AClass: TClass);
var
i : integer;
InterfaceTable: PInterfaceTable;
InterfaceEntry: PInterfaceEntry;
begin
while Assigned(AClass) do
begin
InterfaceTable := AClass.GetInterfaceTable;
if Assigned(InterfaceTable) then
begin
writeln('Implemented interfaces in ', AClass.ClassName);
for i := 0 to InterfaceTable.EntryCount-1 do
begin
InterfaceEntry := @InterfaceTable.Entries[i];
writeln(Format('%d. GUID = %s',
[i, GUIDToString(InterfaceEntry.IID)]));
end;
end;
AClass := AClass.ClassParent;
end;
writeln;
end;

begin
DumpInterfaces(TComponent);
DumpInterfaces(TComObject);
DumpInterfaces(TComObjectFactory);
readln;
end.
Output:
Implemented interfaces in TComponent
0. GUID = {E28B1858-EC86-4559-8FCD-6B4F824151ED}
1. GUID = {00000000-0000-0000-C000-000000000046}

Implemented interfaces in TComObject
0. GUID = {DF0B3D60-548F-101B-8E65-08002B2BD119}
1. GUID = {00000000-0000-0000-C000-000000000046}

Implemented interfaces in TComObjectFactory
0. GUID = {B196B28F-BAB4-101A-B69C-00AA00341D07}
1. GUID = {00000001-0000-0000-C000-000000000046}
2. GUID = {00000000-0000-0000-C000-000000000046}

Thursday, April 27, 2006

Published methods

Normally not thought of (or used as) an object-oriented features, published methods rely in RTTI to enable runtime lookup of methods by using a string with the method name. This is used extensively by the IDE and VCL when you are writing event handlers at design time.

When you create a new event handler (by double-clicking on an empty event value in the Object Inspector) or when you associate an event property with an existing method (by using the drop down or even manually typing in the method name), the IDE ensures that the method’s parameters matches the parameters of the event type. Likewise, when you assign an event property in code to an method, the compiler performs a compile-time check that the parameters and calling conventions agree.

At run-time there are no such parameter checks. All design-time assigned events are stored in the .DFM file simply by using the method name string. When a .DFM is loaded at run-time, the method is looked up using the TObject.MethodAddress function – see TReader.FindMethod in Classes.pas for the streaming details.

The TObject.MethodAddress function works its magic by scanning through some compiler magic tables known as a Method Table (or Published Method Table, as I prefer – reducing the possible confusion with the virtual and dynamic method tables).

Enabling RTTI
By default extended run-time type information (RTTI) for a class is disabled. In contrast to .NET where meta data is generated for all members, in Delphi RTTI is only generated for published members when a class is compiled with the compiler directive {$M+} enabled, or when it inherits from a class that was compiled in $M+ mode (such as TPersistent, TComponent etc). The long-name alternative to {$M+} is {$TYPEINFO ON}. For the purposes of this discussed, I’ll call such classes MPlus classes – all other classes are MMinus classes.

In addition to explicit published members, all members of a MPlus class in the top of the class declaration that have no explicit visibility specifier are treated as published. For MMinus classes, these members are public. This is why all the component field and event handler declarations in the top of form units are published (TForm is a MPlus class).

The compiler allows publishing object and interface reference fields, properties of most types and methods. We’ll mainly focus on published methods in this article. Lets write a little test program to exercise published members and MPlus and MMinus classes.

program TestMPlus;
{$APPTYPE CONSOLE}
uses Classes, SysUtils, TypInfo;

type
{$M-}
TMMinus = class
DefField: TObject;
property DefProp: TObject read DefField write DefField;
procedure DefMethod;
published
PubField: TObject;
property PubProp: TObject read PubField write PubField;
procedure PubMethod;
end;
{$M+}
TMPlus = class
DefField: TObject;
property DefProp: TObject read DefField write DefField;
procedure DefMethod;
published
PubField: TObject;
property PubProp: TObject read PubField write PubField;
procedure PubMethod;
end;

procedure TMMinus.DefMethod; begin end;
procedure TMMinus.PubMethod; begin end;
procedure TMPlus.DefMethod; begin end;
procedure TMPlus.PubMethod; begin end;

procedure DumpMClass(AClass: TClass);
begin
Writeln(Format('Testing %s:', [AClass.Classname]));
Writeln(Format('DefField=%p', [AClass.Create.FieldAddress('DefField')]));
Writeln(Format('DefProp=%p', [TypInfo.GetPropInfo(AClass, 'DefProp')]));
Writeln(Format('DefMethod=%p', [AClass.MethodAddress('DefMethod')]));
Writeln(Format('PubField=%p', [AClass.Create.FieldAddress('PubField')]));
Writeln(Format('PubProp=%p', [TypInfo.GetPropInfo(AClass, 'PubProp')]));
Writeln(Format('PubMethod=%p', [AClass.MethodAddress('PubMethod')]));
Writeln;
end;

begin
DumpMClass(TMMinus);
DumpMClass(TMPlus);
readln;
end.

A compiler quirk
The purpose of this test program is to verify that RTTI is generated for default and published visibility for MPlus classes and that RTTI is not generated for MMinus classes. We have two classes TMMinus and TMPlus that have identical members but are compiled in different $M modes. We would expect TMMinus to have no RTTI for its members, and TMPlus to have RTTI for all its members.

The DumpMClass routine writes out raw pointer values for the RTTI of the fields, properties and methods in each class. When we run this program, we get this surprising result:

Testing TMMinus:
DefField=00000000
DefProp=00000000
DefMethod=00000000
PubField=008C0A78
PubProp=00000000
PubMethod=00412898

Testing TMPlus:
DefField=008C0AA0
DefProp=00412852
DefMethod=0041289C
PubField=008C0AF4
PubProp=00412874
PubMethod=004128A0

As expected the TMPlus class has RTTI for all of its six members, proving that $M+ enables RTTI and that the default visibility for MPlus classes is published. The strange thing about this result is that the TMMinus class declared with TYPEINFO disabled still has RTTI for two of its members, the explicitly published field and method. This reality contradicts the documentation, which says:

A class cannot have published members unless it is compiled in the {$M+} state or descends from a class compiled in the {$M+} state. Most classes with published members derive from TPersistent, which is compiled in the {$M+} state, so it is seldom necessary to use the $M directive.

This is probably a compiler bug. Notice that the published property didn’t get any RTTI. From the docs (“A class cannot have published members”) it sounds like one should expect a compile-time error (or at least warning) if you try to compile a MMinus class with a published section. But we don’t.

The default visibility of class members is documented like this:

Members at the beginning of a class declaration that don't have a specified visibility are by default published, provided the class is compiled in the {$M+} state or is derived from a class compiled in the {$M+} state; otherwise, such members are public.

This matches what we saw in our experiment. Luckily MMinus class methods and fields with no visibility specifier don’t generate spurious RTTI. In weird cases, you might accidentally have members (fields and methods) in a MMinus class in a published section – these will have RTTI generated from them even if you never intended to use it for anything.

Using published methods polymorphically
While it should probably be viewed as a hack, you can use published methods to implement a very flexible, late-bound polymorphic dispatch mechanism. It is very flexible because the caller and the callee do not have to know about each other or use a common interface. The caller needs to know the name, parameters and calling convention of the method it wants to call, and the callee has to implement this as a published method with the correct name, parameters and calling convention.

To override an existing published method, a descendent class needs to define a new published method with the same name. Because dynamic method lookup first searches the most derived class, this works like a polymorphic lookup.

Lets look at a simple example:

program TestPolyPub;
{$APPTYPE CONSOLE}
uses Classes, SysUtils, TypInfo, Contnrs;

type
{$M+}
TParent = class
published
procedure Polymorphic(const S: string);
end;
TChild = class
published
procedure Polymorphic(const S: string);
end;
TOther = class
published
procedure Polymorphic(const S: string);
end;

procedure TParent.Polymorphic(const S: string);
begin
Writeln('TParent.Polymorphic: ', S);
end;

procedure TChild.Polymorphic(const S: string);
begin
Writeln('TChild.Polymorphic: ', S);
end;

procedure TOther.Polymorphic(const S: string);
begin
Writeln('TOther.Polymorphic: ', S);
end;

function BuildList: TObjectList;
begin
Result := TObjectList.Create;
Result.Add(TParent.Create);
Result.Add(TChild.Create);
Result.Add(TOther.Create);
end;

type
TPolymorphic = procedure (Self: TObject; const S: string);
procedure CallList(List: TObjectList);
var
i: integer;
Instance: TObject;
Polymorphic: procedure (Self: TObject; const S: string);
begin
for i := 0 to List.Count-1 do
begin
Instance := List[i];
// Separate assign-and-call
Polymorphic := Instance.MethodAddress('Polymorphic');
if Assigned(Polymorphic) then
begin
Polymorphic(Instance, IntToStr(i));
// Alternative syntax:
TPolymorphic(Instance.MethodAddress('Polymorphic'))(Instance, IntToStr(i));
end;
end;
end;

begin
CallList(BuildList);
readln;
end.

Here we first define three classes – each with a published method named ‘Polymorphic’ that takes a single string parameter (in addition to the implicit Self parameter) and that uses the default register calling convention. Two of the classes inherit from each other and the TChild class in practice overrides the Polymorphic it inherits from TParent. The TOther class is totally unrelated to the two other classes (well, they all inherit from TObject), but its Polymorphic method can be called “virtually” anyway.

Then we build a heterogeneous list of objects containing instances of each of the three classes. This list is passed to CallList that finds and calls the published Polymorphic method of each instance in the list. The Delphi language has no built-in syntax to call a published method through a name string, so we must manually assign the result of Instance.MethodAddress to a procedural variable and then call through the variable. Alternatively we can combine the operations into a single statement that type casts the MethodAddress result into the correct procedural type and calls through the result. Both syntaxes are demonstrated above.

An interesting feature of calling published methods is you can check at runtime if a specific method is available for an instance or not. This way you can use published methods to implement optional behavior or callbacks. For instance, a generic streaming system could optionally call published BeginStreaming and EndStreaming methods before and after streaming an object instance. Only classes that need to perform special actions would actually implement the methods. Published methods could even be used as a kind of poor-man’s attributes.

The main disadvantage of this technique is that there is no compile time or runtime checking of method signatures. If you are calling a method with a different calling convention or types and number of parameters, “interesting” things (crashes, corruption) can happen at run-time.

Monday, April 10, 2006

Hack #9: Dynamic method table structure

One of the compiler magic slots in a class’ virtual method table (VMT) is a pointer to that class’ dynamic method table (DMT). A class only has a DMT if it declares or overrides one or more dynamic (or message) methods. The DMT contains a 16-bit (word) Count followed by an array[0..Count-1] of Smallint indices and an array[0..Count-1] of pointers containing  the code address of the dynamic method’s implementation. Note that the arrays are “inline” in the DMT structure (there is no pointers to the arrays). One approximate way of representing this structure in Pascal would be:

type
TDMTIndex = Smallint;
PDmtIndices = ^TDmtIndices;
TDmtIndices = array[0..High(Word)-1] of TDMTIndex;
PDmtMethods = ^TDmtMethods;
TDmtMethods = array[0..High(Word)-1] of Pointer;
PDmt = ^TDmt;
TDmt = packed record
Count: word;
Indicies: TDmtIndices; // really [0..Count-1]
Methods : TDmtMethods; // really [0..Count-1]
end;

Because Pascal does not support declaring static array types that vary in size depending on a field, we have to perform some pointer tricks to get at the Methods array. We can now update the declaration of our VMT record structure – we change the DynamicTable field from a generic Pointer to our specific PDmt type:

type
PVmt = ^TVmt;
TVmt = packed record
SelfPtr : TClass;
IntfTable : Pointer;
AutoTable : Pointer;
InitTable : Pointer;
TypeInfo : Pointer;
FieldTable : Pointer;
MethodTable : Pointer;
DynamicTable : PDmt;
ClassName : PShortString;
InstanceSize : PLongint;
Parent : PClass;
SafeCallException : PSafeCallException;
AfterConstruction : PAfterConstruction;
BeforeDestruction : PBeforeDestruction;
Dispatch : PDispatch;
DefaultHandler : PDefaultHandler;
NewInstance : PNewInstance;
FreeInstance : PFreeInstance;
Destroy : PDestroy;
{UserDefinedVirtuals: array[0..999] of procedure;}
end;

Compiler magic routines
The System unit contains a number of RTL magic routines. Btw, I’m not making up the phrases “magic routines” and “compiler magic”. Above the declaration of a number of special routines with names that start with an underscore (which maps to an ampersand when compiled) in the System.pas unit you’ll find this comment:

{ Procedures and functions that need compiler magic }

The compiler is hard-coded to find and use these as it is generating code for language features such as strings, dynamic arrays and dynamic methods. They cannot be called explicitly from Pascal – only implicitly by using the language features they implement or explicitly from BASM. As we have seen in a couple of cases, to call a compiler magic routine from BASM you use the syntax CALL System.@MagicName.

The interfaced magic routines that deal with dynamic method dispatching and lookup are:

procedure _CallDynaInst;
procedure _CallDynaClass;
procedure _FindDynaInst;
procedure _FindDynaClass;

There are separate Call and Find routines for instance and class dynamic methods (yes, you can have class level dynamic methods too). The CallDyna routines take a Self (TObject or TClass) parameter in the EAX register and an 16-bit signed Smallint Index parameter in the SI register. Both the CallDyna routines will JMP directly to the dynamic method implementation after finding it. Any parameters the dynamic method in question takes must be assigned to EDX, ECX and pushed to the stack as appropriate. That’s why the normally scratch register SI is used to pass the index.

The two FindDyna routines have no such parameter preserving constraints, so they take a Self (TObject or TClass) in EAX and the Index in EDX, as any normal Register calling convention routine.

All these routines use a common, internal worker routine (GetDynaMethod) that does the actual scanning of the DMT, iterating to scan the parent classes as needed. I was able to reconstruct the TDmt record above by analyzing this code. The implementation uses the fairly efficient REPNE SCASW instruction to quickly scan the array of Smallints for the DMT index.

A debugging tip
If you compile your application with the debug RTL (Project Options | Compiler | [X] Use debug DCUs) – a good idea if you want to get good stack traces from exception stack tracers (such as madExcept or JclDebug) – you might find yourself inside the _CallDynaInst routine if you press F7 to step into the call of a dynamic method. Now you should know why this happens.

procedure       _CallDynaInst;
asm
...
CALL GetDynaMethod
...
JMP ESI
...
end;

To quickly get on to the dynamic method code, you should move the cursor down to the JMP ESI statement, press F4 (Run to Cursor), then press F7 (Step into). Now you’re in the dynamic method proper.

Accessing the DMT from Pascal code
While the compiler and RTL supplies all the DMT dispatching and lookup functionality we need, it could be fun to write our own routines that access these arrays. Given the type definitions above, we can write a few worker routines.

function GetDmt(AClass: TClass): PDmt;
var
Vmt: PVmt;
begin
Vmt := GetVmt(AClass);
if Assigned(Vmt)
then Result := Vmt.DynamicTable
else Result := nil;
end;

function GetDynamicMethodCount(AClass: TClass): integer;
var
Dmt: PDmt;
begin
Dmt := GetDmt(AClass);
if Assigned(Dmt)
then Result := Dmt.Count
else Result := 0;
end;

function GetDynamicMethodIndex(AClass: TClass; Slot: integer): integer;
var
Dmt: PDmt;
begin
Dmt := GetDmt(AClass);
if Assigned(Dmt) and (Slot < Dmt.Count)
then Result := Dmt.Indicies[Slot]
else Result := 0; // Or raise exception
end;

function GetDynamicMethodProc(AClass: TClass; Slot: integer): Pointer;
var
Dmt: PDmt;
DmtMethods: PDmtMethods;
begin
Dmt := GetDmt(AClass);
if Assigned(Dmt) and (Slot < Dmt.Count) then
begin
DmtMethods := @Dmt.Indicies[Dmt.Count];
Result := DmtMethods[Slot];
end
else
Result := nil; // Or raise exception
end;

The GetDmt routine returns a pointer to the DMT given a class reference (such as Instance.ClassType). The three other routines return the number of dynamic methods in a class and let us iterate through all the DMT indices and method pointers. Given these we can now write a routine that will dump information about all the dynamic (and message) methods of a class and all its parent classes.

procedure DumpDynamicMethods(AClass: TClass);
var
i : integer;
Index: integer;
MethodAddr: Pointer;
begin
while Assigned(AClass) do
begin
writeln('Dynamic methods in ', AClass.ClassName);
for i := 0 to GetDynamicMethodCount(AClass)-1 do
begin
Index := GetDynamicMethodIndex(AClass, i);
MethodAddr := GetDynamicMethodProc(AClass, i);
writeln(Format('%d. Index = %2d, MethodAddr = %p',
[i, Index, MethodAddr]));
end;
AClass := AClass.ClassParent;
end;
end;

We can also write the Pascal equivalent of System’s BASM GetDynaMethod to find a dynamic method given its DMT index.

function FindDynamicMethod(AClass: TClass; DMTIndex: TDMTIndex): Pointer;
// Pascal variant of the faster BASM version in System.GetDynaMethod
var
Dmt: PDmt;
DmtMethods: PDmtMethods;
i: integer;
begin
while Assigned(AClass) do
begin
Dmt := GetDmt(AClass);
if Assigned(Dmt) then
for i := 0 to Dmt.Count-1 do
if DMTIndex = Dmt.Indicies[i] then
begin
DmtMethods := @Dmt.Indicies[Dmt.Count];
Result := DmtMethods[i];
Exit;
end;
// Not in this class, try the parent class
AClass := AClass.ClassParent;
end;
Result := nil;
end;

Are we having fun yet? ;)


As a silly example we could use this routine to check if a class has any dynamic methods with a specific (negative) index or any message methods that handle a specific message id.

procedure DumpFoundDynamicMethods(AClass: TClass);
procedure Dump(DMTIndex: TDMTIndex);
var
Proc: Pointer;
begin
Proc := FindDynamicMethod(AClass, DMTIndex);
writeln(Format('Dynamic Method Index = %2d, Method = %p',
[DMTIndex, Proc]));
end;
begin
Dump(-1);
Dump(1);
Dump(13);
Dump(42);
end;

Conclusion
While message methods is a very elegant solution to the problem of handling arbitrary Windows messages without having to maintain an unwieldy case-statement, dynamic methods should be shunned. Now you should have a firm grasp of what dynamic methods are, how they work and why you should avoid them.

[Delphi syntax highlighting provided by DelphiDabbler PasH]

Saturday, April 08, 2006

Dynamic methods compiler implementation

In a previous article, we have covered how the compiler implements non-virtual and virtual method calls. We have also discussed the rationale and semantics of dynamic methods. You’ll recall that dynamic methods works just like virtual methods, only slower. In this article we’ll dig down into the compiler magic and RTL support that is used to support dynamic methods. Note that most of the mechanics used for dynamic methods is also used by message methods – the only difference is that message methods let the programmer decide the index of the method (the message number, a positive 16-bit number).

Call to dynamic method
While a non-virtual method encodes the address of the target method directly in the CPU instruction, and a virtual method looks up the address in the VMT using a fixed offset, calling a dynamic method is very different. All dynamic method calls targets the same routine – a compiler magic RTL routine in the System unit called _CallDynaInst. This routine takes two parameters – the instance pointer (in EAX) and a 16-bit Smallint selector (in SI).

For instance:

type
TMyClass = class
procedure FirstDynamic; dynamic;
procedure SecondDynamic; dynamic;
end;
// …
var
Instance: TMyClass;
begin
Instance := TMyDescendent.Create;
Instance.FirstDynamic;
Instance.SecondDynamic;
end.

Generates the following code for the two dynamic method calls:

TestDmt.dpr.334: Instance.FirstDynamic;
004096D6 8BC3             mov eax,ebx
004096D8 66BEFFFF         mov si,$ffff
004096DC E8E7A2FFFF       call @CallDynaInst
TestDmt.dpr.335: Instance.SecondDynamic;
004096E1 8BC3             mov eax,ebx
004096E3 66BEFEFF         mov si,$fffe
004096E7 E8DCA2FFFF       call @CallDynaInst

Notice the two different constants loaded into the SI register (the low-word of the ESI register); $ffff and $fffe. As it happens this is the binary (or hex) representation of the Smallint values -1 and -2, respectively. So the effect of calling different dynamic methods is to pass different numeric constants to the magic _CallDynaInst helper routine. At compile time the compiler assigns a unique negative number to each dynamic method in a class – this means that you can have no more than 32768 dynamic methods in a class – more than enough for most cases, I should think(!

When the compiler assigns numeric values to the dynamic methods of a class, it also populates a dynamic method table (DMT), associating the value (aka. selector or DMT index) with the address of the method. The _CallDynaInst routine scans this table at runtime, trying to find a match for the DMT index it was given in the SI register. If it succeeds, it JMPs to the correct address. If not it continues the scan in the parent class’ DMT. If there are no matches it triggers a Run Time Error 210 (which SysUtils converts into an EAbstractError exception).

Calling a dynamic method from BASM
If you find yourself in the (unlikely?) event of having to call a dynamic method from an assembly language routine, you can use the relatively new DMTINDEX directive to get the dynamic method index for a specific method. Lets first just retrieve the index for a gentle start:

function MyDynamicMethodIndex: integer;
asm
MOV EAX, DMTIndex TMyClass.FirstDynamic
end;



procedure Test;
begin
Writeln(MyDynamicMethodIndex);
end;

Provided we have the TMyClass definition from above this code snippet will output the number -1. Very useful and interesting, right? :-P Lets go one step further and actually call the method from the assembly code

procedure CallFirstDynamicMethod(Self: TMyClass);
asm
MOV ESI, DMTIndex TMyClass.FirstDynamic;
CALL System.@CallDynaInst
end;

So the way to call a dynamic method from BASM is to first load the index into ESI using the DMTIndex directive with the full name of the class and method, and then call the magic routine System.@CallDynaInst (the compiler maps the _-prefix of the compiler magic RTL routines into a @-prefix, making them impossible to call explicitly from Pascal code). Notice that _CallDynaInst (and its friend _CallDynaClass) uses the unconventional parameter passing register (E)SI - the reason is that it cannot use the registers that the dynamic method itself may be using for passing parameters (ECX and EDX). In all cases EAX contains the Self pointer.

Note that BASM also supports calling a dynamic method statically, without polymorphic dispatch:

procedure StaticCallFirstDynamicMethod(Self: TMyClass);
asm
CALL TMyClass.FirstDynamic // Static call
end;

But that is normally not what you want.

Speeding up calls to dynamic methods
If you have a time sensitive routine that needs to call a dynamic method inside a long-running loop, you can use a trick to speed it up. Instead of incurring the expensive dynamic dispatch lookup in each iteration, move the loop-invariant lookup outside the loop by assigning the method address to a procedure pointer.

If the instance reference stays the same throughout the loop, you can use a procedure of object variable.

procedure SlowDynamicLoop(Instance: TMyClass);
var
i: integer;
begin
for i := 0 to 1000000 do
Instance.FirstDynamic;
end;

procedure FasterDynamicLoop(Instance: TMyClass);
var
i: integer;
FirstDynamic: procedure of object;
begin
FirstDynamic := Instance.FirstDynamic;
for i := 0 to 1000000 do
FirstDynamic;
end;

Here we have optimized the loop by moving the dynamic method lookup outside the loop. If the algorithm runs through a list of different instances, and you can guarantee that the list is homogenous, you can use a procedure variable and explicitly pass the Self instance pointer:

procedure SlowDynamicListLoop(Instances: TList);
var
i: integer;
Instance: TMyClass;
begin
for i := 0 to Instances.Count-1 do
begin
Instance := Instances.List[i];
Instance.FirstDynamic;
end;
end;

procedure FasterDynamicListLoop(Instances: TList);
var
i: integer;
Instance: TMyClass;
FirstDynamic: procedure(Self: TObject);
begin
FirstDynamic := @TMyClass.FirstDynamic;
for i := 0 to Instances.Count-1 do
begin
Instance := Instances.List[i];
Assert(TObject(Instance) TMyClass);
FirstDynamic(Instance);
end;
end;

In assert-mode we check that our assumption holds. In fact, this optimization would work even in cases where you have a heterogeneous list of TMyClass objects, as long as none of the subclasses override the dynamic method. We can check that like this:

function TMyClassFirstDynamicNotOverridden
(Instance: TMyClass): boolean;
var
FirstDynamic: procedure of object;
begin
FirstDynamic := Instance.FirstDynamic;
Result := TMethod(FirstDynamic).Code = @TMyClass.FirstDynamic;
end;

procedure FasterDynamicListLoop2(Instances: TList);
type
PMethod = TMethod;
var
i: integer;
Instance: TMyClass;
FirstDynamic: procedure (Self: TObject);
begin
FirstDynamic := @TMyClass.FirstDynamic;
for i := 0 to Instances.Count-1 do
begin
Instance := Instances.List[i];
Assert(TObject(Instance) is TMyClass);
Assert(TMyClassFirstDynamicNotOverridden(Instance));
FirstDynamic(Instance);
end;
end;

Note that these optimizations are normally not very useful. Well-designed software should probably not use dynamic methods in the first place, and it should definitively not use dynamic methods for time critical operations. Never-the-less, in the rare case where you must call a 3rd party dynamic method in a loop, you now know how you can optimize such loops.

In a later article we’ll dig even further, exposing the structure of the DMT.

[Delphi syntax highlighting provided by DelphiDabbler PasH]

Tuesday, April 04, 2006

Dynamic methods and inherited

In an earlier blog post we covered how virtual methods and inherited calls work. In Delphi there is another kind of polymorphic method, the dynamic method. Note that this polymorphism series targets only the native Win32 platform, but suffice to mention that in Delphi for .NET, dynamic methods are actually identical to virtual methods. In Win32, message methods use the same underlying compiler structures and dispatch mechanism as dynamic methods, while in .NET an attribute- and reflection- based solution is used. We’ll cover message methods in a later article.

While message methods can be very useful, dynamic methods were originally created to work around data segment size issues in 16-bit Windows and DOS. They first appeared in Turbo Pascal for Windows, later in Borland Pascal 7.0 (that targeted real-mode DOS, protected mode DOS and 16-bit Windows). The problem it solved was the total size of all classes’ VMTs. In those days the VMT structures were crammed together in the 64 kB sized global data segment. If you have a base class with a large number of virtual methods, and a large number of descendant classes that only override a few of these methods, there will be some “wasted” space in the VMTs. This is because the non-overridden methods will still have a slot in each descendant’s VMT table – and all those slots will point to the base class methods. Now, with dynamic methods, the compiler instead builds a kind of sparse array for each class – the dynamic method table (DMT) – that is referenced from the VMT. Only newly introduced or overridden dynamic methods take up space in each class’ DMT. In a large class hierarchy like Turbo Vision, OWL or VCL with many descendants (think TComponent and TControl), using dynamic methods can save some space. In the days of the 64 kB data segment limit, this was crucial.

Nowadays, their usefulness is much more limited. In Win32 Delphi, the VMT and DMT structures are stored in the code segment, not in the data segment and there are no size limits (well, 2 GB). Calling a dynamic method can be significantly slower than calling a virtual method and the space savings is minuscule compared with the typical size of a VCL application. In fact, if you don’t have a large class hierarchy or if most methods are overridden, dynamic methods will create larger and slower code, not smaller. So the general advice is to avoid declaring dynamic methods in your classes – use virtual methods instead.

Ok, with that little history lesson under our belts, we’re ready to dive into the dynamic method semantics. Well, it’s not much of a dive, actually. The semantics is identical to how virtual methods work – both with regards to declaring and overriding methods and to use the implicit “inherited;” syntax vs. the explicit “inherited MethodName;” syntax.

But for completeness – here is the short story:

  TShape = class
procedure Clicked; dynamic;
end;
This declares a new dynamic method. As indicated above, you should think twice before doing this. At least make sure that the method you make dynamic is not used in a performance sensitive routine. A routine called as part of UI handling is probably ok – in this case handling a mouse-click on the shape object.
  TRectangle = class(TShape)
procedure Clicked; override;
end;

procedure TRectangle.Clicked;
begin
inherited Clicked;
ShowMessage('Ouch!');
end;

This will unconditionally call the inherited Clicked method in the base class. If the base class method is abstract, this will fail at run-time with an EAbstractError exception.

The alternative syntax is to call just "inherited;" – like this:

  procedure TRectangle.Clicked;
begin
inherited;
ShowMessage('Ouch!');
end;

When the parent method is non-abstract this will work identically as above, passing any parameters that the current routine was passed. If the base class method is abstract, the “inherited” call becomes a no-op. The compiler generates no code for it (and thus you cannot set a breakpoint on it).

In an upcoming article we’ll dig deeper into the inner workings of dynamic methods and the DMT, including a tip on how you can speed up code that needs to call a dynamic method in a performance sensitive loop.

[Delphi syntax highlighting provided by DelphiDabbler PasH]

Saturday, April 01, 2006

Book review: The Pragmatic Programmer

After finishing tech editing and reviewing Jon Shemitz’ excellent upcoming .NET 2.0 for Delphi programmers, I’ve got some more spare time to write blog articles and to read. Thanks to David Cummins and Pearson Publishing Group, we at Infront now have a small library of high-quality tech books, ranging from C# and .NET, to C++, security, hacking, methodology, science and more (the Delphi side we already had covered).

I often find myself reading two or thee books in parallel, but one of the first books I finished was Andrew Hunt and David Thomas’ already legendary book The Pragmatic Programmer. As the title indicates, this is a very pragmatic book, for programmers that want to keep improving their work. In contrast to so many other methodology books that wants to force you to follow a specific model of requirements gathering, specifications, design, coding, testing and deployment, this book drops all the self-righteous fluff. In many ways the advice in the book follows the now popular trends of eXtreme Programming and an Agile development process.

A few of the key concepts or advice in the book are:

  • DRY – Don’t Repeat Yourself – don’t duplicate code, logic, structure, documentation etc. Use code generation techniques where needed to mechanically generate files from a common master-template (for instance, generate a C++ header file, a Delphi import unit and HTML documentation from a single common template).
  • Understand the lower-level subsystems you talk to and depend on. Don’t use wizards that generate code unless you know how that code works. Don’t depend on mysterious magic. Know what you are doing.
  • Focus on algorithmic performance, not micro-optimizations. Know how to estimate Big-Oh performance of an algorithm.
  • Use and know productive tools such as IDEs, source control, bug and requirements tracking, testing frameworks etc.

The book covers much more ground that this, keeping an easy to read, approachable and down-to-earth style. It reinforced some of our existing work habits, reminded us about things we could do better and gave us a few tips for future exploration and experimentation to try and improve our development process.

This book is highly recommended!



Copyright © 2004-2007 by Hallvard Vassbotn