Friday, March 24, 2006

Method calls compiler implementation

To support its many language features, the Delphi compiler uses a number of different data structures and machine code generation patterns. To help in understanding how the Delphi language features are implemented, to make it easier to recognize these patterns when debugging at the assembly level, and to make it possible to know what you're doing if you want to run-time patch code, I'll discuss some of these patters here.

The discussion follows the code patterns used by the D7, D2005 and D2006 [update: and D2007] Win32 compilers, but it should be more-or-less identical for other Win32 versions of Delphi.

Call to non-virtual method

The simplest case is when the compiler generates code for a call to a normal, non-virtual method or global routine. It uses the x86 CALL assembly instruction that takes an immediate relative offset as part of the instruction stream. This means that there is no overhead to calculate or fetch the target address from external memory. The address is part of the instruction encoding and the CPU will perform a perfect target address prediction. For instance:

type
TFoo = class
procedure Bar;
end;

var
Foo: TFoo;
begin
Foo := TFoo.Create;
Foo.Bar;


The generated code for the Foo.Bar call is:


MOV EAX, [Foo] CALL <Relative offset of TFoo.Bar>


The first instruction loads the implicit self pointer into the EAX register (by default all Delphi code uses the register calling convention). The offset in the CALL instruction is relative to the current IP (Instruction Pointer) register. This means that the encoding of calls to the same routine in different call-sites will use different relative offsets. The main reason to use offsets instead of the more obvious absolute address is to reduce the need of patching if a module (for instance a DLL) is moved around in memory (or rather loaded at another address than its predefined base address).



Call to virtual method
Virtual methods is the basis of polymorphism in Delphi. The whole point of polymorphism is that descendent classes may override the method from a base class, implementing specific behaviour. At the implementation level, this means that the compiler can no longer hard-code a specific target address and the CPU no longer has the luxury of being able to perfectly predict the target address.



At the Win32 compiler implementation level, virtual methods are dispatched using a CALL instruction that indirects via the object’s VMT table. Each virtual method has at compile time a unique, static index or VMT offset associated with it. You cannot get at this offset directly using Pascal code, but BASM now has a VMTOFFSET directive to be able to call virtual methods in a clean way. Here is an example of calling a virtual method from BASM:

type
TMyClass = class
procedure Method; virtual;
end;

procedure TMyClass.Method;
begin
writeln(ClassName, '.Method');
end;

procedure CallMyMethod(Instance: TMyClass);
asm
MOV ECX, [EAX]
CALL [ECX + VMTOFFSET TMyClass.Method]
end;

var
Instance: TMyClass;
begin
Instance := TMyClass.Create;
CallMyMethod(Instance);
readln;
end.

Here is an example of the machine code instructions involved in a virtual dispatch:


 // EAX contains the object instance
MOV ECX, [EAX]    // Get VMT pointer into ECX
CALL [ECX+0x014]  // Virtual dispatch via VMT method slot


In the next blog entry we will delve further into the details of the virtual method table, including a hack how to call through the VMT explicitly. [Updated: Delphi syntax highlighting provided by DelphiDabbler PasH]

5 comments:

BlackTigerX said...

welcome back!
very good stuff as always, keep it coming

Anonymous said...

Can u put a post that how i can intercept methods in memory and call something before e after the real method?

And show how we can call virtual methods with parameters?

Anonymous said...

Hi there

I'm attempting to use the CreateDispTypeInfo() (ActiveX.pas) in order
to create, at runtime, a COM type library.

Having a type library for a class, makes it easy to convert an object of
that class to an IDispatch, for scripting purposes.

After that creation it's easy to use the DispGetIDsOfNames() and DispInvoke()
of the ActiveX.pas to carry out the needed IDispatch calls for the object.

The first parameter of the CreateDispTypeInfo() is a TInterfaceData (ActiveX.pas again)
that should be filled by the caller. That is, the caller has to provide information
regarding the methods and properties of the class.

Here are the relevant declarations taken from the ActiveX.pas.

TParamData = record
szName: POleStr;
vt: TVarType;
end;

PParamDataList = ^TParamDataList;
TParamDataList = array[0..65535] of TParamData;

TMethodData = record
szName: POleStr;
ppdata: PParamDataList;
dispid: TDispID;
iMeth: Integer;
cc: TCallConv;
cArgs: Integer;
wFlags: Word;
vtReturn: TVarType;
end;

PMethodDataList = ^TMethodDataList;
TMethodDataList = array[0..65535] of TMethodData;

TInterfaceData = record
pmethdata: PMethodDataList;
cMembers: Integer;
end;


Those methods of the object exposed to automation, should be virtual.
The TMethodData.iMeth is the Index of the method in the virtual method
table. 0, 1, 2 and so on.

Given the above it's relatively easy to provide by hand all the needed
information and pack it to a TInterfaceData record. But of cource it is
error prone.

I still use Delphi 7 and I thought I could use the METHOD_INFO directive
in order to instruct the compiler to produce type information for the
class and then use RTTI to get the information I need to fill the TMethodData
structures.

But how do I know if a method is virtual or not?

Here is the structures and the code I use to get a pointer to a method table.

PMethod = ^TMethodRec;
TMethodRec = packed record
Size : Word;
Address : Pointer;
Name : ShortString;
end;

PMethodTable = ^TMethodTableRec;
TMethodTableRec = packed record
Count : Word;
FirstEntry : TMethodRec;
end;

pMT := PPointer(Integer(AClassRef) + vmtMethodTable)^;

I see no way to find out if a method is virtual.
Is there any such a way?

Please excuse the long post.

Best regards

Theo Bebekis
Thessaloniki, Greece
teo.bebekis@gmail.com

PS. There is the TObjectDispatch (in ObjComAuto.pas) which converts
a plain object to an IDispatch one. But the ObjectInvoke() (ObjAuto.pas),
the TObjectDispatch uses to carry out the IDispatch.Invoke(),
is far from being perfect. It can not handle many things.
So I'm trying to find an alternative.

Hallvards New Blog said...

> But how do I know if a method is virtual or not?

Generally, you don't ;).

It is the call-site that calls the method that determines if the method will be called virtually or statically. The contents of the method itself does not change.

That said, there is a possible hack that could be used to determine if a specific method address matches a virtual method in the class' VMT or not.

This blog article lists a Delphi definition of a VMT - with the UserDefinedVirtuals field commented out:

{UserDefinedVirtuals: array[0..999] of procedure;}

Quote from another blog post:
"The compiler keeps track of the number of virtual methods in each class (as part of the compile-time class information stored in the .dcu), but the code it generates does not need it, and thus there is no VirtualMethodCount field in the VMT. "

But it should be possible to iterate until you get to a value that is outside the code address space of the module.

Anonymous said...

Hi again

thanks for the answer.

I'm an application developer, mostly database applications.
I hardly understand what is going on at a low level.
Anyway I came up with these functions.


function GetObjectVMT(Obj: TObject): Cardinal;
asm
MOV EAX, Obj
MOV EDX, [EAX]
MOV Result, EDX
end;

function IsVirtual(pProc: Pointer; VMTOfs: Cardinal; var MethodIndex: Integer): Boolean;
var
i : Word;
begin
Result := False;
MethodIndex := -1;

for i := 0 to 92 do { hope we don't exceed any limit here, is that 92 safe? }
begin
VMTOfs := VMTOfs + (i * 4);
if pProc = PPointer(VMTOfs)^ then
begin
Result := True;
MethodIndex := i;
Exit; //==>
end;
end;

end;

So, provided that we have, in hand, a list of method addresses
(and we could collect them easily using RTTI)
then it's easy to distinguish which is virtual and which is not
by using the GetObjectVMT() and IsVirtual().

At least they work fine in my tests.

Thanks again for your time.
My best regards.


Theo Bebekis
Thessaloniki, Greece
teo.bebekis@gmail.com



Copyright © 2004-2007 by Hallvard Vassbotn