Thursday, May 11, 2006

Hack #10: Getting the parameters of published methods

This hack is not normally very useful, but inspired by a comment on the first published methods article, I started to investigate how it could be done. Recall that the compiler currently does not encode the method signature when generating RTTI for published methods – only the code address and name string is stored.

So initially, it seems impossible to obtain this information. But let’s backtrack and think about how the IDE handles events and published methods at design-time. If you already have a number of event handlers defined – implemented in a number of published methods on the form – the Object Inspector will filter and show the assignment compatible methods in a drop down list of each component’s events. How does the IDE know what methods it should display in this list – and what methods to filter out?

Well, since each design-time component is compiled into a package and registered with the IDE, the IDE has full access to the RTTI of the component. As we will (probably) see in an upcoming article, each published event property (OnClick, OnSelected etc) is described by the compiler with RTTI that includes information about the parameters of the event. From this information the IDE know the number and types of parameters an assignment compatible method must have. It also uses the event’s RTTI to build a correct signature when you double click the event to declare and assign a new method to it.

But it still doesn’t have access to any parameter RTTI for the published methods of the form. In fact, it doesn’t have access to a compiled representation of the form at all. It does have a triumph-card up its sleeve, though; it has full access to the source code of the form. The IDE “simply” parses the form source code and finds methods that have the correct number and types of parameters. This parsing is not perfect and it will not always be able to evaluate alias type declarations, so typically the parameter types used must be a verbatim copy of the types used in the event type declaration.

That doesn’t help us very much – we don’t have access to the form’s source code at runtime. As the omnipresent Anonymous pointed out in his comment to the Published Methods article, there are Delphi decompilers that are able to determine the parameter types of published methods in a form declaration. How do they accomplish that? Well, there are two clues – the form typically contains a list of published fields; these are the component references that the streaming system automatically assigns as it loads a .DFM. These fields have RTTI that includes the class type of the component. In addition the form has a Components array property containing references to all the components and controls owned by the form. By using either of these, we get access to all the components associated with the form.

These components will typically have one or more event properties assigned to methods of the form. If these events were assigned at design time, the methods they point to will be published. All component event properties that can be assigned at design time must also be published. The compiler provides RTTI for such events – including information about the parameters of the event – and thus the parameters of the assignment compatible published method that is assigned to the event.

Things will be slightly more complicated for a static Delphi decompiler, but the basic chain of information that must be back-tracked is the same. Writing a decompiler is outside the scope of this article (it is left as an exercise for the reader <eg>), but let’s try to write some simple code that can figure out the parameters of all published methods that have been assigned to a published event of an owned component.

The basic algorithm would be something like this:

  • We take an Instance and a TStrings as parameters
  • Loop through all the published methods of the object
  • For each published method
  • Loop through all published events
  • Get the value of each event property
  • If the published method address equals the Code value of the event, we have a link
  • Return the parameter RTTI of the event type – this is the parameters also used for the published method
  • Repeat the above steps for each owned component
That sounds straight-forward enough. Let’s try to turn it into code.
procedure GetPublishedMethodsWithParameters(Instance: TObject; 
List: TStrings);
var
i : integer;
Method: PPublishedMethod;
AClass: TClass;
Count: integer;
begin
List.BeginUpdate;
try
List.Clear;
AClass := Instance.ClassType;
while Assigned(AClass) do
begin
Count := GetPublishedMethodCount(AClass);
if Count > 0 then
begin
List.Add(Format('Published methods in %s', [AClass.ClassName]));
Method := GetFirstPublishedMethod(AClass);
for i := 0 to Count-1 do
begin
List.Add(PublishedMethodToString(Instance, Method));
Method := GetNextPublishedMethod(AClass, Method);
end;
end;
AClass := AClass.ClassParent;
end;
finally
List.EndUpdate;
end;
end;

GetPublishedMethodsWithParameters is the top level method that uses the utility routines from the previous article to iterate through all published methods of the instance. It adds a string representation of each method to a TStrings list. The conversion from a published method to a string is delegated to the PublishedMethodToString function.

function PublishedMethodToString(Instance: TObject; 
Method: PPublishedMethod): string;
var
MethodSignature: TMethodSignature;
begin
if FindPublishedMethodSignature(Instance,
Method.Address, MethodSignature) then
Result := MethodSignatureToString(Method.Name, MethodSignature)
else
Result := Format('procedure %s(???);', [Method.Name]);
end;

This function first tries to obtain the signature of the method using FindPublishedMethodSignature and if it succeeds it translates the method signature into a string representation using MethodSignatureToString. We’ll look at these routines shortly, but let’s first look at the definition for the method signature record.

  PMethodParam = ^TMethodParam;
TMethodParam = record
Flags: TParamFlags;
ParamName: PShortString;
TypeName: PShortString;
end;
TMethodParamList = array of TMethodParam;
PMethodSignature = ^TMethodSignature;
TMethodSignature = record
MethodKind: TMethodKind;
ParamCount: Byte;
ParamList: TMethodParamList;
ResultType: PShortString;
end;

These definitions are my own structures to make it easier to access event type RTTI without struggling with variable length records due to packed shortstring fields. My records are a copy of and point to the raw RTTI structures generated by the compiler and exposed by the TypInfo unit. Here are the relevant definitions from TypInfo.

type
TMethodKind = (mkProcedure, mkFunction, mkConstructor,
mkDestructor, mkClassProcedure, mkClassFunction,
{ Obsolete }
mkSafeProcedure, mkSafeFunction);
TParamFlag = (pfVar, pfConst, pfArray, pfAddress, pfReference, pfOut);
TParamFlags = set of TParamFlag;
TTypeData = packed record
case TTypeKind of
/// ...
tkMethod: (
MethodKind: TMethodKind;
ParamCount: Byte;
ParamList: array[0..1023] of Char
{ParamList: array[1..ParamCount] of
record
Flags: TParamFlags;
ParamName: ShortString;
TypeName: ShortString;
end;
ResultType: ShortString});
end;

Ok. The TTypeData record encodes an event type (a method pointer property) in the following way. The MethodKind field indicates what kind of method this is – AFAICT only two values are currently used – mkProcedure and mkFunction – corresponding to procedure … of object and function … of object declarations, respectively. Then there is a byte containing the number of parameters the method has – limiting the number of parameters in an event type to 255 :-). Then there is a packed array of packed records with information about each parameter; parameter kind (var, const, out, array of), parameter name and type. Following all the parameters is a string with the name of the type that the method returns, if MethodKind was mkFunction.

Since the ParamName, TypeName and ResultType are all encoded as packed shortstrings that is very awkward to deal with, I declared the TMethodParam and TMethodSignature records above. Here is the GetMethodSignature function that converts from a PPropInfo of an event to the easier-to-use TMethodSignature record.

function PackedShortString(Value: PShortstring; 
var NextField{: Pointer}): PShortString; overload;
begin
Result := Value;
PShortString(NextField) := Value;
Inc(PChar(NextField), SizeOf(Result^[0]) + Length(Result^));
end;

function PackedShortString(var NextField{: Pointer}): PShortString; overload;
begin
Result := PShortString(NextField);
Inc(PChar(NextField), SizeOf(Result^[0]) + Length(Result^));
end;

function GetMethodSignature(Event: PPropInfo): TMethodSignature;
type
PParamListRecord = ^TParamListRecord;
TParamListRecord = packed record
Flags: TParamFlags;
ParamName: {packed} ShortString; // Really string[Length(ParamName)]
TypeName: {packed} ShortString; // Really string[Length(TypeName)]
end;
var
EventData: PTypeData;
i: integer;
MethodParam: PMethodParam;
ParamListRecord: PParamListRecord;
begin
Assert(Assigned(Event) and Assigned(Event.PropType));
Assert(Event.PropType^.Kind = tkMethod);
EventData := GetTypeData(Event.PropType^);
Result.MethodKind := EventData.MethodKind;
Result.ParamCount := EventData.ParamCount;
SetLength(Result.ParamList, Result.ParamCount);
ParamListRecord := @EventData.ParamList;
for i := 0 to Result.ParamCount-1 do
begin
MethodParam := @Result.ParamList[i];
MethodParam.Flags := ParamListRecord.Flags;
MethodParam.ParamName := PackedShortString(
@ParamListRecord.ParamName, ParamListRecord);
MethodParam.TypeName := PackedShortString(ParamListRecord);
end;
Result.ResultType := PackedShortString(ParamListRecord);
end;

It uses a couple of overloaded helper routines to get at the packed shortstrings and to advance the current record pointer accordingly. I also had to re-declare the packed TParamListRecord, as the version in TypInfo is commented out. We’ll probably scrutinize the PPropInfo structures later – in this context it suffices to say that we are able to get at the interesting information about the event type’s method signature, and return it in an edible and useful format.

Right, now we have two disconnected pieces of code – we have code that loops through all published methods, trying to convert them into describing strings – and we have code to get at the method signature of an event property. Now we have to connect the two pieces of logic to perform something “useful”. There are two missing links; finding a event property that points to a given published method – and converting a method signature record into a readable string.

Looking at the high-level algorithm we defined above, we need to loop through all published events. Here is some code for that:

function FindEventProperty(Instance: TObject; Code: Pointer): PPropInfo;
var
Count: integer;
PropList: PPropList;
i: integer;
Method: TMethod;
begin
Assert(Assigned(Instance));
Count := GetPropList(Instance, PropList);
if Count > 0 then
try
for i := 0 to Count-1 do
begin
Result := PropList^[i];
if Result.PropType^.Kind = tkMethod then
begin
Method := GetMethodProp(Instance, Result);
if Method.Code = Code then
Exit;
end;
end;
finally
FreeMem(PropList);
end;
Result := nil;
end;

This will get a list of all published properties, filtering out the event properties (tkMethod), getting the current event value and checking if it points to a specific code address. If it does, we return the PPropInfo of the event property, otherwise we return nil. This code will only check a single instance, but we need to check all owned components (if the instance happens to be a TComponent) – so let’s write a routine to do that recursively.

function FindEventFor(Instance: TObject; Code: Pointer): PPropInfo;
var
i: integer;
Component: TComponent;
begin
Result := FindEventProperty(Instance, Code);
if Assigned(Result) then Exit;
if Instance is TComponent then
begin
Component := TComponent(Instance);
for i:= 0 to Component.ComponentCount-1 do
begin
Result := FindEventFor(Component.Components[i], Code);
if Assigned(Result) then Exit;
end;
end;
Result := nil;
// TODO: Check published fields system
end;

This function tries to find an event property that is assigned to a specific code address. It searches in this instance then in all its owned components (if the instance is a component)

Here we use the Components array that all components and controls have to check if any of those might have an event property that points to the specific code address of interest. As the comment indicates we could also (or instead) have checked the instances referenced by any published fields. Since the RTL does not have any easy to use routines to iterate through all published fields and we haven’t got that far in our VMT-digging series yet, I’ve skipped this for now. Besides, the published field references and the Components array references are (mostly) duplicates of each other.

Now we have enough plumbing code to write the final link between the published methods loop and the event searching logic. Here is the FindPublishedMethodSignature function that PublishedMethodToString calls above.

function FindPublishedMethodSignature(Instance: TObject; Code: Pointer; 
var MethodSignature: TMethodSignature): boolean;
var
Event: PPropInfo;
begin
Assert(Assigned(Code));
Event := FindEventFor(Instance, Code);
Result := Assigned(Event);
if Result then
MethodSignature := GetMethodSignature(Event);
end;

This routine first uses the recursive FindEventFor to try and find an event’s PPropInfo that describes the method and if it finds one, it converts the hard-to-use PPropInfo to an easy-to-use TMethodSignature. Finally we only have to write some boilerplate code to convert the binary TMethodSignature record into a human readable string.

function MethodKindString(MethodKind: TMethodKind): string;
begin
case MethodKind of
mkSafeProcedure,
mkProcedure : Result := 'procedure';
mkSafeFunction,
mkFunction : Result := 'function';
mkConstructor : Result := 'constructor';
mkDestructor : Result := 'destructor';
mkClassProcedure: Result := 'class procedure';
mkClassFunction : Result := 'class function';
end;
end;

function MethodParamString(const MethodParam: TMethodParam;
ExcoticFlags: boolean = False): string;
begin
if pfVar in MethodParam.Flags then Result := 'var '
else if pfConst in MethodParam.Flags then Result := 'const '
else if pfOut in MethodParam.Flags then Result := 'out '
else Result := '';
if ExcoticFlags then
begin
if pfAddress in MethodParam.Flags then Result := '{addr} ' + Result;
if pfReference in MethodParam.Flags then Result := '{ref} ' + Result;
end;
Result := Result + MethodParam.ParamName^ + ': ';
if pfArray in MethodParam.Flags then
Result := Result + 'array of ';
Result := Result + MethodParam.TypeName^;
end;

function MethodParametesString(const MethodSignature:
TMethodSignature): string;
var
i: integer;
MethodParam: PMethodParam;
begin
Result := '';
for i := 0 to MethodSignature.ParamCount-1 do
begin
MethodParam := @MethodSignature.ParamList[i];
Result := Result + MethodParamString(MethodParam^);
if i < MethodSignature.ParamCount-1 then
Result := Result + '; ';
end;
end;

function MethodSignatureToString(const Name: string;
const MethodSignature: TMethodSignature): string;
begin
Result := Format('%s %s(%s)',
[MethodKindString(MethodSignature.MethodKind),
Name,
MethodParametesString(MethodSignature)]);
if Length(MethodSignature.ResultType^) > 0 then
Result := Result + ': ' + MethodSignature.ResultType^;
Result := Result + ';';
end;

Phew! This article is getting long and with a lot of code! But now we have some serious (but pretty useless) reverse engineering code to dig out the parameters of a published method. Note that this only works if the instance (or one of its components) also has a published property that points to the published method. The good news is that this is the case for most existing published methods – such as the event handlers on a TForm instance. The bad news is that this would not be the case for any published methods we would like to call dynamically at runtime (and thus would not be assigned to any events).

If you’re still hanging in there, we can now write some test code to see if this thing works or not.

type
{$M+}
TMyClass = class;
TOnFour = function (A: array of byte; const B: array of byte;
var C: array of byte; out D: array of byte): TComponent of object;
TOnFive = procedure (Component1: TComponent;
var Component2: TComponent;
out Component3: TComponent;
const Component4: TComponent) of object;
TOnSix = function (const A: string; var Two: integer;
out Three: TMyClass; Four: PInteger; Five: array of Byte;
Six: integer): string of object;
TMyClass = class
private
FOnFour: TOnFour;
FOnFive: TOnFive;
FOnSix: TOnSix;
published
function FourthPublished(A: array of byte; const B: array of byte;
var C: array of byte; out D: array of byte): TComponent;
procedure FifthPublished(Component1: TComponent;
var Component2: TComponent;
out Component3: TComponent;
const Component4: TComponent);
function SixthPublished(const A: string; var Two: integer;
out Three: TMyClass; Four: PInteger;
Five: array of Byte; Six: integer): string;
property OnFour: TOnFour read FOnFour write FOnFour;
property OnFive: TOnFive read FOnFive write FOnFive;
property OnSix: TOnSix read FOnSix write FOnSix;
end;

function TMyClass.FourthPublished;
begin
Result := nil;
end;
procedure TMyClass.FifthPublished;
begin
end;
function TMyClass.SixthPublished;
begin
end;

procedure DumpPublishedMethodsParameters(Instance: TObject);
var
i : integer;
List: TStringList;
begin
List := TStringList.Create;
try
GetPublishedMethodsWithParameters(Instance, List);
for i := 0 to List.Count-1 do
writeln(List[i]);
finally
List.Free;
end;
end;

procedure Test;
var
MyClass: TMyClass;
begin
MyClass := TMyClass.Create;
MyClass.OnFour := MyClass.FourthPublished;
MyClass.OnFive := MyClass.FifthPublished;
MyClass.OnSix := MyClass.SixthPublished;
DumpPublishedMethodsParameters(MyClass);
end;

begin
Test;
readln;
end.

When we run this we get:
Published methods in TMyClass

function FourthPublished(A: array of Byte; const B: array of Byte; var C: array of Byte; out D: array of Byte): TComponent;
procedure FifthPublished(Component1: TComponent; var Component2: TComponent; out Component3: TComponent; const Component4: TComponent);
function SixthPublished(const A: String; var Two: Integer; out Three: TMyClass; Four: PInteger; Five: array of Byte; Six: Integer): String;

Looks pretty accurate to me! The test code above is a little contrived – an instance would not assign its event properties to its own methods. A more realistic test case would be a form with numerous published events hooked up at design time. I loaded up the \Demos\RichEdit\RichEdit.dpr project shipped with Delphi 7 (in Delphi 2006 the path is \Demos\DelphiWin32\VCLWin32\RichEdit\RichEdit.bdsproj). On the main form in the remain.pas unit, I added my HVPublishedMethodParams unit to the uses clause and changed the Help | About event handler like this:

procedure TMainForm.HelpAbout(Sender: TObject);
begin
GetPublishedMethodsWithParameters(Self, Editor.Lines);
{ with TAboutBox.Create(Self) do
try
ShowModal;
finally
Free;
end;}
end;

This will dump all published methods of the form to the edit control – trying to match them up to events with RTTI to find parameter information. When I ran the demo app and selected Help | About, the edit control was filled with this text:
Published methods in TMainForm

procedure SelectionChange(Sender: TObject);
procedure FormCreate(Sender: TObject);
procedure ShowHint(???);
procedure FileNew(Sender: TObject);
procedure FileOpen(Sender: TObject);
procedure FileSave(Sender: TObject);
procedure FileSaveAs(Sender: TObject);
procedure FilePrint(Sender: TObject);
procedure FileExit(Sender: TObject);
procedure EditUndo(Sender: TObject);
procedure EditCut(Sender: TObject);
procedure EditCopy(Sender: TObject);
procedure EditPaste(Sender: TObject);
procedure HelpAbout(Sender: TObject);
procedure SelectFont(Sender: TObject);
procedure RulerResize(Sender: TObject);
procedure FormResize(Sender: TObject);
procedure FormPaint(Sender: TObject);
procedure BoldButtonClick(Sender: TObject);
procedure ItalicButtonClick(Sender: TObject);
procedure FontSizeChange(Sender: TObject);
procedure AlignButtonClick(Sender: TObject);
procedure FontNameChange(Sender: TObject);
procedure UnderlineButtonClick(Sender: TObject);
procedure BulletsButtonClick(Sender: TObject);
procedure FormCloseQuery(Sender: TObject; var CanClose: Boolean);
procedure RulerItemMouseDown(Sender: TObject; Button: TMouseButton;
Shift: TShiftState; X: Integer; Y: Integer);
procedure RulerItemMouseMove(Sender: TObject; Shift: TShiftState;
X: Integer; Y: Integer);
procedure FirstIndMouseUp(Sender: TObject; Button: TMouseButton;
Shift: TShiftState; X: Integer; Y: Integer);
procedure LeftIndMouseUp(Sender: TObject; Button: TMouseButton;
Shift: TShiftState; X: Integer; Y: Integer);
procedure RightIndMouseUp(Sender: TObject; Button: TMouseButton;
Shift: TShiftState; X: Integer; Y: Integer);
procedure FormShow(Sender: TObject);
procedure RichEditChange(Sender: TObject);
procedure SwitchLanguage(Sender: TObject);
procedure ActionList2Update(Action: TBasicAction; var Handled: Boolean);

This is pretty much a verbatim copy of the published methods in the interface section of the form unit. We are only missing parameters for the ShowHint method. This is because this published method does not have any design-time event properties pointing to it. Instead it is assigned at runtime to one of Application’s events.

    procedure ShowHint(Sender: TObject);
///…
procedure TMainForm.FormCreate(Sender: TObject);
begin
Application.OnHint := ShowHint;
//…
end;

The TApplication object does not publish any of its properties, so there is no straightforward way of obtaining the ShowHint parameters. In fact, having the ShowHint method as published is a minor flaw – it should be made private instead.

That concludes this intriguing, but AFAICS, useless hack. Now you should have a better understanding of how published methods and published events are interconnected at runtime and how Delphi decompilers can perform some of their magic. We have also illustrated just how much information about your program is stored in the EXE file – you better make sure you don’t include any sensitive information in your published method names or event parameter names and types :-).

Hope you have enjoyed the ride!

9 comments:

Anonymous said...

That's a long post! I agree it would be hard to find something useful to do with this hack, but it's very interesting nonetheless.

I wonder if DevCo could be persuaded to add a compiler switch to add method signature RTTI to published methods, or in fact to add RTTI to everything. (Somewhat like a bigger version of {$K+} if I remember the flag correctly.) That would allow things like this to be done for private or protected methods / properties as well.

I think that could actually be quite useful for allowing more metaprogramming than Delphi currently does. The RTTI system was a leading feature a few years ago, so personally I think it would be worth expanding now.

Hallvards New Blog said...

>add a compiler switch to add method
>signature RTTI to published methods

Maybe. Danny Thorpe responded to this question in Delphi 5 live chat in 1999:

http://community.borland.com/article/1,1410,20384,00.html

"dthorpe
Encoding parameter type info in RTTI increases the size of the RTTI. Since 99% of the uses of RTTI do not require dynamic invocation of arbitrary methods, it's hard to justify this additional overhead and code bloat. If you want to implement late-bound OLE automation, use IDispatch."

My guess is that if Borland doesn't need it for something, it will most probably not be added.

> bigger version of {$K+}

It's $M, actually :).

Feel free to QC your suggestions!

Atorian said...

Awesome - just what I was looking for.

Anonymous said...

I found a practical use for this. I'm building a program that uses scripting support with RemObjects PascalScript, and scripts can be used as event handlers. With this and PascalScript's comparable method of extracting function signatures, I can verify that an event handler in the script is compatible with the Delphi event the user wants to hook it to without having to hard-code an analysis routine for every different event type.

Hallvards New Blog said...

> I found a practical use for this.

That's great to hear Mason - thanks for informing us!

Anonymous said...

I have a use for this as well... to check if my unit tester is testing all the published methods in a specific class (this would help me determine which overload is or isn't being tested).

However, I cannot get this to work, the line:
then Result := Pmt.Count
in GetPublishedMethodCount returns a value of 30696, which seems to be wrong.

Also do you have a .pas file, or even just a webpage, with all the code working so that someone could easily download it and try it themselves?

Thanks

btw I'm using Delphi 7

Hallvards New Blog said...

> check if my unit tester is testing all the published methods in a specific class

Nice! ;)

> in GetPublishedMethodCount returns a value of 30696, which seems to be wrong.

That does sound a little excessive, yes. I'm not sure what is wrong, without seeing all the code.

> Also do you have a .pas file, or even just a webpage, with all the code working so that someone could easily download it and try it themselves?

Yes, I've uploaded it to codecentral, here:
http://cc.codegear.com/Item/24074

André said...

I get the following exception in Delphi 2007, when I use HVMethodInfoClasses.GetClassInfo:

---------------------------
Debugger Exception Notification
---------------------------
Project TestPubMethodParams.exe raised exception class Exception with message 'RTTI for the published method "TestEventProc" of class "TTest" has 34 extra bytes of unknown data!'.
---------------------------
Break Continue Help
---------------------------

If I disable this exception, everything works perfect! Thanks for the very nice code!

Anonymous said...
This comment has been removed by a blog administrator.


Copyright © 2004-2007 by Hallvard Vassbotn