Friday, November 02, 2007

DN4DP: The Delphi Language Chapter

We have finally come to an end in the long running series of of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

All the chapter excerpts that I have posted can be seen by clicking on the DN4DP blog label. As a service to our readers, I'm also including a full list of all the post links here.

Classic Delphi and .NET book in the making

Come get a free sample chapter!

.NET 2.0 for Delphi programmers available now

DN4DP#1: Getting classy

DN4DP#2: Protecting your privates

DN4DP#3: Nesting habits

DN4DP#4: Setting new records

DN4DP#5: Redefining the operators

DN4DP#6: Enumerating collections

DN4DP#7: Inlined routines

DN4DP#8: Unicode identifiers

DN4DP#9: Escaping keywords

DN4DP#10: With a little help from your friends

DN4DP#11: The try-finally-Free pattern

DN4DP#12: Record Helpers

DN4DP#13: Overloaded default array properties

DN4DP#14: .NET platform support: Boxing

DN4DP#15: .NET only: Attributes support

DN4DP#16: .NET only: Floating-point semantics

DN4DP#17: .NET only: Multi-unit namespaces

DN4DP#18: .NET only: New array syntax

DN4DP#19: .NET only: Unsafe code

DN4DP#20: .NET only: Multi-cast events

DN4DP#21: .NET only: Undocumented corner

DN4DP#22: .NET only: P/Invoke magic

DN4DP#23: .NET only: Obsolete features

DN4DP#24: .NET vs Win32: Untyped parameters

DN4DP#25: .NET vs Win32: Casting

DN4DP#26: .NET vs Win32: Initialization and finalization

DN4DP#27: .NET vs Win32: Abstract classes

DN4DP#28: .NET vs Win32: Class references

DN4DP#29: .NET vs Win32: Constructors

DN4DP#30: Delphi vs C#

 

Hope you have enjoyed the series - and that you have bought (or will buy) the book! ;) Jon's book is stuffed with good stuff and is generally a much better read than "mine" chapter.


Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

Thursday, November 01, 2007

DN4DP#30: Delphi vs C#

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at .NET and Win32 constructors. This is the final post in this long running series and it covers the main differences between Delphi and C#.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Delphi vs C#

Very similar, but different in the details

If you are faced with the task for porting code from native Delphi to C#, or to convert C# code snippets to Delphi, or to make in informed decision of what language to use, it is useful to know the unique strengths and features of each language. In this section we quickly iterate over the highlights of each language and include some tips on porting between them.

Note that we compare Delphi for .NET to C# version 1.0, not version 2.0. It wouldn’t be fair to compare a .NET 1.1 product like Delphi for .NET 2006 to C# 2.0 with its support for generics, nullable types and iterators.

Delphi language highlights

Of the features that we have already covered, class helpers, unmanaged exports and virtual library interfaces are unique features that Delphi has. In one sense class helpers are very similar to the extension methods of the upcoming C# 3.0 standard to support the Linq (Language integrated query) technology. The main difference is that class helpers cannot help interfaces (at least not yet) and class helpers are more structured.

The following table lists the most important Delphi language features that C# does not have and what their alternative is when porting code.

Delphi feature Comment C# alternatives
class helpers Platform-leveling compiler magic

Explicit static methods C# 3.0 extension methods

Unmanaged exports a.k.a. reverse P/Invoke Use C++, hacks
Virtual Library Interfaces a.k.a. dynamic P/Invoke Use unmanaged C++, hacks
sets Limited to ordinal types with <= 256 elements

enum flags, BitArray,
int bit-fiddling

class of references Meta classes System.Type
virtual class methods Meta class polymorphism System.Type, reflection
virtual constructors Class factories Activator, reflection
Type-less var and out parameters Poor-man’s generics C# 2.0 Generics, System.Object function
type aliases, typed types Logical vs. actual types Explicit typing
Default parameters Simpler than overloading Overloading
resourcestrings Simplified internationalization Resources and ResourceManager.GetString
Named constructors Simulated using overloading Overloading
message methods Dispatching windows messages WndProc switch
Variants One-type fits all System.Object boxing
Global routines Non-OOP code Static methods of a class
Global variables Non-OOP data Static fields of a class
Named array properties Multiple array properties

Overloaded this indexer
Nested class with this indexer

Local (nested) procedures Implementation hiding, automatic access to outer variables private method, anonymous method (C# 2.0)
variant records (case) Structure overlaying (union)

[StructLayout(LayoutKind.Explicit)],
[FieldOffset()]

Text files, Writeln, etc Easy input/output Console and Stream classes
Supports Win32, Linux Cross-platform capabilities

Use C/C++,
Mono for Linux


One difference that is important to be aware of is that hard-casts and safe-casts have opposite syntax in Delphi and C#. The safe exception-raising cast is (O as TargetType) in Delphi and (TargetType)O in C#. The nil/null-returning cast is TargetType(O) in Delphi and (O as TargetType) in C#.

C# language highlights

The following C# 1.0 features are not directly available in Delphi, but have alternative ways of achieving the same goal.













































C# feature Comment Delphi alternatives
lock Thread synchronization
Monitor.Enter(O); 
try
// ..
finally
Monitor.Exit(O);
end;
fixedGarbage collection object pinning
H := GCHandle.Alloc(..) 
try
P := H.AddrOfPinnedObject;
finally
H.&Free;
end;
using Deterministic releasing of unmanaged resources
O := TMy.Create; 
try
// ..
finally
O.Free;
end;
C# destructor ~ClassNameGarbage collection deallocate notificationoverride Finalize method
stackallocUnsafe code temporary allocationsGCHandle.Alloc dynamic array

checked/
unchecked

Integer arithmetic overflow checking
{$OVERFLOWCHECKS ON/OFF}
{$Q+/-}
readonly fieldRead-only fields initialized in a constructorconst, read-only property, normal field
returnSet function result and return to caller
Result := O; 
Exit;
volatile fieldMay be modified outside current threadExplicit locks, Thread.VolatileRead/Write
internal accessPer-assembly cross-class implementation details

public,
protected with cracker-cast

ternary ? : operatorInline test and return resultif .. then .. else, IfThen routines
switch (string)Multi-case testing of stringsnested if..then..else, TStringList.IndexOf


In Delphi an overridden Destroy destructor maps to an implementation of IDisposable, while in C# you must implement IDisposable explicitly. In C# a ~ClassName destructor maps to an overridden Object.Finalize method, while in Delphi Finalize must be overridden manually. In most cases, application-level code needs to implement IDisposable, but should not override Finalize – that should be left to low-level leaf-classes in the FCL.

"

Wednesday, October 31, 2007

DN4DP#29: .NET vs Win32: Constructors

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at class references. This posts covers .NET vs Win32 constructors.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Constructors

While it is a good rule in Win32 to have all constructors call an inherited or peer constructor, the compiler does not enforce it. In .NET the runtime refuses to load types that break this rule. In addition you cannot access inherited fields or call any methods until you have called an inherited constructor.

type
TBar = class
protected
FInheritedField: integer;
end;
TFoo = class(TBar)
private
FField: integer;
procedure Method;
public
constructor Create;
end;

constructor TFoo.Create;
begin
FField := 42;
{$IFNDEF CLR}
Method;
FInheritedField := 13;
{$ENDIF}
inherited Create;
FInheritedField := 13;
Method;
end;

Note that unlike C#, in Delphi you can still modify the fields of the current instance before calling the inherited constructor."

Tuesday, October 30, 2007

DN4DP#28: .NET vs Win32: Class references

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at differences in abstract class behavior. Here we look at class references.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Class references 

For the most part, the semantics of using class of references to create late-bound types of classes is unchanged in .NET. The only noticeable difference is that in .NET the constructor you call through the class reference must be declared virtual. In Win32 it doesn’t strictly have to be, but normally it should be declared virtual."

Sunday, October 28, 2007

DN4DP#27: .NET vs Win32: Abstract classes

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at initialization and finalization sections. This post covers some minor differences in abstract class behavior.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Abstract classes

The concept of abstract classes is strictly enforced by the CLR runtime – it will not allow you to instantiate instances of abstract classes or classes containing abstract methods. In Win32 you can create instances of classes containing abstract methods. Normally you will get a compiler warning, though – this warning has been turned into a compiler error in .NET.

When creating instances through a class of reference, the compiler’s static checking cannot prevent you from compiling code that could potentially instantiate abstract classes. In Win32 this will go undetected at runtime, unless you actually call an abstract method – then you will get an EAbstractError exception. In .NET, the runtime will raise an exception if you try to call the constructor of an abstract class. The AbstractClasses project demonstrates these differences.

Code Sample

unit AbstractClassesU;

interface

type
TFoo = class
procedure Bar; virtual; abstract;
constructor Create; virtual;
constructor Create2;
end;
TFooClass = class of TFoo;
TBar = class(TFoo)
procedure Bar; override;
constructor Create; override;
end;

procedure Test;

implementation

var
Foo: TFoo;
FooClass: TFooClass;

{ TFoo }

constructor TFoo.Create;
begin
inherited Create;
writeln('TFoo.Create');
end;

constructor TFoo.Create2;
begin
inherited Create;
writeln('TFoo.Create2');
end;

{ TBar }

constructor TBar.Create;
begin
inherited Create;
writeln('TBar.Create');
end;

procedure TBar.Bar;
begin
writeln('TBar.Bar');
end;

{$DEFINE TEST_ERRORS}
{$IFDEF TEST_ERRORS}
procedure TestErrors;
begin
// Direct creation of class with abstract method
Foo := TFoo.Create; // Win32 Warning / .NET error

// Call of abstract method
Foo.Bar; // Win32 run-time exception

// Calling non-virtual constructor through class reference
FooClass := TBar;
Foo := FooClass.Create2; // .NET compile-time error

// Creation of abstract class via class reference
FooClass := TFoo;
Foo := FooClass.Create; // .NET run-time error

// Call of abstract method
Foo.Bar; // Win32 run-time error
end;
{$ENDIF}

procedure TestOK;
begin
// Creation of concrete class via class reference
FooClass := TBar;
Foo := FooClass.Create;
Foo.Bar;
writeln('OK');
end;

procedure Test;
begin
TestOK;
{$IFDEF TEST_ERRORS}
TestErrors;
{$ENDIF}
end;

end.

"

Friday, October 26, 2007

DN4DP#26: .NET vs Win32: Initialization and finalization

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at the .NET and Win32 casting issues. Here we quickly covers some potential gotchas related to initialization and finalization sections.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

 "Initialization and finalization

On the native Win32 platform, the Delphi unit initialization and finalization sections have exact (static) execution order and execution time semantics. On the .NET platform, the Delphi compiler implements initialization sections in terms of class constructors – this means that the order of execution is dependent on what types are used in what order at runtime. Likewise, running of an assembly’s unit finalization sections is triggered by a global object’s Finalize method – this implies that the code runs at an unspecified time on the CLR’s finalizer thread (and it is not always guaranteed to occur).

While most simple initialization and finalization code should work as is, you should be careful with code that relies on the execution order of these sections, code that touches types from other units and code that closes physical resources such as files. Old code that only frees memory in the finalization section can typically be IFDEFed out in .NET code.

unit InitializatonAndFinalizationU;

interface

type
TFoo = class
constructor Create;
end;

procedure Test;

implementation

uses
InitializatonAndFinalizationU2;

procedure Test;
begin

end;

{ TFoo }

constructor TFoo.Create;
begin
inherited Create;
writeln('TFoo.Create');
end;

initialization
TBar.Create;
writeln('InitializatonAndFinalizationU.initialization');

finalization
writeln('InitializatonAndFinalizationU.finalization');

end.
unit InitializatonAndFinalizationU2;

interface

type
TBar = class
constructor Create;
end;

procedure Test;

implementation

uses
InitializatonAndFinalizationU;

procedure Test;
begin

end;

{ TBar }

constructor TBar.Create;
begin
inherited Create;
writeln('TBar.Create');
end;

initialization
TFoo.Create;
writeln('InitializatonAndFinalizationU2.initialization');

finalization
writeln('InitializatonAndFinalizationU2.finalization');

end.
program InitializatonAndFinalization;

{$APPTYPE CONSOLE}

uses
SysUtils,
InitializatonAndFinalizationU in 'InitializatonAndFinalizationU.pas',
InitializatonAndFinalizationU2 in 'InitializatonAndFinalizationU2.pas';

begin
InitializatonAndFinalizationU.Test;
readln;
end.

Tuesday, October 23, 2007

Sergey Antonov implements Yield for Delphi!

The Russian Delphi programmer Sergey Antonov (or Антонов Сергей - aka. 0xffff) is a real hacker in the positive sense. He approached me with some intriguing assembly code that implements the equivalent of the C# yield statement!

Yield makes it easier to implement enumerators (you know the simple classes or records with methods like GetCurrent and MoveNext that enables the for-in statement). Normally you have to implement a kind of state-machine to write an enumerator. With the yield statement this is turned around allowing you to express the iteration using easier to write loops (while, repeat-until or even a for-in loop). 

Sergey has pulled the impressive feat of implementing a proof-of-concept version of a yield infrastructure and mechanics - without help from the compiler!! It may have some limitations, but it is most interesting anyway. Without further ado, here is Sergey's article and code. Make sure you also read the follow-up article on Sergey's blog.

Despite some minor language barriers ;), this will be a most interesting blog to follow!

Guest article, by Sergey AntonoV

"C# Yield implementation in Delphi.

The C# yield keyword is used to provide a value to the enumerator object or to signal the end of iteration. The main idea of yield construction is to generate a collection item on request and return it to the enumerator consumer immediately. You may find it useful in some cases.

As you know the Enumerator has two methods MoveNext and GetCurrent.

But how does yield works?

Technical details of the implementation

When I saw this construction I asked myself where is MoveNext and GetCurrent?

The GetEnumerator function returns the enumerator object or interface, but the enumerator is not explicitly constructed anywhere. So there must be some secret mechanism that makes it possible.

How does it really work? After spending some time in the debugger and the answer appeared.

In short the compiler generates a special type of object that of course

has some magic MoveNext and GetCurrent functions.

And because this construction may be useful to our Delphi community, I asked myself, what can I do to get yield support in Delphi with no special methods calling with saving the form of using like in С#.

I first wanted to retain the yield C# syntax, but later I changed the syntax a little and used a delegate implementation to an external procedure almost like in C# but with an additional parameter yield wrapper object. First time it was a virtual procedure.

But of course I have to generalize implementation for all types.

And of course I had an additional question to myself. Сould I improve on the С# yield implementation? Maybe.

I started from the programmer’s viewpoint. Something like this:

var
number, exponent, counter, Res:integer;
begin
// ...
Res:=1;
while counter<exponent do
begin
Res:=Res*number;
Yield(Res);
Inc(counter);
end;
end;

I had to implement some class that implemented the magic MoveNext and GetCurrent functions.

And if you use local vars (that is placed on stack) I had to implement some mechanism that guarantees no memory leaks for finalized types and some mechanism that guarantees that I use

the valid local vars when the actual address of local vars has changed after last yield calling due to external reasons (e.g. enumerator passed as parameter to other procedure, so the location in stack becomes different).

So after each yield call I have to preserve the state of local vars and processor registers,

clean up the stack and return a value to the enumerator consumer.

And after next call to MoveNext I must allocate stack space, restore the state of local vars and processor registers, i.e. emulate that nothing has happened.

And of course I must provide a normal procedure for exiting at the end.

So let’s begin

First of all we declare some types:

type
TYieldObject = class;
TYieldProc = procedure (YieldObject: TYieldObject);

TYieldObject = class
protected
IsYield:boolean;
NextItemEntryPoint:pointer;
BESP:pointer;
REAX,REBX,RECX,REDX,RESI,REDI,REBP:pointer;
StackFrameSize:DWORD;
StackFrame: array[1..128] of DWORD;
procedure SaveYieldedValue(const Value); virtual; abstract;
public
constructor Create(YieldProc: TYieldProc);
function MoveNext:boolean;
procedure Yield(const Value);
end;

And the implementation

constructor TYieldObject.Create(YieldProc:TYieldProc);
asm
mov eax.TYieldObject.NextItemEntryPoint,ecx;
mov eax.TYieldObject.REAX,EAX;
end;

function TYieldObject.MoveNext: boolean;
asm
{ Save the value of following registers.
We must preserve EBP, EBX, EDI, ESI, EAX for some circumstances.
Because there is no guarantee that the state of registers will
be the same after an iteration }
push ebp;
push ebx;
push edi;
push esi;
push eax;

mov eax.TYieldObject.IsYield,0
push offset @a1
xor edx,edx;
cmp eax.TYieldObject.BESP,edx;
jz @AfterEBPAdjust;

{ Here is the correction of EBP. Some need of optimization still exists. }
mov edx,esp;
sub edx,eax.TYieldObject.BESP;
add [eax.TYieldObject.REBP],edx
@AfterEBPAdjust:
mov eax.TYieldObject.BESP,esp;

{ Is there any local frame? }
cmp eax.TYieldObject.StackFrameSize,0
jz @JumpIn;

{ Restore the local stack frame }
mov ecx,eax.TYieldObject.StackFrameSize;
sub esp,ecx;
mov edi,esp;
lea esi,eax.TYieldObject.StackFrame;

{ Some need of optimization still exists. Like movsd}
rep movsb;
@JumpIn:

{ Restore the content of processor registers }
mov ebx,eax.TYieldObject.REBX;
mov ecx,eax.TYieldObject.RECX;
mov edx,eax.TYieldObject.REDX;
mov esi,eax.TYieldObject.RESI;
mov edi,eax.TYieldObject.REDI;
mov ebp,eax.TYieldObject.REBP;
push [eax.TYieldObject.NextItemEntryPoint];
mov eax,eax.TYieldObject.REAX;

{ Here is the jump to next iteration }
ret;

{ And we return here after next iteration in all cases, except exception of course. }
@a1:;

{ Restore the preserved EBP, EBX, EDI, ESI, EAX registers }
pop eax;
pop esi;
pop edi;
pop ebx;
pop ebp;
{ This Flag indicates the occurrence or no occurrence of Yield }
mov al,eax.TYieldObject.IsYield;
end;

procedure TYieldObject.Yield(const Value);
asm
{ Preserve EBP, EAX,EBX,ECX,EDX,ESI,EDI }
mov eax.TYieldObject.REBP,ebp;
mov eax.TYieldObject.REAX,eax;
mov eax.TYieldObject.REBX,ebx;
mov eax.TYieldObject.RECX,ecx;
mov eax.TYieldObject.REDX,edx; // This is the Ref to const param
mov eax.TYieldObject.RESI,ESI;
mov eax.TYieldObject.REDI,EDI;
pop ecx;
mov eax.TYieldObject.NextItemEntryPoint,ecx;

//We must do it first for valid const reference
push eax;
mov ecx,[eax];
CALL DWORD PTR [ecx+VMTOFFSET TYieldObject.SaveYieldedValue];
pop eax;

{ Calculate the current local stack frame size }
mov ecx,eax.TYieldObject.BESP;
sub ecx,esp;
mov eax.TYieldObject.StackFrameSize,ecx;
jz @AfterSaveStack;

{ Preserve the local stack frame }
lea esi,[esp];
lea edi,[eax.TYieldObject.StackFrame];

{ Some need of optimization still exists. Like movsd }
rep movsb;
mov esp,eax.TYieldObject.BESP;
@AfterSaveStack:

{Set flag of Yield occurance }
mov eax.TYieldObject.IsYield,1;
end;

And what about my improvements

As for improvements I am still thinking about unwinding the local SEH (Structured Exception Handling) frames on yielding and restore it with any needed correction after return.

And how do you use it?

type
TYieldInteger = class(TYieldObject)
protected
Value:integer;
function GetCurrent:integer;
procedure SaveYieldedValue(const Value); override;
public
property Current:integer read GetCurrent;
end;

{ TYieldInteger }

function TYieldInteger.GetCurrent: integer;
begin
Result:=Value;
end;

procedure TYieldInteger.SaveYieldedValue(const Value);
begin
Self.Value:=integer(Value);
end;

So now there is full support for integer.

type
TYieldString = class(TYieldObject)
protected
Value:string;
function GetCurrent:string;
procedure SaveYieldedValue(const Value); override;
public
property Current:string read GetCurrent;
end;

{ TYieldString }

function TYieldString.GetCurrent: string;
begin
Result:=Value;
end;

procedure TYieldString.SaveYieldedValue(const Value);
begin
Self.Value := string(Value);
end;

And now there is full support for string.

Sample of using a string Enumerator

procedure StringYieldProc(YieldObj: TYieldObject);
var
YieldValue: string;
i: integer;
begin
YieldValue:='None';
YieldObj.Yield(YieldValue);
for i := 1 to 10 do
begin
YieldValue := YieldValue + IntToStr(i);
YieldObj.Yield(YieldValue);
end;
end;

function TForm1.GetEnumerator: TYieldString;
begin
Result:=TYieldString.Create(StringYieldProc);
end;

procedure TForm1.Button1Click(Sender: TObject);
var
a:string;
begin
for a in self do
Memo1.Lines.Add(a);
end;

From Russia with love

Sergey Antonov aka oxffff (Russia, Ukhta)

References:

ECMA 334

ECMA 335

MSDN




"


Sergey's next article is here.

Saturday, October 20, 2007

DN4DP#25: .NET vs Win32: Casting

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

Last time we looked at the .NET and Win32 differences for untyped var and out parameters. Here we look at casting issues.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Casting

There are also some casting differences[1] between the two platforms. In Win32, hard-casts are unsafe, because the compiler will not complain if you perform obviously illegal casts. Hard-casts is a way of telling the Win32 compiler; "Relax, I know what I'm doing. Just close your eyes and reinterpret these bits as the type I'm telling you it is". So there are no checks and no conversions going on – it is just a binary reinterpretation.

In .NET even hard-casts are safe, in the sense that the compiler and runtime will check that the cast is valid. The CLR will check that the source is compatible with the target type - if not, nil is returned instead. Conceptually, .NET hard-casts like this

  Target := TTargetClass(Source); 

work like this

  Target := TTargetClass(Source); 
if Source is TTargetClass then
Target := Source
else
Target := nil;

This means that a typical native Win32 pattern of

  if O is TMyObject then 
TMyObject(O).Foo;

has the same semantics in .NET, but a more slightly more efficient .NET-only alternative is

  MyObject := TMyObject(O); 
if Assigned(MyObject) then
MyObject.Foo;

As you know, casting between TObject and value types in .NET performs boxing and unboxing operations (see Chapter 5 for the details). In Win32 you can only cast value types of size 4 bytes or less to and from TObject – and the cast will only store the binary bits of the value type in the reference storage itself.

unit CastingDifferencesU;

interface

type
TMyObject = class
procedure Foo;
end;

procedure Test;

implementation

{ TMyObject }

procedure TMyObject.Foo;
begin
writeln('TMyObject.Foo');
end;

procedure Test;
var
O: TObject;
{$IFDEF CLR}
MyObject: TMyObject;
{$ENDIF}
begin
// Both Win32 and .NET supports is-checks, as-casts and hard-casts
O := TMyObject.Create;
if O is TMyObject then
TMyObject(O).Foo;
{$IFDEF CLR}
// In .NET a failing hard-casts returns nil -
// in Win32 the result it undefined (you typically break the type system)
MyObject := TMyObject(O);
if Assigned(MyObject) then
MyObject.Foo;
{$ENDIF}
end;

end.




[1] For more details about how casting works in Win32 and .NET see my blog posts at http://hallvards.blogspot.com/"

Wednesday, October 17, 2007

More fun with Enumerators

As part of the new language syntax inherited from Delphi.NET, native Delphi now (since Delphi 2005) supports for-in loops (known as foreach in C#). The new syntax is easy to read, and it reduces the clutter of maintaining a loop index variable, checking boundary conditions (typically 0 and Count-1) and indexing into the array or list.  

While Delphi has special built-in support for for-in for (got it? ;) arrays, strings, and set types, the RTL and VCL also implements support for for-in by implementing an enumerator pattern. You can do the same with your own collection classes. There are a number of ways to implement such enumerators. We will look at a couple of variants, studying the code that the compiler generates for them. We will mainly focus on getting as efficient code as possible.

The Enumerator Pattern

Primoz Gabrijelcic has already posted several excellent blog posts about how to write enumerators in Delphi - please read them first if you haven't already.

Basically, when writing support for the for-in loop in one of your own classes or records (you can have records with methods these days, you know) you need to provide a single function called GetEnumerator. This function needs to return an instance of a class or a record that needs to have a public MoveNext function and a public Current property.

Here is a complete example that closely mimics the way TList implements its enumerator.

type
TMyObject = class(TObject)
procedure Foo;
end;
TMyList = class;
TMyListEnumerator = class
private
FIndex: Integer;
FMyList: TMyList;
public
constructor Create(AMyList: TMyList);
function MoveNext: Boolean;
function GetCurrent: TMyObject;
property Current: TMyObject read GetCurrent;
end;
TMyList = class(TList)
public
procedure Add(AMyObject: TMyObject);
function GetEnumerator: TMyListEnumerator;
end;

implementation

{ TMyObject }

procedure TMyObject.Foo;
begin
end;

{ TMyList }

procedure TMyList.Add(AMyObject: TMyObject);
begin
inherited Add(AMyObject);
end;

function TMyList.GetEnumerator: TMyListEnumerator;
begin
Result := TMyListEnumerator.Create(Self);
end;

{ TMyListEnumerator }

constructor TMyListEnumerator.Create(AMyList: TMyList);
begin
inherited Create;
FIndex := -1;
FMyList := AMyList;
end;

function TMyListEnumerator.GetCurrent: TMyObject;
begin
Result := FMyList.List[FIndex];
end;

function TMyListEnumerator.MoveNext: Boolean;
begin
Result := FIndex < FMyList.Count - 1;
if Result then
Inc(FIndex);
end;

With this code in place we can now create instances of TMyList and run the shiny new for-in loops on them.

procedure Test;  
var
MyList: TMyList;
MyObject: TMyObject;
begin
MyList := TMyList.Create;
for MyObject in MyList do
MyObject.Foo;
end;

The Generated Code


This is all pretty straight-forward  code, but let's look one level deeper and look at the assembly code that the compiler generates from this code. This is as easy as setting a break-point on the for-in loop, hitting run (F5) and then press Ctrl+Alt+C to open the CPU window. The code we see looks like this (from Delphi 2007).

for MyObject in MyList do
call TMyList.GetEnumerator
mov [ebp-$04],eax
xor eax,eax
push ebp
push $0040ff8c
push dword ptr fs:[eax]
mov fs:[eax],esp
jmp $0040ff62
mov eax,[ebp-$04]
call TMyListEnumerator.GetCurrent
mov ebx,eax
MyObject.Foo;
mov eax,ebx
call TMyObject.Foo
for MyObject in MyList do
mov eax,[ebp-$04]
call TMyListEnumerator.MoveNext
test al,al
jnz $0040ff51
xor eax,eax
pop edx
pop ecx
pop ecx
mov fs:[eax],edx
push $0040ff93
MyObject.Foo;
cmp dword ptr [ebp-$04],$00
jz $0040ff8b
mov dl,$01
mov eax,[ebp-$04]
call TObject.Destroy
ret
jmp @HandleFinally
jmp $0040ff7b

Wow! That sure was a lot of machine code from two innocent looking lines of Pascal!


Without digging into the semantics of each assembly instruction, we can quickly see that there are calls to four methods (GetEnumerator, GetCurrent, Foo and MoveNext) and one destructor (TObject.Destroy). There are also some magic gyrations involving FS: [xx] segment overrrides and a jump instruction to HandleFinally - this is the hallmark of a try-finally implementation.


If we try and expand the for-in loop into Pascal code, it would look something like this.

procedure TestImpl;  
var
MyList: TMyList;
MyObject: TMyObject;
MyListEnumerator: TMyListEnumerator;
begin
MyList := TMyList.Create;
MyListEnumerator := MyList.GetEnumerator;
try
while MyListEnumerator.MoveNext do
begin
MyObject := MyListEnumerator.Current;
MyObject.Foo;
end;
finally
if Assigned(MyListEnumerator) then
MyListEnumerator.Destroy;
end;
end;

Is all of this overhead really necessary? Not really. While all of the enumerators in the Delphi RTL and VCL (and even in Primoz' sample code) are implemented as classes (forcing the the compiler to implicitly free the enumerator instance after the loop), there is no rule that says that all enumerators must be implemented by classes.


Enumerator Records


No, light-weight records with methods can implement enumerators, too - and they are allocated directly on the stack and has no need for calling a destructor. So changing the enumerator from a class to a record will make the code smaller and faster. Most enumerators can be favorably be implemented as records. In fact, all the existing enumerators in the Delphi RTL and VCL could be easily re-implemented as records - producing smaller and faster code for all for-in loops that use them. I can see only two reasons to use classes for enumerators; if the class needs to free something in an overridden destructor, or if the class needs to inherit functionality from another class. In all other cases, enumerators should be implemented as records.


Another observation is that the implementations for GetCurrent and MoveNext (and even GetEnumerator) are typically very short and thus perfect candidates for inlining. The two first methods will be called once for every item iterated over in the collection, so it makes sense to inline them at the call site.


So, lets change the sample code above with these two optimizations. To better see the effect on the generated code, we'll do one change at the time. First we replace the class with a record - yielding the following simple changes in the code. 

type
TMyListEnumerator = record
private
FIndex: Integer;
FMyList: TMyList;
public
constructor Create(AMyList: TMyList);
function MoveNext: Boolean;
function GetCurrent: TMyObject;
property Current: TMyObject read GetCurrent;
end;

constructor TMyListEnumerator.Create(AMyList: TMyList);
begin
// inherited Create;
FIndex := -1;
FMyList := AMyList;
end;

The only changes we did was to change "class" to "record" and comment out the call to the inherited constructor (as records do not inherit anything - and cannot have parameterless constructors). Without the need to free the enumerator instance, the code the compiler generates for the for-in loop is now much simpler.

  for MyObject in MyList do
mov edx,esp
mov eax,ebx
call TMyList.GetEnumerator
jmp $0041010d
mov eax,esp
call TMyListEnumerator.GetCurrent
mov ebx,eax
MyObject.Foo;
mov eax,ebx
call TMyObject.Foo
for MyObject in MyList do
mov eax,esp
call TMyListEnumerator.MoveNext
test al,al
jnz $004100fd

We can still see the three method calls to GetEnumerator, GetCurrent, Foo and MoveNext, but the heavy try-finally code and the Destroy call are gone. If we try to write this in Pascal again, it would be something like this.

procedure TestImpl;  
var
MyList: TMyList;
MyObject: TMyObject;
MyListEnumerator: TMyListEnumerator;
begin
MyList := TMyList.Create;
MyListEnumerator := MyList.GetEnumerator;
while MyListEnumerator.MoveNext do
begin
MyObject := MyListEnumerator.Current;
MyObject.Foo;
end;
end;

That is much better, don't you think!? ;)


Inlining the Loop Calls


While the class-to-record optimization gave some nice code size savings, for long running loops it doesn't really affect the overall running time, because it only affects what happens before and after the loop.


The next step is to turn on inlining the the small and simple enumerator functions. This is as easy as adding the inline directive to the declaration of the methods, like this.

  TMyListEnumerator = record
private
FIndex: Integer;
FMyList: TMyList;
public
constructor Create(AMyList: TMyList);
function MoveNext: Boolean; inline;
function GetCurrent: TMyObject; inline;
property Current: TMyObject read GetCurrent;
end;

Here we have just added "inline;" to the MoveNext and GetCurrent methods. Recompiling, running and hitting the break-point again, we can inspect the assembly code once again.

  for MyObject in MyList do
mov edx,esi
mov eax,ebx
call TMyList.GetEnumerator
jmp $00410222
mov eax,[esi+$04]
mov eax,[eax+$04]
mov edx,[esi]
mov ebx,[eax+edx*4]
MyObject.Foo;
mov eax,ebx
call TMyObject.Foo
for MyObject in MyList do
mov eax,esi
call TMyListEnumerator.MoveNext
test al,al
jnz $00410210

Notice that the GetCurrent call is now gone - instead the assembly instructions for its implementation have been merged into our loop. This is good as excessive branching inside a loop brings down performance. But notice that the MoveNext call is still there. For some reason the compiler is not heeding our inline directive for the MoveNext function.


While Expressions Not Inlined


As discussed in my DN4DP piece on Delphi inlining, the inline directive is just a hint that the compiler should try to inline the call if possible - there are a number of documented and undocumented cases where it will not be inlined. It looks like we have stumbled onto one of the undocumented cases here - the MoveNext function of an enumerator will not currently (D2007) be inlined in the code that the compiler generates for a for-in loop.


I've done some more testing and digging, and it seems like no functions are inlined if they appear inside a while loop control expression. Given some mock-up test code:

procedure Foo;
begin
end;

function Inlined(var I: integer): boolean; inline;
begin
Dec(I);
Result := I <> 0;
end;

We can now used this inlined function in a while loop.

procedure Test1;
var
I: integer;
begin
I := 100;
while Inlined(I) do
Foo;
end;

If the inlining worked we should see no call to the Inlined function in the generated assembly code.

  I := 100;
mov [esp],$00000064
jmp $0040fdc7
Foo;
call Foo
while Inlined(I) do
mov eax,esp
call Inlined
test al,al
jnz $0040fdc2

Alas, there is clearly a "call Inlined" in there :-(. 


Transforming a While Loop


Let's experiment a little. Testing shows that the Inlined routine is properly inlined for a simple if-test. It is possible to rewrite a while loop with an expression into a while-true loop with an if-statement on the negated expression and a Break. We can rewrite the non-inlining while loop in Test1 with a semantically equivalent loop.

procedure Test2;
var
I: integer;
begin
I := 100;
while True do
begin
if not Inlined(I) then
Break;
Foo;
end;
end;

Looking at the generated code again (we're really getting the hang of this, right?;)), we can see that the inlining does work just fine in this loop.

  I := 100;
mov [esp],$00000064
if not Inlined(I) then
dec dword ptr [esp]
cmp dword ptr [esp],$00
setnz bl
test bl,bl
jz $0040fdf2
TestInlining.dpr.57: Foo;
call Foo
while True do
jmp $0040fddd

This shows that the compiler is able to inline loop control like this - it just needs a little assistance ;).


Quality Central Reports


CodeGear could fix this in two ways;



  • Fix the compiler so that while-loop expressions can be inlined (this is the best solution). This should then automatically also inline the MoveNext call that the compiler generates for for-in loops.

  • If this is hard or impossible for some reason, at least the for-in loop could be changed to generate a while-true loop with if not MoveNext then Break; logic. Thus would ensure that for-in loops can become more efficient than today.

While they are at it, they could also:



  • Refactor all existing RTL and VCL enumerators from class to record

  • Inline all GetCurrent and MoveNext functions in all enumerators

Yes, I know should probably log these issues in Quality Central. And I will - eventually ;).


Update: I have now taken the time to report these issues and suggestions in Quality Central. Note: I think the first QC report ended up as a Beta report and may thus not be publically viewable - but it seems CodeGear have noticed anyway ;).


QC#53623 - Function calls inside while expressions are not inlined


QC#53737 - For-in codegen: Enumerator MoveNext calls are not inlined


QC#53738 - Change all RTL and VCL enumerators from class to record


QC#53739 - Inline GetCurrent and MoveNext in all RTL and VCL enumerators

Sunday, October 07, 2007

DN4DP#24: .NET vs Win32: Untyped parameters

This post continues the series of The Delphi Language Chapter teasers from Jon Shemitz’ .NET 2.0 for Delphi Programmers book.

The previous post listed the Win32 specific language and RTL features. The next few posts will focus on minor differences in implementation between Win32 and .NET - starting with differences in the detailed semantics of untyped var and out parameters.

Note that I do not get any royalties from the book and I highly recommend that you get your own copy – for instance at Amazon.

"Win32 and .NET differences

In addition to the new language features and the obsolete features we have already mentioned, there are some implementation details that are different between the Win32 and .NET versions of the Delphi language. These differences are minor, but it is useful to know about them.

Untyped var and out parameters

It is interesting to note that Delphi supports type-less var and out parameters in a strictly typed and managed environment like .NET.

var
GlobalInt: integer;

procedure FooVar(var Bar);
var
BarValue: integer;
begin
BarValue := Integer(Bar);
Inc(BarValue);
Bar := BarValue;
end;

procedure TestFooVar;
begin
GlobalInt := 1;
FooVar(GlobalInt);
end;

The implementation relies on boxing the actual argument to and from System.Object before and after the method call. The compiler compiles the code above like this

procedure FooVarImpl(var Bar: TObject);
var
BarValue: integer;
begin
BarValue := Integer(Bar); // Unbox
Inc(BarValue);
Bar := TObject(BarValue); // Autobox
end;

procedure TestFooVarImpl;
var
Temp: TObject;
begin
GlobalInt := 1;
Temp := TObject(GlobalInt); // Autobox
FooVarImpl(Temp);
GlobalInt := Integer(Temp); // Unbox - after the routine returns
end;

Because of this implementation, you will not see intermediate modifications of the actual argument until the call returns. The compiler allows direct assignments to the untyped parameter in .NET (as if $AUTOBOX is turned ON just for that TObject parameter) – this is not allowed in Win32. In addition, left-hand-side casts of an untyped parameter is not allowed in .NET – only in Win32. This can make it hard to write single-source routines using var and out parameters without resorting to IFDEFs. The UntypedParameters project demonstrates these differences.

unit UntypedParametersU;

interface

procedure Test;

implementation

{$DEFINE CHECK_SIDEFFECTS}

type
TMyObject = class
end;

var
GlobalInt: integer;
GlobalRef: TMyObject;

procedure FooVar(var Bar);
var
BarValue: integer;
begin
BarValue := Integer(Bar);
{$IFDEF CHECK_SIDEFFECTS}
writeln('1. FooVar Bar = ', BarValue);
writeln('1. FooVar GlobalInt = ', GlobalInt);
{$ENDIF}
Inc(BarValue);
{$IFDEF CIL}
Bar := BarValue;
{$ELSE}
Integer(Bar) := BarValue;
{$ENDIF}
{$IFDEF CHECK_SIDEFFECTS}
writeln('2. FooVar Bar = ', Integer(Bar));
writeln('2. FooVar GlobalInt = ', GlobalInt);
if GlobalInt = Integer(Bar)
then writeln('Win32 untyped var semantics')
else writeln('.NET untyped var semantics');
{$ENDIF}
end;

procedure FooVarImpl(var Bar: TObject);
var
BarValue: integer;
begin
BarValue := Integer(Bar);
Inc(BarValue);
Bar := TObject(BarValue);
end;

procedure FooOut(out Bar);
begin
{$IFDEF CIL}
Bar := Integer(42);
{$ELSE}
Integer(Bar) := 42;
{$ENDIF}
{$IFDEF CHECK_SIDEFFECTS}
writeln('1. FooOut Bar = ', Integer(Bar));
writeln('1. FooOut GlobalInt = ', GlobalInt);
if GlobalInt = Integer(Bar)
then writeln('Win32 untyped out semantics')
else writeln('.NET untyped out semantics');
{$ENDIF}
end;

procedure FooConst(const Bar);
var
Temp: integer;
begin
writeln('1. FooConst Bar = ', Integer(Bar));
writeln('1. FooConst GlobalInt = ', GlobalInt);
Temp := Integer(Bar);
writeln('1. FooConst Temp = ', Temp);
end;

procedure NilRef(var Obj);
begin
{$IFDEF CIL}
Obj := nil;
{$ELSE}
TObject(Obj) := nil;
{$ENDIF}
{$IFDEF CHECK_SIDEFFECTS}
if GlobalRef = nil
then writeln('Win32 untyped var semantics')
else writeln('.NET untyped var semantics');
{$ENDIF}
end;

procedure NilRefImpl(var Obj: TObject);
begin
Obj := nil;
{$IFDEF CHECK_SIDEFFECTS}
if GlobalRef = nil
then writeln('Win32 untyped var semantics')
else writeln('.NET untyped var semantics');
{$ENDIF}
end;

procedure TestNilRef;
begin
GlobalRef := TMyObject.Create;
NilRef(GlobalRef);
end;

procedure TestNilRefImpl;
var
Temp: TObject;
begin
GlobalRef := TMyObject.Create;
Temp := GlobalRef;
NilRefImpl(Temp);
GlobalRef := TMyObject(Temp);
end;

procedure TestFooVar;
begin
GlobalInt := 1;
FooVar(GlobalInt);
end;

procedure TestFooVarImpl;
var
Temp: TObject;
begin
GlobalInt := 1;
Temp := TObject(GlobalInt);
FooVarImpl(Temp);
GlobalInt := Integer(Temp);
end;

procedure Test;
begin
TestNilRef;
{$IFDEF CIL}
TestNilRefImpl;
{$ENDIF}
TestFooVar;
{$IFDEF CIL}
TestFooVarImpl;
{$ENDIF}
Writeln('2. GlobalInt = ', GlobalInt);
FooOut(GlobalInt);
Writeln('3. GlobalInt = ', GlobalInt);
FooConst(GlobalInt);
end;

end.

"



Copyright © 2004-2007 by Hallvard Vassbotn