Tuesday, March 11, 2008

TDM#8: DelayLoading Of DLLs

"I don’t miss many features from Microsoft’s Visual C++ 6.0 when working in Delphi, but the new /DELAYLOAD option of the linker is one of them. This option lets you turn normal, implicit DLL import libraries into so-called delayload import libraries. This means that the DLL will not be loaded by the operating system (OS) during start-up of the EXE file, but rather on an as-needed basis when you actually call the routines. The first time a specific DLL routine is called, the DLL is loaded with LoadLibrary and the routine address is retrieved with GetProcAddress. This is accomplished simply by turning on the /DELAYLOAD option of the linker, specifying what DLLs you want to be delayloaded. In this article, we will show how we can implement a framework for easily accomplishing similar behavior in our Delphi applications."

H.Vassbotn, The Delphi Magazine, March 1999

This is one of the TDM articles I'm most satisfied with. It covers a fairly useful subject and technique and it contains some tricky, hacky code that is challenging to explain and get your head around. It is a neat hack that feels a little bit like magic ;).

When writing the article I was inspired by MSJ articles about DELAYLOAD by Matt Pietrek and Jeffrey Richter. In addition I looked at similar code for 16-bit Delphi written by Peter Sawatzki.

A couple more excerpts from the article and code:

"Effortless Explicit Loading

The goal we should set is to write a support unit that will enable us to do dynamic explicit linking just as easily as implicit linking. For the solution that we will go through, we will actually achieve all the stated benefits while being able to write simple import units like the one [below]."

unit DynLinkTest;
Routine1 : procedure (A, B, C, D: integer); register;
Routine2 : procedure (A, B, C, D: integer); pascal;
Routine3 : procedure (A, B, C, D: integer); cdecl;
Routine4 : procedure (A, B, C, D: integer); stdcall;
TestDll: TDll;
Entries : array[1..4] of HVDll.TEntry =
((Proc: @@Routine1; Name: 'Routine1'),
(Proc: @@Routine2; Name: 'Routine2'),
(Proc: @@Routine3; ID : 3),
(Proc: @@Routine4; ID : 4));
TestDll := TDll.Create('Testdll.dll', Entries);

"Generating Code on the Fly

To allow us to use the procedural variables without more code than we saw in Listing 5, we must somehow dynamically compile code thunks similar to the ones we saw back in Listing 3. This requires acting like a mini-compiler and generating executable code on the fly. To complicate matters, we have to preserve the stack-layout and the contents of parameter passing registers (EAX, EDX and ECX). A single set of code must handle all cases of calling conventions and parameters.

As the first step, the CreateThunks method is responsible for dynamically creating these code thunks. Essentially, it allocates a block of memory and then fills it with CPU instruction op-codes, see [the code below]."

procedure TDll.CreateThunks;
CallInstruction = $E8;
PushInstruction = $68;
JumpInstruction = $E9;
i : integer;
Dlls.CodeHeap.GetMem(FThunkingCode, SizeOf(TThunkHeader) +
SizeOf(TThunk) * Count);
with FThunkingCode^, ThunkHeader do
PUSH := PushInstruction;
VALUE := Self;
JMP := JumpInstruction;
OFFSET := PChar(@ThunkingTarget) - PChar(@Thunks[0]);
for i := 0 to Count-1 do
with Thunks[i] do
CALL := CallInstruction;
OFFSET := PChar(@ThunkHeader) - PChar(@Thunks[i+1]);

"The Inner Workings of ThunkingTarget

When one of the procedural variables are called through, control will be transferred to the corresponding thunk. From here it calls back up to the per-DLL header, pushes the Self-pointer and jumps on to the ThunkingTarget procedure. This procedure is a bit tricky and it has to be written in assembly to allow us to save the contents of certain registers, see [the code below]."

procedure ThunkingTarget;
MOV EAX, [ESP+12] // Self
MOV EDX, [ESP+16] // Thunk
CALL TDll.DelayLoadFromThunk
// "RETurn" to the DLL!

"Using the Classes

We have now been through the inner workings of the HVDll unit. What might be more useful in the long run, is to know how the classes can be used for everyday work. I have included the public interface of the TDll class [below]."

TDll = class(TObject)
constructor Create(const DllName: string; const Entries: array of TEntry);
destructor Destroy; override;
procedure Load;
procedure Unload;
function HasRoutine(Proc: PPointer): boolean;
function HookRoutine(Proc: PPointer; HookProc: Pointer; var OrgProc): boolean;
function UnHookRoutine(Proc: PPointer; var OrgProc): boolean;
property FullPath: string read FFullPath write SetFullPath;
property Handle: HMODULE read GetHandle;
property Loaded: boolean read GetLoaded;
property Available: boolean read GetAvailable;
property Count: integer read FCount;
property EntryName[Index: integer]: string read GetEntryName;

In addition to the main article, there are also a number of side bars of varying degree of length, relevance and interest - the side bars are:

  • Proper [run-time] Code Generation

  • The Case of the Broken Breakpoints

  • Calling Performance and Package Overhead

  • Gotcha! Using SizeOf in BASM

We might revisit a coupe of these in the blog later.

Go ahead to download and read the full article (PDF) and code.


David Heffernan said...


My app uses this code - it's just fantastic. One of the great features about it is that you can change the DLL name at runtime based on any criteria you like, and then bind the functions.

I just wanted to take time to write this post and say thank you for giving us all this wonderful tool.

David Heffernan

Anonymous said...

I have not looked at the code in-depth, but I was wondering if you have a variation of the code that works for Delay Loading of BPL packages. I am construction a PlugIn application framework, and would like to be able to delay load packages.

Hallvards New Blog said...

David: Thanks!

You're actually the first I've heard back from about using the HVDLL code. If anyone else wants to come forward, it would be nice! ;)

No, this code is not really suited to load BPLs. Packages should be loaded using LoadPackage. You can find info and code for plugin archtectures with BPLs by searching:

Anonymous said...

BCB6, BCB2007 has an option to "DelayLoading Of DLLs":

Project->Options->Advanced Linker->Dlls to delay-load

Jon A said...


You might wonder if this information has become obsolete. It has not! I am writing some code where final executable size is critical. While I could use Delphi 2010 (and later) for the 'delayedloading' directive, executables produced by later versions of Delphi are extremely bloated compared to what is possible with earlier (<2009) versions. I have implemented your code/approach in my D2007 app and could not be happier!

Hallvards New Blog said...

Hi Jon,

Thanks for the feedback - good to know it is still of use to someone :)

I know the code isn't ready for 64-bit yet, but I think it should be farily easy to port the assembly code.


Copyright © 2004-2007 by Hallvard Vassbotn