Sunday, March 02, 2008

TDM#6: Knitting Your Own Threads

One of the key reasons that computers have conquered the world is that they have been following Moore's Law with faster, smaller and cheaper CPUs (and similar "laws" and improvements of memory, hard disks, graphics cards, etc) coming out every year.

Until recently, all programs have just become faster and faster due to improved hardware. This has been dubbed "the free lunch" and has given sloppy programmers the "hardware-will-catch-up" excuse to write slow and bloated software.

Well, no more (this DDJ piece by Herb Sutter is recommended reading). The speed of normal run-of-the-mill single-threaded code are not getting any speedups any more. The main reason for this is that CPU clock speeds have hit the ~3 Ghz speed limit. On the other hand, CPUs are getting better at doing things in parallel - by packing multiple cores in the same die, multiple threads of execution can run at the same time. To exploit this parallelism, programs have to somehow execute multiple threads - so called multithreaded programming.

Even in the classical case of a single CPU core, there can often be substantial benefits of dividing the work of a program into multiple threads. You can improve user interface responsiveness by off-loading some of the heavy lifting to a background thread, for instance, while the main thread is still happily updating the GUI.

Delphi has always (well, since Delphi 2) anyway had some level of support for writing multithreaded applications. There is the TThread class that encapsulates the concept of a separate thread of execution and there are wrappers for critical sections, mutexes, semaphores and events. Allen Bauer is doing an admirable job of thinking, designing, (re-)implementing and writing about an upcoming Delphi Parallel Library (DPL) in his blog.

Still, one of the major weaknesses of the current Delphi VCL thread support is that there is no (easy) way for the main thread to wait for a signal in a non-polling, non-blocking fashion. Another related issue is that there is no clean way for threads to communicate between themselves or the main thread. For performance and blocking reasons, the dreaded TThread.Synchronize method should be avoided.

These are the main points I raised in The Delphi Magazine article Knitting Your Own Threads, published in December 1998 (time sure is flying, that is almost 10 years ago now! :)). It also goes into some detail the various threading support in the compiler, RTL and VCL (we are still in the Delphi 4 era here). It talks about and digs into the implementation and shortcomings of:

  • threadvar
  • MainThreadID
  • IsMultiThread
  • BeginThread / EndThread
  • TThreadFunc
  • TThread (Execute, WaitFor, Synchronize)
  • TThreadList
  • TEvent
  • TCriticalSection

Then it goes on to try and fill in some of the shortcomings, giving implementations and wrappers for (some of these have been covered in later Delphi releases):

  • TSynchroObject
  • THandleObject
  • TNamedObject
  • TMutex
  • TSemaphore
  • TWaitableThreadList
  • TMultiThreadedMainLoop
  • MsgWaitForMultipleObjects (hooking Applicaton.OnIdle)
  • TSignalList

A few excerpts of the article and code:

"[The TWaitableThreadList] class can be used to administer a background working thread. The thread class will keep two TWaitableThreadLists, one InBox for work to be done and one OutBox for worked finished by the thread ready to be picked up by the main thread (or even another work thread for further processing),see example work flow in Figure 2. "

constructor TSignalList.Create;
begin
inherited Create;
FList := TList.Create;
// See MsgWaitForMultipleObjects in help for list of possible values
FMsgWakeupMask := QS_AllInput;
end;

destructor TSignalList.Destroy;
begin
FreeOwningTList(FList);
inherited Destroy;
end;

procedure TSignalList.AddSignal(aSignal: TCustomSignal);
begin
// Check that we are not passing any limits (currently 64!)
if FList.Count >= MAXIMUM_WAIT_OBJECTS then
raise Exception.Create('Too many wait-objects!');

// Update the low-level array with this new handle
FObjs[FList.Count] := aSignal.Handle;

// Add the thread event to the list
FList.Add(aSignal);
end;

procedure TSignalList.TriggeredIndex(Index: integer);
begin
// Use assertions to guarantee correct code while debugging and fast release code
Assert((Index >= 0) and (Index < FList.Count));
Assert(TObject(FList[Index]) is TCustomSignal);
Assert(FObjs[Index] = TCustomSignal(FList[Index]).Handle);

// Get the Signal associated with this index and trigger the event
TCustomSignal(FList.List^[Index]).Trigger;
end;

function TSignalList.WaitOne(WaitTime: DWORD; var Index: integer): TWaitResult;
// We use the blocking function MsgWaitForMultipleObjects to wait for any
// message in the message queue or any signaled object from any of the
// other running threads in this process. See WINAPI32.HLP for details.
var
WaitResult: DWORD;
begin
// This call will block and use 0% CPU time until:
// - A message arrives in the message queue, or
// - Any of the object handles in the Objs array become signaled
if IgnoreMessages
then WaitResult := WaitForMultipleObjects(FList.Count, @FObjs, WaitForAll, WaitTime)
else WaitResult := MsgWaitForMultipleObjects(FList.Count, FObjs, WaitForAll, WaitTime, MsgWakeupMask);

// Index is only valid when Result = wrSignaled
Index := WaitResult - WAIT_OBJECT_0;

// Convert from WAIT_ returncode to TWaitResult
case WaitResult of
WAIT_ABANDONED: Result := wrAbandoned;
WAIT_TIMEOUT : Result := wrTimeout;
WAIT_FAILED : Result := wrError;
else
if WaitResult = DWORD(WAIT_OBJECT_0 + FList.Count)
then Result := wrMessage
else Result := wrSignaled // WAIT_OBJECT_0 .. WAIT_OBJECT_0+(FList.Count-1)
end;
end;

function TSignalList.WaitOneAndTrigger(WaitTime: DWORD): TWaitResult;
var
Index: integer;
begin
Result := WaitOne(WaitTime, Index);
if Result = wrSignaled then
TriggeredIndex(Index);
end;

function TSignalList.WaitUntil(WaitTime: DWORD; WaitResultStop: TWaitResults): TWaitResult;
begin
repeat
Result := WaitOneAndTrigger(WaitTime);
until (Result in WaitResultStop);
end;


"Now that we have a nice encapsulation of MsgWaitForMultipleObjects to work with, lets continue with adding the improved threading capability to the main thread of the application (and thus the VCL). As we discussed, this is accomplished by hooking the OnIdle event of TApplication. See the source code of the TMultiThreadedMainLoop class in the HVMultiThreadMain unit."

procedure TMultiThreadedMainLoop.AppIdle(Sender: TObject; var Done: boolean);
// Whenever the application becomes idle, i.e. there are no messages in the
// message queue, this procedure is entered.
begin
// The default case for the old idle event
// handler should be that it is done processing
Done := true;

// Call any old idle event handler
// - this could be extended with an idle hook chain
if Assigned(FOldAppIdle) then
FOldAppIdle(Sender, Done);

// WaitUntil handles all signaled objects for the main thread
if Done then
// If the old idle event handler is done,
// wait until there is a message for us (blocking)
FSignalList.WaitUntil(INFINITE, [wrMessage])
else
// If the old idle event handler is not done yet,
// just check for signaled objects or messages (non-blocking)
FSignalList.WaitUntil(0 , [wrMessage, wrTimeOut]);

// Tell the timer-loop that we have actully been idle
FHasBeenIdle := true;

// Now return to the message loop in TApplication and
// let it have a look at the message for us
// Note that we will normally return with Done = true.
// This will call WaitMessage in TApplication.Idle, but it will not
// block because we already know there is a message in the message queue
end;

Continue to read the full article (PDF) and download the code.

No comments:



Copyright © 2004-2007 by Hallvard Vassbotn