Creating UDFs in Delphi

By: Borland Staff

Abstract: This paper will provide step by step instructions for creating a UDF in Delphi.

Problem:
Creating UDFs in Delphi

Solution:
This paper will provide step by step instructions for creating a UDF in Delphi.  It will lead
you through the process of creating a DLL in Delphi and declaring your functions to a
database so that they can be used. 
  
-------------------------------------
Create the DLL in Delphi
-------------------------------------

What is a UDF?
An InterBase UDF is a function that resides in a shared library.  Windows calls its shared
libraries DLLs (Dynamic Link Libaries).  Thus, to create a UDF to use with InterBase you
must create a DLL.  Further, InterBase is written in C so when it deals with UDFs it
requires them to behave in a certain fashion.  All functions pass their parameters on the
stack.  However, not every language pushes those parameters on the stack in the same
fashion.  InterBase requires all UDFs to adhere to the C calling convention.  Fortunately,
Delphi can create DLLs that use the C calling convention. 

How to create a DLL in Delphi
Delphi provides a wizard to get you started in creating your DLL.  You use this wizard by
bringing up the New Item dialog (menu item: File | New).  You click on the DLL icon and
it generates a new DLL project for you.  Delphi differentiates a DLL project from a normal
executable project via the keyword 'Library'.  You will notice the Library keyword at the
top of your project's file.  I will list the steps required to create the DLL first, then explain
each of them in greater detail. 

  1.Create the DLL project 
  2.Save project with desired project name 
  3.Create a new unit for your functions 
  4.Export the functions 
  5.Build the library 

I have already talked about creating a DLL project via the Delphi wizard, so I'll move on to
the next step.  Before you do anything else save your project.  I strongly recommend that
you save often when working in Delphi.   So, save your project and give it a name that
represents what the project does.  In the case of a DLL project, the DLL by default will be
named the same as the name of the project.  So... name wisely. 

Next, you should create a unit in which to develop your DLLs functions.  It is also possible
to create all your functions in the project file, but it is adviseable to create units in which
to write your functions.  Using units allows you to seperate your functions into logical
groups.   For example, FreeUDFLib is comprised of several units.  There is a unit for
string functions, as well as a unit for blob functions, just to name a few.  It makes it easier
to find and work with code that is logically separated into smaller groups (in this case
Delphi units).  It is also easier to reuse code if it is broken down into smaller units.  So...
create your unit to house your functions. 

To create your unit you again use the New Item dialog (File | New).  Select the unit icon
and it will create a new unit and add it to your project.  You should then save your project,
which will require you to give the new unit a filename.  As an side note, I suggest that
when naming projects and units that you give them descriptive names.  You don't need to
stick to the 8.3 filename convention. 

At this point you are set to start coding your functions.  The details of programming and
coding are outside the scope of this paper.  I will assume that you are successful in your
function creation and will move on. 

At this point you have a project, saved I hope, with a set of functions.  You next need to
tell Delphi that you want to export the set of functions that you have just created. 
Exporting a function allows other applications to link with your DLL and use your
exported functions.  If a function is not exported then it cannot be used by applications that
link with your DLL.  To export functions in Delphi you must use the keyword 'exports'
along with a list of functions that you are exporting.  The following Library file shows the
use of the 'exports' keyword to export three functions from the library DelphiUDF.  The
functions sysDate, sysTime, and returnFive are available for applications to use when they
link with this DLL, called DelphiUDF.dll. 

--------------------- begin source code snippet ------------------------------

library DelphiUDF; 

uses 
  SysUtils, 
  Classes, 
  brettFuncs in 'brettFuncs.pas'; 

exports 
  sysDate, 
  sysTime, 
  returnFive; 

begin 
end. 

--------------------- end source code snippet ------------------------------

So far you have created your project, coded the functions, and listed all the functions that
are to be exported from the DLL.  The only thing left is to build the DLL.  To build a
project in Delphi you just select the menu item to build your project (Project | Build
projectname).  Barring any coding errors your DLL should exist in the directory you
saved your project in. 

You are now done with the Delphi part of this exercise.  You have created a DLL that
exports your functions.  Congratulations! 

-----------------------------------------------------------
Declare the Functions to the Database
-----------------------------------------------------------

Move DLL to directory where it can be loaded 
Now that you have a working DLL you need to declare the functions to a database so that
you can use them with InterBase.  Before you create your function declaration script you
must make sure that InterBase can find the DLL.  InterBase uses the Win32 API function
LoadLibrary to load DLLs and execute functions.  LoadLibrary has a set order for finding
DLLs on a system.  Here is the search order that LoadLibrary performs when trying to
load a library. 

  1.Search in the directory from which the calling application was loaded.  In the case of
    InterBase we are talking about the ibserver.exe process.  This process is started from
    the interbase bin directory (c:Program FilesInterBase CorpInterBasebin by
    default). 
  2.Search in the current directory 
  3.Search in the Windows system directory 
        For Win95 this is the windowssystem directory 
        For NT this is the winntsystem32 directory and the winntsystem (16-bit
        system directory) 
  4.Search in the windows directory 
        For Win95 this is the windows directory 
        For NT this is the winnt directory 
  5.Search all directories specified in the PATH environment variable 

Declare UDF functions to database 
The UDF definitions are stored in an InterBase database.  This means that you have to
declare the UDFs to every database that you want to use them in.   Here is the syntax for
declaring a function to a database. 

DECLARE EXTERNAL FUNCTION name [ datatype | CSTRING ( int) 
    [, datatype | CSTRING ( int) ]] 
RETURNS { datatype [BY VALUE] | CSTRING ( int)} [FREE_IT] 
ENTRY_POINT ' entryname' 
MODULE_NAME ' modulename'; 

I have provided an example function definition: 

declare external function getSysDate 
returns CSTRING(10) FREE_IT 
entry_point "sysDate" 
module_name "DelphiUDF.dll"; 

This example does not have any input parameters, but does have one return parameter.  The
return parameter is defined as a CSTRING() which is a null-terminated string.  Consult the
InterBase Language Guide for more details on declaring functions to a database. 

------------------------------
Test the Functions
------------------------------

Perform preliminary tests to ensure the UDFs work 
Now that the functions have been built and declared to the database they should be ready to
use.  You can test your functions by using isql and a select statement involving the
functions.  So... an example query that would test the getSysDate function, which is
defined above, would be: 

select getSysDate() from rdb$database; 

This query uses the rdb$database table so that only 1 row will be returned.  This query will
execute the UDF getSysDate and display the string that it returns. 

Always test in an environment that mimics the production environment 
This is only a first pass at testing your UDFs.  Further testing should be done to ensure that
your UDF will work in a concurrent multi-user environment.   As always, I recommend
setting up your test environment to copy the environment where your database will be put
into production. 

---------------------------------------------------------------
Issues to Be Aware of When Creating UDFs
---------------------------------------------------------------

I will next discuss a few of the issues that arise when developing UDFs for InterBase.  The
issues that will be discussed should be fully understood BEFORE UDFs are deployed to a
production environment.  A lot of these issues only show up in multi-user environments,
which is why I suggest that your test environment mimic the production environment of
the database. 
  

Multithreading Issues
-------------------------------

The most common set of problems occur because InterBase v4.21 and later employ the
superserver architecture.  This architecture is a single-process, multi-threaded server. 
Because of this UDFs must be thread-safe.  I will discuss a few of the major
multi-threading issues. 

Calling other functions 
For a UDF to be thread-safe its code must be thread-safe as well as all functions that it
uses.   The UDF becomes not thread-safe if any function it calls is not thread-safe.  For
example, if I had a thread-safe UDF, but then added a call to a non-thread-safe library
(non-multithreaded C runtime library) then my function is then not thread-safe and may
cause the ibserver.exe process to abnormally terminate with an access violation.  So, every
function that you call directly and every function that is called indirectly must be
thread-safe as well. 

Synchronization - avoiding deadlocks 
When sharing resources in a multithreaded environment, those resources must have a
means to control access to them.  In a multithreaded environment multiple instances of
your UDF may be executed concurrently.  If your UDF uses some sort of shared resource,
anything from a global variable to a memory-mapped file, you must setup some sort of
protection so that only one instance of your UDF can access the shared resource at any
given time.  If you don't control access then that resource may get corrupted.  Every
platform InterBase runs on has a set of synchronization techniques (mutexes, semaphores,
...) that can be used to control access to shared resources.  The discussion of
synchronization techniques is beyond the scope of this paper.   Just let it be known that
there are several ways to synchronize access to shared resources and it is up to the
developer to decide which one is the correct choice for the situation. 

When you start dealing with more than one shared resource then you open your UDF up
for deadlock situations.  A deadlock situation is basically where one function, lets say
funcA, has a locked resource, but needs another locked resource.  Additionally, there is a
second function, lets say funcB, that holds the locked resource needed by funcA.  FuncB
itself needs the locked resource held by funcA.  Each function has a resource, and each
function needs a resource held by the other function.  Neither function will be able to
complete its task, because it will never get the second resource that it requires.  This is a
deadlock situation.  The only way to resolve a deadlock situation is to terminate one of the
functions so that the other may finish its task.  This is not a desired course of action. 

When you are writting UDFs that use more than one shared resource you need to be wary
of deadlock situations.  There is a general rule to follow when dealing with multiple
shared resources.  Code all your functions to acquire the shared resources in the same
order.  So, every function would have the same order for acquiring the set of shared
resources required.  What this does is prevent the sort of deadlock situation described
above.  Only one function can have the first shared resource on the list, and no other shared
resource can be held until the first one is acquired.  There are many other issues involved
in deadlock avoidance, but setting up a resource acquisition strategy such as this will
reduce the chances of a deadlock occurring. 

Non-synchronized variables may cause corruption 
When using variables in your application it is wise to understand the scope of where each
variable is visible and who can access those variables.  As a rule: global variables are a
no-no.  Global variables are available to all threads in the process.  This means that every
instance of your UDF that is executed has access to the global variable.  You must ensure
that the global variable is protected so that concurrent executions of your UDF don't
corrupt the variable. 

Likewise, when using static local variables you must ensure that the variable is
protected.  static local variables are variables local to a function, but don't go "out of
scope" when the function ends.  They exist for the life of the application.  Unlike normal
local variables, there is only one copy of the variable, not one for each execution of the
function.  You must ensure that these static local variables are accessed simultaneously,
which in most cases will corrupt the variable. 

Thread-local storage 
Thread-local storage (TLS) is allocated memory that is associated with a particular thread. 
TLS allows you to create a variable that is local to a specific thread, a memory location
which can be viewed only by the thread that it is created for.  Windows defines two types
of TLS; dynamic and static.   Right to the point; dynamic TLS will work, but static TLS
will NOT. 

Dynamic TLS is named as such because you dynamically allocate the local storage at
runtime.  Windows defines a set of functions to manipulate dynamic TLS.  I list them here:

    TlsAlloc 
    TlsFree 
    TlsSetValue 
    TlsGetValue 

All access to dynamic TLS is through these functions.  Developers just need to call these
functions to manipulate the TLS for each thread.  The functions work in the context of the
current thread.  For example, if you execute the TlsSetValue function it will set the value
for the storage allocated for the current thread.  The developer doesn't have to worry about
how to access the current thread's storage or how to synchronized access to the storage
area.  Windows controls the low level access and synchronization to the TLS.   For those
of you that are familiar with FreeUDFLib, this is the mechanism that Greg Deatz uses. 

I won't go into great detail into why static TLS will NOT work, but will delve into a brief
explanation.  See Advanced Windows for more details on static TLS.  The short of it is that
storage for all static TLS variables is allocated for each thread when it is created. 
However, and here lies the problem, when DLLs are loaded they can contain static TLS
variables.  The dynamically loaded DLL (your UDF) will be mapped into the process and
the static TLS variables will be mapped into all existing threads.  The key point here is that
it will be mapped only into the existing threads.  InterBase dynamically creates new threads
when necessary.  So, all threads that are created after the UDF is loaded will not have this
static TLS variable mapped.  For example, say the first connection executes the UDF
function.  The DLL is dynamically loaded by ibserver.exe and the DLL's static TLS
variables are mapped for each of the existing threads.  Now, the server gets bombarded by
attachments.  There are so many attachments that the server creates a number of new
worker threads to handle the attachments.  One of these attachments executes the UDF that
is already loaded.  The UDF relies on the assumption that the static TLS variable exists and
can be accessed.  Depending on what the static TLS variable is used for the function can
return bad data (corruption) to InterBase or worse: try to write to the location and either
corrupt InterBase's process space or try to write outside of InterBase's process space. 
Either one will frequently result in an Access Violation and the server will crash. 
  

Memory Management Issues
--------------------------------------------

Memory management issues deal more with memory leaks than server crash problems. 
However, as we all know InterBase doesn't do very well when the system is low on
memory.  In fact, InterBase will often crash when the operating system says no more
memory.  So, it is advantageous to understand the memory management issues when
developing UDFs for InterBase. 

Dynamic memory allocation 
On Windows platforms there isn't a standard memory allocation function.  Each compiler
implements its own memory management functions.  There also isn't any standard for how
to implement memory management.  Because of this you cannot mix and match your
memory management calls.  If you allocate memory with one compiler's allocation
function, you must free the memory with that same compiler's deallocation function.  You
can't, for instance, allocate memory with MicroSoft Visual C++'s allocation function and
free it with a Borland compiler's deallocation function. 

InterBase v5.x is compiled with MSVC++ (don't ask), and we make use of the memory
management functions provided by MSVC++.  This means that all InterBase memory
allocation and deallocations must be with the MSVC++ memory management functions. 
This causes a problem for certain types of UDFs that are written with a Borland compiler
(BC++, Delphi, or C++ Builder).   UDFs that dynamically allocate memory for return
parameters and specify the FREE_IT keyword are subject to memory leaks.  The
FREE_IT keyword specifies that the memory associated with the return parameter should
be freed by InterBase when it is done with it.  This causes a problem for InterBase when the
memory is dynamically allocated using a Borland compiler.  InterBase cannot free the
memory, because it wasn't allocated by MSVC++'s memory allocation function.  What it
boils down to is the fact that InterBase and the UDF are using different pools to draw
memory from.  InterBase, using MSVC++ memory management functions, doesn't have
access to the pool of memory being used by the UDF.   So, everytime the UDF is called it
will allocate a chunk of memory that is never freed, but all references to it are released. 
This causes a memory leak and bloat the ibserver.exe process.  This memory will not be
freed until the ibserver.exe process is stopped and restarted. 
  
------------------
References
------------------

Writing clean and safe UDFs for InterBase in Delphi by Greg Deatz 
Advanced Windows by Jeffrey Richter 

Server Response from: ETNASC04