Maximizing Performance of Delphi/C++Builder/InterBase Applictions

By: Conference Speaker

Abstract: This paper will present some tips to help you with getting better performance from your Delphi/C++Builder/InterBase system.

This is the technical paper from a talk given at the 12th Annual Borland Developer's Conference
By Robert Schieck – MER Systems Inc.

Robert Schieck is the president of MER Systems Inc., a Toronto, Canada-based company that provides custom client/server software development and training. Robert is a Borland Certified Delphi, C++Builder, and JBuilder instructor, a member of TeamB, a Certified Netware Engineer, and founder and past president of the Toronto Delphi Users Group. Robert has a B.S. in mechanical engineering from the University of Toronto in Toronto, Ontario, Canada. The MER Systems Web site (http://www.mers.com) is the host for the InterBase List and News Servers, which contain the largest body of public InterBase information in the world.

Note: No portion of this paper may be distributed or reproduced by any means, or in any form, without the author's and/or Borland's prior written permission.

Introduction

1) Fully Populate Your Database Before You Build the Front End

2) Use the SQL Monitor to Educate Yourself on What Your Front End is Asking of InterBase

3) BDE vs Native Access Components like IBX, the Speed Difference is Only 40%

4) Avoid Holding Transactions Open for Long Periods of Time.

5) Do Not Use Large Varchars

6) Always Build Your Front End Application Using a Remote Connection

7) Use a 2Kb or 4Kb Page size for Your Database

8) The Only Way to Set Your Database Cache Buffers is Using Gfix

9) Do Not Parameterize Queries That Contains the Verb "Like"

10) Primary Keys and Foreign keys, I Don't Use Them

11) Parameterize and Prepare Your Queries for Maximum Performance.

12) Use Single Processor Machines with InterBase

13) Rob's Rules of Left Outer Joins

14) Avoid Fetchalls Like the Plague

15) Large System, Lots of Users, Cache Your Lookup Tables for Speed.

16) Need Speed, Turn Async Writes Off But Beware of the Risks.

Summary


Introduction

Delphi and CBuilder are wonderful tools. They allow developers with little or no data base experience to build database applications by bolting components together. It is very easy for inexperienced Developers to build small database systems that work reasonably well. Unfortunately as the number of users grow or the size of the database gets large the developer needs to have a better understanding of how Delphi/CBuilder works with InterBase to create reasonably performing systems. This paper will present some tips to help you with getting better performance from your Delphi/CBuilder/InterBase system.

1) Fully Populate Your Database Before You Build the Front End

Most developers don't do this. What most developers do is create their meta data ( create their tables and indexes) , populate each table with six records and then build their application.

We all make mistakes and with only six records in each table you won't know during the development process whether you have made an error or not because no matter how you retrieve six records from the database it is always fast. If on the other hand, you had put a hundred thousand records into your customer database and during the development process you found it took 15 seconds to retrieve a customer, you would know immediately that you have a problem that needs to be fixed.

It's sort of a pay me now or pay me later scenario. If you fully populate your database before you start development you'll find your errors and problems during development or you can put six records in your tables and find your errors during production when the heat is really on.

2) Use the SQL Monitor to Educate Yourself on What Your Front End is asking of InterBase

Give a man of fish and feed him for a day, teach him to fish and feed him for life. The SQL Monitor is your fishing pole for doing database development. It allows you to see the conversations that go on between the client and the server.

The SQL Monitor allows you to compare how application changes affect the conversation between your application and the server. It allows you to compare how different components sets access the server differently.

With the BDE SQL Monitor I have several favorite options I use:

Connect/Disconnect - this option shows when connections are made to the database. You can use this option to determine if you have more than one connection running to the database or if you are opening and closing connections to the database frequently.

Prepare Query Statement - this option allows you to see how often your queries are being prepared. In a properly built application queries should be prepared once and use multiple times to minimize traffic between the client and server.

Execute Query Statement - this option allows you to see the SQL query that is being sent to the server.

Statement Operations - this option allows you to see records being retrieved from the server. Each "Fetch " statement in the SQL Monitor represents one row being retrieved from the server.

Transactions - this option allows you to see how often you are committing your data to the database.

Invest some time in the SQL Monitor and it will give you an excellent education on how Delphi and CBuilder interact with InterBase, which will allow you to build highly optimized applications

3) BDE vs Native Access Components like IBX, the Speed Difference is Only 40%

One way to pick up performance in your application is to use native access components such as IBX. Over all, what you will see is approximately a 40% increase in performance. In addition you will pick up better access to some of the internal features of InterBase.

However, what you gain in speed and accessibility you lose in terms of portability. If you're planning to make your application work with InterBase and other SQL servers, then using native data access components will increase your conversion costs significantly. Please let me remind you that the "S" in SQL stands for structured not standard. Even if you used the BDE for your application and you wish to support Oracle , you will most likely have to rewrite all your SQL statements to get acceptable performance from Oracle.

4) Avoid Holding Transactions Open for Long Periods of Time.

Quite often in the newsgroups you will hear comments to the effective don't hold transactions open for a long period of time. Is never qualified as to what a long period of time is. The duration of the transaction isn't necessarily as important as to what is done with the transaction. If you're going to hold a transaction open for a day and do nothing with it, that is not nearly as severe as holding transactions open all day and running a query against a every minute and a half. The former will prevent InterBase from doing garbage collection. The latter will also prevent InterBase from doing garbage collection and in addition will cause InterBase to consume more and more memory to keep track of what is going on in the transaction. The latter can cause significant slowdowns in performance. For example one company would open their application in the morning and InterBase would be consuming roughly 180 to 200 megs of memory on the server. Because transactions were being held open all day and select queries being run against them, by the end of the day, InterBase was consuming 950 megs of memory. The difference in performance was readily visible when the backups were run. With InterBase consuming 950 megs of memory the backup would take an hour to run. If you shut down InterBase and restarted it to free all the memory, the backup would take six minutes.

So, holding transactions open for several minutes isn't a major problem. Holding a transaction open all day and running lots of queries against that transaction can cause problems.

5) Do Not Use Large Varchars

InterBase stores chars and varchars internally essentially the same way. Externally, chars are returned to your Delphi application padded with blanks to the length of the field. On the other hand, varchars are presented to your application with no trailing blanks. What most people don't realize is a InterBase passes chars and varchars over the network the same way, fully padded out. To help you visualize this, think of a varchar(32000) with just the letter 'a' in the field. When this field is passed over the network it will be passed as the letter 'a' and 31,999 filler characters.

When you are designing your database remember that chars and varchars are passed between the client server fully padded out. If you wish to have a long variable length string consider using a blob.

6) Always Build Your Front End Application Using a Remote Connection

There are two types of connections that you can make, local and remote. A local connection string looks like:

Connect c:pathtomydatabase.gdb

It uses memory mapped files to do the communication between your application and the server and is deceptively fast. This type of connection is significantly faster than what you can expect when you use a remote server.

A remote connection looks like:

Connect localhost:c:pathtomydatabase.gdb

Or

Connect myserver:c:pathtomydatabase.gdb

A remote connection goes through the network to gain access to InterBase and gives performance that one would expect from a client/server application.

The extra speed that a local connection has, easily hides serious performance/design problems. These problems will surface when you finally connect to a remote server. On the newsgroups the message that is posted by someone who built there application with a local connection and has this problem goes something like:

I developed my application using local InterBase and it ran great. When I move to a remote InterBase Server it runs horribly, what is the problem.

7) Use a 2Kb or 4Kb Page size for Your Database

The default page size in InterBase 5.x is 1Kb which is too small. Memory is cheap and so is disk space, so you should increase the page size to 2Kb or 4Kb. I use 4Kb by default and 8Kb on systems with large tables. The larger page size results in reduced index depth and the ability to hold larger blobs and records on a single page all of which will increase performance, especially the indexes if you have long tables.

The only way to change the page size of an existing database is to back it up and restore it with a new page size.

8) The Only Way to Set Your Database Cache Buffers is Using Gfix

There are lots of ways to set the number of cache buffers on an InterBase server. The best way is to set number of cache buffer allocated on a per database basis. There is no way to set this value using a GUI interface. You must edit the ibconfig file by hand or use gfix. The command to set the buffers is :

Gfix buffers 10000 user sysdba password masterkey pathtomydatabase.gdb

This will set the cache buffers for mydatabase.gdb to 10,000 database pages. This means that the first connection to this database will cause InterBase to add 10,000 database pages of memory to its cache. The second and subsequent users will add zero buffers.

The range for buffers is between 0 and 64k. Empirical testing that I have done shows a negligible performance increase going from 10,000 buffers to 20,000 so I typically use 10,000.

9) Do Not Parameterize Queries That Contains the Verb "Like".

The verb "like" is somewhat tricky to use correctly. Assuming you have an index on the lastname field, the following SQL will cause InterBase to use the index:

Select * from customer where lastname like 'SCH%'; // find all the customers who's last name begins with 'SCH'

However, the following SQL will be slow and cause excessive load on the InterBase server as it will not use an index:

Select * from customer where lastname like '%SCH%; // find all the customers who's last name contains 'SCH'

The only way InterBase can solve the above SQL statement is to do a table scan. This means that InterBase will retrieve every record from disk for the customer table, examine it to see if it the lastname field contains an 'SCH'. Table scans are slow and should be avoided.!

The following queries will cause table scans:

Select * from customer where lastname like :aparam ;

This causes a table scan because InterBase doesn't know if you are going to send "SCH%" or a "%SCH%" as the parameter, so it optimized for the lowest common denominator which results in a table scan.

Select * from customer where lastname like upper('something') ;

It doesn't matter what you submit to the function or what the function is, InterBase cannot tell what the output of the function is ahead of time so it optimizes this query down to a table scan

Select * from customer where upper(lastname) like '%SCH';

The only way for InterBase to determine the output of the Upper(lastname) function is to run the lastname field through the function which means a table scan will happen.

Generally, table scans are slow, indexes are fast. Use "like" carefully and you application will fly.

10) Primary Keys and Foreign keys, I Don't Use Them

InterBase has the ability to declare primary keys and foreign key in your table definitions. While the declarative aspects of these feature make them easy to use, I don't use either.

For primary keys, I use a unique index. This allows me to name the index so when I try to post a duplicate value to the table I get an exception with an Index name I assigned instead of rdb$primary?? which is not intuitive. In addition unique indexes can be deactivated and reactivated to rebuild the indexes where primary keys cannot.

Foreign keys are a different story. Lets assume you have a table with a field called "Address_State" in it and you want to limit what is accepted in that field to one of the 50 States, so you create a foreign key to a table called States. When you make the foreign key declaration, InterBase will create an index on the "Address_State" field. The purpose of this index is for when you try to delete a row from the States table. It will allow InterBase to quickly check the "Address_State" field to ensure the value contained in the States table record that you are trying to delete, is not present in the Address_state field.

The performance problem is with the in index that is created on the "Address_State" field. If you have 1,000,000 records in your table and there 50 states then there are roughly 20,000 records per state in the "Address_State" field index, making it a very very poor index. If the InterBase optimizer gets a hold of it this index, there can be problems.

One site that had this problem was running a report that never finished. An examination of the problem revealed that one column had 95,000 rows with the number 3 in it and 5,000 nulls. There was an index on this field created by a foreign key. The InterBase optimizer found this index and the index caused the optimize to solve the problem backwards. The result was an report/query that could not finish in 10 hours. By dropping the foreign key and its index and replacing it with triggers, the optimizer solve the problem in a couple of minutes and the reports came out.

11) Parameterize and Prepare Your Queries for Maximum Performance.

Every SQL statement that is sent to InterBase is prepared, either by you or by the tools you use. Preparing a query is a reasonably expensive process, especially over slow speed lines or the Internet. For example, in one project which ran over 28.8kb dial up lines, the first child took 30 seconds to be returned because of the prepares that had to be done, while subsequent children took 4 seconds each.

In the case of the BDE, if you don't prepare your statements, the BDE will prepare it for you when you open your TQuery Unfortunately it will also unprepare your statement when you close the TQuery, which defeats the purpose of preparing the query. However, if you prepare your TQuery (TQuery.prepare), it will stay prepared until you explicitly unprepare it or you change the contents of the SQL property.

For you IBX fans, IBX will prepare your queries for you automatically and will keep them prepared so long as you do not change the SQL property.

12) Use Single Processor Machines with InterBase

InterBase (SuperServer Architecture) does not know how to use multiple processors correctly. On NT, if you have multiple processors, the InterBase process will get flipped from processor to processor. It will look like InterBase is using multiple processors but it is not. Empirical evidence points to as much as a 30% decrease in performance due to the flipping of the InterBase process.

If you have a multi processor machine and are running InterBase on it, you can run InterBase as an application and set its affinity to one processor to prevent the flipping of InterBase from processor to processor.

13) Rob's Rules of Left Outer Joins

I have never been a fan of left outer joins. They tend to be slow and InterBase will only use an index on the the first left outer join in a query. I have several ways of avoiding left outer joins.

For the rest of the discussion, here is some sample data

SQL> select * from children;

  CHILD_KEY NAME                            SCHOOL_KEY
=========== ============================== ===========

          1 Robert Schieck                           0
          2 Jon Schieck                              1
          3 Diane Schieck                            2
          4 Megan Schieck                            3
          5 Robyn Schieck                            4
          6 Emma Schieck                             5
SQL> select * from schools;

 SCHOOL_KEY SCHOOL_NAME
=========== ================================

          1
          2 Sir Winston Churchill
          3 Oakridges Public School
          4 Lady Churchill Senior Public


I have four rules for left outer joins to share with you :


1) Design out the Left Outer Join

We have a table(children) with children in it and a field called School_key which is a foreign key (no not declarative) into the Schools table which has two fields School_Key and School_Name; Since children under 5 do not go to school, they would have a null value in their School_key field and you would have to do a left outer join to get all the children with their associated school_name. In my school table there is a record with the following values:

School_key 0
School_Name ''

Yes the school_name field is blank or an empty string. In the children table, the School_key field is defaulted to 0. Now all the children have a school. Just the ones under 5 years of age have a school with no name. When I need to see all the children with their associated schools I can use an inner join instead of an outer join.

SQL> select c.name, s.school_name from children c, schools s where c.school_key = s.school_key;

NAME                           SCHOOL_NAME
============================== ================================

Jon Schieck
Diane Schieck                  Sir Winston Churchill
Megan Schieck                  Oakridges Public School
Robyn Schieck                  Lady Churchill Senior Public

Please notice that Jon Schieck doesn't go to a school.

2) Use a Correlated Sub Query

Not all SQL servers can do this ( I won't mention the MS SQL Server 6.5 can do one of these but two of them in the same query will cause wrong answers, not an error message, just wrong answers). Using the above example the query to retrieve ALL the children and their associate schools would look like:

Select c.Name, (select s.school_name from schools s where s.school_key = c.school_key) from children c;

This query will run the subquery once for each row returned.

SQL> Select c.Name,  (select s.school_name from schools s where s.school_key = c.school_key) from children c;

NAME
============================== ================================

Robert Schieck                 
Jon Schieck
Diane Schieck                  Sir Winston Churchill
Megan Schieck                  Oakridges Public School
Robyn Schieck                  Lady Churchill Senior Public
Emma Schieck                   

The two nulls show up because they have a school_key but no corresponding entry in the schools table.


3) Use a Stored Procedure

Stored procedures that return a result set are very powerful. In this case they allow you to create a virtual left outer join:

create procedure select_childSchool
returns (name varchar(30), school_name varchar(30))
as
declare variable school_key integer;
begin
  for select name, school_key from children into :name, :school_key do 
  begin
     school_name = null;
     select school_name from schools where school_key = :school_key into :school_name ; 

     suspend;
  end
end


The output looks like :

SQL> select * from select_childschool;

NAME                           SCHOOL_NAME
============================== ==============================

Robert Schieck                 
Jon Schieck
Diane Schieck                  Sir Winston Churchill
Megan Schieck                  Oakridges Public School
Robyn Schieck                  Lady Churchill Senior Public
Emma Schieck                   

4) Use a left outer join!

Sometimes there is no substitute for a left outer join!
SQL> select c.name, s.school_name from children c left join schools s  on C.SCHOOL_KEY = S.SCHOOL_KEY;


NAME                           SCHOOL_NAME
============================== ================================

Robert Schieck                 
Jon Schieck
Diane Schieck                  Sir Winston Churchill
Megan Schieck                  Oakridges Public School
Robyn Schieck                  Lady Churchill Senior Public
Emma Schieck                   

 

14) Avoid Fetchalls Like the Plague

A fetchall is when you do something in Delphi or CBuilder that causes your application to fetch all the records for a select statement from InterBase.

For example if you have 100,000 people in a database and do a "select * from people where lastname like 'A%'" you will be selecting approximately 10,000 of those records.

If you issue the select from a TQuery with nothing connected to it, then only one row will be returned from the query until you call the next method for the TQuery.

If you have a grid attached to the TQuery when you open it, then it will fetch one row for each row that shows up in the grid. If you have 14 rows showing in the grid, then it will fetch only 14 rows from the server. It will fetch more rows from the server as need when you scroll past the bottom of the grid.

If you do a TQuery.recordcount, no mater what you have connected to the TQuery, the BDE will fetch each row from the server and count it. When the last row has been sent , in this case row 10,000, then the BDE will give you the number of records that it fetched from the server. Fetching 10,000 records can take a while.

A filter on a TQuery, causes a fetchall and then it applies the filter to all the records.

A locate on TQuery causes a fetchall and then it positions the record pointer to the appropriate record.

If you use the SQL monitor, you can see the fetches from the server and it is very easy to spot a "fetchall" happening.

15) Large System, Lots of Users, Cache Your Lookup Tables for Speed.

As the number of users on your system increase, you need to decrease the load on your server. One way to do that is to start caching your lookup tables. For example, there are 50 states in the United States and the odds of a new one being added on any given day is quite remote, so it is a good candidate to be cached.

A ClientDataSet is an excellent component for caching the 'states" table. With the "States" table in a clientdataset, you no longer need to join the "States" table in your queries on the InterBase server but can simulate the join using a calculated field on the client side.

This will reduce the load on the server and allow you to service more users.

16) Need Speed, Turn Async Writes Off But Beware of the Risks.

By default InterBase 5.x on NT, forced write or sync writes is on. This means when InterBase tells NT to write something to disk, NT is not allowed to put it into the OS cache, it must write it to disk immediately.

On Unix platforms, the default is to have force writes off. This means when InterBase tells the the UNIX operating system to write something to disk, UNIX puts it into its cache and writes it out to disk when it is ready.

What is the difference in performance? Inserting 13,000 records via a stored procedure into a table, with forced writes on it took 15 minutes, with forced writes off it took 45 seconds.

As with everything there is a price to be paid and in this case the speed comes with risk. If the InterBase server or the Operating system crashes while InterBase is writing to disk, you stand a good chance of corrupting you database, potentially to the point of making it unrepairable.

If you choose to turn off force writes to improve performance, make sure you computer is on a UPS and make lots of backups.

Summary

Hopefully some of the techniques present here will help your Delphi/CBuilder/InterBase Systems perform better.


Server Response from: ETNASC04