This is the technical paper from a talk given at the 10th Annual Inprise & Borland Developer's Conference
By Dave Schnepper - Brookstone Solutions
Mr. David Schnepper is a software consultant and trainer specializing in InterBase applications. Previously he was a member of the InterBase engineering staff from 1991 to 1999. From 1997 he was the senior staff engineer responsible for leading the evolution of the product architecture and implementation. Prior to assuming the Staff Engineer role, he designed and implemented InterBase's internationalization features. Before joining InterBase, Mr. Schnepper filled several engineering roles for Ashton-Tate, including architect for international software, and international development manager for Framework I, II, III, & IV. He has served on the ANSI SQL standardization committee and the Unicode Committee. Previous employers include Stanford University, Xerox PARC, and several Silicon Valley startups (Integrated Office Systems, Artelonics, and Forefront). At Stanford he taught Introduction to Computer Programming. He was the first Stanford instructor to use Microprocessor-based systems in an introductory course. Mr. Schnepper has an M.S. degree in computer science and a B.S. degree in mathematical sciences from Stanford University. He has a U.S. Patent pending on the tagging of database data with locale information. He resides in Los Gatos, California with his wife Sasha and children Mark and Alex.
Introduction
Talking about performance is always a "fast" way to heat up a database
discussion. Have you ever had your database run too fast?
This paper discusses performance of the InterBase v6.0 database server;
the features added to make your applications run hotter than ever before.
InterBase Express
While the InterBase database server has always been fast -- customer usage
has been restricted due to general-purpose front-ends. InterBase
6.0 makes available the InterBase Express - custom components for data
access at high speed.
Note: Mark Duquette's services API talk will show examples of using
IBX with Delphi 5.
Internal Server Features
Several internal architecture changes have been made to improve server
performance. These features are automatically available in Version
6.0. After you migrate your database server to version 6.0 you should
realize the performance improvements without any application changes.
GARBAGE collection thread
Prior to Version 6, every database request cooperated in garbage collection.
This could result in decreased performance as all pages "dirtied" during
a request have to be written at COMMIT time. In InterBase V6, back
version garbage collection is performed by a central thread. Worker
threads notify the garbage thread when a page containing garbage is noticed
-- but, generally, do not clean up the page themselves. COMMITs then
only have to write pages actually modified by the request's actions.
The garbage collector thread keeps a list of "messy" pages -- fetching
them, cleaning, and writing back.
Databases may stay active after the last user attachment, as the garbage
collector thread will continue to clean even while no users are in the
database.
SWEEP thread
Every so often the database reaches a state where it must be "Swept"
to ensure that all rolled-back transactions have been removed from the
database. In both V5 & V6 there is a specific thread that performs
this action.
Thread Scheduling
In V6 both the Garbage Collection thread & Sweep thread run at lower
internal priority than user request threads. In general, request
threads will now release engine locks more frequently to enable other requests
to run. (One intensive request will no longer prevent other requests
from getting useful work completed). Both threads cooperate
-- the SWEEP thread will also perform Garbage collection when sweeping
isn't required.
File Format Change
An internal change in how back-version differences records are stored
enables V6 to store more records on a data page.
Cache Management
Sequential scans of large objects will release pages to the top
of the LRU queue. A BLOB or relation being scanned is unlikely to
be needed by another request, so it's page is freed for re-use, rather
than cause "useful" pages to be thrown out of the cache.
Optimizer Changes
Mapping of DISTINCT operations to Indices will not always occur - sometimes
it is faster to just do a sort.
The pruning algorithm for determining a join order for queries with
more than 6 tables joined has been improved so it won't "give up" to easily.
TCP/IP Packet Buffer size
(This feature was available in the v5.5 release)
You are able to specify various sizes for TCP buffer in the isc_config
file. The exact syntax is as shown below
"TCP_REMOTE_BUFFER
NNNN"
where NNNN is a value between 1448 - 32768
The client and server both read the entry from
isc_config file and sets up the remote send and receive buffer to the value
specified. If the
value specified is not with in the range then
the buffer size defaults to 8K i.e. 8192.
Our internal testing revealed 8K as the most "useful" size on a range
of database operations. Your application may be tuned as well.
Note that the CLIENT and SERVER can each have different settings for the
buffer size.
Sort Memory utilization
To improve overall performance, changes are being done to the sort
algorithm to write to disk when a large equivalence group was encountered.
Database Performance Options
Read Only Databases
In V6, a "read-only" attribute can be applied to a database.
With the read-only attribute a Database cannot be modified (INSERT, UPDATE,
DELETE). A read-only database has several usages:
-
Store a database on read-only media, such as CD-ROM
-
Historical Data protection (Keep Q1 data on-line in an archive database)
-
Additional protection against unauthorized updates (set read-only from
9 PM to 8 AM)
-
Fast backups: set read-only - make file system copy - set read-write
Example:
gfix -read_only database.gdb
copy database.gdb backup.gdb
gfix -read_write database.gdb
As the database is in read-only mode, a backup copy made with the file
system copy command will be consistent.
GFIX will wait until it can obtain exclusive access to the database
(and set the read-only flag). If necessary, use the shutdown
feature to force other users out of the database.
Utilities running on Server
The client-server model can often cause performance problems - due to the
inherent slowness of networks performing interactive operations.
In InterBase V6.0, several client side utility functions have been migrated
to the server for higher performance.
Server Side backups/restores
InterBase users have known for some time that Backups (& restores)
go faster when you use the following procedure:
Login to your server with a command window
Use the server's GBAK command to backup the database to a server backup
file (on disk)
After the backup completes, transport the backup file to a safe location
via FTP (binary mode!)
In version 6.0, this process has been somewhat automated with the -service
switch to gbak.
The V6 GBAK backup utility can now perform a backup as a SERVER operation,
rather than a CLIENT-SERVER. The backup file is created on the server
and will not transport over the network to the client. Additionally,
as the backup file remains on the server it can be protected by OS file
systems from "hijacking".
Example
gbak -backup -service service_name /data/database.gdb
/backup/database.gbk
The service_name specifies the name of the host (server machine)
to connect to and the protocol as follows:
| TCP/IP |
hostname:service_mgr |
| SPX |
hostname@service_mgr |
| Named Pipes |
hostnameservice_mgr |
| Local access |
service_mgr |
The location of the backup file is relative to YourServer's file system.
By using the -service switch you can use automated scripts on your client
to start backups, without having manual intervention with a terminal window.
Server Services
InterBase version 6.0 provides additional services on the Server.
These services are available via IBX and directly to the InterBase API.
Many of these services were impossible to do from a client under prior
versions.
Services exist to perform the following:
-
Return information on the configuration of a server.
-
Return information on the current state of a server (databases in use,
user information, etc.)
-
Database Backup & Restore
-
Security Database administration (add user, drop user, update user, etc.)
-
Retrieve InterBase.log file.
Applications which used earlier methods of performing these maintenance
tasks will see a performance improvement using the V6 Services API.
New Server Data types
InterBase version 6.0 supports several new data types: 64-bit integer
numerics, a TIME only data type, and a DATE only data type
64-bit numerics
Previously, DECIMAL(n,m) & NUMERIC(n,m) data columns were represented
by floating point numbers when the precision was larger than 9 digits.
In version 6.0 InterBase uses 64-bit integers to support up to 18 digits
of precision. While this is primarily useful for the increased
accuracy (and lack of rounding error) there is also a platform specific
performance improvement by using 64-bit arithmetic instead of floating
point arithmetic.
New applications should use 64-bit numerics whenever possible -- existing
applications should closely examine the InterBase v6.0 migration guidelines.
DATE only data type
Previously, InterBase used 8-bytes to store a TIMESTAMP value.
Often, applications were only interested in the DATE portion - and would
implement special coding to manipulate the (unwanted) TIME portion of a
TIMESTAMP.
InterBase version 6.0 supports a SQL standard DATE only data type --
for data values that only need a DATE portion this data type has advantages.
-
Uses 4 bytes instead of 8
-
Has smaller index keys (more index entries per page)
-
Less data to be transported over the network.
Schema Migration
InterBase has always had performance advantages due to its ability
to perform on-line schema updates. In version 6.0 the ability update
a schema from SQL has been enhanced. These features were always
available in InterBase -- now they are available use SQL-like syntax.
Column Data type changes
Previously, user's wanting to use pure SQL could only change a column's
data type via the cumbersome process of:
ALTER TABLE aTable ADD newColumn newType;
UPDATE aTABLE SET newColumn = oldColumn;
ALTER TABLE aTable DROP oldColumn;
This was cumbersome -- it forced a table scan (to copy the data),
and had the undesirable side effect of "renaming" oldColumn as newColumn.
It was even a lot of typing!
In version 6.0, SQL has been extended to allow access to the engine's
schema modification ability:
ALTER TABLE aTable ALTER COLUMN oldColumn TYPE newType;
Easier to type, MUCH faster to execute.
DOMAIN Data type changes
If you thought a Column data type modification was cumbersome - imagine
having to do it for a domain!
CREATE DOMAIN newDomain AS ...;
for each usage of oldDomain ...
ALTER TABLE aTable ADD newColumn newDomain;
UPDATE aTABLE SET newColumn = oldColumn;
ALTER TABLE aTable DROP oldColumn;
DROP DOMAIN oldDomain;
In version 6.0,
ALTER DOMAIN oldDomain TYPE newType;
Not a performance feature
Like programmers everywhere, the V6 staff couldn't resist putting
in some new features that had nothing to do with Performance of the database
server.
SQL Warnings
Version 6.0 can now return warning information from the server to
the client. Warnings have the implication of "your operation succeeded,
and you should take a look at the following". (Errors tell you -
"your operation failed - here's why"). To see warnings, you
need both a V6 server, a V6 client library, and a V6 application that is
warning aware (such as V6 ISQL).
Typical uses for warnings:
DELETE FROM xTable WHERE A < A;
Warning: Operation affected no rows.
Warnings are also used to flag SQL statements that have different
results between version 5 SQL and version 6 SQL.
Example:
SELECT SUM(size) / COUNT(size) FROM ...
In version 5 a floating point division was used, and a floating
point number returned.
In version 6, an integer division is used, and a scaled integer
value returned.
The semantic difference is flagged to the developer using a Warning.
Summary
This paper has outlined several of the performance features added in InterBase
version 6.0. Many of these features are available without any schema
or application changes. Additional performance can be gained by tuning
your application for version 6.0 and examining your schema for datatype
modifications.