sweep may cause database server to crash - InterBase 5.5

By: Borland Staff

Abstract: Resolved in InterBase 5.6
Problem:
An automatic or manual invocation of a sweep operation under certain conditions leads to an abnormal abort of
the database server and possible corruption of the database. The most common indication that this bug has been
encountered is when the clients recieve the error: 'Connection lost to database'. This is logged as bug #10101 in 
the InterBase bug  tracking system. It was reported against InterBase 5.1.1 and 5.5.0, and is generic across 
platforms. It has not been  reported to affect versions of InterBase prior to 5.1.1. This bug is fixed in the InterBase 
6.0.0 product, which is in  pre-Beta development at the time of this writing. The fix for this defect will also be 
released in a mid-year maintenance release, InterBase 5.6.0 for Windows 95/98/NT.

Solution:
The bug exhibits itself in databases that have backversions of records with NULL states, where the column with
NULL state has an index on it. It is the attempt at index garbage collection that causes the problem. For example, 
foreign keys are defined as indexes on nullable columns. If any field of the record with an indexed nullable column 
is updated, InterBase creates a new version of the record, and the previous state is stored as a backversion. 
Eventually, InterBase attempts to garbage collect this backversion to reclaim space. When a sweep operation 
finds a backversion to garbage collect, it removes the record version and also attempts to garbage collect any 
obsolete entries in indexes that point to values in the record. If it finds that one of the obsolete entries in an index 
is a NULL, the sweeper tries to scan the relation block to determine a default value to garbage collect from the index. 
The sweeper has not yet scanned the relation, so the relation block pointer is NULL and ibserver crashes.
The reason this doesn't crash for gbak or user requests is due to the fact that they scan the relation prior to accessing
the data. A sweeper does not scan the relation prior to performing its garbage collection.

Repairing a damaged database

To repair an affected database, perform all steps below:
1. Make a copy of the problem database.
copy D:tns.gdb C:tnsbad.gdb (File names are examples only.)
2. Run gfix to set sweep interval to zero.
gfix -h 0 -user SYSDBA -password masterkey C:tnsbad.gdb
3. Run gbak to backup the copy of the database
gbak -b -g -z -v -l -user SYSDBA -password masterkey
C:tnsbad.gdb C:tns.gbk
4. Run gbak to restore the database
gbak -c -v -z -user SYSDBA -password masterkey
C:tns.gbk C:tnsfixed.gdb
5. Copy the restored database back in place.
copy C:tnsfixed.gdb D:tns.gdb

If these steps fail to repair your damaged database, you must restore the most recent reliable backup that you
made of the database, and any changes made to the database since you made that backup are not recoverable.

Preventing the sweep bug

There are several techniques you can use to work around this bug. You should use these techniques only
temporarily until you upgrade to a corrected version of InterBase, either 5.6 or 6.0.

Preventing sweeps from occurring

You can prevent future encounters with this bug by leaving the sweep interval set to zero: 
gfix -h 0 D:tns.gdb
This inhibits automatic sweeping, and reduces the risk that index garbage collection crashes ibserver or corrupts
a database.

Removing garbage record versions without sweeping

You can clean a database of obsolete backversions of records without invoking a sweep operation, by backing
up the database and restoring it. Backups do not back up obsolete backversions, and a restore creates the
database anew with only one version of each record. You should make regular backups with the gbak -g command 
or the InterBase Server Manager for Windows. It is recommended that you make backups on a daily basis. A 
backup process performs garbage collection unless you specify the -g option. You can run a backup while other 
users are working on the database, but any changes made to the database after the backup begins are not 
recorded in the backup output.

Daily restores are not necessary, but somewhat regular restores help to keep the database optimized. For example,
it is adequate in typical databases to perform daily backups and weekly restores. You should not restore a database
while others are working on the database.

To reduce the performance impact on other users during the backup/restore procedure, perform these operations
at night, or when the workload is likely to be the lightest. If you perform the backup with no other users attached,
you maximize the effectiveness of the backup since all back versions are accessible.

Structuring a database to avoid NULLs

The sweep bug only affects nullable indexed columns. In many cases, you can structure your database such that
all indexed columns are constrained as NOT NULL. Your individual database design determines whether it is feasible
to restrict indexed columns in this way. The methods below describe ways that you can alter an existing populated
database to add NOT NULL constraints to columns without disturbing the data.

To change one existing column, for example EMPLOYEE.LAST_NAME, from nullable to not nullable, issue the
following statement:

UPDATE RDB$RELATION_FIELDS
SET RDB$NULL_FLAG = 1
WHERE RDB$RELATION_NAME = 'EMPLOYEE'
AND RDB$FIELD_NAME = 'LAST_NAME';To change all indexed columns in all user tables from nullable to not nullable, use the following stored procedure:
CREATE PROCEDURE INDEXED_COLUMNS_NOT_NULL
AS
DECLARE VARIABLE RELATION_NAME CHAR(31);
DECLARE VARIABLE FIELD_NAME CHAR(31);
BEGIN
FOR SELECT F.RDB$RELATION_NAME, F.RDB$FIELD_NAME
FROM RDB$RELATION_FIELDS F, RDB$INDICES I, RDB$INDEX_SEGMENTS S
WHERE F.RDB$RELATION_NAME = I.RDB$RELATION_NAME
AND F.RDB$FIELD_NAME = S.RDB$FIELD_NAME
AND I.RDB$INDEX_NAME = S.RDB$INDEX_NAME
AND (F.RDB$NULL_FLAG IS NULL OR F.RDB$NULL_FLAG = 0)
AND (F.RDB$SYSTEM_FLAG IS NULL OR F.RDB$SYSTEM_FLAG = 0)
INTO :RELATION_NAME, :FIELD_NAME
DO BEGIN
UPDATE RDB$RELATION_FIELDS
SET RDB$NULL_FLAG = 1
WHERE RDB$RELATION_NAME = :RELATION_NAME
AND RDB$FIELD_NAME = :FIELD_NAME;
END
END
The change does not take effect until you execute this procedure and then COMMIT.

IMPORTANT After performing these changes, you must do the following:

1. Execute UPDATE statements to change all NULLs in indexed columns to non-NULL values
2. Back up the database with gbak -g and restore it

Published on: 7/24/2000 12:00:00 AM

Server Response from: ETNASC01

Copyright© 1994 - 2013 Embarcadero Technologies, Inc. All rights reserved.