How do I find the amount of space used by each table in the database.

By: Borland Staff

Abstract: Server Manager or gstat can give you an estimate

Problem:
It has been growing very rapidly and I am trying to figure out what parts are growing fast.


Solution:
This is not an easy question to answer.  With the tools that are available 
computing an upper bound on the space that is utilitzed by table data is 
possible.  Getting an exact amount of space usage is not currently 
possible, and getting an estimate usage of space is not practical to do. 
I'll explain these facts in the next paragraphs.

When a record in a table is stored it is stored with references to a trail 
to all of the back versions of the record.  When a record is stored there 
is also some RLE (run length encoding) of the record to save some space, 
so the record is stored with "mild" compression be done to it.  In terms 
of the transaction being used one may be looking at the current record or 
a back version of the record.  If we take a very nice case where a backup 
and restore has been done to the database, there are only current versions 
of a record.  To go through each record and record the space taken up by 
the compressed record takes some time and expense (it is not an easy 
global calculation), while this is a judgement call it is not likely that 
there will be a return on an investment in calculating this disk usage.  
Add the elements of doing UPDATEs and DELETEs which produce back versions 
of a record and READ_COMMITTED and REPEATABLE_READ transactions, this 
space usage calculation becomes much more complex.

When a record is stored on a data page it may be occupying only a portion 
of a page, or if it is large may spill over onto a second or third page 
and may occupy all or a portion of these pages.  A record may also be 
placed on a page that contains data belonging to another record.  If a 
record has back versions (or delta records) these records versions may be 
stored on data pages that contain data from other records.  A record back 
version can also spill over to other data pages as well.  

In maintaining data pages, each data page has some management overhead 
space that will take away from some of the storage capacity of the page.  
A restore typically fills each page only up to 80% capacity to leave some 
space on the page for UPDATEs and DELETEs and for delta records.

There is one final caveat in this situation on how data pages are used, 
this is when new data is INSERTed.  New records are inserted first into 
empty spaces on existing data pages and then unallocated pages are 
converted to data pages before new data pages are allocated.

As seen from the last paragraphs, obtaining the exact space usage by a 
table is next to impossible to calculate.  What is possible is to obtain 
an upper bound for the space that is occupied.  To get this data requires 
running the gstat command line utility, or by running the Database 
Statistics process in the Server Manager Utility.  When running a Database 
Statistics go the the View menu and choose the Database Analysis choice.  
The data given here will show the number of data pages currently occupied 
per table, and the average fill percentage.  When the average fill 
percentage starts to get relatively low with regard to the number of data 
pages used, this can used as a sign that it is time to backup and restore 
the database.  Same applies to indexes.

Some other things that can be done is to check that the OIT (oldest 
interesting transaction) is moving forward, or table data may be 
proliferating back versions of records.  Another tactic that can be done 
if appropriate is to use the GBAK command line executable, there is a 
"use_all_space" parameter that will cause the data pages to be restored 
with a 100% fill capacity (if possible).

Posted below is an excerpted portion of the Database Statistics to look 
at, here are the sections of the output that can be used to determine the 
upper bound of the data pages used by a table, and the index pages used by 
an index:

EMPLOYEE (131)
    Primary pointer page: 270, Index root page: 271
    Data pages: 5, data page slots: 5, average fill: 71%
    Fill distribution:
  0 - 19% = 0
 20 - 39% = 0
 40 - 59% = 1
 60 - 79% = 4
 80 - 99% = 0

    Index NAMEX (3)
 Depth: 1, leaf buckets: 1, nodes: 42
 Average data length: 15.00, total dup: 0, max dup: 0
 Fill distribution:
      0 - 19% = 0
     20 - 39% = 0
     40 - 59% = 0
     60 - 79% = 0
     80 - 99% = 1

    Index RDB$FOREIGN8 (1)
 Depth: 1, leaf buckets: 1, nodes: 42
 Average data length: 0.00, total dup: 23, max dup: 4
 Fill distribution:
      0 - 19% = 0
     20 - 39% = 1
     40 - 59% = 0
     60 - 79% = 0
     80 - 99% = 0

    Index RDB$FOREIGN9 (2)
 Depth: 1, leaf buckets: 1, nodes: 42
 Average data length: 6.00, total dup: 15, max dup: 4
 Fill distribution:
      0 - 19% = 0
     20 - 39% = 0
     40 - 59% = 1
     60 - 79% = 0
     80 - 99% = 0

    Index RDB$PRIMARY7 (0)
 Depth: 1, leaf buckets: 1, nodes: 42
 Average data length: 1.00, total dup: 0, max dup: 0
 Fill distribution:
      0 - 19% = 0
     20 - 39% = 1
     40 - 59% = 0
     60 - 79% = 0
     80 - 99% = 0

Server Response from: ETNASC04