ClientDataSet Aggregates and GroupState

By: Cary Jensen

Abstract: This article describes how to use aggregates to calculate simple statistics, as well as how to use group state to improve your user interfaces.

One of the advantages to using a ClientDataSet in your applications is the large number of features it enables. In this article I continue my series on using ClientDataSets with a look at both aggregates and group state. Aggregates are objects that can perform the automatic calculation of basic descriptive statistics based on the data stored in a ClientDataSet. Group state, by comparison, is information that identifies the relative position of a record within a group of records, based on an index. Together, these two features permit you to add easy-to-maintain capabilities to your applications.

If you are unfamiliar with either aggregates or group state, you might be wondering why I am covering these two features together in this article. The answer is simple. Both are associated with grouping level, which is an index-related feature. Because the discussion of aggregates necessarily involves grouping level, the coverage of group state is a natural addition. This article begins with a look at aggregates. Group state is covered in a later section.

Understanding Aggregates

An aggregate is an object that can automatically perform simple descriptive statistical calculations across one or more records in a ClientDataSet. For example, imagine that you have a ClientDataSet that contains a list of all purchases by your customers. If each record contains a field that identifies the customer, the number of items purchased, and the total value of the purchase, an aggregate can calculate the sum of all purchases across all records in the table. Yet another aggregate can calculate the average number of items purchased by each customer. In all, a total of five statistics are supported by aggregates. These are count, minimum, maximum, sum, and average.

There are two types of objects that you can use to create aggregates: TAggregate objects and TAggregateField objects. A TAggregate is a TCollectionItem descendant, and a TAggregateField is a descendent of the TField class. While these two aggregate types are similar in how you configure them, they differ in their use. Specifically, a TAggregateField, because it is a TField descendent, can be associated with a data-aware control, permitting the value of the aggregate to be displayed automatically. By comparison, a TAggregate is an object whose value you must explicitly read at runtime.

One characteristic shared by both types of aggregates is that they require quite a few specific steps to configure. If you have never used aggregates in the past, be patient. If your aggregates do not appear to work at first, you probably missed one or more steps. However, after you get comfortable configuring aggregates, you will find that they are relatively simple to use. 

Because TAggregateField instances are somewhat easier to use, they will be discussed in the following section. Using TAggregates is discussed later in this article.

Creating Aggregate Fields

Aggregate fields are virtual, persistent fields. While they are similar to other virtual, persistent fields, such as calculated and lookup fields, there is one very important difference. Specifically, introducing one or more aggregate fields does not preclude the automatic, runtime creation of dynamic fields. By comparison, creating at least one other type of persistent field, such as a data field, lookup field, or calculated field, prevents the ClientDataSet from creating other TFields for that ClientDataSet at runtime. As a result, it is always safe to create aggregate fields at design-time, whether or not you intend to instantiate any other TField instances at design-time.

As mentioned earlier, adding an aggregate field requires a number of specific steps in order to configure it correctly. These are:

  • Add the aggregate field to a ClientDataSet. This can be done at design-time using the Fields Editor, or at runtime using the TAggregateField's constructor.
  • Set the aggregate field's Expression property to define the calculation that the aggregate will perform
  • Set the aggregate field's IndexName property to identify the index to base grouping level on
  • Set the aggregate field's GroupingLevel property to identify which records to perform the aggregation across
  • Set the aggregate field's Active property to True to activate it
  • Set the aggregate field's Visible property to True
  • Set the AggregatesActive property of the ClientDataSet to which the aggregate is associated to True

Because there are so many steps here, it is best to discuss creating aggregate fields using an example. Use the following steps in Delphi or Kylix to create a simple project to which an aggregate field will be added.

  1. Create a new project.
  2. Add to your main form a DBNavigator, a DBGrid, a ClientDataSet, and a DataSource.
  3. Set the Align property of the DBNavigator to alTop, and the Align property of the DBGrid to Client. Next, set the DataSource property of both the DBNavigator and the DBGrid to DataSource1. Now set the DataSet property of the DataSource to ClientDataSet1. 
  4. Set the Filename property of the ClientDataSet to the Orders.cds table (or Orders.xml), located in Borland's shared data directory. If you installed your software using the default installation paths in Windows, you will find this file in c:Program FilesCommon FilesBorland SharedData.

Your main form should now look something like the following:

Adding the Aggregate Field

At design-time, you add aggregate fields using the ClientDataSet's Fields Editor. Use the following steps to add an aggregate field

  1. Right-click the ClientDataSet and select Fields Editor.


  2. Right-click the Fields Editor and select New Field (or press Ctrl-N). Delphi displays the New Field dialog box.


  3. Set Name to CustomerTotal and select the Aggregate radio button in the Field type area. Your New Field dialog box should now look like the following


  4. Click OK to close the New Field dialog box. You will now see the newly added aggregate field in the Fields Editor, as shown here.

Defining the Aggregate Expression

The Expression property of an aggregate defines the calculation the aggregate will perform. This expression can consist of constants, field values, and aggregate functions. The aggregate functions are AVG, MIN, MAX, SUM, and COUNT. For example, to define a calculation that will total the AmountPaid field in the Orders.cds table, you use the following expression:

SUM(AmountPaid)

The argument of the aggregate function can include two or more fields in an expression, if you like. For example, if you have two fields in your table, one named Quantity and the other named Price, you can use the following expression:

SUM(Quantity * Price)

The expression can also include constants. For example, if the tax rate is 8.25%, you can create an aggregate that calculates total plus tax, using something similar to this:

SUM(Total * 1.0825)

You can also set the Expression property to perform an operation on two aggregate functions, as shown here

MIN(SaleDate) - MIN(ShipDate)

as well as perform an operation between an expression function and a constant, as in the following

MAX(ShipDate) + 30

You cannot, however, include an aggregate function as the expression of another aggregate function. For example, the following is illegal:

SUM(AVG(AmountPaid)) //illegal

Nor can you use a calculation between an aggregate function and a field. For example, if Quantity is the name of a field, the following expression is illegal:

SUM(Price) * Quantity //illegal

In this particular case, we want to calculate the total of the AmountPaid field. To do this, use the following steps:

  1. Select the aggregate field in the Fields Editor.
  2. Using the Object Inspector, set the Expression property to SUM(AmountPaid) and its Currency property to True.

Setting Aggregate Index and Grouping Level

An aggregate needs to know across which records it will perform the calculation. This is done using the IndexName and GroupingLevel properties of the aggregate. Actually, if you want to perform a calculation across all records in a ClientDataSet, you can leave IndexName blank, and GroupingLevel set to 0.

If you want the aggregate to perform its calculation across groups of records, you must have a persistent index whose initial fields define the group. For example, if you want to calculate the sum of the AmountPaid field separately for each customer, and a customer is identified by a field name CustNo, you must set IndexName to the name of a persistent index whose first field is CustNo. If you want to perform the calculation for each customer for each purchase date, and you have fields named CustNo and SaleDate, you must set IndexName to the name of a persistent index that has CustNo and SaleDate as its first two fields.

The persistent index whose name you assign to the IndexName property can have more fields than the number of fields you want to group on. This is where GroupingLevel comes in. You set GroupingLevel to the number of fields of the index that you want to treat as a group. For example, imagine that you set IndexName to an index based on the CustNo, SaleDate, and PurchaseType fields. If you set GroupingLevel to 0, the aggregate calculation will be performed across all records in the ClientDataSet. Setting GroupingLevel to 1 performs the calculation for each customer (since CustNo is the first field in the index). Setting GroupingLevel to 2 will perform the calculation for each customer for each sale date (since these are the first two fields in the index).

It is interesting to note that the TIndexDef class, the class used to define a persistent index, also has a GroupingLevel property. If you set this property for the index, the index will contain additional information about record grouping. So long as you are setting an aggregate's GroupingLevel to a value greater than 0, you can improve the performance of the aggregate by setting the persistent index's GroupingLevel to a value at least as high as the aggregate's GroupingLevel. Note, however, that a persistent index whose GroupingLevel property is set to a value greater than 0 takes a little longer to generate and update, since it must also produce the grouping information. This overhead is minimal, but should be considered if the speed of index generation and maintenance is a concern.

The following steps walk you through the process of creating a persistent index on the CustNo field, and then setting the aggregate field to use this index with a grouping level of 1.

  1. Select the ClientDataSet in the Object Inspector and select its IndexDefs property. Click the ellipsis button of the IndexDefs property to display the IndexDefs collection editor.


  2. Click the Add New button in the IndexDefs collection editor toolbar to add a new persistent index.
  3. Select the newly added index in the IndexDefs collection editor. Using the Object Inspector, set the Name property of this IndexDef to CustIdx, its Fields property to CustNo, and its GroupingLevel property to 1. Close the IndexDefs collection editor.
  4. With the ClientDataSet still selected, set its IndexName property to CustIdx.
  5. Next, using the Fields Editor, once again select the aggregate field. Set its IndexName property to CustIdx, and its GroupingLevel property to 1. The Object Inspector should look something like the following

Making the Aggregate Field Available

 The aggregate field is almost ready. In order for it to work, you must set the aggregate field's Active property and its Visible property to True. In addition, you must set the ClientDataSet's AggregatesActive property to True. After doing this, the aggregate will be automatically calculated when the ClientDataSet is made active.

With aggregate fields, there is one more step, which is associating the aggregate with a data-aware control (if this is what you want to do). The following steps demonstrate how to activate the aggregate, as well as make it visible in the DBGrid.

  1. With the aggregate field selected in the Object Inspector, set its Active property to True and its Visible property to True.
  2. Next, select the ClientDataSet and set its AggregatesActive property to True and its Active property to True.
  3. Now, right-click the DBGrid and select Columns. This causes the Columns collection editor to be displayed.
  4. Click the Add All button on the Columns collection editor toolbar to add persistent columns for each dynamic field in the ClientDataSet. 


  5. Now click the Add New button on the Columns collection editor toolbar to add one more TColumn.


  6. With this new TColumn selected, set its FieldName property to CustomerTotal. In order to see this calculated field easily, drag the new column to a higher position in the Columns collection editor. For example, move this new column to the third position within the Columns collection editor.


  7. That's it. If you have followed all of these steps, your newly added aggregate field should be visible in the third column of your DBGrid, as shown in the following figure.

A couple of additional comments about active aggregates are in order here. First, the ClientDataSet's AggregatesActive property is one that you might find yourself turning on and off at runtime. Setting AggregatesActive to False is extremely useful when you must add, remove, or change a number of records at runtime. If you make changes to a ClientDataSet's data, and these changes affect the aggregate calculation, these changes will be much slower if AggregatesActive is True, since the aggregate calculations will be updated with each and every change. After making your changes, setting AggregatesActive to True will cause the aggregates to be recalculated.

Rather than turning all aggregates off or on, the Active property of individual aggregates can be manipulated at runtime. This can be useful if you have many aggregates, but only need one or two to be updated during changes to the ClientDataSet. Subsequently turning other aggregates back on will immediately trigger their recalculation. At runtime you can read the ClientDataSet's ActiveAggs TList property to see which aggregates are currently active for a given grouping level. 

Creating Aggregate Collection Items

Aggregate collection items, like aggregate fields, perform the automatic calculation of simple descriptive statistics. However, unlike aggregate fields, they must be read at runtime in order to use their value. Aggregate collection items cannot be hooked up to data-aware controls. But with that exception in mind, nearly all other aspects of the configuration of aggregate collection items is the same as for aggregate fields.

The following steps demonstrate how to add and use aggregate collection items in a project. These steps assume that you have been following along with the steps provided earlier in this article to define the aggregate field.

  1. Select the ClientDataSet in the Object Inspector and select its Aggregates property. Click the ellipsis button for the Aggregates property to display the Aggregates collection editor.


  2. Click the Add New button on the Aggregates collection editor's toolbar to add two aggregates to your ClientDataSet.
  3. Select the first aggregate in the Aggregates collection editor. Using the Object Inspector, set the aggregate's Expression property to AVG(AmountPaid), its AggregateName property to CustAvg, its IndexName property to CustIdx, its GroupingLevel property to 1, its Active property to True, and its Visible property to True.
  4. Select the second aggregate in the Aggregates collection editor. Using the Object Inspector, set its Expression property to MIN(SaleDate), its AggregateName property to FirstSale, its IndexName property to CustIdx, its GroupingLevel property to 1, its Active property to True, and its Visible property to True.
  5. Add a PopupMenu from the Standard page of the component palette to your project. Using the Menu Designer (double-click the PopupMenu to display this editor), add a single MenuItem, setting its caption to About this customer.
  6. Set the PopupMenu property of the DBGrid to PopUpMenu1.
  7. Finally, add the following event handler to the Add this customer MenuItem:

    procedure TForm1.Aboutthiscustomer1Click(Sender: TObject);
    begin
    ShowMessage('The average sale to this customer is ' +
      Format('%.2m', [StrToFloat(ClientDataSet1.Aggregates[0].Value)]) +
      '. The first sale to this customer was on '+
      DateToStr(ClientDataSet1.Aggregates[1].Value));
    end;
  8. If you now run this project, your main form should look something like that shown in the follow figure.


  9. To see the values calculated by the aggregate collection items, right-click a record and select About this customer. The displayed dialog box should look something like the following figure:

Understanding Group State

Group state refers to the relative position of a given record within its group. Using group state you can learn whether a given record is the first record in its group (given the current index), the last record in a group, neither the last nor the first record in the group, or the only record in the group. You determine group state for a particular record by calling the ClientDataSet's GetGroupState method. This method has the following syntax:

function GetGroupState(Level: Integer): TGroupPosInds;

When you call GetGroupState, you pass an integer indicating grouping level. Passing a value of 0 (zero) to GetGroupState will return the current record's relative position within the entire dataset. Passing a value of 1 will return the current record's group state with respect to the first field of the current index, passing a value of 2 will return the current record's group state with respect to the first two fields of the current index, and so on. 

GetGroupState returns a set of TGroupPosInd flags. TGroupPosInd is declared as follows:

TGroupPosInd = (gbFirst, gbMiddle, gbLast);

As should be obvious, if the current record is the first record in the group, GetGroupState will return a set containing the gbFirst flag. If the record is the last record in the group, this set will contain gbLast. When GetGroupState is called for a record somewhere in the middle of a group, the gbMiddle flag is returned. Finally, if the current record is the only record in the group, GetGroupState returns a set containing both the gbFirst and gbLast flags.

GetGroupState can be particularly useful for suppressing redundant information when displaying a ClientDataSet's data in a multi-record view, like that provided by the DBGrid component. For example, consider the preceding figure of the main form. Notice that the CustomerTotal aggregate field value is displayed for each and every record, even though it is being calculated on a customer-by-customer basis. Not only is the redundant aggregate data unnecessary, it makes reading the data more difficult.

Using GetGroupState you can test whether or not a particular record is the first record for the group, and if so, display the value for CustomerTotal field. For records that are not the first record in their group (based on the CustIdx index), you simply skip printing. Determining group state and suppressing or displaying the data can be achieved by adding an OnGetText event handler to the CustomerTotal aggregate field. The following is an example of how this event handler might look:  

procedure TForm1.ClientDataSet1CustomerTotalGetText(Sender: TField;
  var Text: String; DisplayText: Boolean);
begin
if gbFirst in ClientDataSet1.GetGroupState(1) then
  Text := Format('%.2m',[StrToFloat(Sender.Value)])
else
  Text := '';
end;

If you want to also suppress the CustNo field, you must add persistent fields for all of the fields in the ClientDataSet that you want to appear in the grid, and then add the following event handler to the CustNo field's OnGetText event handler:

procedure TForm1.ClientDataSet1CustNoGetText(Sender: TField;
  var Text: String; DisplayText: Boolean);
begin
if gbFirst in ClientDataSet1.GetGroupState(1) then
  Text := Sender.Value
else
  Text := '';
end;

The following figure shows the main form from the running CDSAggs project, which demonstrates the techniques described is the article. Notice that the CustNo and CustomerTotals fields are displayed only for the first record in each group (when grouped on CustNo). You can download this sample project from Code Central by clicking here.

About the Author

Cary Jensen is President of Jensen Data Systems, Inc., a Texas-based training and consulting company that won the 2002 Delphi Informant Magazine Readers Choice award for Best Training. He is the author and presenter for Delphi Developer Days (www.DelphiDeveloperDays.com), an information-packed Delphi (TM) seminar series that tours North America and Europe, and Delphi Developer Days Power Workshops, focused Delphi (TM) training. Cary is also an award-winning, best-selling co-author of eighteen books, including Building Kylix Applications (2001, Osborne/McGraw-Hill), Oracle JDeveloper (1999, Oracle Press), JBuilder Essentials (1998, Osborne/McGraw-Hill), and Delphi In Depth (1996, Osborne/McGraw-Hill). For information about onsite training and consulting you can contact Cary at cjensen@jensendatasystems.com, or visit his Web site at www.JensenDataSystems.com.

Click here for a listing of upcoming seminars, workshops, and conferences where Cary Jensen is presenting.

Breaking News: Get hands-on training with Cary Jensen. Jensen Data Systems, Inc. is proud to announce Delphi Developer Days Power Workshops, focused Delphi (TM) training. These intense, two-day workshops give you the opportunity to explore and implement a variety of Delphi techniques with Cary Jensen, one of the world's leading Delphi experts. Workshop topics include ClientDataSet, IntraWeb, and more. Due to the hands-on nature of these workshops, class size is very limited. Reserve your seat now. Click here for more information about Delphi Developer Days Power Workshops, or visit http://www.DelphiDeveloperDays.com.

   

Copyright ) 2002 Cary Jensen, Jensen Data Systems, Inc.
ALL RIGHTS RESERVED. NO PART OF THIS DOCUMENT CAN BE COPIED IN ANY FORM WITHOUT THE EXPRESS, WRITTEN CONSENT OF THE AUTHOR.



Server Response from: ETNASC04