Filtering ClientDataSets

By: Cary Jensen

Abstract: When applied to a dataset, a filter limits the records that are accessible. This article explores the ins and outs of filtering ClientDataSets.

This article is part of an extended series exploring the ClientDataSet in detail. In case you are new to this series, a ClientDataSet is a component that provides an in-memory table that can be manipulated easily and efficiently. Previous articles in this series have provided a broad overview of ClientDataSet usage, but in the past two installments I have been covering the essential, basic operations involving ClientDataSet. In this article I am completing the discussion of foundation issues with a look at dataset filtering.

When you filter a dataset, you restrict access to a subset of records contained in the ClientDataSet's in-memory store. For example, imagine that you have a ClientDataSet that includes one record for every one of your company's customers, world-wide. Without filtering, all customer records are accessible in the dataset. That is, it is possible to navigate, view, and edit any customer in the dataset. Through filtering you can make the ClientDataSet appear to include only those customers who live in the United States, or in London, England, or who live on a street named Enterprise Way. This example, of course, assumes that there is a field in the ClientDataSet that contains country names, or fields containing City and Country names, or a field holding street names. In other words, a filter limits the accessible records based on data that is stored in the ClientDataSet, and is effective the the extent that the data in the ClientDataSet can be used to limit which records are accessible.

A ClientDataSet supports two fundamentally different mechanisms for creating filters. The first of these involves a range, which is an index-based filtering mechanism. The second, called a filter, is more flexible than ranges, but is slower to apply and cancel. Both of these approaches to filtering are covered in this article.

But before addressing filtering directly, there are a couple of important points that need to be made. The first is that filtering is a client-side operation. Specifically, the filters discussed in this article are applied to the data loaded into a ClientDataSet's in-memory store. For example, you may load 10,000 records into a ClientDataSet (every customer record, for instance), and then apply a filter that limits access to only those customers located in New York City. Once applied, the filter may make the ClientDataSet to appear to contain only 300 records (given that 300 of your customer's are located in New York City). Although the filtered ClientDataSet provides access only to these 300 records, all 10,000 records remain in memory. In other words, a filter does not reduce the overhead of your ClientDataSet, it simply restricts access to a subset of the ClientDataSet's records.

The second point is that instead of using a filter, you may be better off limiting how many records you load into the ClientDataSet in the first place. Consider the 10,000 customer records once again. Instead of loading all 10,000 records into memory, and then filtering on the City field, it might be better to load only a subset of the customer records into the ClientDataSet. While partial loading not available when a ClientDataSet is loaded from the local file system using MyBase, it is an option when loading a ClientDataSet through a DataSetProvider. 

For example, imagine that your DataSetProvider points to a SQLDataSet whose CommandText contains the following SQL query:

SELECT * FROM CUSTOMER WHERE CITY = 'New York City'

When the ClientDataSet's Open method is called, this SQL select statement is executed, and only those 300 or so records from your New York City-based customers are loaded into the ClientDataSet. This approach greatly reduces the memory overhead of the ClientDataSet, since fewer records need to be stored in memory.

Actually, there are a number of  techniques that permit you to load selected records from a dataset through a DataSetProvider into a ClientDataSet, including the use of parameterized queries, nested datasets, dynamic SQL, among others. An thorough examination of these techniques will appear in a future article in this series. Nonetheless, from the perspective of this article, these techniques are not technically filtering, since they do not limit access within the ClientDataSet to a subset of its loaded records..

So when do you use filtering as opposed to loading only selected records into a ClientDataSet? The answer boils down to three basic issues: bandwidth, source of data, and client-side features.

When loading a ClientDataSet from DataSetProvider, and bandwidth is low, as is often the case in distributed applications, it is normally best to load only selected records. In this situation, loading records that are not going to be displayed consumes bandwidth unnecessarily, affecting the performance of your application as well as that of others that share the bandwidth. On the other hand, if bandwidth is plentiful and the entire dataset is relatively small, it is often easier to load all data and filter on those records you want displayed.

The second consideration is data location. If you are loading data from a previously saved ClientDataSet (in either Borland's proprietary binary format or in XML format), you have no choice. Filtering is the only option for showing just a subset of records. Only when you are loading data through a DataSetProvider do you have a choice to use a filter or selective loading of data.

The final consideration is related to client-side features, the most common of which is speed. Once data is loaded into a ClientDataSet, most filters are applied very quickly, even when a large amount of data needs to be filtered. As a result, filtering permits you to rapidly alter which subset of records are displayed. A simple click of a button or menu selection can almost instantly switch your ClientDataSet from displaying customers from New York City to displaying customers from Frankfurt, Germany, without a network round-trip.

As mentioned earlier, there are two basic approaches to filtering: ranges and filters. Let's start by looking at ranges.

    Setting a Range

Ranges, while less flexible than filters, provide the fastest option for displaying a subset of records from a ClientDataSet. In short, a range is an index-based mechanism for defining the low and high values of records to be displayed in the ClientDataSet. For example, if the current index is based on customer's last name, a range can be used to display all customer's whose last name is 'Jones.' Or, a range can be used to display only customer's whose last name begins with the letter 'J'. Similarly, if a ClientDataSet is indexed on an integer field called Credit Limit, a range can be used to display only those customers whose credit limit is greater than (US) $1,000, or between $0 and $1000.

There are two ways to set a range. The first, and easiest, is to use the SetRange method. SetRange defines a range using a single method invocation. The second mechanism is to enter the dsSetKey state, which requires a minimum of three method calls, and often four. 

In Delphi and Kylix, SetRange has the following syntax:

procedure SetRange(const StartValues, EndValues: array of const);

As you can see from this syntax, you pass two constant arrays when you call SetRange. The first array contains the low values of the range values for the fields of the index, with the first element in the array being the low end of the range for the first field in the index, the second element being the low end of the range for the second field in the index, and so on. The second array contains the high end values for the index fields, with the first element in the second array being the high end of the range on the first field of the index, the second element being the high end on the second field of the index, and so forth. These arrays can contain fewer elements than the number of fields in the current index, but cannot contain more.

Consider again our example of a ClientDataSet that holds all customer records. Given that there is a field in this dataset named 'City,' and you want to display only records for customers who live in New York City, you can use the following statements:

ClientDataSet1.IndexFieldNames := 'City';
ClientDataSet1.SetRange(['New York City'], ['New York City']);

The first statement creates a temporary index on the City field, while the second sets the range. Of course, if the ClientDataSet was already using an index where the first field of the index was the City field, you would omit the first statement in the preceding code segment.

The preceding example set the range on a single field, but it is often possible to set a range on two or more fields of the current index. For example, imagine that you want to display only those customers whose last name is Walker and who live in San Antonio, Texas. The following statements show you how:

ClientDataSet1.IndexFieldNames := 'LastName;City;State';
ClientDataSet1.SetRange(['Walker', 'San Antonio', 'TX'], ['Walker', 'San Antonio', 'TX']);

In both of these preceding examples the beginning and ending ranges contained the same values. But this is not always the case. For example, imagine that you want to set a range to include only those customers whose credit limit is greater than (US) $1,000. This can be accomplished using statements similar to the following:

ClientDataSet1.IndexFieldNames := 'CreditLimit';
ClientDataSet1.SetRange([1000], [MaxInt]);

    Using ApplyRange

In a previous article in this series you learned that there are two index-based methods for locating a record based on an exact match. One, FindKey, is a self-contained statement for locating a record based on fields of the current index. By comparison, GotoKey is more involved, requiring you to first call SetKey to enter the dsSetKey state, during which you define your search criteria, and then complete the operation with a call to GotoKey. SetRange is similar to FindKey, where a single statement defines the range as well as sets it. ApplyRange, by comparison, is similar to GotoKey. 

To use ApplyRange you begin by calling SetRangeStart (or EditRangeStart). Doing so places the ClientDataSet in the dsSetKey state. While in this state you assign values to one or more of the TFields involved in the current index to define the low values of the range. As is the case with SetRange, if you define a single low value, it must be to the first field of the current index. If you define a low range value for two fields, they must necessarily be the first two fields of the index.

After setting the low range values, you call SetRangeEnd (or EditRangeEnd). You now assign values to one or more fields of the current index to define the high values for the range. Once both the start values and end values have been set, you call ApplyRange to filter the ClientDataSet on the defined range.

For example, the following statements use ApplyRange to display only customers who live in New York City in the customer table.

ClientDataSet1.IndexFieldNames := 'City';
ClientDataSet1.SetRangeStart;
ClientDataSet1.FieldByName('City').Value := 'New York City';
ClientDataSet1.SetRangeEnd;
ClientDataSet1.FieldByName('City').Value := 'New York City';
ClientDataSet1.ApplyRange;

Just like SetRange, ApplyRange can be used to set a range on more than one field of the index, as shown in the following example.

ClientDataSet1.IndexFieldNames := 'LastName;City;State';
ClientDataSet1.SetRangeStart;
ClientDataSet1.FieldByName('LastName').Value := 'Walker';
ClientDataSet1.FieldByName('City').Value := 'San Antonio';
ClientDataSet1.FieldByName('State').Value := 'TX';
ClientDataSet1.SetRangeEnd;
ClientDataSet1.FieldByName('LastName').Value := 'Walker';
ClientDataSet1.FieldByName('City').Value := 'San Antonio';
ClientDataSet1.FieldByName('State').Value := 'TX';
ClientDataSet1.ApplyRange;

Both of the preceding examples made use of SetRangeStart and SetRangeEnd. In some cases, you can use EditRangeStart and/or EditRangeEnd instead. In short, if you have already set low and high values for a range, and want to modify some, but not all, values, you can use EditRangeStart and EditRangeEnd. Calling SetRangeStart clears any previous values in the range. By comparison, if you call EditRangeStart, the previously defined low values remain in the range fields. If you want to change some, but not all, of the low range values, call EditRangeStart and modify only those fields whose low values you want to change. Likewise, if you want to change some, but not all, of the high range values, do so by calling EditRangeEnd.

For example, the following code segment will display all records where the customer's credit limit is between (US) $1,000 and (US) $5,000.

ClientDataSet1.IndexFieldNames := 'CreditLimit';
ClientDataSet1.SetRange([1000],[5000]);

If you then want to set a range between $1,000 and $10,000, you can do so using the following statements:

ClientDataSet1.EditRangeEnd;
ClientDataSet1.FieldByName('CreditLimit').Value := 10000;
ClientDataSet1.ApplyRange;

    Canceling a Range

Whether you have created a range using SetRange or ApplyRange, you cancel that range by calling the ClientDataSet's CancelRange method. The following example demonstrates how a call to CancelRange looks in code:

ClientDataSet1.CancelRange;

    A Comment About Ranges

Earlier in this article I mentioned that it is 'sometimes' possible to set a range on two or more fields. The implication of this statement is that sometimes it is not, which is true. When setting a range on two or more fields, only the last field of the range can specify a range of values, all other fields must have the same value for both the low and high ends of the range. For example, the following range will display all records where the credit limit is between $1,000 and $5,000 for customers living in New York City.

ClientDataSet1.IndexFieldNames := 'City;CreditLimit';
ClientDataSet1.SetRange(['New York City', 1000], ['New York City', 5000]);

By comparison, the following statement will display all records for customers whose credit limit is between $1,000 and $5,000, regardless of which city they live in.

ClientDataSet1.IndexFieldNames := 'CreditLimit;City';
ClientDataSet1.SetRange([1000, 'New York City'], [5000, 'New York City']);

The difference between these two ranges is that in the first range, the low and high value in the first field of the range was a constant value, New York City. In the second, a range appears (1000-5000). In this case, the second field of the range is ignored.

There is another aspect of ranges that is rather odd when working with ClientDataSets. This is related to the KeyExclusive property inherited by the ClientDataSet from TDataSet. Normally, this property can be used to define how ranges are applied. When KeyExclusive if False (its default value), the range includes both the low and high values of the range. For example, if you set a range on CreditLimit to 1000 and 5000, records where the credit limit is 1000 or 5000 will appear in the range. If KeyExclusive is set to True, only customer records where the credit limit is greater than 1000 but less than 5000 would appear in the range. Customers with credit limits of exactly 1000 or 5000 will not.

Maybe its me, but when I try to programmatically set the KeyExclusive property on a ClientDataSet it raises an exception. I have concluded from this that KeyExclusive does not apply to ClientDataSets. If you can get KeyExclusive to work with ClientDataSets in Delphi 6 or Delphi 7, I'd like to know.

    Using Filters

Because ranges rely on indexes, they are applied very quickly. For example, on a 100,000 record table, with an index on the FirstName field, setting a range to show only records for customers where the first name is Scarlett was applied in less than 10 milliseconds on a 850 MHz Pentium III with 512 MB RAM (the resulting view contained only 133 records).

Filters, by comparison, do not use indexes. Instead, they operate by evaluating the records of the ClientDataSet, displaying only those records that pass the filter. Since filters do not use indexes, they are not as fast (filtering on the first name Scarlett took just under 500 milliseconds on the same database). However, they are much more flexible.

A ClientDataSet has four properties that apply to filters. These are Filter, Filtered, FilterOptions, and OnFilterRecord (an event property). In its simplest case, a filter requires the use of two of these properties:  Filter and Filtered. Filtered is a Boolean property that you use to turn on and off the filter. If you want to filter records, set Filtered to True, otherwise set Filtered to False (the default value).

Hide image

When Filtered is set to True, the ClientDataSet uses the value of the Filter property to identify which records will be displayed. You assign to this property a Boolean expression containing at least one comparison operation involving at least one field in the dataset. You can use any comparison operators, include =, >, <, >=, <=, and <>. As long as the field name does not include any spaces, you include the field name directly in the comparison without delimiters. For example, if your ClientDataSet includes a field named City, you can set the Filter property to the following expression to display only customers living in New York City:

City = 'New York City'

Note that the single quotes are required here, since New York City is a string. If you want to assign a value to the Filter property at runtime, you must include the single quotes in the string that you assign to the property. The following is one example of how to do this:

ClientDataSet1.Filter := 'City = ' + QuotedStr('New York City');

The preceding code segment used the QuotedStr function, which is located in the SysUtils unit. The alternative is to use something like the following. Personally, I prefer using QuotedStr, as it is much easier to debug and maintain.

ClientDataSet1.Filter := 'City = ''Freeport''';

In the preceding examples the field name of the field in the filter did not include spaces. If one or more fields that you want to use in a filter include spaces in their field names, enclose those field names in square braces. (Square braces can also be used around field names that do not include spaces.) For example, if your ClientDataSet contains a field named 'Last Name,' you can use a statement similar to the following to create a filter.

ClientDataSet1.Filter := '[Last Name] = ' + QuotedStr('Williams');

These examples have demonstrated only simple expressions. However, complex expressions can be used. Specifically, you can combine two or more comparisons using the AND, OR, and NOT logical operators. Furthermore, more than one field can be involved in the comparison. For example, you can use the following Filter to limit records to those where the City field is San Francisco, and the last name is Martinez:

ClientDataSet1.Filter := '[City] = '+ QuotedStr('San Francisco') +
'and [Last Name] = ' + QuotedStr('Martinez');

Assigning a value to the Filter property does not automatically mean that records will be filtered. Only when the Filtered property is set to True does the Filter property actually produce a filtered dataset. Furthermore, if the Filter property contains an empty string, setting Filtered to True has no effect.

By default, filters are case sensitive and perform a partial match to the filter criteria. You can influence this behavior using the FilterOptions property. This property is a set property that can contain zero or more of the following two flags:  foCaseInsensitive and foNoPartialMatch. When foCaseInsensitive is included in the set, the filter is not case sensitive. 

When foNoPartialMatch is included in the set, partial matches are excluded from the filtered DataSet. When foNoPartialCompare is absent from the FilterOptions property, partial matches are identified by an asterisk ('*') in the last character of your filter criteria. All fields whose contents match the characters to the left of the asterisk are included in the filter. For example, consider the following filter:

ClientDataSet1.Filter := 'City = '+ QuotedStr('San *');

This so long as foNoPartialCompare is absent from the FilterOptions property, this filter will include any city whose name begins 'San ,' such as San Francisco or San Antonio.

Partial matches can also be used with compound Boolean expressions. For example, the following filter will display all customer's whose names begin with the letter M, and who live in a city whose name begins with 'New,' such as Newcastle or New York City.

ClientDataSet1.Filter := 'City = '+ QuotedStr('New*') +
'and [Last Name] = ' + QuotedStr('M*');

    Using the OnFilterRecord Event Handler

There is a second, somewhat more flexible way to define a filter. Instead of using the Filter property, you can attach code to the OnFilterRecord event handler. When Filtered is set to True, this event handler triggers for every record in the dataset. When called, this event handler is passed a Boolean parameter by reference, named Accept, that you use to indicate whether or not the current record should be included in the filtered view. From within this event handler, you can perform almost any test you can imagine. For example, you can verify that the current record is associated with a record in another table. If, based on this test, you wish to exclude the current record from the view, you set the value of the Accept formal parameter to False. This parameter is True by default.

The Filter property normally consists of one or more comparisons involving values in fields of the ClientDataSet. OnFilterRecord event handlers, however, can include any comparison you want. And there lies the danger. Specifically, if the comparison that you perform in the OnFilterRecord event handler is time consuming, the filter will be slow. In other words, you should try to optimize any code that you place in an OnFilterRecord event handler, especially if you need to filter a lot of records.

The following is a simple example of an OnFilterRecord event handler.

procedure TForm1.ClientDataSet1FilterRecord(DataSet: TDataSet;
var Accept: Boolean);
begin
Accept := ClientDataSet1.Fields[1].AsString = 'Scarlett';
end;

    Navigating Using a Filter

Whether you have set Filtered to True or not, you can still use a Filter for the purpose of navigating selected records. For example, although you may want to view all records in a database, you may want to quickly move between records that meet specific criteria. For example, you may want to be able to quickly navigate between those records where the customer has a credit limit in excess of (US) $5,000.

A ClientDataSet exposes four methods for navigating using a filter. These methods are FindFirst, FindLast, FindNext, and FindPrior. When you execute one of these methods, the ClientDataSet will locate the requested record based on the current Filter property, or OnFilterRecord event handler. This navigation, however, does not require that the Filtered property be set to True. In other words, while all records of the ClientDataSet may be visible, the filter can be used to quickly navigate between those records that match the filter.

When you execute the methods FindNext or FindPrior, the ClientDataSet sets a property named Found. If Found is True, a next record or a prior record was located, and is now the current record. If Found returns False, the attempt to navigate failed.

    Using Ranges and Filter Together

Ranges make use of indexes, and are very fast. Filters are slower, but are more flexible. Fortunately, both ranges and filters can be used together. Using ranges with filters is especially helpful when a you cannot use a range alone, and your filter is a complicated one that would otherwise take a long time to apply. In those situations, it is best to first set a range, limiting the number of records that need to be filtered to the smallest possible set that includes all records of the filter. The filter can be applied on the resulting range. Since fewer records need to be evaluated for the filter, the combined operations will be faster than using a Filter alone. 

    An Example

The various filter-related techniques discussed in this article, with the exception of the OnFilterRecord event handler, are demonstrated in the CDSFilter project, which can be downloaded from Code Central by clicking here. The main form of this project is shown in the following figure.

Hide image
Click to see full-sized image

In addition to demonstrating the use of SetRange, ApplyRange, Filter, Filtered, and FilterOptions, this project also provides you with feedback concerning filter performance. The following figure shows a large dataset that has not yet had its range set.

Hide image
Click to see full-sized image

In the following figure, a range has been applied, and only one record appears (out of 100,000 records). In this case, the range was applied in less than 10 milliseconds.

Hide image
Click to see full-sized image

    About the Author

Cary Jensen is President of Jensen Data Systems, Inc., a Texas-based training and consulting company that won the 2002 Delphi Informant Magazine Readers Choice award for Best Training. He is the author and presenter for Delphi Developer Days (www.DelphiDeveloperDays.com), an information-packed Delphi (TM) seminar series that tours North America and Europe, and Delphi Developer Days Power Workshops, focused Delphi (TM) training. Cary is also an award-winning, best-selling co-author of eighteen books, including Building Kylix Applications (2001, Osborne/McGraw-Hill), Oracle JDeveloper (1999, Oracle Press), JBuilder Essentials (1998, Osborne/McGraw-Hill), and Delphi In Depth (1996, Osborne/McGraw-Hill). For information about onsite training and consulting you can contact Cary at cjensen@jensendatasystems.com, or visit his Web site at www.JensenDataSystems.com.

Click here for a listing of upcoming seminars, workshops, and conferences where Cary Jensen is presenting.

Breaking News: Get hands-on training with Cary Jensen. Jensen Data Systems, Inc. is proud to announce Delphi Developer Days Power Workshops, focused Delphi (TM) training. These intense, two-day workshops give you the opportunity to explore and implement a variety of Delphi techniques with Cary Jensen, one of the world's leading Delphi experts. Workshop topics include ClientDataSet, IntraWeb, and more. Due to the hands-on nature of these workshops, class size is very limited. Reserve your seat now. Click here for more information about Delphi Developer Days Power Workshops, or visit http://www.DelphiDeveloperDays.com.

Hide image
   Hide image

Copyright ) 2002 Cary Jensen, Jensen Data Systems, Inc.
ALL RIGHTS RESERVED. NO PART OF THIS DOCUMENT CAN BE COPIED IN ANY FORM WITHOUT THE EXPRESS, WRITTEN CONSENT OF THE AUTHOR.

Server Response from: ETNASC01