The Coad Letter: Modeling and Design Edition, Issue 88, Lists Considered Harmful

By: Coad Letter Modeling Editor

Abstract: In this issue, we look at the use of lists in Java applications.

Short intro note from Peter Coad
Hi There,
I wanted to send you a sample issue, for your consideration. Here it is: the first issue from the all-new site (see below).

Oh what fun we are going to have! I hope you'll join me in this great adventure.

Very best,
Peter Coad
Editor-in-Chief, The Coad Letter
Chairman, Founder, and Chief Strategy Officer, TogetherSoft Corporation
 

Lists Considered Harmful
Issue 88, March 2002


Dear Friend,

Welcome to the first issue of the Modeling and Design Edition of the newly reorganized The Coad Letter. Over the coming months, I'll be sharing with you insights, strategies, patterns and techniques for creating better software.

We kick off this month with a look at the impact of lists on the usability and scalabilty of software systems. How many times have you had to wait forever for a system to display a long list of things that you then had to scroll through looking for the particular item you want? We look at three strategies for better handling of large lists.

Have fun

Steve

Stephen Palmer (stephen@thecoadletter.com)

PS #1. Yes, you may have heard: I have returned to independent consulting ... specializing in object-oriented analysis, design, programming and process. Contact me via http://www.step-10.com for short or longer term assignments. Peter Coad says I am one of the best designers he has ever worked with. Give me the chance to show you why. Let me help your team build better object-oriented software. More details at http://www.step-10.com.

PS #2. Together ControlCenter 6.0 is officially released! UI builder, 10 new refactorings (total 12), testing, and more. Download today from http://www.togethersoft.com/downloads/


Lists Considered Harmful

User extendible lists or large fixed-sized lists can seriously affect the performance of a software system. Loading the entries of a large list from disk or over a network can be a time consuming operation. Keeping the contents of large lists in memory can increase the amount of memory required to run an application or result in the operating system page swapping memory to disk (again slowing performance). If the contents of a list are more complex objects than just simple character strings the problem is worse.

You might think that such performance problems would be identified early in a system's development and appropriate solutions found. However, most developers use small sets of data for unit and integration testing. This means that a performance problem with a large list might not be noticed until formal system testing. By then project deadlines may mean there is not enough time to correct the problem. When a list's contents can be extended by users of the system, a performance problem may not be noticed until the system has been installed and running for months or until a project of large enough size is attempted.

Therefore, when it comes to working with lists, it is definitely a case of think first before coding the simplest solution that comes to mind. In this situation the simplest solution may actually be too simple and affect the ability of the software to scale to truly large sets of users and large projects. For software product vendors, problems like this can seriously affect the ability to sell a software product into large companies.

The following strategies can be used to provide better designs for manipulating larger and growing lists of things. None of the strategies are new or earth shattering; the point is to consider them when first designing the manipulation of a large list and pick one or a combination.


Strategy 2001-03-01: Categorize list entries and present as a tree structure

Place the entries of a large list into a set of mutually exclusive categories. Instead of presenting the user with a single long list of items, ask the user to select a category first and then display only those entries within that category.

Notes:

  • Most graphical user interface (GUI) toolkits provide a tree control that can be used to do this concisely; categories are parent nodes in the tree and the items are leaf nodes.
  • Most GUI tree controls provide a means of showing and hiding the children of a node. Only those entries under nodes that a user 'expands' need be loaded often reducing the amount of memory used significantly.
  • Enabling the user to redefine the categories and move items from one category to another allows a user to organize the items to better meet their own specific needs.

Examples:

  1. The filesystem on many computers (e.g. Unix, Mac and Windows) use directories or folders to present large numbers of individual files within a tree structure.
  2. Many email clients use folders to organize a large number of messages as a tree.
  3. TogetherSoft's Together ControlCenter presents its list of automated design patterns as a tree control categorized by the patterns applicability or origin (e.g.. J2EE patterns, Gang of Four Design patterns, user interface patterns, Coad class archetypes, etc.).

Advantages:

  • This strategy usually offers a reasonably straightforward refactoring by replacing a list control with a tree control. However, it is definitely more efficient to decide to use a tree first than code a list and then replace it later.
  • Provides an easy way for a user to locate a particular item if the category of the item is known.

Disadvantages:

  • Makes it much harder to find a particular item if the category is not known.
  • As more and more items are added, the categories need to be reorganized and extra levels added.
  • A tree structure only provides a single categorization scheme; an item belongs to one and only one category. Extensions such as links and shortcuts to other categories can be used to relieve this constraint but at the cost of significant additional complexity.

This strategy does not really solve the underlying problem of a growing list; it only helps ease the pain for a while. Imagine a system that listed all employees. In a rapidly growing company a simple list might be sufficient for the first couple of years. When the company has a few hundred employees a tree control might provide a good enough mechanism for locating a particular employee. However for a multi-national company neither a simple list or tree is going to suffice.


Strategy 2001-03-02: Replace a simple list with a search operation

Instead of presenting all the items in one long, alphabetically sorted list, provide the user with the ability to search for a small subset of items using a set of criteria.

Notes:

  • In business systems it is important to select a useful set of search criteria; talk to the user representatives and, if possible, watch how they currently locate these items in their daily work.
  • If search criteria can be saved, a user can build up a set of useful searches so that criteria does not have to be remembered and re-entered each time.

Examples:

  1. Filesystems on many computers (e.g. Unix, Mac and Windows) provide a file search capability in addition to the hierarchical categorization of directories or folders.
  2. Many business systems provide a search facility for locating customers details.
  3. Many e-commerce sites provide a search facility for quickly locating a particular product.

Advantages:

  • Users can use multiple criteria to try to locate a particular item.
  • When backed by indexing techniques a search can be a much faster way to locate a particular item.
  • Searching is one of the most established branches of computer science so an efficient algorithm is usually readily available.

Disadvantages:

  • Poor selection of search criteria can still result in a large list of items being presented to the user or the particular item not being found.
  • Poor implementation can make searching very expensive and time consuming operation.

Strategy 2001-03-03: Enable a user to define, name, save and load subsets

Enable a user to specify a subset of the whole list, give that subset a name, save it and then load it as and when desired.

Notes:

  • In a distributed system, designers need to choose between storing the subsets on the client or on the server. Storing on the server usually requires more work but subsets can be shared between users.
  • Storing on the client often means faster retrieval but useful subsets cannot be shared between users as easily.

Examples:

  1. The list of possible stereotypes in UML is growing daily. For a tool vendor to present in a list all the possible stereotypes an element can take is rapidly becoming impractical. UML Profiles provide a mechanism where a subset of stereotypes can be named, saved and loaded so that a user only loads the subset of stereotypes relevant to the task on which they are working.
  2. A bank manager might save a subset of his most important account holders.
  3. A planner might save a subset of possible tasks representing a particular process.

Advantages:

  • Users only load the items they need to work with reducing the amount of items that need to be loaded from disk or across a network into memory.
  • Unlike saving a set of search criteria, there is no potentially expensive operation to be performed before the set of items is presented to the user.
  • If there are many lists to which this strategy can be applied, then these sets can themselves be collected together into named themes or profiles or project templates.

Disadvantages:

  • A newly added item may not be noticed by someone working with statically defined subsets.

Combining Strategies

We have already mentioned that computer filesystems tend to combine a tree structure and a search facility to help users locate files. Other combinations can be very useful too. A search could return its results in a tree form helping the user learn the categories used. The results of a search could be used to from a named subset of items. Larger named subsets could be presented as a small tree structure. And so on.

Summary

Working with a list? Find out how large it is or could become. If it could grow to hundreds or thousands of entries consider each of the above strategies (preferably with a user representative). Pick one or a combination and provide your system with the ability to scale well to use by large organizations.


Published on: 3/26/2002 12:00:00 AM

Server Response from: ETNASC01

Copyright© 1994 - 2013 Embarcadero Technologies, Inc. All rights reserved.