CodeGear - The Drive for Quality Products

By: Chris Pattinson

Abstract: Insight from CodeGear QA Manager Chris Pattinson on Testing Automation Improvements in 2006

  CodeGear - The Drive for Quality Products

By Chris Pattinson, CodeGear QA Manager. January 13, 2007.

Test early. Test often. This is the CodeGear QA mantra, and we’re working to prove this by action – not just words. I’ve been with Borland for over 6 years now, initially as an International QA Engineer, and worked my way up the ranks to QA Lead, International QA Manager, International Manager and now the team QA Manager.

Engineering quality into the product is both a feature and a development effort. It takes a significant effort to not only test your product, but understand the results of the testing. One you understand those results, then you need to act on those results to improve the product. This cycle is more complex in the world of technology where platforms change, development requirements change, and operating systems can be updated so that code that worked last year no longer behaves as expected. For example, I worked on all three Kylix projects, and Linux was one heck of a moving target to hit. Just when you fix a feature so it works on one operating system, a new one is released that may invalidate a portion of the recent work.

In 2006, the engineering group focused on improving our automated test system. There are numerous components to this – in the past we’ve run a number of automated tests in a ‘semi-automatic’ fashion, where individual engineers would need to build and run the executables. This made for difficult maintenance, unpredictable results, and more questions than answers.

To improve the system, we first targeted our framework. Internally we call this framework ‘Zombie’, and Steve Trefethen (of R&D) led the charge to dramatically improve the reliability of this framework. He implemented a number of improvements, which include an automation model generation system. Models are Delphi classes containing component information automated tests used to connect to and reference parts of the Galileo IDE. (The Galileo IDE is used for Delphi, C++Builder, C#Builder, and Turbo Delphi, Turbo C++ and Turbo C#.) A model is required for every part of the IDE that is tested. Changes to the IDE previously required by-hand changes of these models to prevent tests from breaking. Now we reference these automatically generated models, and the testing world for CodeGear is a MUCH happier place.

In addition, Justin Swett developed and implemented a centralized automation reporting system. In the past it was difficult to summarize our testing position and effort. How many tests were running? On how many builds? How often? What was the summary and history of results? Were trends up or down?

Now we have a system that provides a huge amount of summarized information, as well as charting of history and views of previous runs of tests. This allows engineering at a glance to see how the product is performing today in comparison to yesterday.

Hide image
Click to see full-sized image

Image 1: Summary Report

The Summary Report shows the results of tests run against a build on 1/13/2007, at 2:35:56 AM. The top section has a list of test suite name, followed by number of tests that ran, and a color code representing the percentage success. A dark blue means 100% success. The red dot is a suite that completely failed, and the other colors represent various levels of success. QA is responsible for reviewing the results of these tests after each build, and providing feedback to R&D of any changes in behavior.

The bottom section shows a summary of results such as build number, date run, area, OS, SKU, duration of tests, number of tests, and on the right side a summary of results from the previous run of the same test for easy comparison.

Hide image
Click to see full-sized image

Image 2 : Historical Charting

That was the beginning. The QA team then leveraged existing VMWare knowledge and work, to run test automation against every build. First we started with a product ‘smoketest’, which is a suite that currently includes 75 basic functional tests that exercise common user operations in the IDE. Some examples of these functional tests are: Creating a new application; Editing the source code; Building; Debugging; Code Insight, and Closing. This ‘smoketest’ runs every build.

In addition, QA engineers use VMWare to run a set of functional tests on every build. This part of the system is 100% automated, and kicks off when a new build is delivered from integration. Usually this is daily, but can be more often. We combine a mix of the VMWare player, VMWare Workstation, GSX Servers (now called ‘VMWare Server’) and ESX Servers to support the testing architecture. The team has been very pleased with the performance and capabilities of ESX servers, especially with the fully automated test system.

That is a good beginning, and a lot of energy is being spent to add additional tests suites and tests cases to this system. However we wanted to go further - catch errors during build, and at the source. For this we use CruiseControl, to run a number of basic tests as part of the regular build process to ensure that code does more than just a basic compile and build. An example of that type of testing is another smoketest, which opens and compiles each type of application, in each personality. If there is a fundamental failure preventing an application from being created and compiled, then the build is flag ‘not good enough’ for QA, and email notifications are sent out.

This additional enhancement has kept the flow of testable builds to the QA group in great shape. In the past three weeks, we’ve only had ONE build not testable by QA, which is a huge improvement!

To improve product performance, we’ve hired a QA engineer with a focus on performance and stability testing. Claire Rouchy has implemented the following set of performance tests, which are graphed as part of the automated test system.

  • TimeStartup : time to load IDE
  • TimeGalleryInvoke : Open | New | Other
  • TimeNewWin32VCLFormsApplication: Open|New|Delphiwin32Form
  • TimeUnitToFormSwitch: 2 tests,F12 toggles between form and switch (QC 22372)
  • DelphiW32ProjectOptionsInvoke: Project | options
  • TimeLargeUnitLoad : 2 tests,Measure the time to open/close classes.pas
  • TimeNewBCBProject: Open | New | Other | BCB
  • BCBProjectMenuOptionsInvoke: Project | options
  • TimeNewCSharpWFApplication: Open | New | Other | CSharpWinForm
  • TimeNewVCLDotNetFormsApplication: Open | New | Other | VCLDotNetForm
  • TimeNewDelphiWinFormsApplication: Open | New | Other | DelphiWinForm
  • ToolsMenuOptionsInvoke: Project | options
  • OpenLargeProject: Open a big project ProjectGroup2A.bdsgroup
  • IDEShutdown

R&D has visibility into the results of their performance tweaking and changes against these tests, and Claire has been scouring QualityCentral for use cases and reports to include as part of the testing. Here are some examples:

  • QC 22038 / RAID 237942: slowdown when projects build/run in IDE
  • QC 37152: IDE slow down to switch between debug and default layouts
  • QC 34162 / RAID 242030: Memory usage climbs after compiling/switching project
  • QC 29604 / RAID 240286: Noticeable delay when compiling in IDE
  • QC 26145 / RAID 239235: Code insight performance needs to be improved

In addition to performance, Claire is working on stability testing. The concept is a suite of semi-randomized tests that run over and over until the product fails. For example – create a new Delphi application, add a unit, implement a class, remove a unit, add two units, close all, change personality and create a new C++ application, and so on. With enough data points, a trend can be reliably established on IDE stability. This is a useful metric to R&D to help determine where those hard to find or reproduce crash bugs exist. A detailed log of actions and ability to debug will be part of these types of tests.

These efforts are just the beginning! As a result of work in 2006, we now have a wealth of raw data to process and our awareness of product quality is much improved. For 2007 we’re seeking to improve our actions on these results. We are also working in other directions, such as improving the format of our field tests, and channels of customer feedback. 2006 was a good year for CodeGear product quality, and in 2007 and beyond we’re expecting our customers will enjoy the benefits of all this effort.


Server Response from: ETNASC03