Calculating the community point spread

By: John Kaster

Abstract: John K describes the factors for scoring items in the upcoming BugCentral project, and asks for feedback on ways to calculate the score

Food for thought

I was having breakfast with David I a few days ago, and we were talking about the upcoming launch of the public beta for "BugCentral" (BC). For the first article about this project, see "The BugCentral Project". We are in the process of choosing the final name for this project right now. If you would like to see the choices and cast a vote, please use the "Choose a name for "BugCentral" community pulse.

The choice has now been made, and the official name for the project is "QualityCentral." QualityCentral (QC) is a place where you can report suspected bugs, submit requests for product enhancements, let us know what you think of additions in the recent releases, make marketing suggestions, request articles for the community site or report issues with it, and in general participate in discussions with other Borland customers about issues relating to Borland that you care about, without having to worry about the conversation "scrolling off" after a few days.

At this meeting, we were going over the rating system QC will be using. I explained that four factors were involved. David grimaced. David likes to keep things simple. He claims it's because he's old and a VP now, and he can't handle complicated tasks. I think it's just because he likes to keep things simple. Anyway, I assured him that I could explain the rating system and the reasons for it. He answered, "OK. Find out what the community thinks about it."

Thus an article was born.

But first, a little summary of my perspective on this project is probably in order. I haven't written anything about this project in a while, so lots of thoughts have been going around in my head rather than out to the community.

Using the community process

As with CodeCentral, QC enables a "community process." In the case of CodeCentral, that process is a resource for sharing code, components, projects, and other downloadable items of interest for specific products Borland manufactures. The BugCentral community process will allow the entire community to provide meaningful feedback to Borland about a wide variety of issues the community considers important. It is designed to be an interactive surveying system with issue management built in to the application.

There are almost half a million people using CodeCentral (for details, look here). I expect that even more people will be using QC within a year. This means that thousands of people will be entering suggestions, bug reports, and requests on various issues, and additional thousands will be rating, voting, or commenting on these issues. Any community member will be able to instantly create a "survey" for the rest of the community to answer, prioritize, or comment on. The community will be able to create an "escalation" of an issue they consider important, simply by casting a vote on that issue or rating it highly.

As the data (running in InterBase, naturally) maintained in QC grows more extensive, there are bound to be issues that a clear majority of the community feels is important. Borland can then use this information to help in prioritizing bug fixes and new feature work.

The point spread

This, of course, brings us to the reason I'm writing this article. I'd like to use our current community process, which is the far more manual system of reading comments on the community newsgroup or comments on this article, to discuss the current rating system in QC. Let us know if this rating system makes sense. Let me know if you think it will be useful. Suggest alternatives if you think of other ways to do the scoring system, or have other formulas to suggest.

There are several ways to classify an issue, but in order to calculate a score for an item they must end up being representable by a numeric value. The available classifications are type, severity, rating, and vote.

Type

The type of issue is one of the following:

  • A (Crash/Data loss/Total failure).
  • B (Basic functionality failure)
  • C (Minor failure/Design problem)
  • D (Documentation problem)
  • E (Suggestion/Enhancement request)
  • F (Issue)

The person entering the report will select the type of a report. Because that person may have a high degree of interest in that report, it may not be accurately typed. In that case, a QC sysop may reclassify the type. Only sysops and the person entering the report can modify this classification.

For scoring purposes, let's assign "A" a value of 6, down to "F" having a value of 1.

Severity

  • 4 (Extreme corner case)
  • 3 (Infrequently encountered problem)
  • 2 (Commonly encountered problem)
  • 1 (Serious/highly visible problem)
  • 0 (Critical/Show stopper)

The person entering the report will select the severity as well. As with the type of the report, a sysop may reclassify it. Only sysops and the person entering the report can modify this classification.

Rating

  • 5 (well written, clear description and steps)
  • 4 (has a description and steps but perhaps a little ambiguous)
  • 3 (missing either steps or description)
  • 2 (missing either steps or description, and the one that's there is ambiguous)
  • 1 (can't even tell what the person is talking about)

The ratings are used by every community member to indicate how accurate, clear, and detailed they think a report is. A rating of 5 means "very clear" and a rating of 1 means "totally confusing." You will always be able to see how many ratings an item has, as well as the average score of that item.

Vote

Votes are your "silver bullets" for indicating what you want done. You get five votes (this number is subject to change) per "Product". Since QC covers more than just the IDE products we ship, those "products" include things like CodeCentral, this community web site, and this project itself.

Votes differ from ratings in one key area: you are limited in the number of votes you can cast. This, of course, would automatically mean that you would only use your votes on items you really care about. For instance, someone may put a bug report into QC for an area of a product that you never use. Suppose it's a database access bug, and you don't ever use that database. However, the problem is pretty serious, resulting in data loss or crashing the application. In this case, you might rate the bug highly (important or critical) because it's a bad bug. However, since the bug doesn't affect you and probably never will (unless all of a sudden you have to start supporting that database), you wouldn't use one of your votes to get that bug fixed, because it's not affecting you. This is how voting is different from rating.

The next two paragraphs are included because I know some of you will ask for details on the voting process. You can skip this section if you're not interested in the details.

If you have used up all your votes for a product, you can change your vote to a new item by removing one of your existing votes first, then re-casting your vote on the new item. Once an item is marked closed, as a bug report would be when fixed, you will automatically have that vote restored to you.

In some rare cases, you might actually have more than five votes active for a product. Suppose you already have used four votes for a product. You then cast your fifth and final vote on another item for that product (let's call it Item A). Item A gets closed, and you are now able to vote for an item in that product area again. So you vote on Item B. If Item A is re-opened for whatever reason, you now have six (6) votes active. If you decide you want to change your votes for that product, you will have to clear two votes before you can recast your fifth vote. It is unlikely that this situation will occur very often, but QC will allow you to handle it gracefully. Deciding which votes to remove will probably be the hard part!

Creating the secret formula

Now that I've explained the various rating factors, let's consider some possible formulas for calculating the score.

We'll use the following variables in the formulas:

VariableDescription
TypeValueNumeric value of the type code
SeveritySeverity classification as currently defined
RatingCountTotal number of ratings an item has received
RatingAverageAverage of all ratings for an item
VoteCountTotal number of votes for an item

Favoring votes

One technique that will be useful with the formulas is having secondary scoring calculations that help with tie breakers. For example, if you cared primarily about which items received the most votes, you would still want to use some other factor to differentiate them, because it is likely that multiple items will have the same number of votes. You want as many unique scores as possible, so a formula like the following:

Score = VoteCount

Would return far too many items that would have the same value. A formula like:

Score = (VoteCount * 10) + TypeValue * (5 - Severity)

might give you a score that includes both the popularity and the seriousness of the bug.

Composite Score

Here's a suggestion for a "composite" score that may end up considering all classification factors:

Score = (VoteCount + 1) * (TypeValue * (5 - Severity) * RatingAverage)

As you can see, this still heavily favors votes, but I think that's appropriate.

We hope to include an expression evaluator in a future version of QC's web service that you can use to get your own customized top scores for items. This isn't in the current beta version, but may be available by the time we have the public beta. The public beta will be available before the end of March, 2002 because the project is far enough along that we need to start getting feedback on it from the community. Fortunately, you can use the application to report feedback on the application!

Refactoring

If you have any comments on what I've presented here, please post them as a comment to this article, or in the community newsgroup. Browser access to the newsgroup is at http://newsgroups.borland.com.


Server Response from: ETNASC03