By: John Kaster
Abstract: Use our new search engine to find everything on the CodeGear websites
On Wednesday, September 17, 2008, we launched a new search engine based on an open source search technology called Lucene.
Our new search engine supports searching for results on all of our major web sites. Because it is based on Lucene, there are now advanced free text search options available.
You can see the new search engine right now by typing some text into the little search box on this browser page and clicking the Search link.
There are major new improvements in our search functionality:
Search across all major web properties (CodeCentral, QualityCentral, Blogs, all GetPublished sites)
Faster search (although we always have more performance tweaks to make).
Search in any supported human language.
Phrase search, like "this is my phrase".
Full indexing of all words and numbers (number indexing was turned off in our old search engine).
Search all source code in CodeCentral and appropriately marked documents. (QualityCentral attachments will be also indexed soon.)
Consolidated display of content mapped to multiple locations on our web sites.
SearchInsight™ provides tips based on the content of our repository for strings that match the text you are typing.
(For those who enjoy puns like I do, feel free to call this SearchInSite instead!)
As you type text in the small search box at the top of one of our web pages, an AJAX call is made to our search hints web service, matching the first "word" of what you type against the keywords in our search index. The following screenshot shows SearchInsight™ for datasnap:
Figure 1Search Insight for "datasnap"
Our search is case-insensitive, do you can type in DataSnap, datasnap or any other character casing combination, and still match the search results.
The Approx hits value reports the approximate number of unique entries matching what you have currently typed. The actual results you get back can vary based on the visibility rules of the content matched, and your access rights.
The search hints retrieval logic is conceptually simple (but somewhat complex to implement). It returns the first 3 keyword combinations that match the pattern of the "first word" you type.
For example, with datasnap as the value, we request the first three (3) matches for the following patterns, providing 9 potential "quick search" values:
Search Index Matches
datasnap* * *
+datasnap +server +using
+datasnap +2009 +overview
+datasnap +area +has
After you begin typing in the next "word" of your search criteria, we provide only the estimate of matching items in the index, as this screenshot shows:
If the standard search doesn't provide enough control over the results, you can use the Advanced search dialog to fine-tune your search criteria.
You can search for a specific author by name, search only content titles, abstracts (short descriptions, summaries), and the body (full description) of the indexed content if you want to limit your keyword searching to a specific part of the content.
When searching source code, you can choose to search for source code in any of our supported languages or all of them, and also specify the sections of the source code you want to search.
The following options are available for searching source code:
Source code search matches found in CodeCentral are displayed with the name of the source file as the title. Clicking on the link will go directly to the submission. For example, searching for C++ code that calls TClientDataSet currently returns 59 matches. Each of these matches shows the file containing the TClientDataSet reference via the CodeCentral archive explore, which uses YAPP to syntax highlight the source code. You can click on the link above to try it for yourself.
The "code only", "comments", and "strings" options are only available if you specify a source code language for your search.
If you filter by language, only content in the selected language will be shown. If you select "My Preferred Language", content will be filtered using the following rules:
The "community" drop down at the bottom of the screenshot will only contain communities if you're searching the current site, a single selected site, or all sites when only a single site exists. Otherwise, it will always be set to "Any Community".
We sometimes have content in GetPublished that is mapped to multiple locations.
In the previous search interface, users were often confused by the same content being displayed in separate links. Now only one list item per mapped site is shown for the link, as this screenshot shows:
Item number 2 in the list above is a "welcome" article, which is landing page content for an area of a site. It is mapped to the Delphi, C++, Java, InterBase, BlackfishSQL, and PHP communities, so each one is displayed in a separate link below the title. Only welcome articles have multiple links below the title, because they appear in multiple URLs on the site.
We have some content that could be mapped to appear on our support, www, and developer network sites. Only one list item per site will be displayed with our new search results list.
One very handy thing I'd like to mention is something that was available in our previous search engine as well, but I think most people didn't notice it. One the search results page, we have RSS and Atom icons that allow you to subscribe to the search criteria you entered. This way, you can have completely customized automatic notifications of content updates on all our web sites.
There are many other features we want to add to the search engine, like saving user search preferences, adding result highlighting in matching content links, and so on. Please use the search area in QualityCentral to report bugs, and request any features you'd like to see for the new search.
I really hope you find what you're looking for!
Internet Services Architect
Server Response from: ETNASC01