Using The WebBrowser Component

By: Vino Rodrigues

Abstract: A paper on using the TWebBrowser component, including some nif fty tricks.

Using The WebBrowser Component

A white paper and tutorial by Vino Rodrigues
vinorodrigues@yahoo.com
www.geocities.com/vinorodrigues

What will be covered in this paper:

The TWebBrowser component

The TWebBrowser component (in the Internet palette) is a Microsoft ActiveX® control that you can use on your application's forms to browse Web sites, view Web pages and other documents, and download data located on the Internet. The TWebBrowser component is useful in situations where you don't want to disrupt the work flow in your application by switching from your application to a Web browser or other document-viewing application.

The TWebBrowser component can display any Web page that Microsoft Internet Explorer version 3.0 or later (i.e. 4.0, 4.01, 5, 5.5, 6 ...) can display. For example, the TTWebBrowser component can display pages that includeany of the following features:

  • Standard HTML and HTML enhancements, such as floating frames and cascading style sheets

  • Other ActiveX controls

  • Most Netscape plug-ins

  • Scripting, such as Microsoft Visual Basic Scripting Edition (VBScript) or JavaScript

  • JavaTM applets

  • Multimedia content, such as video and audio playback

  • Three-dimensional virtual worlds created with Virtual Reality Modeling Language (VRML)

In addition to opening Web pages, the TWebBrowser component can open any ActiveX document, which includes most Microsoft Office documents. For example, if Microsoft Office is installed on a user's computer, an application that uses the TWebBrowser component can open and edit Microsoft Excel spreadsheets, Microsoft Word documents, and Microsoft PowerPoint presentations from within the control. Similarly, if Microsoft Excel Viewer, Microsoft Word Viewer, or Microsoft PowerPoint Viewer is installed, users can view those documents within the TWebBrowser component.

With the TWebBrowser component, users of your application can browse sites on the World Wide Web, as well as folders on a local hard disk and on a local area network. Users can follow hyperlinks by clicking them or by typing a URL into a text box. Also, the TWebBrowser component maintains a history list that users can browse through to view previously browsed sites, folders, and documents.

Adding the TWebBrowser component to a Form

Before you can add the TWebBrowser component to a form, you must have Microsoft Internet Explorer version 3.0 or later installed.

If you purchased Microsoft Windows or Office products you may have it already installed or can do so from the original Microsoft media. You can also download Microsoft Internet Explorer for free from the Microsoft web site: http://www.microsoft.com/ie/download/.

To add the TWebBrowser component to a form

  1. Open the form in Design view.

  2. Select the Internet component palette tab.

  3. Select the TWebBrowser component.

  4. On the form, click where you want to place the component.

  5. Move and size the control to the area you want to display.

Tip: If the TWebBrowser component can't display the full width or height of a Web page or document, it automatically displays scroll bars. However, in most cases, you should make the control wide enough to display the full width of a typical Web page so that users of your application don't have to scroll horizontally.

Displaying Web Pages or Documents in the TWebBrowser component

To display a Web page or document in the TWebBrowser component, wen need use the Navigate method programmatically. The syntax for the Navigate method is:

procedure Navigate(const URL: WideString);
  overload;

procedure Navigate(const URL: WideString;
  var Flags: OleVariant); overload;

procedure Navigate(const URL: WideString;
  var Flags: OleVariant;
  var TargetFrameName: OleVariant); overload;

procedure Navigate(const URL: WideString;
  var Flags: OleVariant;
  var TargetFrameName: OleVariant;
  var PostData: OleVariant); overload;

procedure Navigate(const URL: WideString;
  var Flags: OleVariant;
  var TargetFrameName: OleVariant;
  var PostData: OleVariant;
  var Headers: OleVariant); overload;

Where:

  • URL specifies the UNC path name of a file or the Uniform Resource Locator (URL) of an Internet resource that the Web browser should display.

    If URL refers to an Internet protocol and a location on the Internet, your application must establish a connection before is can display the document. If the computer running your application is connected to a proxy server (a secure connection to the Internet through a LAN), or if it has a direct connection to the Internet, the TWebBrowser component downloads and displays the Web page or other Internet content immediately. If the computer running your application uses a modem and dial-up connection to the Internet, and that connection hasn't been established beforehand, the TWebBrowser component initiates the connection.

    If URL refers to an Internet protocol and a location on an intranet server, the computer running your application must be connected to the intranet and have permission to access that server.

    If URL refers to a standard file system path on a local hard drive or intranet, the TWebBrowser component opens the document and displays it immediately. The TWebBrowser component can open Microsoft Office documents, text files, and HTML documents that don't require features supported only by an Internet server. For example, the TWebBrowser component can't open HTML documents that use IDC/HTX files or Active Server Pages (ASP) files from the standard file system, but it can open HTML documents that contain only the HTML tags supported by Microsoft Internet Explorer.

    Note: If URL refers to a path in the standard file system that doesn't refer to a file name (for example, C:WindowsSystem), the TWebBrowser component displays the file system itself, much like My Computer.

  • Flags is a set of values that specify whether to add the resource to the history list, whether to read from or write to the cache, and whether to display the resource in a new window. It can be a sum of zero or more of the following:

    ConstantValueMeaning
    NavOpenInNewWindow$01Open the resource or file in a new window.
    NavNoHistory$02Do not add the resource or file to the history list. The new page replaces the current page in the list.
    NavNoReadFromCache$04Do not read from the disk cache for this navigation.
    NavNoWriteToCache$08Do not write the results of this navigation to the disk cache.
    NavAllowAutosearch$10If the navigation fails, the Web browser attempts to navigate common root domains (.com, .org, and so on). If this still fails, the URL is passed to a search engine.

  • TargetFrameName is the name of the frame in which the resource will be displayed, or nil if the resource should not be displayed in a named frame.

  • PostData contains the data sent to the server when using Navigate to generate an HTTP POST message. If PostData is nil, Navigate generates an HTTP GET message. PostData is ignored if URL does not specify an HTTP URL.

  • Headers contains any headers sent to the servers when the URL represents an HTTP URL. HTTP headers specify such things as the intended action required of the server, the type of data, and so on.

Displaying a Document in the TWebBrowser component by Using an Address in a Text Box

Using the TWebBrowser component, you can create a form that performs most of the functions of Microsoft Internet Explorer. For example, the following illustration shows the Cool Web Browser (/demos/Coolstuf/webbrows.dpr).

When a user types a valid URL in the combo box at the top of the form (URLs) and presses ENTER, the TWebBrowser component (WebBrowser1) displays the Web page or document. Pressing ENTER triggers the FindAddress procedure (via the URLsKeyDown event handler) ; the FindAdress event contains the following code which navigates to the URL entered by the user:

procedure TMainForm.FindAddress;
var  Flags: OLEVariant;
begin
  Flags := 0;
  UpdateCombo := True;
  WebBrowser1.Navigate(WideString(Urls.Text),
    Flags, Flags, Flags, Flags);
end;

If you prefer to start navigation by clicking a speed button instead pressing ENTER, you can use similar code in the button's OnClick event.

The Home, Back, Forward, Refresh, and Search buttons on the Custom Browse form use the corresponding GoHome, GoBack, GoForward, Refresh, and GoSearch methods of the TWebBrowser component.

The example uses an animation to show that the download is in progress (just like Microsoft Internet Explorer or Netscape does) and this can be controled through the OnDownloadBegin and OnDownloadComplete events.

A download in progress can be stopped at any time with the Stop method.

Distributing the TWebBrowser component with your application

Unlike most other ActiveX controls, you can't install the TWebBrowser component by itself. For an application that uses the TWebBrowser component to work, Microsoft Internet Explorer version 3.0 or later must also be installed on the computer.

Microsoft Internet Explorer can (for now) be distributed freely, and doesn't require the payment of royalties or other licensing fees.

Pluggable Protocols

Browsers and Protocols

Although HTTP is the most well-known and widely used protocol on the Internet, browsers generally support a variety of different protocols. This list just scratches the surface:

ProtocolDescription
Http:HTTP is a stateless and transaction-oriented client/server protocol used to access data on the Web. It relies on TCP/IP for low-level connections.
ftp:The protocol used for copying files to and from remote computer systems on a network using TCP/IP. This protocol also allows users to use FTP commands to work with files, such as listing files and directories on the remote system.
Mailto:Used to write and drop an email message through a related program.
Gopher:A client/server application that allows the user to browse large amounts of information. It presents the information to the user in a menu format.
File:Allows you to access and browse the local file system as if it were a network resource.
Nntp:The name stands for Network News Transfer Protocol and is an application protocol used in TCP/IP networks. Enables clients to read and post information to USENET newsgroups.
About:Intended to let you output text directly on the page with the aim of providing information about what happened.

HTTP, FTP, and Gopher are probably the three most common protocols implemented by Web servers. In addition to Internet Explorer, all the protocols listed in Figure 9-1 are also supported by Netscape Communicator and most other vendors' browsers.

It's important to always remember that a browser is in no way tied only to HTTP. A browser is simply a piece of software that performs some actions by following a given protocol. Ultimately, a protocol is implemented by a piece of software, resident on the client machine, that is invoked when a browser encounters the prefix used as the protocol identifier. In this way, when the browser finds an address that begins with 'http:', it relies on the functions exposed by the module that handles HTTP. When the browser encounters an 'ftp:' link, it calls the module that handles FTP protocol conversations.

Once the interface of such a module is formalized, you have a generic layer of code that acts as a conduit to transfer data between the browser and the server. In this case, the server can be anything that can provide the requested information. It could be a Web server for 'http:', an email program for 'mailto:', or the local file system if the protocol is 'file:'. Interestingly, the server will be a file if the protocol is 'res:', as I hinted above.

By generalizing the structure of this protocol-handling layer and implementing it via a component object model like COM, the browser now has a far more modular architecture. At the same time, it is more extensible, since it's not dependent upon a fixed number of protocols.

The 'about:' Protocol

Have you ever wondered about the about:NavigationCanceled URL that appears when you try to access unavailable resources with Internet Explorer? Well, 'about:' is an IE pluggable protocol. Its role is to display either raw text or predefined pages using a short moniker. In one sense, 'about:' is the Web equivalent of MessageBox. It is meant to help you display messages in an HTML page.

The syntax for this protocol is:

about:{some text}

The text portion can be raw HTML text or a kind of pointer to an HTML page. The browser first tries to find a matching page for the specified text. If it fails, it next considers the text portion to be plain text to display. The 'about:' protocol is implemented in shdocvw.dll. Under the hood, the protocol's implementation ends up writing text in the document body. If you type the following in the Internet Explorer 4.0 (or above) address bar:

about:Hello, MIND

the string "Hello, MIND" will appear on a blank page as if you'd loaded a page with this source code:

<HTML>Hello, MIND</HTML>

You can also enter more complex text such as:

about:Hello, <a href="c:">MY DRIVE</a>

The result is shown here:

The 'about:' protocol is also supported by Netscape Communicator 4.05, but it doesn't support any conversion tables in its implementation, and it's limited to outputting text in the document's body.

Monikers

What's cool with 'about:' is that you can define monikers to address specific HTML pages instead of plain text. For example, the content displayed by about:NavigationCanceled actually comes from an HTML resource, res://shdocvw.dll/navcancl.htm, that's stored in shdocvw.dll, as shown in Figure 9-4.

But how does the browser know how to associate the NavigationCanceled moniker with the resource navcancl.htm? It's all stored in a table within the system registry, under this easy-to-remember key:

HKEY_LOCAL_MACHINE
  Software
    Microsoft 
      Internet Explorer
        AboutURLs

Adding a new moniker is as easy as writing a new entry in the registry. If you have resprot.dll installed in your system directory and add this association to the table, then about:mind will be a command that'll be recognized by Internet Explorer 4.0 or above.

Complex Text

You can also use the 'about:' protocol for complex strings - like the entire Content of a TPageProducer. Try it...

The 'res:' Protocol

The 'res:' protocol lets you extract a resource from a compiled module like an EXE or DLL. While this protocol has been introduced to work with HTML pages, you can use it to work with other type of resources as well, including custom resources. A URL based on this protocol looks like this:

res://resource_file.ext[/resource_type]/{res_id}

where resource_file.ext is the name of the executable module. If the file is in the search path (for instance, in the Windows directory), it may be specified by file name alone.

The second chunk of information, resource_type, is optional. The 'res:' protocol supports numbers for each of its predefined resource types, and allows you to use literal strings to identify custom resources. The complete list of the resource types is declared in windows.pas.

Here is a list of the entries most commonly used via the 'res:' protocol.

Resourcewindows.pas constID
CursorRT_CURSOR1
BitmapRT_BITMAP2
IconRT_ICON3
StringRT_STRING6
Animated CursorRT_ANICURSOR21
Animated IconRT_ANIICON22
HTML{not defined!}23

Those are the same IDs required by some API functions like FindResource. If you don't provide a resource type, it defaults to HTML (type 23). This means that:

res://ie4tour.dll/23/welcome.htm

and

res://ie4tour.dll/welcome.htm

are equivalent and will access the same page in the specified executable. If you want to refer a custom resource, namely one whose type is not defined in windows.pas, then you have to use the name of the type. For example, if you have a file called resProt.dll with this line in its .rc file:

MindLogo  GIF  "mind.gif"

then
res://resProt.dll/gif/MindLogo

is the correct way to reference the image.

The final piece of information in the URL is the resource id (res_id) or name. A resource name can be a number or a string. You can use any string to identify a resource, but if it evaluates to an external file name that's been embedded in the executable file, using the file name as an identifier makes sense. For example, if you have the following line in your .rc file:

mind.gif  GIF  "mind.gif"

you can invoke it within an HTML page like this:

<img src="res://resProt.dll/gif/mind.gif">

Note: Neither BRCC32.EXE (Borland Resource Compiler Command Line) nor the Delphi IDE resource compiler support a '.' (dot) in the resource identifier. If you wish to bind a resource with filename-like resource identifiers you will need to use Microsoft's RC.EXE - which ships with most Microsoft development tools, like Microsoft Visual C++, or Visual Studio.

The 'res:' protocol allows you to compile HTML pages within your application so that there's just one file to distribute: the EXE. Notice that all the internal references are based on the 'res:' protocol. This lets you embed an entire HTML-based application within a compiled module.

The 'res:' protocol isn't the only possible solution for embedding HTML resources into an application. A lower-level approach might be to embed your resources as shown above, then extract and recreate them as separate files at startup using FindResource and other related APIs. This solution might be worth consideration if your target browser isn't Internet Explorer. The 'res:' protocol is the most elegant solution, but browsers other than Internet Explorer 4.0 and above don't support it.

Creating a HTML resource

To add the HTML as a resource, you include a file that contains nothing but the actual HTML in your application's resource script. This file is included as an HTML type resource.

Microsoft's RC.EXE compiler

The following example shows how to include a file called mind.htm and a image, themind.gif, as an HTML resource into a .rc file called mymind.rc:

mind.htm  HTML  "mind.htm"

themind.gif  HTML  "themind.gif"

In this example, mind.htm is the identifier of the resource, and it can be either a string or a numerical identifier. HTML is the resource type. Microsoft's RC.EXE will interpret this as the numeric value 23 and will substitute 23 for HTML when the resource file is opened for editing.

mind.htm is the file that contains the HTML source that will be added. The resource compiler adds this file as is and will not attempt to interpret the contents of the file.

Run the resource compiler with the following command line:

RC MYMIND.RC

A compiled resource file, mymind.res, will be generated and this file can be included into your program or library:

{$R 'MYMIND.RES'}

You would then call this page up within you application using the TWebBrower's 'Navigate' procedure:

WebBrowser1.Navigate('res://' + Application.ExeName + '/mind.htm');

Delphi's IDE resource compiler

The following example shows how to include the same files, this time in a file called mymind2.rc:

mind  23  "mind2.htm"

themind  23  "themind.gif"
Note: Borland's RC compilers do not understand the '.' (dot) in the resoure identifiers, so we need to strip that. Also remember that the <IMG> tags in the HTML must also strip the '.gif' from their SRC's.

Note: Also, Borland's RC compilers do not understand the 'HTML' type, so we need to use that type's numerical value.

You don't need to compile the .rc file just add it to your project with the "Add file to project" short-cut. This will place the following line to your project source:

{$R 'mymind2.res' 'mymind2.rc'}
Note: This is some Borland magic - the file will be listed as a project file and the IDE will auto-magically compile it when you change it!

You would then call this page up within you application using the TWebBrower's 'Navigate' procedure:

WebBrowser1.Navigate('res://' + Application.ExeName + '/mind');

*


Server Response from: ETNASC01