Other People's Objects

By: Mark Hannah

Abstract: Director of object technologies at Block Financial Corp.'s Technology Center in upper Montclair, New Jersey, where he works on distributed multimedia information systems using c++ and Orbix.

Other People's Objects
Mark Betz
Director of object technologies at Block Financial Corp.'s Technology Center in upper Montclair, New Jersey, where he works on distributed multimedia information systems using c++ and Orbix.

    NOTE: The views and information expressed in this document represent those of its author(s) who is solely responsible for its content. Inprise does not make or give any representation or warranty with respect such content.

Component-based development is one of the hottest concepts in the Windows programming world, and programmers are realizing its advantages. They've discovered how much easier it is to build complex things from smaller, simpler objects than it is to build them whole from one large block of material. In fact, this concept, sometimes called encapsulation, modularity, or information-hiding, has been central to every advance in development practices since toggle switches were a user-input device. It's also been central to object-oriented programming, which adds to it concepts of identity, classification, and object self-awareness. These functions make modules easier to understand, create, and reuse. Well, maybe not reuse: the reuse aspect has yet to meet a lot of people's expectations. And as a proponent of object-oriented approaches, I'm here to offer a mea culpa on behalf of all of us who collected consulting fees while we were going on and on about the reusability of C++ classes. It turns out they're a lot harder to reuse than many of us expected.

It's not that designing reusable interfaces is difficult. The problem is implementing them so that they can be included, in a binary linkage model, in a wide variety of applications. If everyone uses the same C++ compiler and linker, nobody has stepped on anyone else's names, and everyone agrees on which utility class libraries to use, you have a decent chance of producing reusable stuff at a relatively fine level of granularity. DLLs may be a great facility, but even they don't solve the problem: wait until you get a new version of a DLL that changes the size of an exported class or alters the layout of the virtual function table.

Breaking the bond of binary dependence between objects that make up an application is a big issue with components. If encapsulation at the syntax level within a particular language is object-oriented, it's much more object-oriented to encapsulate even these details. You need to link to objects dynamically at runtime and use them, regardless of the language they were implemented in, the operating system they run on, and even the machine that happens to be running them. These last two characteristics elevate components into the realm of distributed-object computing, and this is where the real payoff in the component architecture lies.

Components simplify development by letting you make big things out of little things. It's even simpler when you don't have to build the little things. Just borrow them. In technical terms, rather than building monolithic applications from components that have to be linked every time they are used, construct the applications out of simpler, independently executing objects, each in its own protected address space and designed and implemented by those who best understand what it has to do. You can put these objects anywhere on a LAN or WAN (or the Internet, for that matter) for applications to find and use.

Such a model provides enormous potential for system integration and business process-level reuse, but before you can reap those benefits organizationally, you'll need to pay careful attention to architectural issues. Let's take a look at a highly reusable architecture for distributed Windows applications. It uses common Windows facilities, such as DLLs and object-oriented ideas of abstraction, to present application developers with simple interfaces to complex behavior. This is an essential design characteristic that allows an organization to structure its development teams for reuse. I chose Orbix, a popular implementation of the CORBA specification from Iona, to support the architecture in the examples.

Let's begin with a picture of the entire system. Three separate processes are involved, running on two machines connected by a network. Two of the processes are on the server side: the object implementation, which I'll refer to as an object server, and a daemon process that handles server management. There can be many object servers on a host, but there is usually only one daemon. In a system with multiple object-server hosts, each will have a daemon that cooperates with all the others. Together, these daemons make up an information bus that helps connect clients with the objects they need. Its services are independent of the kind of objects being implemented.

The object server provides an implementation of a distributed object. It links libraries or uses DLLs to communicate with the daemon. The daemon manages the object, can start and stop the server process, and provide clients with information about it. The particular implementation of these services, called the ORB core, is specific to Orbix. Other vendors take similar approaches, though some are completely peer-to-peer, a model that raises more design difficulties than technical ones.

The client is similar to the server in that it also links libraries or DLLs to communicate with the ORB core and remote servers. But unlike servers, clients do not normally need a local daemon: the runtime libraries can communicate with any accessible daemon on the net. And whereas the server implements a distributed object on top of the ORB runtime system, the client layers in a set of proxy classes. In the simplest model, these proxies are linked at build time, and then created and called at runtime. The ORB performs the magic that transmits the call and its arguments to the implementation on the remote server, and that treats the results to a similar trip back.

Accomplishing this is no mean feat, as developers of earlier procedural facilities, RPCs, found out. The developer must first provide a way to define types that is independent of any implementation language. This process must be translatable to languages implemented on various operating systems and machine architectures. The runtime system, the ORB, must also be able to transmit data types between these operating systems and architectures. This requires a translation process known as marshaling. Though requirements for marshaling are very complex, marshaling is a system-level service whose functioning can be taken for granted.

The way objects are described affects the way you use them in applications. CORBA defines an implementation-neutral declarative syntax for describing types. The language is known as Interface Definition Language. IDL is solely about declaring types. It has no implementation concepts such as storage, control of flow, addresses, or any syntax for them. In IDL, you declare objects as interfaces, named collections of methods, and attributes similar to C++ classes in appearance. You also declare any supporting types, such as enumerations, structs, typedefs, exceptions, and the like. Once the IDL is run through a compiler, it emerges as source code translated into an implementation language, often C++. The result is a header file that's included on the client and server plus implementation files specific to each.

The translation of an IDL interface into a C++ class is a standard mapping included in the CORBA specification, and most vendors are accepting the standard. In fact, the Orbix compiler produces a hierarchy of classes familiar in shape to any developer who has managed to add common systemic behavior to a number of distinct class hierarchies.

The interface contains a root class called CORBA::Object that is inherited by all the client- and server-side classes. It is how the ORB gets into the act when an object is created or destroyed and when calls are made to or through it. The IDL interface itself is translated into a single class, derived from CORBA::Object, that declares a set of virtual methods that correspond to those in the IDL declaration. Because it is the direct translation of the IDL interface into C++, this class is referred to as an IDL C++ class.

For example, imagine an IDL interface called IBankAccount, which compiles to a C++ class of the same name. The class is used on the client side as a proxy, and it is the only one of the classes generated from the interface that the client knows or cares about. For the server, a class called IBank- AccountBOAImpl is generated. It is derived from IBankAccount and simply redeclares the virtual functions inherited from IBankAccount as pure-virtual. IBankAccountBOAImpl serves as a parent for a derived implementation class called IBankAccountImpl. The redeclaration of the functions as pure-virtual is necessary to force developers to implement all of them. The functions could not be declared pure-virtual in IBankAccount because IBankAccount requires implementations of them for client-side use. They could not be left unimplemented because the server-side needs something to call when an invocation comes in.

The BOA in BOAImpl stands for Basic Object Adapter, a CORBA-standard facility that provides a virtual socket where an implementation can plug into the ORB. In Orbix, it is no more than a derivation point for the implementation class and some simple, though important, constructor behavior. Once the impl class, IBankAccountImpl, is derived from the BOAImpl class, all that's required is an implementation of its virtual functions in a way consistent with the interface's intent. In the example interface, these functions would probably connect to a database to retrieve account data when the client makes a request.

The client-side proxy IBankAccount has implementations of its virtual functions. These implementations call the ORB client runtime system, which packs up the arguments and ships everything off to the server. To use a distributed object, the client has to instantiate a proxy object and then make standard synchronous C++ function calls on it. Naturally, there are some other considerations. The server process has to be registered with the ORB daemon where it is located, and the client has to know the name of that host or a central name server host that can point it in the right direction. In practice, these configuration issues are minimal and can be automated to a large extent by developing distributed system services within the ORB's framework. For the most part, application developers use the distributed objects as if they were local; however, the component developers still haven't achieved true language independence, nor a completely natural usage model for application assemblers. We need to focus more closely on the architecture of the client.

Now all the pieces of a simple distributed-object application are in the picture: The ORB Core finds and manages the object servers; the ORB runtimes connect clients and servers with the ORB Core and each other; the IDL interfaces describe the objects available; and the translations of these interfaces become the parents of implementation classes on the server, and are also used as proxies, or call-stubs, on the client. It all works robustly, quickly, and on all of the Windows platforms. So what's the problem?

The problem is, if you give application developers IDL proxy classes to use, you are still presenting them with some subtle dependencies that they could do without. For example, because of wide variation in C++ implementation among various compilers, the generated classes and code are tool-specific. And because of the nonstandard way in which C++ function names are mangled to ensure link-time type-safety, even the compiled, linkable object modules are tool-specific. They're also C++, which is a great language for system-level development, but not necessarily the best platform for all development. To top it off, the IDL translation to C++ is standard, but nobody is completely compliant yet. How can you present the impressive distributed technology embodied in an ORB to application developers in a way that is truly portable and reusable across various languages and development environments?

The answer is to use tried and true techniques of layering. In broad terms, a layer is code that encapsulates other code, uses it, and presents it to the rest of the system in a certain way. Whereas classes increase horizontal modularity, layers increase vertical modularity. Classes break up a particular domain, a user interface, for example, into manageable pieces. Layers isolate interface classes from business classes, and business classes from database classes. Like a database API, ORB technology is middleware, and it makes sense to encapsulate its peculiarities. Also, Windows DLLs are great for storing several layers of code that provide useful interfaces to your distributed objects. They are easy to write, easy for application developers to use and, in Win32 at least, as efficient as statically linked code. You can link the ORB runtimes (or the import libraries for the ORB DLLs), the generated client-side proxy code, and your interface layers to the DLL and hand it over to a developer in one neat package.

In an incremental approach, the first thing to do is make the C++ version available in a self-contained and naturally usable way. The best way is to create a wrapper class for the client-side proxy class. Why a wrapper? Basically, because you don't control the generation of the C++ proxy class from IDL, nor are the compilers compliant with the standard. It's worth the effort simply to isolate applications from changes, but there's another compelling reason: The life cycle of a distributed object is not exactly that of a locally instantiated class. Take a look at the process of object creation, manipulation, and destruction in Orbix:

IBankAccount* pAccount;
char* pszServer = "AccountServer";
char* pszHost = "mariner.mars.com";
char* pszName;

pAccount = IBankAccount::_bind
    ( pszServer, pszHost );
pszName = pAccount->Name();
MessageBox( 0, pszName, "Account", MB_OK );

Looks easy enough, doesn't it? After all, it's only eight lines of code. A pointer to the proxy class is created, and then assigned the return value from a static method called _bind that associates it with a server. A call is made on it as through any other class pointer, and then _release is called on it to free the server binding. Anyone using C++ to get at the distributed objects should have no problem following this model. However, this model doesn't account for errors, and there will be errors. The ORB gives a nice, class-based interface to distributed resources, but it can't completely hide all of the stuff in between. Errors that occur in the network layers, for example, have to propagate back to the highest layers of the app in some form.

Fortunately, CORBA defines an exception model; Orbix implements it. Though it's not based on C++ exceptions (it will be eventually), a wrapper lets you turn ORB exceptions into C++ exceptions and decide what information goes back up the call stack. Orbix provides macros that approximate the C++ exception syntax. They don't emulate C++ exception behavior very well, but that won't bother the users of your DLL.

My sample code shows an approach you might take in a wrapper layer to bind a proxy to a server with full error-handling. The Orbix macros catch the CORBA errors. Following the CATCH macros, _release is called to clean up and a C++ exception is thrown out to the client application. The IT_X parameter passed to the _bind call and the Name method invocation is called a context object, they're the mechanism by which information on server activities, including exceptions, is passed back to the client. Every ORB call takes a context object as a parameter, which defaults to a global context. The Orbix macros simply expand to code that declares one of these named IT_X and examines it when the calls return. If it contains an exception, the appropriate catch block is entered.

Although the life-cycle model might not be difficult to understand, it's tedious and cumbersome. But a wrapper class can completely encapsulate the code and let application developers work with C++ classes as usual. In the sample code, you'll find the declaration of a client-side wrapper for IBankAccount. The constructor for this class simply executes the _bind with full exception handling. The destructor calls _release. The wrapper declares member functions that correspond to the methods on the proxy. Each of the functions call the corresponding method on the proxy, again handling exceptions as needed.

The wrapper class can be exported from the DLL and used directly by anyone working in the same compiler that the IDL was translated for. It leaves C++ applications completely independent of changes to the underlying proxy, as long as these do not affect the interface, and also provides a way to translate and abstract lower-level exceptions for client use.

C++ developers now have a clean interface to the distributed BankAccount object, but what about developers working in other environments? One of the lamentable things about C++ is that at link-time, it's a lot less reusable than C. C++ mangles names and uses a special function-call protocol that passes an object pointer to a member function. But many languages and tools, from Delphi to Visual Basic and Powerbuilder, can call C functions in a DLL, so a good step is to export a C interface to your wrapper class. The only problem is that this model is a lot harder to design and implement than the wrapper class. A typical method used throughout the Windows API is to replace objects with handles and member functions with C functions that take handles. Two management chores that crop up are keeping track of handles on a list and associating the last received exception with the handle to the object that caused it. When compiled for C, handles aren't typesafe, so you'd need to check each entry to each function to ensure that the handle was valid. The C API can be placed in the same DLL that exports the wrapper class. Depending on the needs of the app, it can also be placed in its own DLL.

Using either the C++ or C access strategy, this model has a lot of potential. Imagine a client application in Boston and an account object server hanging off an Internet node in Chicago. What would an application developer normally have to do to obtain and print out the name of a particular account owner in such a case?

try {
   pBankAccount = new BankAccount
       ( "0112143" );
   MessageBox( 0, pBankAccount->Name(),
       "Name", MB_OK );
delete pBankAccount;
catch( ... ) {
   MessageBox( 0, "error", "Name", MB_OK );

I don't mean to minimize issues of naming, version control, database fault-tolerance, or any of a host of other technical issues that pop up in distributed systems. But the ORB reduces the level of detail at which the application has to deal with these issues and also provides an excellent framework for innovative ways to deal with them. In my experience, all of these solutions have involved high-level programming at the ORB interface. Programmers have not had to deal with networks at the level of the distributed object interface, much less in the application.

But what about performance? First, there's the ORB issue. Orbix has proven to be as efficient at transporting data across the network as the protocols and physical media allow. The protocol for Orbix, and most other ORBs, is TCP/IP by default. Second, the designer of an interface can greatly affect performance by deciding how the state of the remote object is retrieved into the client address space. The examples I chose illustrate a call-level interface, but it would be as easy to define a structure and stream the structure back in one chunk.

On the client, the wrapper class can access the structure when member functions are called. One performance test involved setting up an account object client with a C++ wrapper on a Pentium 90 running NT. The client was connected over a 28.8 kilobit PPP serial line to the server running on another P90. The server in turn accessed Microsoft SQL Server running on a high-performance Compaq with two P60 processors. The test made 10 calls to the remote object, each of which returned a string varying in length from 10 to 50 characters. Assuming the server was already running, it required approximately 1,100 milliseconds to bind, retrieve the data, and release the proxy. If the ORB had to start the server, that time could extend to 4,000 milliseconds. Fortunately, servers are usually persistent. Given our slow link and call-level access model, the performance is clearly acceptable, and it's pretty cool to imagine all this happening over the TCP/IP Internet.

Where can you go from here? The DLL-based client architecture I've discussed here forms a nice foundation for interfacing an application to distributed objects at even higher levels of abstraction. The DLL could form the core of an OLE object, such as an OCX, a Delphi VCL component, or a VBX. Application developers could paste your distributed account object into their spreadsheets. And Iona Technologies, the developers of Orbix, have implemented an OLE automation compiler for IDL, opening yet another avenue of access. In addition, distributed object computing has another advantage just as important as its implementation benefits: It enables an organization to structure its software teams into mutually dependent groups of component developers and assemblers.

With all the capabilities of this model, there's no reason not to go ahead and use other people's objects. After all, everybody's doing it. Mark Betz is director of object technologies at Block Financial Corp.'s Technology Center in Upper Montclair, New Jersey, where he works on distributed multimedia information systems using C++ and Orbix.

Server Response from: ETNASC01