Other People's Objects
Mark Betz
Director of object technologies at Block Financial Corp.'s Technology Center in upper Montclair, New Jersey, where he works on distributed multimedia information systems using c++ and Orbix.
NOTE: The views and information expressed in this document represent those of its author(s) who is solely responsible for its content. Inprise does not make or give any representation or warranty with respect such content.
Component-based development is one of the hottest
concepts in the Windows programming world, and programmers are
realizing its advantages. They've discovered how much easier it
is to build complex things from smaller, simpler objects than
it is to build them whole from one large block of material. In
fact, this concept, sometimes called encapsulation, modularity,
or information-hiding, has been central to every advance in development
practices since toggle switches were a user-input device. It's
also been central to object-oriented programming, which adds to
it concepts of identity, classification, and object self-awareness.
These functions make modules easier to understand, create, and
reuse. Well, maybe not reuse: the reuse aspect has yet to meet
a lot of people's expectations. And as a proponent of object-oriented
approaches, I'm here to offer a mea culpa on behalf of all of
us who collected consulting fees while we were going on and on
about the reusability of C++ classes. It turns out they're a lot
harder to reuse than many of us expected.
It's not that designing reusable interfaces is difficult. The
problem is implementing them so that they can be included, in
a binary linkage model, in a wide variety of applications. If
everyone uses the same C++ compiler and linker, nobody has stepped
on anyone else's names, and everyone agrees on which utility class
libraries to use, you have a decent chance of producing reusable
stuff at a relatively fine level of granularity. DLLs may be a
great facility, but even they don't solve the problem: wait until
you get a new version of a DLL that changes the size of an exported
class or alters the layout of the virtual function table.
Breaking the bond of binary dependence between objects that make
up an application is a big issue with components. If encapsulation
at the syntax level within a particular language is object-oriented,
it's much more object-oriented to encapsulate even these details.
You need to link to objects dynamically at runtime and use them,
regardless of the language they were implemented in, the operating
system they run on, and even the machine that happens to be running
them. These last two characteristics elevate components into the
realm of distributed-object computing, and this is where the real
payoff in the component architecture lies.
Components simplify development by letting you make big things
out of little things. It's even simpler when you don't have to
build the little things. Just borrow them. In technical terms,
rather than building monolithic applications from components that
have to be linked every time they are used, construct the applications
out of simpler, independently executing objects, each in its own
protected address space and designed and implemented by those
who best understand what it has to do. You can put these objects
anywhere on a LAN or WAN (or the Internet, for that matter) for
applications to find and use.
Such a model provides enormous potential for system integration
and business process-level reuse, but before you can reap those
benefits organizationally, you'll need to pay careful attention
to architectural issues. Let's take a look at a highly reusable
architecture for distributed Windows applications. It uses common
Windows facilities, such as DLLs and object-oriented ideas of
abstraction, to present application developers with simple interfaces
to complex behavior. This is an essential design characteristic
that allows an organization to structure its development teams
for reuse. I chose Orbix, a popular implementation of the CORBA
specification from Iona, to support the architecture in the examples.
STICKY-FINGERED SERVERS
Let's begin with a picture of the entire system. Three separate
processes are involved, running on two machines connected by a
network. Two of the processes are on the server side: the object
implementation, which I'll refer to as an object server, and a
daemon process that handles server management. There can be many
object servers on a host, but there is usually only one daemon.
In a system with multiple object-server hosts, each will have
a daemon that cooperates with all the others. Together, these
daemons make up an information bus that helps connect clients
with the objects they need. Its services are independent of the
kind of objects being implemented.
The object server provides an implementation of a distributed
object. It links libraries or uses DLLs to communicate with the
daemon. The daemon manages the object, can start and stop the
server process, and provide clients with information about it.
The particular implementation of these services, called the ORB
core, is specific to Orbix. Other vendors take similar approaches,
though some are completely peer-to-peer, a model that raises more
design difficulties than technical ones.
The client is similar to the server in that it also links libraries
or DLLs to communicate with the ORB core and remote servers. But
unlike servers, clients do not normally need a local daemon: the
runtime libraries can communicate with any accessible daemon on
the net. And whereas the server implements a distributed object
on top of the ORB runtime system, the client layers in a set of
proxy classes. In the simplest model, these proxies are linked
at build time, and then created and called at runtime. The ORB
performs the magic that transmits the call and its arguments to
the implementation on the remote server, and that treats the results
to a similar trip back.
Accomplishing this is no mean feat, as developers of earlier procedural
facilities, RPCs, found out. The developer must first provide
a way to define types that is independent of any implementation
language. This process must be translatable to languages implemented
on various operating systems and machine architectures. The runtime
system, the ORB, must also be able to transmit data types between
these operating systems and architectures. This requires a translation
process known as marshaling. Though requirements for marshaling
are very complex, marshaling is a system-level service whose functioning
can be taken for granted.
CAN YOU DESCRIBE THE OBJECT?
The way objects are described affects the way you use them in
applications. CORBA defines an implementation-neutral declarative
syntax for describing types. The language is known as Interface
Definition Language. IDL is solely about declaring types. It has
no implementation concepts such as storage, control of flow, addresses,
or any syntax for them. In IDL, you declare objects as interfaces,
named collections of methods, and attributes similar to C++ classes
in appearance. You also declare any supporting types, such as
enumerations, structs, typedefs, exceptions, and the like. Once
the IDL is run through a compiler, it emerges as source code translated
into an implementation language, often C++. The result is a header
file that's included on the client and server plus implementation
files specific to each.
The translation of an IDL interface into a C++ class is a standard
mapping included in the CORBA specification, and most vendors
are accepting the standard. In fact, the Orbix compiler produces
a hierarchy of classes familiar in shape to any developer who
has managed to add common systemic behavior to a number of distinct
class hierarchies.
The interface contains a root class called CORBA::Object that
is inherited by all the client- and server-side classes. It is
how the ORB gets into the act when an object is created or destroyed
and when calls are made to or through it. The IDL interface itself
is translated into a single class, derived from CORBA::Object,
that declares a set of virtual methods that correspond to those
in the IDL declaration. Because it is the direct translation of
the IDL interface into C++, this class is referred to as an IDL
C++ class.
For example, imagine an IDL interface called IBankAccount, which
compiles to a C++ class of the same name. The class is used on
the client side as a proxy, and it is the only one of the classes
generated from the interface that the client knows or cares about.
For the server, a class called IBank- AccountBOAImpl is generated.
It is derived from IBankAccount and simply redeclares the virtual
functions inherited from IBankAccount as pure-virtual. IBankAccountBOAImpl
serves as a parent for a derived implementation class called IBankAccountImpl.
The redeclaration of the functions as pure-virtual is necessary
to force developers to implement all of them. The functions could
not be declared pure-virtual in IBankAccount because IBankAccount
requires implementations of them for client-side use. They could
not be left unimplemented because the server-side needs something
to call when an invocation comes in.
The BOA in BOAImpl stands for Basic Object Adapter, a CORBA-standard
facility that provides a virtual socket where an implementation
can plug into the ORB. In Orbix, it is no more than a derivation
point for the implementation class and some simple, though important,
constructor behavior. Once the impl class, IBankAccountImpl, is
derived from the BOAImpl class, all that's required is an implementation
of its virtual functions in a way consistent with the interface's
intent. In the example interface, these functions would probably
connect to a database to retrieve account data when the client
makes a request.
The client-side proxy IBankAccount has implementations of its
virtual functions. These implementations call the ORB client runtime
system, which packs up the arguments and ships everything off
to the server. To use a distributed object, the client has to
instantiate a proxy object and then make standard synchronous
C++ function calls on it. Naturally, there are some other considerations.
The server process has to be registered with the ORB daemon where
it is located, and the client has to know the name of that host
or a central name server host that can point it in the right direction.
In practice, these configuration issues are minimal and can be
automated to a large extent by developing distributed system services
within the ORB's framework. For the most part, application developers
use the distributed objects as if they were local; however, the
component developers still haven't achieved true language independence,
nor a completely natural usage model for application assemblers.
We need to focus more closely on the architecture of the client.
IDL HANDS ARE THE DAEMON'S WORKSHOP
Now all the pieces of a simple distributed-object application
are in the picture: The ORB Core finds and manages the object
servers; the ORB runtimes connect clients and servers with the
ORB Core and each other; the IDL interfaces describe the objects
available; and the translations of these interfaces become the
parents of implementation classes on the server, and are also
used as proxies, or call-stubs, on the client. It all works robustly,
quickly, and on all of the Windows platforms. So what's the problem?
The problem is, if you give application developers IDL proxy classes
to use, you are still presenting them with some subtle dependencies
that they could do without. For example, because of wide variation
in C++ implementation among various compilers, the generated classes
and code are tool-specific. And because of the nonstandard way
in which C++ function names are mangled to ensure link-time type-safety,
even the compiled, linkable object modules are tool-specific.
They're also C++, which is a great language for system-level development,
but not necessarily the best platform for all development. To
top it off, the IDL translation to C++ is standard, but nobody
is completely compliant yet. How can you present the impressive
distributed technology embodied in an ORB to application developers
in a way that is truly portable and reusable across various languages
and development environments?
The answer is to use tried and true techniques of layering. In
broad terms, a layer is code that encapsulates other code, uses
it, and presents it to the rest of the system in a certain way.
Whereas classes increase horizontal modularity, layers increase
vertical modularity. Classes break up a particular domain, a user
interface, for example, into manageable pieces. Layers isolate
interface classes from business classes, and business classes
from database classes. Like a database API, ORB technology is
middleware, and it makes sense to encapsulate its peculiarities.
Also, Windows DLLs are great for storing several layers of code
that provide useful interfaces to your distributed objects. They
are easy to write, easy for application developers to use and,
in Win32 at least, as efficient as statically linked code. You
can link the ORB runtimes (or the import libraries for the ORB
DLLs), the generated client-side proxy code, and your interface
layers to the DLL and hand it over to a developer in one neat
package.
In an incremental approach, the first thing to do is make the
C++ version available in a self-contained and naturally usable
way. The best way is to create a wrapper class for the client-side
proxy class. Why a wrapper? Basically, because you don't control
the generation of the C++ proxy class from IDL, nor are the compilers
compliant with the standard. It's worth the effort simply to isolate
applications from changes, but there's another compelling reason:
The life cycle of a distributed object is not exactly that of
a locally instantiated class. Take a look at the process of object
creation, manipulation, and destruction in Orbix:
#include <IBANKACCOUNT.HH>
IBankAccount* pAccount;
char* pszServer = "AccountServer";
char* pszHost = "mariner.mars.com";
char* pszName;
pAccount = IBankAccount::_bind
( pszServer, pszHost );
pszName = pAccount->Name();
MessageBox( 0, pszName, "Account", MB_OK );
pAccount->_release();
Looks easy enough, doesn't it? After all, it's only eight lines
of code. A pointer to the proxy class is created, and then assigned
the return value from a static method called _bind that associates
it with a server. A call is made on it as through any other class
pointer, and then _release is called on it to free the server
binding. Anyone using C++ to get at the distributed objects should
have no problem following this model. However, this model doesn't
account for errors, and there will be errors. The ORB gives a
nice, class-based interface to distributed resources, but it can't
completely hide all of the stuff in between. Errors that occur
in the network layers, for example, have to propagate back to
the highest layers of the app in some form.
Fortunately, CORBA defines an exception model; Orbix implements
it. Though it's not based on C++ exceptions (it will be eventually),
a wrapper lets you turn ORB exceptions into C++ exceptions and
decide what information goes back up the call stack. Orbix provides
macros that approximate the C++ exception syntax. They don't emulate
C++ exception behavior very well, but that won't bother the users
of your DLL.
My sample code shows an approach you might take in a wrapper layer
to bind a proxy to a server with full error-handling. The Orbix
macros catch the CORBA errors. Following the CATCH macros, _release
is called to clean up and a C++ exception is thrown out to the
client application. The IT_X parameter passed to the _bind call
and the Name method invocation is called a context object, they're
the mechanism by which information on server activities, including
exceptions, is passed back to the client. Every ORB call takes
a context object as a parameter, which defaults to a global context.
The Orbix macros simply expand to code that declares one of these
named IT_X and examines it when the calls return. If it contains
an exception, the appropriate catch block is entered.
Although the life-cycle model might not be difficult to understand,
it's tedious and cumbersome. But a wrapper class can completely
encapsulate the code and let application developers work with
C++ classes as usual. In the sample code, you'll find the declaration
of a client-side wrapper for IBankAccount. The constructor for
this class simply executes the _bind with full exception handling.
The destructor calls _release. The wrapper declares member functions
that correspond to the methods on the proxy. Each of the functions
call the corresponding method on the proxy, again handling exceptions
as needed.
The wrapper class can be exported from the DLL and used directly
by anyone working in the same compiler that the IDL was translated
for. It leaves C++ applications completely independent of changes
to the underlying proxy, as long as these do not affect the interface,
and also provides a way to translate and abstract lower-level
exceptions for client use.
HEY! WHERE'D YOU GET THOSE PLUSSES?
C++ developers now have a clean interface to the distributed BankAccount
object, but what about developers working in other environments?
One of the lamentable things about C++ is that at link-time, it's
a lot less reusable than C. C++ mangles names and uses a special
function-call protocol that passes an object pointer to a member
function. But many languages and tools, from Delphi to Visual
Basic and Powerbuilder, can call C functions in a DLL, so a good
step is to export a C interface to your wrapper class. The only
problem is that this model is a lot harder to design and implement
than the wrapper class. A typical method used throughout the Windows
API is to replace objects with handles and member functions with
C functions that take handles. Two management chores that crop
up are keeping track of handles on a list and associating the
last received exception with the handle to the object that caused
it. When compiled for C, handles aren't typesafe, so you'd need
to check each entry to each function to ensure that the handle
was valid. The C API can be placed in the same DLL that exports
the wrapper class. Depending on the needs of the app, it can also
be placed in its own DLL.
Using either the C++ or C access strategy, this model has a lot
of potential. Imagine a client application in Boston and an account
object server hanging off an Internet node in Chicago. What would
an application developer normally have to do to obtain and print
out the name of a particular account owner in such a case?
try {
pBankAccount = new BankAccount
( "0112143" );
MessageBox( 0, pBankAccount->Name(),
"Name", MB_OK );
delete pBankAccount;
}
catch( ... ) {
MessageBox( 0, "error", "Name", MB_OK );
}
I don't mean to minimize issues of naming, version control, database
fault-tolerance, or any of a host of other technical issues that
pop up in distributed systems. But the ORB reduces the level of
detail at which the application has to deal with these issues
and also provides an excellent framework for innovative ways to
deal with them. In my experience, all of these solutions have
involved high-level programming at the ORB interface. Programmers
have not had to deal with networks at the level of the distributed
object interface, much less in the application.
MONEY FOR NOTHING, OBJECTS FOR FREE
But what about performance? First, there's the ORB issue. Orbix
has proven to be as efficient at transporting data across the
network as the protocols and physical media allow. The protocol
for Orbix, and most other ORBs, is TCP/IP by default. Second,
the designer of an interface can greatly affect performance by
deciding how the state of the remote object is retrieved into
the client address space. The examples I chose illustrate a call-level
interface, but it would be as easy to define a structure and stream
the structure back in one chunk.
On the client, the wrapper class can access the structure when
member functions are called. One performance test involved setting
up an account object client with a C++ wrapper on a Pentium 90
running NT. The client was connected over a 28.8 kilobit PPP serial
line to the server running on another P90. The server in turn
accessed Microsoft SQL Server running on a high-performance Compaq
with two P60 processors. The test made 10 calls to the remote
object, each of which returned a string varying in length from
10 to 50 characters. Assuming the server was already running,
it required approximately 1,100 milliseconds to bind, retrieve
the data, and release the proxy. If the ORB had to start the server,
that time could extend to 4,000 milliseconds. Fortunately, servers
are usually persistent. Given our slow link and call-level access
model, the performance is clearly acceptable, and it's pretty
cool to imagine all this happening over the TCP/IP Internet.
Where can you go from here? The DLL-based client architecture
I've discussed here forms a nice foundation for interfacing an
application to distributed objects at even higher levels of abstraction.
The DLL could form the core of an OLE object, such as an OCX,
a Delphi VCL component, or a VBX. Application developers could
paste your distributed account object into their spreadsheets.
And Iona Technologies, the developers of Orbix, have implemented
an OLE automation compiler for IDL, opening yet another avenue
of access. In addition, distributed object computing has another
advantage just as important as its implementation benefits: It
enables an organization to structure its software teams into mutually
dependent groups of component developers and assemblers.
With all the capabilities of this model, there's no reason not
to go ahead and use other people's objects. After all, everybody's
doing it. Mark Betz is director of object technologies at Block
Financial Corp.'s Technology Center in Upper Montclair, New Jersey,
where he works on distributed multimedia information systems using
C++ and Orbix.
Connect with Us