Design Considerations for MIDAS/DataSnap

By: Vino Rodrigues

Abstract: A white paper on things you should think about when you chose to use MIDAS/DataSnap.

Design Considerations for MIDAS/DataSnap
  Technology Considerations

A white paper by Vino Rodrigues
vinorodrigues@yahoo.com
www.geocities.com/vinorodrigues

What will be covered in this paper:

The Evolution of Three-Tier

Before we can even begin to understand the three-tier model, we need to know its evolution.

Tier Location

1-Tier Application Model

In the beginning there was a single tier. This originated from the mainframe era. So what was this tier about? In essence, one machine handled both an application and a database. This was easily achieved in the mainframe because it provided simple screen scraping to dumb terminals. This mainframe got to do all the work -- which was easily acceptable because a) into usually was a powerful machine, and b) the amount of data in that era was small in comparison to today's.

This model was quickly adapted to the PC era with the advent of hard drive storage.

2-Tier Application Model

When people adapted the single tier to the PC, they came to the realization that the data was now not shareable, as it once was with the mainframe. And thus the client/server model came into existence. In essence the client/server model was meant to bring the database back into a central location. One of the benefits of this model was that the application ran on the PC, and the database was hosted on a "server". This made the application do all the work and the server thus was able to be a smaller machine.

The client/server model worked for many a year, and in its day, was a brilliant model. It's shortfall came when the amount of data became too large for even the largest of server machines. Now why was this? Looking back at the mainframe, we could usually handle the amounts of users and data that the client/server model was now not handling. This can be attributed to the network. Why? The mainframe usually only transported screen dumps. I.e. just the information that was needed - at the time that it was needed. Most client/server developers got to use to getting all the data that was required to process a screen. I.e. the application would query the database to receive a connection of results, which would then be processed by the application. The application, in turn, would then only show portions of that result. This practice may have worked twenty years ago when data was comparatively small in today's huge data loads this means the death of the application.

This brings to the two "client/server schools"; a) business at client, and b) business and server (a.k.a. stored procedure model).

The first school, which existed from the old one tier model, placed all the business rules in the client application. The second school, which realized that the other school was flawed due to the amount of network traffic, placed all the business rules on the server as stored procedures. The second school's methodology work of several years, but as more and more data existed; we found that even the most powerful servers could not handle the workload. This was later resolved by creating replication servers that duplicated databases. But even the solution had serious drawbacks.

3-Tier Application Model

The three-tier model was then thought out to resolved the client/server problem. How did it achieve this? The answer was a mixture of the best features of the two previous models. a) the client would only receive the data that into needed - when it needed it (also called a thin client). b) the middle tier would then handle the work load. Unfortunately, that is the principle that most distributed application developers have forgotten! In client/server the work was done by many applications, and thus the middle tier should consist of several servers that connectivity handle the huge amount of users and data.

This middle tier centralizes the logic that governs your database interactions so there is centralized control over data relationships. This allows different client applications to use the same data, while ensuring consistent data logic. It also allows for smaller client applications because much of the processing is off-loaded onto the middle tier. These smaller client applications are easier to install, configure, and maintain. Multi-tiered applications can also improve performance by spreading data-processing over several systems.

This brings us to concept called thin-client, a term which, unfortunately, is widely abused. Thin client does not mean: no data access libraries. On the other hand, thin client also does not mean: web based front-end. What thin client does mean, however, is: a client application that only is responsible for the presentation layer of the application.

Tier Functionality

Another way to look at this, is to view what each part of the tier does what. An application can be divided into three working entities; a) a data layer, b) a business layer, and c) presentation layer.

In single-tier applications the tier handles all three aspects of the application.

In a client/server, or two-tier, application the server handles the data, while the client application handles the business as well as the presentation.

A there-tier application should once again bring the presentation layer (and only that layer) to the front-end. The middle tier should handle business and the back end the data.

Why write distributed applications?

Distributing an application is not an end in itself. Distributed applications introduce a whole new kind of design and deployment issues. For this added complexity to be worthwhile, there has to be a significant payback.

Some applications are inherently distributed: multi-user games, chat and teleconferencing applications are examples of such applications. For these, the benefits of a robust infrastructure for distributed computing are obvious.

Many other applications are also distributed, in the sense that they have at least two components running on different machines. But are limited in scalability and ease of deployment because they where not designed to be distributed. Any kind of workflow or groupware application, most client/server applications, and even some desktop productivity applications essentially control the way their users communicate and cooperate. Thinking of these applications as distributed applications and running the right components in the right places benefits the user and optimises the use of network and computer resources. The application designed with distribution in mind can accommodate different clients with different capabilities by running components on the client side when possible and running them on the server side when necessary.

Designing applications for distribution gives the system manager a great deal of flexibility in deployment.

Distributed applications are also much more scalable than their monolithic counterparts. If all the logic of a complex application is contained in a single module, there is only one way to increase the throughput without tuning the application itself: faster hardware. Today's servers and operating systems scale very well but it is often cheaper to buy another identical machine than to upgrade to a server that is twice as fast. With a properly designed distributed application, a single server can start out running all the components. When the load increases, some of the components can be deployed to additional lower-cost machines.

What is DataSnap?

Contrary to popular belief, DataSnap is not COM+, CORBA, TCP/IP, HTTP, or even SOAP. DataSnap is also not a tree-tier model. DataSnap is a proprietary Borland technology that enables data (in packets) to be sent across a medium over a distributed network or a file system. Yes, it is true that DataSnap may use a protocol to achieve this, but in essence all that DataSnap does for you is package (and store) data.

DataSnap Technology

The core of the DataSnap technology lies in two components; the TDatasetProducer and the TClientDataSet.

The connection components, and their corresponding data modules, in turn provide the medium for these data packets to move in. According to the Delphi help file, under the topic "Deploying multi-tiered database applications (DataSnap)":

"DataSnap provides multi-tier database capability to Delphi applications by allowing client applications to connect to providers in an application server.

Delphi's support for developing multi-tiered applications is an extension of the way client datasets communicates with a provider component using transportable data packets. Once you understand how to create and manage a three-tiered application, you can create and add additional service layers based on your needs."

Understanding Provider-Based Multi-Tiered Applications

Delphi's support for multi-tiered applications use the components on the DataSnap page and the Data Access page of the component palette, plus a remote data module that is created by a wizard on the "Multitier" page of the New Items dialog. They are based on the ability of provider components to package data into transportable data packets and handle updates received as transportable delta packets.

The components needed for a multi-tiered application are described in the following table:

ComponentDescription
Remote data modules Specialized data modules that can act as a COM Automation object, SOAP server, or CORBA object to give client applications access to any providers they contain. Used on the application server.
Provider component A data broker that provides data by creating data packets and resolves client updates. Used on the application server.
Client dataset component A specialized dataset that uses midas.dll or midaslib.dcu to manage data stored in data packets. The client dataset is used in the client application. It caches updates locally, and applies them in delta packets to the application server.
Connection components A family of components that locate the server, form connections, and make the IAppServer interface available to client datasets. Each connection component is specialized to use a particular communications protocol.

The DataSnap Protocols

There are several types of connection components that can connect a client dataset to an application server. They are all descendants of TCustomRemoteServer, and differ primarily in the communication protocol they use (DCOM, CORBA, TCP/IP, HTTP, or SOAP).

The connection component establishes a connection to the application server and returns an IAppServer interface that the client dataset uses to call the provider.

This is where we first start discussing design considerations in a DataSnap application. The first concentration for a multi-tiered application is the underlying technology that will provide our remote procedure call abilities. Remote procedure called (RPC) is the term used in our industry to define technology that is capable of calling procedures or functions remotely.

COM/COM+

We can create COM/COM+ application servers by creating a "Remote Data Module" or a "Transactional Data Module". We then connect to these application servers by using the "COM connection" component. This methodology uses the Windows underlying COM and COM+ technologies to invoke calls on a Windows NT domain.

Advantages:

  • Fastest DataSnap Protocol on Windows platform
  • Can be used with other MS+ and 3rd party COM applications
  • Can be use as Transaction Server (MTS/COM+) objects

Disadvantages:

  • Location Dependant
  • DCOM has proven to be unstable
  • Relies on Windows NT Domain
  • Can be difficult to set up
  • Cannot port to other environments, MS+ only technology

When to use:

  • Same machine application server and client application
  • A department sized implementation of +- 30 connected clients
  • When there is a need to integrate into other MS+ or COM based applications
  • Closed NT domains (i.e. No WAN or internet access required)
  • Pure speed is at a premium
  • Reliability is not at a premium

CORBA (VisiBroker)

We can create CORBA application servers by creating a "CORBA Data Module". We then connect to these application servers by using the "CORBA connection" component. Delphi's CORBA DataSnap implementation uses VisiBroker 3.x's DII (Dynamic Interface Invocation) technology. By using VisiBroker's OSAgent (ORB Smart Agent) technology, our application will automatically have a load balancing and fail safe implementation.

Advantages:

  • Location Independent
  • Automatic Fail-Over
  • Automatic Load-Balancing, Scales easily
  • Easy to set up
  • Easily ported to other systems, CORBA is a standard
  • Proven technology - reliable, robust
  • Highly scalable

Disadvantages:

  • Relies on TCP/IP Subnet
  • Costly (?) (Outweighed by advantages!)
  • Cannot "talk" to MS® software

When to use:

  • Large implementations
  • Reliability is at a premium
  • Scalability (potential for growth) is required
  • When implementation requires connectivity to:
    • Non-Windows legacy systems
    • state-of-the-art systems (EJB's and JSP's)
    • heterogeneous external parties (clients and suppliers)

TCP/IP (Sockets)

The "Socket connection" uses an existing COM/COM+ implementation through a proxy known as the "Socket Server". The socket server is Borland's solution to the almost impossible to configure COM limitations. It works on the principal that a client can connect to a remote server using a TCP/IP port (known as a socket). That server then has a daemon that listens on that port and proxies (or re-marshals) those requests into the COM layer.

Should be remembered that this protocol was intended to be a last resort. Its main usage is in an implementation that requires remote access. By remote access I mean that the client will be dialling into the network. The nature of this scenario means that we will never have more than five or ten clients connected concurrently. The fact that this connection protocol uses the same application server has the COM connection protocol means that we can provide one solution for both our internal clients and our remote clients.

Advantages:

  • Uses COM, can be used as add-on to existing COM implementations
  • Easy to connect to over remote connections or the internet
  • Easy to configure
  • Can be secured (with interceptor)

Disadvantages:

  • Dependant on COM
  • Single-point-of-entry, busy socket will bottleneck!
  • Slow, single socket must marshal all incoming calls.
  • Does not work on DCOM (no remote COM objects)
  • Location dependant

When to use:

  • Remote or dial-up access is required
  • Tiny implementations were COM is to difficult to configure

HTTP (Web)

The "Web connection" uses a superb marriage of an existing COM/COM+ implementation, a Web server and a Web browser. It works on the same "proxy" principle as the socket connection - but instead of having an additional exposed TCP/IP port its uses the IIS Web server and an ISAPI extension to "listen" on a URL. The client uses Windows' Internet Explorer to marshal the HTTP GET request - this is done for two reasons: a) we can use IE's proxy settings, and b) we can use a secure socket layer (SSL) call across the Internet.

This protocol should also be used as a last resort. It is however far more secure and flexible than the pure socket connection. Clients do not need to dial in to your network, but instead, can use any Internet connection. This means that the client may be half way across the world, and still be able to work on your implementation.

Advantages:

  • Uses COM, can be used as add-on to existing COM implementations
  • Very easy to connect to over the internet
  • Easy to configure
  • Very secure (with IE to IIS SSL)

Disadvantages:

  • Dependant on COM
  • Single-point-of-entry, busy web-server will bottleneck!
  • Slow, web-server must marshal all incoming calls.
  • Does not work on DCOM (no remote COM objects)
  • Web-site dependant

When to use:

  • Pure Internet connection is required
  • Tiny implementations that use the Internet as backbone.

DataModule Threading Models

There are only two variations of Data Modules that can be created with DataSnap. On one hand we have the COM based "Remote Data Module" and "Transactional Data Module", and on the other hand we have the "CORBA Data Module". These modules exist within a host. In COM it will be the COM server, and CORBA it will be the CORBA server. Clients make requests to these hosts through their object factories. These, in turn, will either create a new object or connect to an existing object for that client making the request. The threading model of the hosts will define how the factory behaves. (The factory in this case, is just software that receives all incoming requests to the object host and keeps track of objects and diverge all calls to the appropriate objects.)

The COM Threading Models

Process

Process indicates in which process space to object server (or host) runs in. This is not a setting that you can define; it does however depend on which type of host you create.

In-process server

The object server (or host) is a dynamic link library (DLL) or an ActiveX library. The server runs in the client's process space, and is loaded by the client on demand.

When to use:

  • Objects with a user interface (visual controls).
  • MTS or Transactional objects managed by MTS or COM+.
  • Objects that do not often fail.

Out-of-process server

The object server (or host) is a stand-alone executable, usually with an EXE extension. The server runs in it's own process space.

When to use:

  • Objects running as stand-alone on a remote machine. (This is the only way that stand-alone DCOM functions.)
  • Objects that may be unstable. (Separate hosts will protect other clients from failing.)
  • Object servers perform other tasks too, such as a GUI application.

Instancing

Instancing indicates how your object server is launched.

Internal Instance

The object is created in an in-process server. Choose this option when creating a remote data module as part of an active Library (DLL/OCX).

When to use:

  • The object will only be used by another object in the same host.

Single Instance

Only a single instance of the object is created for each executable. Each client connection launches its own instance of the executable. The object instance is therefore dedicated to a single client.

Multiple Instance

A single instance of the server (process) instantiates all remote data modules created for clients. Each objects is dedicated to a single client connection, but they all share the same process space.

Threading Model

The threading model indicates how client calls are passed to your objects' interface. By adding thread support to your COM object, you can improve its performance, because multiple clients can access your application at the same time.

Single Threading Model

The object only receives one client request at a time. You don't need to deal with threading issues.


First client method call creates object.


Second client's method call will be queued d until other client method calls on done.


Second client will connect to same (single) object when no other client calls are in queue.

Pro's and Con's:

  • COM serializes client requests so that the application receives one request at a time.
  • Clients are handled one at a time so no threading support is needed.
  • No performance benefit.

When to use:

  • Objects that clients will use briefly and in rapid succession, like:
    • A logon and authentication provider.
    • A background process trigger, like a batch process invoker.

Apartment Threading Model
a.k.a. Single-Threaded Apartment

Each instance of the object services one request at a time.

The term "apartment" comes from a metaphor in which a process is conceived as a totally discrete entity, such as a "building" that is subdivided into a set of related but different "locales" called "apartments." An apartment is a "logical container" that creates an association between objects and, in some cases, threads. Objects are not apartments, although every object is associated with one and only one apartment. But apartments are more than just a logical construct; their rules describe the behaviour of the COM system.


First client method call creates object.


Second client method call creates new object.


Objects are destroyed once their clients disconnect. Objects "live" in their client connections.

Pro's and Con's:

  • All client calls use the thread in which the object was created.
  • Objects can safely access their own instance data, but global data must be protected.
  • The thread's local variables are reliable across multiple calls. (i.e. The object is state-full.)
  • Some performance benefits.

When to use:

  • Objects with a user interface (visual controls).
  • Objects that demand state-full connections, like batch processing.
  • Any other objects that will contain properties.
  • Objects where clients use it for a long time each time they call it.
  • (Un-pooled) BDE dataset implementations.

Free Threading Model
a.k.a. Multi-Threaded Apartment

Object instances can receive simultaneous client requests on several threads.


First client method call creates object.


Objects stay in memory when method calls on done.


Second client method calls will connect to "free" objects...


... but will create a new object in other objects are busy with other method calls.


Client method calls will always connect to first "free" object.


Client method calls will always connect to first "free" object.

Pro's and Con's:

  • Objects must protect all instance and global data.
  • Thread local variables are not reliable across multiple calls.

When to use:

  • Stateless data application servers.
  • Objects that do not contain properties.

Both Threading Model

This is the same as the Free-threaded model except that outgoing calls (for example, call-backs) are guaranteed to execute in the same thread.


Clients connect at method call level like the free model...


(i.e. client will connect to first "free" object.)


... but will queue to reconnect to its originally created or connected object.


(i.e. client will assume object has "Apartment" behaviour.)

Pro's and Con's:

  • Maximum performance and flexibility.
  • Does not require the server to provide thread support for parameters supplied to outgoing calls.

When to use:

  • Stateless, high performance, data application servers.
  • ADO dataset implementations.

Neutral Threading Model

Multiple clients can call the object on different threads at the same time, but COM ensures that no two calls conflict.


Objects are created on demand at method level.


Clients will reuse pooled objects if they are free at method level.


Clients will create new objects is there is no other object that has the requested method "free"...


... but will use an existing "in use" object if the requested method is "free".

Pro's and Con's:

  • You must guard against thread conflicts involving global data and any instance data that is accessed by multiple methods.
  • Should not be used with objects that have a user interface (visual controls).
  • Only available under COM+, under COM, it is mapped to the Apartment model.
  • The only model that supports COM+ object pooling implementations.

When to use:

  • Stateless, high load, data application servers.
  • When you wish to use COM+ built in pooling implementations.

More on COM/COM+

Local variables (except those in call-backs) are always safe, regardless of the threading model. This is because local variables are stored on the stack and each thread has its own stack. Local variables may not be safe in callbacks when using free threading.

Under COM+, the serialization of calls to your object is also influenced by how it participates in activities. This can be configured using the COM+ page of the type library editor or the COM+ Component Manager.

Activities are recorded in an object's context, and the association between an object and an activity cannot be changed. An activity includes the transactional object created by the base client, as well as any transactional objects created by that object and its descendants. These objects can be distributed across one or more processes, executing on one or more computers.

For example, a physician's medical application may have a transactional object to add updates and remove records to various medical databases, each represented by a different object. This record object may use other objects as well, such as a receipt object to record the transaction. This results in several transactional objects that are either directly or indirectly under the control of a base client. These objects all belong to the same activity.

MTS or COM+ tracks the flow of execution through each activity, preventing inadvertent parallelism from corrupting the application state. This feature results in a single logical thread of execution throughout a potentially distributed collection of objects. By having one logical thread, applications are significantly easier to write.

When a transactional object is created from an existing context, using either a transaction context object or an object context, the new object becomes a member of the same activity. In other words, the new context inherits the activity identifier of the context used to create it.

Only a single logical thread of execution is allowed within an activity. This is similar in behaviour to a COM apartment-threading model, except that the objects can be distributed across multiple processes. When a base client calls into an activity, all other requests for work in the activity (such as from another client thread) are blocked until after the initial thread of execution returns back to the client.

Under MTS, every transactional object belongs to one activity. Under COM+, you can configure the way the object participates in activities by setting the call synchronization. The following options are available:

OptionMeaning
Disabled COM+ does not assign activities to the object but it may inherit them with the caller's context. If the caller has no transaction or object context, the object is not assigned to an activity. The result is the same as if the object was not installed in a COM+ application. This option should not be used if any object in the application uses a resource manager or if the object supports transactions or just-in-time activation.
Not Supported COM+ never assigns the object to an activity, regardless of the status of its caller. This option should not be used if any object in the application uses a resource manager or if the object supports transactions or just-in-time activation.
Supported COM+ assigns the object to the same activity as its caller. If the caller does not belong to an activity, the object does not either. This option should not be used if any object in the application uses a resource manager or if the object supports transactions or just-in-time activation.
Required COM+ always assigns the object to an activity, creating one if necessary. This option must be used if the transaction attribute is Supported or Required.
Requires New COM+ always assigns the object to a new activity, which is distinct from its caller's.

The CORBA Threading Models

Instancing

Instancing indicates how your CORBA server application creates instances of the CORBA object.

Instance-per-client

A new CORBA object instance is created for each client connection. The instance persists until a timeout period elapses with no requests from the client. This allows the server to free instances when they are no longer used by clients, but runs the risk that the CORBA object may be freed prematurely if the client does not use its interface often enough.

Shared Instance

A single instance of the CORBA object handles all client requests.

Threading

Threading indicates how client calls invoke your CORBA object's interface.

Single-threaded

Each CORBA object instance is guaranteed to receive only one client request at a time. Instance data is safe from thread conflicts, but global memory must be explicitly protected.

Multithreaded

Each client connection has its own dedicated thread. However, the CORBA object may receive multiple client calls simultaneously, each on a separate thread. Both global memory and instance data must be explicitly protected against thread conflicts.

*



Server Response from: ETNASC03