The JDataStore Advantage

By: Steven Shaughnessy

Abstract: Ease of deployment and protability make JDataStore a simple database to use

JDataStore, a database written entirely in Java, provides many critical advantages over databases with native components in terms of deployment, portability, reliability, performance, memory footprint and ease of use. This article describes the characteristics of JDataStore that support these claims. I'll also show that many of these characteristics also give JDataStore a significant advantage over other databases written entirely in Java.

The information contained in this article is based on the most recent version of JDataStore, 3.51. This software can be downloaded free of charge from our Web site at www.inprise.com/jdatastore/. A developer license is included with this download.

Defining the Pure Java Database

For true portability a pure Java database must meet the following conditions:
1. It must run under a generic JVM such as Sun's 1.1 or 1.2 JVM.
2. All components that access data in the database must be written entirely in Java.

If a vendor can't provide this support, you'll need to deploy different native components for each platform that you want to support. Beware: when a vendor claims to have Java support, there can be varying dependencies on native components.

Some native database vendors such as Oracle have built or integrated a Java Virtual Machine (JVM) into their native database. This is not a pure Java solution. It's a "lock-in" strategy for these database vendors: using such a solution locks you into their JVM platform and requires database companies to port this native solution to all customer platforms, a huge resource commitment. Since Java is evolving all the time, porting such a solution across multiple platforms is expensive. As a customer you become hostage to that vendor's porting schedule.

There are many vendors that provide only Java JDBC drivers. These drivers have separate native components for each platform supported. Sun classifies the various types of JDBC drivers into four driver types based largely on how native components are used. The drivers least dependent on native components are the Type 4. To provide the best portability, a Type 4 JDBC driver for a native database requires some net protocol to communicate with that database. This way the native database may or may not be on the same machine as the client that's using the Type 4 JDBC driver.

Some vendors with legacy database systems built in C or C++ will port some of their database subsystems to Java, but not their base storage and concurrency system. A good example of this would be an Object database that adds support for serializing Java Object trees, but the storage/concurrency subsystem is still native. If such a vendor provides JDBC access, application deployment will still include platform-specific native components.

JDataStore provides two Type 4 JDBC drivers: local and remote. The local one is used when the application and the JDataStore database are running in the same JVM process. Better performance can be achieved for local and server-side applications because JDBC requests are processed directly through in-process method calls. We also provide a remote JDBC driver that communicates via TCP/IP. The purpose of this driver is to provide access from more than one JVM process (which may or may not run on the same machine) to the same database. There is a performance penalty for using the remote driver since it must communicate requests and results via TCP/IP. Again the benefit for the remote driver is multiprocess support; it also has a small 52K footprint.

Deployment and Portability

A major benefit from using JDataStore is ease of deployment and portability. It's a slamdunk - there's no database software install needed! You just add one or two JAR files to your classpath depending on whether you're using the remote or local driver.

Another critical element of deployment is file format compatibility across different hardware/software platforms. Some database vendors have a different file format for the different platforms supported. A utility must be run to transfer the database from one platform type to another. There are also many international issues for native databases with respect to character sets, collation, etc.

JDataStore database files are directly transferable to any platform. There's no conversion process needed when deploying or transferring a database from one machine to another. String data is stored as Unicode; sorting and indexing uses the standard Java collation support; and numeric and time data types are stored identically for all platforms.

JDataStore runs on any stable JVM versions 1.1 and above. Some VMs that we've tried aren't stable. For example, we can complete our certification suites under Solaris using HotSpot or the classic JVM only with Sun JIT disabled. The Sun JIT has some fundamental bugs that keep us from running any test suite of significance (particularly in the area of exception handling). Table 1 shows some of the platform/JVM combinations that we can run our certification suites on.

Table 1: Platform/JVM Combination

Just because a platform/JVM combination isn't listed doesn't mean that JDataStore won't run on it. In our experience we've found that if the JVM is stable, our tests complete successfully. Database technology is incredibly portable when written in Java due to the powerful and efficient multithreading support in Java, and the fact that the only other platform dependencies relied on are basic I/O APIs.

On a side note, we like testing the IBM JVM under Linux because its JIT performs well (the best we've seen for Linux) and it supports native threads.

We also tested the HotSpot 1.2 JVM available on HP-UX. Most tests complete, but it can't handle some of our more rigorous multithreaded transactional stress tests. We'll continue to run against newer versions of their JVM, but for now it's not as stable as most of the other platforms we run against.

Reliability

We have a fair amount of experience with a variety of JDBC drivers due to our DataExpress technology. DataExpress is a collection of JavaBean components that provide powerful data access and a visual component data binding capability. DataExpress can work with a variety of data sources but the most common one used is a JDBC driver. DataExpress is designed to generically work with any JDBC driver including JDataStore.

What we've found is that the quality of JDBC drivers varies quite a bit from vendor to vendor. Many developers initially attempt to use the JDBC-ODBC bridge to access data via an ODBC driver; many of these attempts fail. This is nonportable since it relies on native code to interact with an ODBC driver. Many failures crop up when using the bridge, particularly when accessing Microsoft data sources such as Access and MS-SQL. In fact, Sun recommends that the bridge not be used for deploying applications! Another key dependency for the bridge is that it can't be any more reliable than the ODBC driver you're using it with.

In building benchmark tests using JDBC drivers, we've often caused other JDBC drivers to choke under the stress of high-volume transactional benchmarks. Thread deadlocks and strange exceptions from some of these drivers are not uncommon. In some cases we believe the database server is capable of handling the load, it's just that the JDBC driver can't.

JDBC is the primary interface to JDataStore. It's what we primarily test against, and it has to work for our customers. Failure is not an option. If you're using a native or pure Java database you really need to know that the JDBC driver is reliable and well maintained by the vendor.

Performance

When we were first developing JDataStore, many questioned whether Java was performing well enough to write a database in. The answer is, "Most definitely." In fact, there are some performance benefits because JDataStore is written in Java. Key performance benefits over native databases include:

  • High-performance thread synchronization built into the Java language: HotSpot has exceptionally fast thread synchronization support. With or without HotSpot, I've yet to see where thread synchronization causes any significant performance penalty in JDataStore benchmarks.

  • In-process execution: The local JDBC driver for JDataStore runs in the same process as the Java application. The JDBC drivers for native databases that most developers use are almost always Type 4 or Type 3. This means the JDBC driver must communicate via a networking protocol to a native database. Many high-volume transactional applications generate many short-duration requests from a database. The network overhead (even if the database is on the same machine as the application) is significant.

    To be fair, there are some aspects of Java that I see as potentially slower than what you could do by writing the database natively. We made important architectural decisions to ensure that potential Java performance bottlenecks didn't bite us. For example, byte array processing is slower in Java than C++ since array bound checks need to be made. The leaf nodes of JDataStore's secondary indexes are prefix compressed and of variable length. This minimizes the overhead of byte array processing in JDataStore's secondary index manipulations.

    The JIT compiler used with JDataStore can make a big difference in performance. In our experience the fastest JIT compiler for CPU-intensive operations is the Symantec JIT that ships with the classic WinTel JVM 1.2.2. In my experience HotSpot is faster only when garbage collection is a serious issue. On Linux, the IBM JIT is about the fastest I've seen for that platform, but it's still not as fast for CPU-intensive operations as the Symantec JIT.

    Ninety-percent of all performance issues we encounter are resolved with architectural or algorithmic improvements.

    Memory Footprint

    You can obtain full JDataStore functionality from about 700K of JAR files. This footprint can be reduced by eliminating classes needed for unused features. Even without pruning class files, this is still a very small footprint. The code size of many native databases is much larger than this.

    Another big footprint win: the local JDataStore database shares the same memory heap as the application. If you're using a native database, there's extra overhead for maintaining heaps in the native process and the JVM. Multiple memory heaps can waste memory since each heap typically preallocates memory in large chunks from the operating system. JDataStore also shares the same Java runtime libraries as the application. Native databases will most likely be using a large C runtime library.

    Ease of Use

    I've already discussed how ease of deployment and portability make this a simple database to use. These are really big wins. The Inprise Application Server (IAS) team has incorporated JDataStore into their product for Java naming service, Java messaging service and as an EJB container persistence option. What they like about JDataStore is its ease of use and deployment. They also like the performance characteristics of JDataStore. Who likes to install Oracle servers on every platform deployed to?

    JDBC has data type support for all Java primitive types Java date/time/timestamp, BigDecimal, binary, String, Object, etc. However, not all JDBC drivers support all of these data types. In fact, many don't even support all the Java primitive types. Another issue with native databases is that they convert some of their native data types to Java types for JDBC access. These conversions can result in precision loss for numerics. I don't know of any native database that has the same definition of time values as Java (i.e., time ranges supported). JDataStore supports the Java primitive types BigDecimal, Time, Timestamp, Date, binary and String without any data conversion loss.

    JDataStore is a zero administration database. Log files are used for transaction support and crash recovery, but as soon as a log file is no longer needed for an active transaction it'll be deleted automatically.

    There are several tools provided for JDataStore:

    • The DataStore Explorer provides a nice viewer for browsing and editing the contents of a JDataStore database. It also provides a variety of utility functions including database creation, transaction management options, executing interactive queries, import/export from files or other JDBC data sources, table copy and data verification. It can be launched as a stand-alone application or from a JBuilder menu.

    • JDBC Explorer provides generic data and metadata browsing capability and interactive query support for any JDBC driver including JDataStore.

    • JBuilder's visual component designer is nicely integrated and provides RAD support for JDataStore.

    Mobile and Local Store
    Application Support

    The JDBC API encourages a set-oriented approach to data access. In these approaches data access is performed on subsets of the data in relatively short duration transactions. If this discipline isn't followed, the database server can have difficulty scaling to large numbers of concurrent users.

    For mobile and local store computing models it's often desirable to have direct navigational access to the data. With this kind of access a table can be opened in a visual component where it can be browsed and edited directly. The DataExpress components provide a second API for accessing JDataStore data. By setting two properties on a DataExpress TableDataSet, large JDataStore tables can be directly navigated and edited. Since we also provide dbSwing GUI components that can bind directly to a DataExpress DataSet, it's trivial to create powerful applications quickly, using the RAD tools built into JBuilder. The following code shows how simple it is to bind a JDataStore table to a visual component.

    JdbTable grid = new JdbTable(); // dbSwing grid control.
    TableDataSet customer = new TableDataSet();
    DataStore dataStore = new DataStore();
    dataStore.setUserName("sshaughnessy");
    dataStore.setFileName("/usr/sshaughnessy/Customers.jds");
    customer.setStoreName("Customer");
    customer.setStore(dataStore);
    grid.setDataSet(customer);

    This could have been generated directly from designer interactions in the JBuilder visual component designer just by dropping three components and setting four properties. When the grid is shown it'll automatically be populated with the customer table in the customers.jds database. Note that if the customer's table contained a billion rows, you'd be able to navigate instantly from top to middle to end of table. The customer DataSet is providing a live cursor into the table. You can also edit the customer table directly from the grid control.

    No native or pure Java database provides an equivalent capability.

    Conclusion

    JDataStore is easier and more natural to use in a Java application than a native database. It makes portability and deployment nonissues. Because it was written in Java and designed with the Java platform in mind, it's much easier to develop applications with. JDataStore performance and memory footprint characteristics are excellent. Performance is actually enhanced because JDataStore is written in Java. The support for the JDBC API allows JDataStore to be a strong contender in server-side and local applications. The support for the DataExpress JavaBean components allows for high-powered RAD development of mobile/local store applications that can't be achieved with any other native or pure Java database.

    About the Author

    Steve Shaughnessy is a senior staff engineer at Borland and a member of the JBuilder team that is developing data access components and the JDataStore embeddable database. He can be contacted at sshaughnessy@borland.com


  • Server Response from: ETNASC04