Base One's Database Library and terminology

.NET database and distributed computing tools

Base One's Database Library and Terminology

Contents
Database terminology Comparison with relational Transaction processing

Database Terminology

Base One Rich Client database terminology differs from that of the different back-ends (DBMSs) supported by the Database Library. The aim is to be easier to understand than relational terminology, unambiguous and consistent. You can easily convert Rich Client terminology into relational SQL terminology (as described below).

A Database is an organized collection of related information. This data is broken down into Records, each of which represents information about some particular entity. Each record is further subdivided into Database Fields, which contain individual data values. A database field can hold a number, a character string, a date, a time, or an Attached Object, such as a picture, a word processor document, a spreadsheet, a text memo, or even music or a video clip.

Records are grouped according to their Record Types. Each record type has a distinctive field layout, which is described in the Data Dictionary contained in the database. There is a data dictionary record for each record type (table) and each field (column) in that record type's layout.

Record types are themselves grouped into database Regions, with each region containing at most one record type plus any number of subregions. Every record type belongs to one and only one region, but since a region can contain subregions which also contain record types, a region can be thought of as containing a collection of record types. For example, the security region consists of all the different record types used to manage database security. Regions without record types can be created for the sole purpose of organizing the record types contained in the subregions. There is a data dictionary record for each database region.

For rapid retrieval and ordering of records, a database uses indexing. An Index is a sorted collection of Keys. Each key is associated with a single record and has one or more Key Fields of information about that record. For any given record type, one index is required for each type of key needed for fast access. There is a data dictionary record for each index.

Comparison with Relational Architecture

A Rich Client "Record Type" is like a "table" in relational terms and a Rich Client "Record" is approximately equivalent to a "row". (In the world of SQL and spreadsheets, records are called "rows".) A "Database Field" is similar to a "column" in relational terminology. For example, a Base One database may be said to contain 1000 client records, with each record having the fields defined by the client record type, whereas in relational terms the same database has a client table with 1000 rows, whose columns are specified by the client table definition. In Rich Client architecture, a "Record Type" is what it sounds like, that is, a type of record, such as a personnel record. In relational terminology, the term "table" is ambiguous because database views and result sets are also called "tables". (To avoid ambiguity, some database vendors call a collection of records having the same record type a "base table".)

Every Rich Client database has a "Database Name", which is its logical name. If the database is being accessed through Microsoft ODBC, the database name should be the ODBC Datasource Name. For Oracle DirectAPI, you can choose any DbNam because it is used for informational and error reporting purposes only (but is still required). A database usually has a "Database Server ID", which is a back-end and communication specific string which locates the database. For example, the Database Server ID might be the string "(local)", the NT Server name, an SQLNet string, or a firewall IP Address.

A comparison with Oracle, for example, is useful since a Rich Client database is a little different from an Oracle database. In the Rich Client model, a database is a collection of related information. In Oracle, data in different schemas are all part of the same database, even if unrelated. For example, you might have a production schema and a schema used for training, both in the same Oracle database. In Rich Client terminology, these would be considered to be two different databases; the training database is not related to the production database in the sense that the data has different financial value, rules for backup, security considerations and overall importance. For these reasons, in Rich Client architecture, the database must always exist within a single DBMS schema. However, this does not prevent applications from running against multiple databases at the same time.

The INSERT command is used to add records in SQL, whereas the Rich Client functions are AddRec(), AddSet() and LoadRecs(). In SQL, you use UPDATE to change the contents of a record. The Rich Client function is ChgRec(). DELETE is used in SQL, and DelRec() and DelSet() functions exist in Rich Client Architecture for deleting records. In addition to providing such easy-to-use functions, the Database subsystem provides direct access to SQL INSERT, UPDATE and DELETE commands through its Command Processor.

Transaction processing in RDBMSs requires explicit COMMIT and ROLLBACK commands. Not only are these supported, but also most Base One data modification functions, such as AddRec(), ChgRec(), and DelRec(), provide transaction processing options to simplify programming and improve performance. For example, AddRec() can be called with no arguments, and the default will be to commit on success and rollback on failure. This is important for client/ server programming and three-tier Internet architecture where the middle tier can do the commits or rollbacks without going back to the client workstation for instructions. Explicit COMMIT or ROLLBACK commands are also supported.

The SELECT command is used to retrieve data in SQL, whereas the Rich Client functions are GetRec(), FindRec() and GetRecSet(). In Rich Client Architecture, a "Set" is a collection of one or more records or rows of data either returned from a query or to be used for a bulk operation, such as adding multiple records. Thus, a Rich Client "set" is similar to relational database terms such as "RowSet" or "RecordSet". All of these types of "sets" can be thought of blocks of results (and are to be distinguished from "Result Set", which is the entire query results that can span many blocks).

When a Rich Client query is first executed, data is always returned without an explicit command, such as FETCH, which is required for most relational database implementations. Similarly, as a program moves through a result set, rows of data are also automatically moved into memory without the programmer keeping track of the buffer size and explicitly fetching the next block of results.

Rich Client architecture differs greatly from relational database architecture in its attitude towards indexing. Rich Client assumes that indexing is of primary concern to programmers, and the orderings that can be handled efficiently are an essential property of the database. The functions for data retrieval provide the programmer with a unique way to automatically ensure that the retrieval is done through an appropriate index key rather than permitting the database to sort large numbers of records. This simplifies programming, helps guarantee good performance during scrolling, and greatly reduces the need for custom tuning to achieve acceptable response time.

Rich Client attempts to simplify the confusion between nulls and blanks in SQL. In the traditional relational model, a value is null if it is "unknown", and this must be distinguished, for example, from a value consisting of all blank characters.

In the Rich Client model, null means "blank": a null field is a blank field. Nulls and blanks fields are treated as equivalent (to the extent possible given the underlying RDBMS). The relational definition of null as "unknown" causes a number of hard to understand consequences. For example, two fields both having null values are not considered to be equal. In fact, in SQL a value of NULL is not considered to be greater than, less than, different from, or equivalent to any other value, including another value of NULL.

The Database Field Dictionary specifies whether a field can be null. The check for null is made on the client side whenever possible. If the field is not allowed to be null, then it is also not allowed to be blank.

Transaction Processing

In the world of large, multi-user databases, performance can make or break a project just as surely as any bug. The Database Library samples and documentation provide guidance in designing efficient transaction processing systems, which prevent the all-too-easy mistake of building a working prototype that becomes a total failure in a full-scale production environment. The Database Library was designed not just to create reliable database applications quickly, but also to make it next to impossible to fall into the usual performance and deadlock pitfalls.

Transaction processing plays a key role in Base One's database-centric architecture because it is crucial in coordinating multiple users doing data entry, and in reliable recovery from hardware and software failures. Transaction processing's most important purpose is to insure that all database changes that are part of a single logical (business) transaction appear completely or not at all. Users must be prevented from doing a "dirty read" and seeing another process' in-flight transaction, which may subsequently fail.

Rich Client architecture uses optimistic concurrency control to coordinate multi-user access to the same records. Two users can retrieve the same record, but only one user at a time can alter it. If they both try to update it, the changes of the first user to save the record take effect. The second user's changes are rejected, and that user is notified that some other process has modified the record.

A transaction defines a set of events that are guaranteed to be committed, making all the changes permanent, or rolled back (aborted) as a unit. The need to aggregate individual actions into indivisible transactions is a core requirement of most business systems. For example, transferring $1000 from your savings account to your checking account requires both subtracting the money from your savings account and adding it to your checking account. Either both operations should occur or neither.

A related objective of transaction processing is to make it possible to gather data into a single, consistent snapshot, which means, for example, that all data on a given report or screen actually existed together at the same point in time. Local storage (caching) of data significantly speeds up response time, but must not conflict with the need to both know and control exactly how old some of that data is (Last Refresh Date/Time).

You may have the experience of using an application in which it takes some time to realize you really need to do a manual REFRESH because you're looking at old data or a strange mix of old data and new data. Transaction processing is a core requirement for building database applications that give end users the sense that they can rely on the consistency and timeliness of the data they are seeing on the screen. Of course, out of sync data can not only be disconcerting but dangerous to database integrity (correctness). For example, it can be annoying when there might be 10,000 newly added records in front of what the local application stubbornly insists is the first record. Or there could be serious repercussions if a cash machine authorizes a cash withdrawal based on a local copy of a bank account record, shortly after that record was deleted for security reasons from the central database.

Base One's Rich Client database architecture follows the formal concepts of transaction processing, sometimes called the ACID test:

Atomicity: All state changes within a transaction are atomic - either all changes happen or no changes happen.
Consistency: A transaction is a correct transformation of state. The aggregate changes represented by a single transaction do not leave a corrupt or inconsistent state.
Isolation: While two or more transactions may execute concurrently within the system, the net change is as if all transactions executed sequentially.
Durability: Once a transaction successfully commits, the resulting changes will survive a system failure.

To continue the cash machine example: a bank debit transaction is atomic if it both dispenses cash and updates your account. It is consistent if the money dispensed is the same as the debit to the account. It is isolated if the transaction can take place even though other programs may be reading and writing your account concurrently (for example, your spouse making a deposit at the same time). And it is durable (stable, persistent) if, once the transaction is complete, the account balance is sure to reflect the withdrawal (even if, for example, the cash machine should crash immediately after the withdrawal).

Programmers need transaction processing facilities to make sure they can easily and efficiently defend against the strange coincidences that take place when a number of users bang on a database. Scroll Cache is an example of a Base/1 database architecture feature that makes it simple to combine local data caching with transaction processing rigor. Another example is the Last Modification Key mechanism used to efficiently handle multiple users simultaneously trying to change or delete the same record. This use of transaction processing facilities is sometimes called "optimistic" concurrency control, because it is based on an assumption that, although important, such collisions are rare, and it is therefore much more efficient to spot conflicts as they occur, rather than "pessimistically" locking records for the entire time a user is making changes.

The true costs to performance of using transaction processing to maintain data integrity are not usually apparent until large numbers of users and/or large amounts of data are involved. Too often, applications that work perfectly in prototype have dismal performance once put into production. The reason is that great care and clear awareness of the pitfalls are required in programming large scale, multi-user systems. Otherwise, the database locking protocols used for transaction processing cause users to seriously interfere with each other.

The Base/1 architecture automatically handles many of the details needed for efficient transaction processing. RDBMSs require an explicit COMMIT command to complete a transaction and an explicit ROLLBACK command to abort a failed transaction. Most Base/1 data modification functions, such as AddRec(), ChgRec(), DelRec(), AddSet(), LoadRecs(), provide automated handling of transaction processing. Some database functions include options for custom tailoring of transaction handling, and others automatically handle releasing locks. The objective is to improve performance, simplify programming, and reduce the chance of transaction processing errors that may not show up until the application is put into production.

More details about Base One's Database classes ...


Home	Products	Consulting	Case Studies	Order	Contents	Contact	About Us

Copyright © 2012, Base One International Corporation