DBMS Concepts:
This chapter discusses some necessary and some desirable features to look for in a DBMS. Each should be implemented as an available service in the DBMS, even if you have no need for all of them.Update and retrieval are necessary features. A system that does not allow you to change, add, or delete data would not be useful. Update includes the ability to change, add, and delete records.
Catalog services are used to describe a DBMS. A catalog would include descriptions of all tables, of the relations between them, of the data types used in the tables, and of the constraints used for each field. Some systems refer to this kind of reference as a data dictionary. The text refers to the concept as metadata about the DBMS.
Concurrent update problems occur when multiple users input data into tables in a way that causes the loss of some of that data. A term used here is transaction, which is defined as the steps needed to complete a single task. Some classic scenarios are offered for overcoming this problem:
- batch processing - have all transactions stored in a queue, then processed in turn by an update program. The text describes a system that might store transactions and process all updates once a day. This would be fine, as long as the next user did not need to know the effects of the last user's updates. A system that provides real time access to inventory data, or one that performs registrations for classes are examples of systems that cannot wait very long for the update process to run.
- locking - a DBMS may allow for locking single or multiple records in a table, or it may allow for locking entire tables. Data is locked so that other users cannot access that data until it is unlocked. This provides for a clean environment in which you are not allowed to access data until you have the freedom to change it. The text describes a two-phase approach to locking. The first phase is the growing phase, in which locks are applied to as many records or tables as necessary to accomplish the user's transaction uninterrupted. The second phase is the shrinking phase, in which locks are released as the changes are finished and the locks are no longer needed.
- timestamping - this approach assigns a time to each request, and it processes them in chronological order. This is like batch processing, in that an order of events is assigned, but unlike it in that we do not wait an arbitrary time for the processing.
Data recovery is necessary on any computer system that saves important data. A DBMS system may use standard backup tape systems, but it may have some additional features. Journaling allows the system to make a record of transactions, a log file that can be used in case a restoration is necessary. The concept leads to a couple of scenarios:
- forward recovery - Assume that there is a disaster causing us to restore the DBMS to the state is was in two days ago, as that is the date of our last reliable backup. Forward recovery would then allow us to process the journal of transactions to bring the DBMS back to the present. Obviously, this requires that journals be kept around for some time to make this possible.
- backward recovery - also called rollback, this is the process of backing away from an error condition by restoring the system to an earlier state. You can do this by simply performing a restore, or by undoing the transactions in a journal in reverse order.
- encryption - storing and transmitting files in an encrypted mode so that they are not readable unless you are running the DBMS
- authentication - authentication typically involves a user ID and a password to gain access to the system. Biometric authentication is used in some systems, which may require a fingerprint or other physical ID in addition to an ID and password.
- authorization - authorization is the granting of rights to parts of the system to particular users. This can include allowing groups of users to see only parts of the database (views).
A DBMS should support data independence in that it should allow for changes to the data and changes to the tables without having to rebuild the entire system. Change examples listed include changing the length of a field, adding fields, and creating indexes.
Data replication is the process of keeping multiple copies of the database in multiple places. This may be done for fault tolerance or it may be done to make the data more accessible to users in distant locations. Multiple copies of the data require methods to synchronize the copies. Updates may be allowed in one or more copies, but all changes must be propagated to all other copies.
Utility services are services to allow you to perform actions associated with the operating system in use, like file copying, running other programs, and writing and running script files.