Database Architecture Notes
Database Architecture
The Three-Level ANSI-SPARC Architecture
- In 1978, the ANSI/SPARC committee proposed a generalized framework for database systems, distinguishing between the logical and physical representation of data.
- This framework introduced a three-level architecture, representing three levels of abstraction for viewing the database.
Need for Abstraction
- The primary objective of the three-level architecture is to separate each user’s view of the database from its physical representation.
- Reasons for this separation:
- Users should be able to access the same data but have different customized views.
- Users should be able to change their view of the data without affecting other users.
- Users should not have to deal directly with physical database storage details; interaction should be independent of storage considerations.
- The Database Administrator (DBA) should be able to change database storage structures without affecting users’ views.
- Changes to physical storage aspects (e.g., new storage device) should not affect the internal structure of the database.
- The DBA should be able to change the conceptual structure without affecting all users.
External Level
- Represents the users’ view of the database.
- Describes the part of the database relevant to each user.
- Consists of various external views; each user has a view of the ‘real world’ in a familiar format.
- Includes only entities, attributes, and relationships of interest to the user.
- Different views can have different representations of the same data.
- May include derived or calculated data, which is not stored but created when needed.
Conceptual Level
- Represents the community view of the database.
- Describes what data is stored and the relationships among the data.
- The middle level in the three-level architecture.
- Contains the logical structure of the entire database, as seen by the DBA.
- Provides a complete view of the organization's data requirements, independent of storage considerations.
- The conceptual level represents:
- All entities, their attributes, and relationships.
- Constraints on the data.
- Semantic information about the data.
- Security and integrity information.
- Supports each external view; any data available to a user must be contained in or derivable from this level.
- Must not contain any storage-dependent details.
Internal Level
- Represents the physical representation of the database on the computer.
- Describes how the data is stored in the database.
- Covers the physical implementation of the database to achieve optimal runtime performance and storage space utilization.
- Covers the data structures and file organizations used to store data on storage devices.
- Interfaces with the operating system access methods (file management techniques) to place and retrieve data.
- Concerned with:
- Storage space allocation for data and indexes.
- Record descriptions for storage (with stored sizes for data items).
- Record placement.
- Data compression and data encryption techniques.
Data Independence
- A major objective of the three-level architecture is data independence, meaning upper levels are unaffected by changes to lower levels.
- Two kinds of data independence:
- Logical data independence
- Physical data independence
Logical Data Independence
- Refers to the immunity of external schemas to changes in the conceptual schema.
- Changes to the conceptual schema (e.g., addition/removal of entities, attributes, or relationships) should be possible without changing existing external schemas or rewriting application programs.
- Users for whom changes are made need to be aware, but other users should not be affected.
Physical Data Independence
- Refers to the immunity of the conceptual schema to changes in the internal schema.
- Changes to the internal schema (e.g., different file organizations, storage structures, devices, modifying indexes, or hashing algorithms) should be possible without changing the conceptual or external schemas.
- From the users’ viewpoint, the only noticeable effect may be a change in performance.
Data Models
- A model is a representation of ‘real world’ objects, events, and their associations.
- A data model represents the organization itself.
- Should provide basic concepts and notations for database designers and end-users to communicate their understanding of organizational data.
- Components:
- A structural part, consisting of a set of rules for database construction.
- A manipulative part, defining allowed operations on the data (updating, retrieving, changing structure).
- A set of integrity constraints to ensure data accuracy.
- Data models fall into three broad categories:
- Object-based
- Record-based
- Physical data models
- The first two are used at the conceptual and external levels; the latter is used at the internal level.
Object-Based Logical Model
- Contains concepts such as entities, attributes, and relationships.
- First developed by Peter Chen in 1976.
- Common types:
- Entity–Relationship
- Functional
- Object-Oriented
- An entity is a distinct object (person, place, thing, concept, event) in the organization to be represented in the database.
- An attribute is a property describing some aspect of the object we wish to record.
- A relationship is an association between entities.
- The object-oriented data model extends the definition of an entity to include both attributes (state) and actions (behavior).
- The object encapsulates both state and behavior.
- Object-based logical models are then transformed into record-based logical models.
Record-Based Model
Record-Based Model or Logical Data Model or simply Database Model is logical conversion of Entity Relationship Model
Visualizes the logical model, which in turn is the basis for the physical implementation (database tables, indexes, etc.).
In a record-based model, the database consists of fixed-format records, possibly of differing types.
Three principal types:
- Relational data model
- Network data model
- Hierarchical data model
Choice of data model depends on the relationship between various pieces of data
Each model is best suited for a particular class of problems depending on:
- Nature of the data relationships
- The need for flexibility
- The volume of changes to the database
The hierarchical and network data models were developed before the relational data model, so their links to traditional file processing concepts are more evident.
Hierarchical Model
- The hierarchical DBMS is used to model one-to-many relationships, presenting data in a treelike structure.
- Within each record, data elements are organized into segments.
- To the user, each record looks like an organizational chart with one top-level segment called the root.
- An upper segment is connected logically to a lower segment in a parent–child relationship.
- A parent segment can have more than one child, but a child can have only one parent.
- Hierarchical structures depict one-to-many relationships.
Network Model
- In the network model, data is represented as collections of records, and relationships are represented by sets.
- In network model parents can have multiple children, and a child can have more than one parent.
- Network DBMS depict data logically as many-to-many relationships.
- A 1:N, or one-to-many, relationship relates one instance of a record to many record instances using some pointer linking mechanism in these model.
Relational Database Model
- The relational data model is based on the concept of mathematical relations.
- In the relational model, data and relationships are represented as tables, each of which has a number of columns with a unique name.
- Relational databases permit more flexible combination and easier retrieval of information than other models.
- Data in one table can be related to any piece of data in another table as long as both tables share a common element.
Physical Database Model
- Physical data models describe how data is stored in the computer, representing information such as record structures, record orderings, and access paths.
- Its just Data Definition Language(DDL)
- DDL provides the semantics for administering all the physical objects In your database that is stored in the data dictionary
- The result of compilation of DDL Operation is a set of tables, views, roles, users and other objects as physical database