Database Concepts

Basic Database Concepts

Data

Data refers to known facts that can be stored on a computer. This includes:

  • Customer names, addresses, and telephone numbers.

  • Documents, images, sound, and video segments.

  • Definition: Facts, text, images, sound, and video segments meaningful in the user’s environment.

Data vs. Information

Information is data that has been processed to increase the knowledge of the person using it.

  • Data Example: Raw list of names and numbers.

  • Information Example: Class roster with names, IDs, majors, and GPAs.

Metadata

Metadata are data that describes the properties or characteristics of other data, and includes data types, field sizes and allowable values.

File Processing System

A collection of application programs performing services for end-users, where each program defines and manages its own data.

  • Disadvantages: Program-data dependence, data duplication (redundancy), limited data sharing, lengthy development times, excessive program maintenance, and vulnerability to inconsistency.

  • Problems with Data Dependency: Each programmer maintains their own data and metadata, leading to non-standard file formats.

  • Problems with Data Redundancy: Waste of space, maintenance headaches, and inconsistencies when data changes in one file.

Database Approach

A central repository of shared data managed by a controlling agent, stored in a standardized form, which requires a Database and Database Management System (DBMS).

What is a Database?

An organized collection of logically related data arranged for easy storage, manipulation, and retrieval, describing a domain of interest to a group of users.

Terms and Concepts

  • Bit: Smallest unit of data; binary digit (0,1)

  • Byte: Group of bits representing a single character.

  • Column/Field: Group of words or a complete number.

  • Row/Record: Group of related fields.

  • File/Table: Group of records of the same type.

  • Database: Group of related files.

  • Entity: Person, place, thing, event about which information is maintained.

  • Attribute: Description of a particular entity.

  • Key field: Identifier field used to retrieve, update, sort a record

Data Hierarchy

Bit -> Byte -> Field -> Record -> File -> Database.

Database Approach vs. File Based System

Feature

File-based Approach

Database Approach

Data Redundancy

Data duplicated in different files.

Data appears only once.

Data Consistency

Inconsistencies may arise.

Data is always consistent.

Data Retrieval

Difficult to retrieve, stored separately.

Easily retrieved via linked tables.

Data Sharing

Data distributed, not easily shared.

Data stored centrally, easily shared.

Data Integrity/Security

Difficult to apply data integrity checks.

Provides many constraints for data security.

Security

Poor security, data is widely spread.

Provides many procedures to maintain security.

The Database Environment

Encourages data sharing, controls redundancy, improves accuracy, permits storage of vast data volumes with acceptable access. Offers database queries and tools for data security, privacy, backup, and recovery.

Types of Databases and Database Applications

  • Traditional: Numeric and Textual.

  • More Recent: Multimedia, Geographic Information Systems (GIS), Biological and Genome, Data Warehouses, Mobile.

  • Recent Developments: Social Networks, Search Engines, Big Data, NOSQL systems, cloud storage.

Database Management System (DBMS)

Software for creating and maintaining databases, managing data for the firm, acting as an interface between applications and physical data files, and separating logical and design views.

  • Advantages: Data sharing, data independence, controlled redundancy, better data integrity.

Schemas versus Instances

  • Database Schema: Description of a database including structure, data types, and constraints.

  • Schema Diagram: Illustrative display of a database schema.

  • Schema Construct: A component of the schema.

  • Database State: Actual data stored in a database at a particular moment (also called database instance).

  • Initial Database State: Database state initially loaded.

  • Valid State: State that satisfies the database structure and constraints.

  • Distinction: Schema changes infrequently, state changes with every update. Schema is also called intension, the state is also called extension.

Three-Level Schema Architecture

  • External Schema (View Level): How individual users view the data, hiding unnecessary details.

  • Conceptual Schema (Logical Level): Logical structure of the entire database, combining all user views.

  • Internal Schema (Physical Level): How and where the data is stored and accessed, including physical storage format and indexing.

Data Independence

  • Logical Data Independence: Capacity to change the conceptual schema without changing external schemas.

  • Physical Data Independence: Capacity to change the internal schema without changing the conceptual schema.