Database Technique - Lecture 5

DB113H: Database Technique - Lecture 5 Study Notes

Overview of Database Models

  • Old Models
    • Earlier models relied on file systems.
    • Applications required detailed knowledge of file structures to function.
    • This posed challenges during database reorganizations.
    • Lack of physical data independence was a significant drawback.

Relational Model

  • Introduction

    • Codd's Relational Model, introduced in 1969 by Edgar F. Codd, characterized by mathematical notation of relations.
  • Components of the Relational Model

    • A relation variable is defined as a table (denoted by R).
    • Attributes correspond to columns in a table and can take non-ordered formats (Attribute 1, …, Attribute n).
    • Tuples are the rows in a table, also non-ordered, and can be referred to as records.

Central Idea of Relational Model

  • The relational model describes a database as a collection of predicates over a finite set of predicate variables, which outline constraints on valid values and value combinations.
  • At any moment, the database content is a finite logical model consisting of a set of relations corresponding to predicate variables, fulfilling all predicates.
  • Database queries function as predicates as well.

Representation and Structure

  • Tables as Relations
    • The relation concept is physically presented as a table, resembling a two-dimensional array.
    • Columns correspond to attributes while the domain of an attribute is defined by its permissible values.
    • Each row (tuple) represents an individual record or entity instance.

Integrity Constraints

  • Definition
    • Integrity constraints (ICs) are rules or restrictions applied to the database to ensure data validity.
  • Role of DBMS
    • The Database Management System (DBMS) enforces these rules, ensuring that entered data constitutes a legal instance.
  • Importance
    • Preserving integrity is crucial for correctness and consistency of data in the database, preventing the entry of inconsistent data.

ER-model to Relational Schema Mapping

  • Conversion Process
    • Implement simple ER-diagrams into logical models using DBMS tools like MySQL.
    • Each entity set translates into a table.
Naming Conventions for Tables
  • Consistency for developers is vital. Different developers might have varied naming standards; adhering to one convention simplifies collaborative efforts.

    • E.g., considering singular vs plural naming: object-oriented programming favors singular nouns (e.g., Bike, Car).
  • A well-designed database could outlast its applications. Carefully established naming conventions save time in future developments.

  • Singular vs. Plural

    • While tables can represent multiple entries (zero, one, many), naming should reflect the singular form to avoid confusion regarding the number of records.
    • Easier implementation of singular names over plural alternatives.
  • Using Underscores vs. Camel Case

    • It is suggested to utilize underscores (_) for word separation instead of camel case for better readability. SQL is case-insensitive; thus, maintaining consistent use of lower or upper case is recommended.
    • E.g., Select Statement Examples:
      • SELECT * FROM ActingApplicationScheduleManager;
      • SELECT * FROM acting_application_schedule_manager;
    • Adapt conventions based on user preferences or customer requirements.

Assigning Attributes in Relational Schema

  • While defining tables like Patient, start by assigning attributes alongside their datatype.
    • Example SQL Syntax:
      sql CREATE TABLE patient ( patient_id INT, name VARCHAR(45), address VARCHAR(45) );
  • Adding Primary Key
    • It is common to assign a primary key during table creation. For our Patient table, the patient_id will serve as the primary key:
      sql ALTER TABLE patient ADD PRIMARY KEY (patient_id);
  • A similar process applies for other entity sets like visit, sometimes necessitating surrogate keys if natural keys aren't definable:
    sql ALTER TABLE visit ADD PRIMARY KEY (visit_id);

Mapping Relationships and Keys

  • Relationships are handled within relational schemas, notably for entities like Visit.

  • Foreign Keys

    • When a relationship has no attributes, it doesn’t necessitate a separate table. Otherwise, foreign keys should be established. E.g., for the visit table:
      sql ALTER TABLE visit ADD FOREIGN KEY(patient_patient_id) REFERENCES patient(patient_id);
  • Cardinality

    • Cardinality explains the relationship complexity between tables:
      • One-to-One (1:1): One user profile correlates with one person.
      • One-to-Many (1:M): One patient can attend multiple visits.
      • Many-to-Many (M:N): Represents more complex associations requiring extra tables; e.g., enrollment in multiple courses.

Handling Multiple Values in Attributes

  • Some attributes can support multiple values. For instance, a teacher may hold several phone numbers.
  • Best Practices
    • Instead of combining multiple entries in one column, create a separate table to accommodate such entries more efficiently.
      sql CREATE TABLE phone_nbr ( phone_nbr VARCHAR(10), teacher_teacher_id INT );

Engineering Concepts

  • Reverse Engineering: Extracting data structures from existing databases to visualize schemas.
  • Forward Engineering: Transforming database structures from design models into actual databases in a systematic way. E.g., creating Class, Student, and Enroll tables with defined primary and foreign keys.

Data Types in SQL

  • Definition and Importance

    • SQL data types define data nature stored in database objects like tables. Appropriate data types help maintain data integrity, optimize storage efficiency, and enhance query performance.
  • Examples of SQL Data Types:

    • Exact Numeric Types: bigint, int, smallint, tinyint with defined ranges.
    • Approximate Numeric Types: e.g., float(n), where n defines precision.
  • Data Type Characteristics:

    • Data Integrity: Prevent non-appropriate types, e.g., inserting non-integer values into an integer column leads to errors.
    • Storage Efficiency: Utilizing smaller data types optimizes resources, reducing storage.
    • Query Performance: Numeric data types facilitate faster processing compared to strings, yielding quicker query execution.
  • Common SQL Data Types Examples:

    • Fixed-length using char: 8000 characters max.
    • Variable-length using varchar: 8000 characters or up to 2GB.
    • Date: stores in the YYYY-MM-DD format (example: "date 2020-06-12").