Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Knowt Play

Explore Top Notes

Acids Bases and Salts

Studied by 60 people

Introduction to Cells and Organelles

Studied by 99 people

Ch 22 - Economic inequality and poverty

Studied by 43 people

Studied by 81 people

Chapter 19: Computer Basics

Studied by 17 people

AQA GCSE Biology Trilogy: Food Tests

Studied by 77 people

Lecture2 - Relational Models and Database Schema

Data Models Overview

A data model is a collection of high-level description concepts for describing data. It hides low-level storage details.
It describes the structure of a database and constraints the database should obey.
A schema is a description of a particular collection of data using a given data model.

Data Model Categories

Conceptual (high-level, semantic) data models
- Close to how users perceive data.
- Concepts: entities, attributes, relationships.
- Examples:
- Entity: real-world object like student, lecturer.
- Attribute: property of an entity.
- Relationship: association between entities (e.g., teaches).
Physical (low-level, internal) data models
- Describe how data is stored in the computer.
Implementation (representational) data models
- Between conceptual and physical, balancing user view with storage details.
- Includes relational, network, and hierarchical models.

Data Models in Practice

ER model is commonly used for database design.
Database design in the ER model is usually converted to the relational model for storage and processing.

Entity-Relationship (ER) Model

Core concepts:
- Entities (objects): e.g., customers, accounts, bank branches.
- Relationships: associations among entities (e.g., a depositor relationship between customers and accounts).
Relationship sets illustrate how entities interact (e.g., Depositor links customers to accounts).
Widely used for database design; a typical workflow converts ER designs to relational schemas for implementation.

ER Model Example (from the transcript)

Example schema components (illustrative):
- Entities: Customer, Account
- Attributes: customer-id, customer-name, customer-street, customer-city, account-number, balance
- Relationship: Depositor between Customer and Account
Visual representation shows entities with attributes and lines for relationships.

The Relational Data Model

Key idea: data is stored in relations (tables).
Schema:
\text{RelationName}(field1 : type1 , \dots , fieldn : typen )
Query language: SQL (declarative). Example:

  SELECT balance
  FROM account
  WHERE branch = 'Springfield';

Characteristics:
- Programmer specifies what results are needed, not how to execute.
- DBMS selects execution strategy.
- Provides physical data independence: applications don’t need to know how data is stored.

Relational Model: Structural Independence and Query Power

Structural independence: one can implement various associations (one-to-one, one-to-many, many-to-many) easily in the relational model.
Relational model has powerful and flexible query capabilities.

Relational Model Example 2

A sample relational database includes relations such as:
- Students(sid: string, name: string, age: integer, gpa: real)
An instance of the relation can be represented as a table with rows (tuples) and columns (attributes).
Example data excerpt (simplified):
- customer-id, customer-name, customer-street, customer-city, phone, etc. (illustrative population from the transcript)

Relational Model Terminology

Relation: a table with columns (attributes) and rows (tuples).
Attribute: a named column of a relation.
Domain: set of allowable values for an attribute.
Tuple: a row of a relation.
Degree (arity): number of attributes in a relation.
Cardinality: number of tuples in a relation.
Relational Database: a collection of normalized relations with distinct relation names.
Relation name: unique among relation names in the schema.
Atomicity: each cell contains exactly one value.
Attribute names: distinct.
Domain-consistency: values of an attribute come from the same domain.
Tuple distinctness: no duplicate tuples.
Order irrelevance: order of attributes and tuples is not significant (theoretically).

Relational Keys

Superkey: a set of attributes that uniquely identifies a tuple within a relation.
Candidate Key: a minimal superkey (no proper subset is a superkey).
Primary Key: chosen candidate key to identify tuples uniquely.
Alternate Keys: candidate keys not chosen as primary key.
Foreign Key: attribute(s) in one relation that reference a candidate key of another relation.

Nulls and Integrity

NULL values represent missing or inapplicable data.
- Meanings:
- Value unknown.
- Value exists but is not available.
- Attribute does not apply to this tuple (value undefined).
NULLs are not the same as zero or spaces.

Database Constraints

Entity Integrity (primary key): no primary key value can be NULL.
Referential Integrity (foreign key): FK values must reference an existing PK value in the referenced relation, or be NULL (with caveats).
Domain constraints: data types, ranges, and formats enforced by the domain.
Semantic (business) integrity constraints: application-specific rules that may require a constraint language to express (e.g., max hours per employee on a project is 56 hours per week).

Examples of Schema and Relations

Example schema (COMPANY): includes EMPLOYEE, DEPARTMENT, PROJECT, WORKS_ON, DEPENDENT, etc., with keys and relationships depicted.
Visuals in the transcript show underlined primary keys and arrows for referential integrity constraints.
Example attributes and keys include SSN (primary key for EMPLOYEE), DNUMBER (for DEPARTMENT), PNUMBER (for PROJECT), ESSN (FK in WORKS_ON referencing EMPLOYEE.SSN), etc.

Updating Relations and Integrity

Update operations: INSERT, DELETE, MODIFY a tuple.
Integrity constraints must not be violated by updates.
Updates may be grouped; changes may cascade to maintain integrity (CASCADE, SET NULL) or trigger additional updates.

Update Scenarios

If an integrity violation would occur, possible actions include:
- Reject the operation.
- Proceed with the operation but raise a violation warning.
- Trigger cascading updates to restore consistency.
- Execute a user-defined error-correction routine.

Schema, Schema Diagrams, and Instances

Database Schema (intension): descriptions of the database structure and constraints.
Schema Diagram: a diagrammatic display of aspects of a database schema, including names of relations and some constraints.
Database Instance/State (extension): the actual data at a moment in time.
A new database starts empty (no data); initial state includes initial data.
Schema changes infrequently; instance/state changes on updates.

Three-Schema Architecture

Levels:
- External level (Views): multiple views tailored to user groups.
- Conceptual level (Logical): defines the community-wide structure and constraints.
- Internal level (Physical): describes storage structures and access paths.
A single conceptual schema, multiple external schemas, and potentially multiple internal schemas exist.

External, Conceptual, and Internal Schemas

External schemas provide customized access per user group (views).
Conceptual schema describes the entire database structure for a community of users.
Internal schema describes how data is stored physically, including files and indexes.

Conceptual Schema Design and Internal Details

Conceptual schema design involves determining relations (entities), attributes, data types, relationships, and constraints.
Internal schema specifies physical storage, indexes, and access paths; decisions depend on data access patterns and DBMS capabilities.
The external schema refines the conceptual schema to meet user needs and security requirements.

Data Independence

Data Independence:
- Logical data independence: ability to change the conceptual schema without changing external schemas or applications.
- Physical data independence: ability to change the internal schema without changing the conceptual schema.
When a lower-level schema changes, mappings to higher-level schemas change, while higher-level schemas remain unchanged.

DBMS Languages and Processing

High-level (non-procedural) languages: e.g., SQL. Set-oriented; specify what data to retrieve, not how.
Low-level (procedural) languages: record-at-a-time; specify how to retrieve data; include loops and other control structures.
DBMS components:
- Storage Manager: interface between DBMS and physical data stored in files.
- Query Processor: compiles and evaluates queries; includes DDL interpreter, DML compiler, and the query evaluation engine.

Storage Manager Details

Responsibilities:
- Interface between low-level data storage and applications/queries.
- Translates DML statements into file operations.
- Stores, retrieves, and updates data.
Data structures managed: data files, data dictionary (metadata), indices for fast access.

Query Processor Details

Components:
- DDL Interpreter: processes DDL statements and updates the data dictionary.
- DML Compiler: translates DML into an evaluation plan understood by the query engine.
- Query Evaluation Engine: executes the low-level instructions.

Client-Server Architectures

Two-tier (client-server) architecture:
- Tier 1: Client – manages user interface and applications.
- Tier 2: Database server – holds database and DBMS; performs server-side validation and data access.
Advantages of two-tier:
- Wider access to databases.
- Increased performance and potential cost reductions.
- Reduced client hardware costs and communication costs.
- Improved consistency.

Three-Tier Client Architecture

Evolution to three layers to address scalability:
- Tier 1: Client – user interface.
- Tier 2: Application server – business logic and data processing logic.
- Tier 3: Database server – data validation and database access.
Advantages of three-tier:
- Client can have lighter hardware.
- Centralized application maintenance.
- Easier modification or replacement of one tier with minimal impact on others.
- Better support for load balancing and web environments.

Practical Connections and Real-World Relevance

Relational model provides a flexible, scalable basis for modern database systems used across industries.
Separation of concerns (schema vs data, external vs conceptual vs internal) enables evolution: you can change how data is stored without breaking applications.
Integrity constraints and normalization principles protect data quality and consistency in real-world systems.
Understanding keys, constraints, and update propagation is essential for reliable database design and maintenance.

Quick Reference Concepts (Recap)

Data model, schema, instance/state, intension vs extension.
ER model, entities, relationships, relationship sets.
Relational model: relations, attributes, domains, tuples, keys, and constraints.
Keys: superkey, candidate key, primary key, alternate keys, foreign key.
NULL values and their meanings.
Integrity constraints: entity, referential, domain, semantic.
Update operations and cascading effects.
Schema diagrams and example schemas (e.g., COMPANY).
Three-schema architecture (external/conceptual/internal) and data independence.
DBMS architecture: storage manager, query processor, and their subcomponents.
Client-server architectures: two-tier and three-tier, with associated advantages.
Declarative vs procedural languages; SQL as an example of a high-level language.
Examples of relational schemas and relations from the transcript (e.g., EMPLOYEE, DEPARTMENT, PROJECT, WORKS_ON) to illustrate keys and constraints.

Note: An explicit example mentioned in the material is the constraint: the max hours per employee on all projects is 56 hours per week, illustrating semantic integrity constraints.

Note

0.0(0)

Take a practice test

Chat with Kai

undefined Flashcards

Knowt Play

Explore Top Notes

Acids Bases and Salts

Studied by 60 people

Introduction to Cells and Organelles

Studied by 99 people

Ch 22 - Economic inequality and poverty

Studied by 43 people

Studied by 81 people

Chapter 19: Computer Basics

Studied by 17 people

AQA GCSE Biology Trilogy: Food Tests

Studied by 77 people