Chap01-04eng_forStud

Database Definition

Instructor: Asst. Prof. Khantharat Anekboon, Ph.D.Institution: CIS, KMUTNB

Learning Objectives

  • Distinguish between data and information.

  • Understand the evolution from databases to file systems.

  • Explore the database system environment.

  • Identify the main functions of a Database Management System (DBMS).

Importance of Data

  • Operating a business without data leads to unstructured and unquantifiable operations.

  • Key Components to Know:

    • Customers

    • Products

    • Employees

Data and Information

  • Good decision-making requires qualitative information derived from raw data, referred to as data.

  • Efficient data management involves the use of databases, which have evolved from traditional file systems.

Data vs. Information

  • Data are raw, unprocessed facts and figures.

  • Example: Admission forms of students contain raw facts such as names, addresses, etc.

  • Examination data shows raw scores without context.

Using Surveys for Data Collection

  • Surveys can assess user perceptions, utilizing web forms for data collection.

  • This step gathers raw data about user experience.

Data Repositories

  • Raw data is stored in repositories but is not useful in raw forms; finding insight from purely numerical data is difficult.

Transforming Data

  • Transforming raw data into visual summaries makes it more usable and understandable.

Representation of Data

  • Information can be effectively represented graphically to enhance understanding.

Evolution of Databases

  • Databases are developments arising from earlier file systems.

  • Types of File Systems:

    • Manual File Systems

    • Computerized File Systems

Manual File Systems

  • Defined as paper-and-pencil systems, effective for small data collections.

  • Works adequately for organizations with minimal reporting needs.

Challenges of Manual Systems

  • As organizations expand, generating reports from manual systems becomes slow and challenging, leading to a need for computerized solutions.

Computerized File Systems

  • Initially similar to manual files, computerized systems began to manage data more effectively as organizations evolved.

Comparison of Systems

  • Which system is better? Comparison of file systems and databases.

Disadvantages of File Systems

  • Data separated leads to isolation, complicating access and synthesis.

  • Difficult to obtain quick answers due to lack of integrated data access.

More Disadvantages

  • Issues such as data duplication, inconsistency, and lack of security are prevalent in file systems.

  • Limited data sharing due to segregation also poses a challenge.

Advantages of Databases

  • Databases improve:

    • Control of data redundancy

    • Data consistency

    • Security and data sharing

    • Data integrity and backup/recovery services

Disadvantages of Databases

  • Complexity of systems, and costs associated with hardware, software, and personnel are significant factors.

Definition of a Database

  • A database integrates collections of data (raw facts) and metadata (data about data).

Understanding Metadata

  • Metadata encompasses descriptions of data characteristics (like data types and nullability) and relationships linking database entities.

Components of Database System

  • Five major components:

    • Hardware

    • Software

    • Data

    • People

    • Procedures

Hardware

  • Refers to physical devices such as computers, storage devices, and network components needed for database functionality.

Software

  • Software components include:

    • Operating system

    • Database application programs

    • DBMS software

Operating System

  • Manages computer hardware and software resources; examples include Windows and Unix.

Database Application Programs

  • Software that interacts with databases to facilitate data requests (usually through SQL).

Database Management System (DBMS)

  • Serves as a bridge between users and databases, enabling efficient data definition and access.

What is DBMS?

  • Defines, creates, maintains, and controls database access.

Examples of DBMS

  • Oracle, IBM DB2, Microsoft SQL Server, etc.

Functions of DBMS

  • DBMS performs key functions including:

    • Data dictionary management

    • Data storage and presentation management

    • Security, integrity, and backup management.

Data Dictionary Management

  • Stores definitions and relationships of data components within a data dictionary.

Data Storage Management

  • DBMS manages data storage structures, relieving users of defining physical data characteristics.

Data Transformation and Presentation

  • Ensures entered data conforms to required structures and formats across different locales.

Security Management

  • Enforces a security system to manage user access and data privacy.

Multiuser Access Control

  • Manages concurrent access to ensure data integrity in multiuser scenarios.

Backup and Recovery Management

  • Focuses on recovery of databases post-failure or corruption incidents.

Data Integrity Management

  • Ensures that data integrity is maintained, guaranteeing data validity and consistency.

Database Access Languages

  • The DBMS supports data access through a query language enabling user specifications of data manipulation without procedural context.

Programming Interfaces

  • Additionally offers APIs for programming languages like Java and C++ for custom development and integrations.

Database Communication Interfaces

  • Accepts user requests via varied network environments, including web browsers.

Understanding Data

  • Data encompasses the facts stored in the database, forming the core content of the system.

The Role of Users

  • Identifies five types of users including administrators, designers, analysts, programmers, and end-users.

User Responsibilities

  • Responsibilities include overseeing operations, managing the DBMS, and ensuring its effective function.

Designing the Database

  • Systems analysts/programmers design application programs for user interaction with database data.

End Users

  • Engage directly with application programs for daily operations within the organization.

Procedures

  • Procedures govern rules and instructions for using database systems, ensuring consistent operation and data monitoring.

Spreadsheet Limitations

  • Not a substitute for databases; lacks features necessary for proper data management, such as type enforcement and relationship definition.

Database Architectures

Introduction to Database Architectures.

Learning Objectives on Architectures

  • Understanding levels of database architecture and related concepts.

Viewing Data

  • Recognizes that different stakeholders may view the same data in various ways.

ANSI/SPARC

  • Framework defining four levels of data abstraction from the 1970s.

  • Levels are: external, conceptual, internal, and physical model.

The External Model

  • Represents end-users' views and interaction with data.

External Model Details

  • Shows specific entity associations and user views.

Shared Entities in External Model

  • Presents CLASS and COURSE entities shared across business units.

The Conceptual Model

  • Offers a unified global view of the entire database.

Conceptual Model Relations

  • Integrates external views for a comprehensive understanding of enterprise data.

ER Model

  • Entity-Relationship (ER) model is commonly used, remaining independent from software and hardware specifications.

Internal Model

  • Maps the conceptual model to specific DBMS usage, reflecting software-specific configurations.

The Physical Model

  • Lowest abstraction level, detailing how data is physically stored on media.

Mappings Between Models

  • Explains logical connections between external/conceptual and conceptual/internal models.

Data Independence

  • Major objective is to achieve independence among data levels; includes logical and physical types.

Types of Data Independence

  • Logical independence: Changes to internal models don’t affect conceptual models.

  • Physical independence: Modifications in the physical model don’t alter internal models.

Transition to Data Models

  • Introduction to data models and their importance.

Significance of Data Models

  • Essential for communication among users, analysts, and designers.

Data Modeling

  • The process defining application-specific models for data, foundational for database design.

Components of Problem Domain

  • Identifying inputs, outputs, and stakeholders crucial for effective data model development.

Basic Building Blocks

  • Core components of data models:

    • Entities

    • attributes

    • relationships

    • constraints

Entities

  • Defined as objects of interest for data collection.

More on Entities

  • Identifies and defines various characteristics of entities, such as customers and products.

Attributes

  • Characteristics that define an entity, akin to fields in tables.

Relationships

  • Relationships represent associations between entities, vital for understanding their interactions.

Types of Relationships

  • Defines relationship degrees:

    • One-to-One

    • One-to-Many

    • Many-to-Many

Types of Data Models

  • Classification of data models including hierarchical, network, and relational models.

Relational Model

  • Most popular and recent model aiming to mitigate earlier complexities in database design.

Core Concept of Relational Model

  • Basis in relations (tables) storing all database data efficiently.

Terminology in Relational Databases

  • Focus on superkeys, candidate keys, primary and foreign keys definitions.

Relations

  • Explained as tables consisting of columns (attributes) and rows (records).

Attributes in Relations

  • Defined as patterned columns within relations requiring unique names.

Domain in Database

  • Sets allowable values for attributes, essential for maintaining data consistency.

Tuples in Relations

  • A tuple signifies a row in a relation, emphasizing the distinctiveness of each record.

Duplicate Tuples

  • Reiterates that duplicate records are disallowed in relations, ensuring uniqueness.

Degree and Cardinality

  • Definitions of degree (number of attributes) and cardinality (number of tuples) outlined.

Relational Database Definition

  • Defined as a collection of distinct relations, normalization to be discussed further.

The Concept of Relational Keys

  • Significance in uniquely identifying tuples within a relation.

Types of Relational Keys

  • Overview of superkeys, candidate keys, primary keys, and foreign keys.

Understanding Superkeys

  • Superkey describes attributes uniquely identifying rows within tables.

More on Superkeys

  • Examples reinforcing the definition and application of superkeys in databases.

Primary Key Concept

  • A candidate key that uniquely identifies other attribute values, cannot have null entries.

Primary Key Reiteration

  • Further exploration of examples of primary keys' function in databases.

Foreign Key Definition

  • Defined as attributes that either match primary keys from other tables or can be null.

Conclusion on Foreign Keys

  • Reinforces the importance of foreign keys in database integrity.

Integrity Constraints

  • Definitions encapsulating entity integrity and referential integrity within databases.

Entity Integrity Defined

  • Stipulates that no primary key attributes can be null to ensure uniqueness.

Importance of Entity Integrity

  • Emphasizes the requirements for entity integrity across tables.

Referential Integrity Explained

  • Reinforces that foreign keys must correspond with primary keys or remain null in their relations.

Referential Integrity Continued

  • Further reiterates the requirements needed for referential integrity within database structures.