N

Chapter 2: Database System Concepts and Architecture - VOCABULARY flashcards

2.1 Data Models, Schemas, and Instances

  • Characteristics of the database approach versus traditional file processing
    • Database approach uses a single repository of data accessed by multiple users and applications through queries, transactions, and programs
    • Traditional file processing builds separate files and programs for each application, causing data redundancy and maintenance inefficiencies
  • Traditional File Processing Example
    • A grade reporting office maintains files on students and grades; programs to print transcripts and enter new grades are part of the application
    • An accounting office maintains files on student fees and payments; separate files and programs exist for these needs
    • Consequence: data redundancy leads to wasted storage and duplicated maintenance work to keep data up to date
  • Database Approach Example
    • A single repository stores data defined once and accessed by many users via queries and programs
    • Key advantages include shared data, reduced redundancy, and centralized management
  • Main characteristics of the database approach (vs. file processing)
    • Self-describing nature of a database system
    • Insulation between programs and data; data abstraction
    • Support for multiple views of the data
    • Data sharing and multiuser transaction processing
  • Data Models and Schemas
    • Data abstraction: suppression of details about data organization/storage to highlight essential features for understanding
    • Different users perceive data at different levels of detail through abstraction
  • Data Model
    • A collection of concepts that describe the structure of a database
    • Basic operations: specify retrievals and updates on the database
    • Dynamic aspect or behavior: allows the database designer to specify a set of valid operations on database objects
  • Categories of Data Models
    • High-level (conceptual) data models: close to how users perceive data
    • Low-level (physical) data models: describe details of how data is stored on storage media
    • Representational data models: easily understood by end users; similar to how data is organized on storage; hides many storage details but can be implemented directly
  • Conceptual Data Models
    • Conceptual types include relational data model (most common in traditional commercial DBMSs), object data model (newer family of higher-level implementation models, closer to conceptual models), and semantic data model (e.g., Semantic Web)
  • Conceptual Features
    • Entity: represents a real-world object or concept
    • Relationship: an association among two or more entities
    • Entity-Relationship (ER) model: models entities and their relationships
    • Attribute: property of an entity; provides further description of an entity
  • Physical Data Models
    • Physical: describes how data is stored as files in the computer
    • Access path: structure that makes search for particular records more efficient
    • Index: example of an access path; allows direct access to data using an index term or keyword
  • Schemas, Instances, and Database State
    • Distinguish between the description of the database (schema) and the database itself (state)
  • Schema Diagram Example (Figure 2.1 in Figure 1.2)
    • Entities/tables and their attributes:
    • STUDENT(Name, Student_number, Class, Major)
    • COURSE(Coursename, Coursenumber, Credit_hours, Department)
    • PREREQUISITE(Coursenumber, Prerequisitenumber)
    • SECTION(Sectionidentifier, Coursenumber, Semester, Year, Instructor)
    • GRADEREPORT(Studentnumber, Section_identifier, Grade)
  • Schema changes and evolution
    • Schema changes are usually needed as requirements change
    • Newer DBMS provide operations to allow schema changes, though the process is more involved than simple data updates
  • Notational notes about terminology
    • It is customary to use schemas as the plural for schema, though schemata is the proper plural form
    • The word scheme is sometimes used to refer to a schema
  • Define key concepts
    • Define a new database
    • Specify database schema to the DBMS
    • Initial state: populated or loaded with initial data
    • Valid state: satisfies the structure and constraints specified by the schema
    • Schema evolution: changes applied to the schema as application requirements change
  • Figure 2.1 (Schema diagram) and column/field labeling visuals (as shown in slides)
  • Practical takeaway
    • Understanding schemas vs. instances helps in reasoning about data definitions vs. actual data at a point in time

2.2 Three-Schema Architecture and Data Independence

  • Three-schema architecture overview
    • Internal level: describes physical storage structure of the database
    • Conceptual level: describes the structure of the whole database for a community of users
    • External (view) level: describes part of the database that a particular user group is interested in
  • Architecture mapping
    • External level -> external/view schemas
    • Conceptual level -> conceptual/internal mappings
    • Internal level -> internal/storage schema
  • External, conceptual, and internal levels (visual form)
    • External Level: External View(s) tailored to user needs
    • Conceptual Level: Conceptual Schema representing the overall database structure for all users
    • Internal Level: Internal Schema describing physical storage details
    • Mappings: External/Conceptual Mapping and Conceptual/Internal Mapping connect levels
  • Data independence
    • Capacity to change the schema at one level without changing the higher levels
    • Types of data independence
    • Logical data independence: ability to change the conceptual schema without changing external schemas or applications
    • Physical data independence: ability to change the internal/physical schema without changing the conceptual schema
    • Practical implication: allows evolving DBMS technology or storage hardware without rewriting applications
  • Example implications
    • Changing DBMS or disk drive can be done with logical/physical independence, preserving application interfaces and external views

2.3 Database Languages and Interfaces

  • DBMS languages and their roles
    • Data Definition Language (DDL): defines data and storage schemas
    • Storage Definition Language (SDL): specifies the internal schema (how data is stored on disk)
    • View Definition Language (VDL): specifies user views and views-to-conceptual-schema mappings
    • Data Manipulation Language (DML): retrieval, insertion, deletion, modification of data
    • Query Definition Language (QDL): defines queries (often encompassed within DML in modern systems)
  • Additional language families
    • Data Control Language (DCL): controls access permissions and security
    • Transaction Control Language (TCL): controls transactions (commit, rollback, etc.)
    • Open/Standard interfaces: ODBC (Open Database Connectivity) and JDBC (Java Database Connectivity)
    • VDL and QDL are often integrated into higher-level language features in modern systems
  • SDL vs. SQL and other DBMS languages
    • SDL: internal schema details; multiple DBMS implementations may support different SDLs
    • SQL: widely adopted master language that unifies many subsystems’ capabilities
    • The textbook highlights a subset of languages; emphasis on SQL as a unifying standard
  • Common SDL/DDL/DML/VL tools and examples
    • MySQL example: InnoDB as a default storage engine; support for multiple SDLs
  • DBMS user interfaces (UIs)
    • Menu-based interfaces for Web clients or browsing
    • Forms-based interfaces
    • Graphical user interfaces (GUIs) such as SQL Server Management Studio (SSMS)
    • Natural language interfaces
    • Speech input/output interfaces
    • Interfaces for parametric users (e.g., SQL parameterization for repeated queries)
    • Interfaces for DBAs (administrative tooling and dashboards)
  • SQL injection example (security note)
    • Demonstrates how unsanitized inputs can be exploited to alter or delete data
    • Example dialogue shows an attacker injecting: "Robert'); DROP TABLE Students;--"
    • Lesson: sanitize inputs and use prepared statements/parameterized queries; validate and escape inputs; use least-privilege database accounts
  • SQL Insertion Hack visuals (additional demonstrations)
    • A tollbooth camera OCR example illustrating how unsanitized input can be used to delete data; reinforces input sanitization and secure coding practices

2.5 Centralized and Client/Server Architectures for DBMSs

  • Centralized DBMS architecture
    • All DBMS functionality, application execution, and user interface processing occur on a single machine
    • Historically associated with mainframes and dumb terminals; still widely used in some environments
  • Basic Client/Server Architectures
    • Server roles include: file server, printer server, web server, email server, database server
    • Client roles: provide user interfaces and local processing power; access server services
  • Two-tier client/server architectures
    • Server handles query and transaction functionality related to SQL processing
    • Client handles user interfaces and application programming
  • Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC)
    • ODBC: API that allows client applications to call the DBMS; both client and server need appropriate software
    • JDBC: Java-specific API for accessing one or more DBMSs through a standard interface
  • Three-tier and n-tier architectures for Web-based access
    • Application server or Web server: intermediate layer between client and database server; runs application programs and enforces business rules
    • N-tier architectures: further divide layers into finer components to separate concerns
  • Service-Oriented Architecture (SOA)
    • Systems composed of many services that can reside on the same or different machines
    • Database service, business process services, calendar service, etc.
  • Three-tier architecture (example structure)
    • Presentation layer: client GUI or web interface
    • Business logic layer: application server or Web server
    • Data layer: database server and DBMS
    • Visual depiction often includes an application layer between the presentation layer and the database server

2.6 Classification of Database Management Systems

  • Classification by data model
    • Relational data model
    • Object data model
    • Hierarchical and network models (legacy systems)
    • Native XML DBMS
  • Classification by number of users
    • Single-user
    • Multiuser
  • Classification by number of sites
    • Centralized
    • Distributed
    • Homogeneous (same DBMS across sites)
    • Heterogeneous (different DBMSs or data sources across sites)
  • Classification by cost/licensing
    • Open source
    • Commercial/licensing variations
  • Classification by access path options
    • General-purpose vs. special-purpose access paths
  • Summary highlights
    • DBMS classifications consider data model, user count, site distribution, access paths, and cost

Summary (Key Takeaways)

  • The database approach offers data abstraction, multiple views, data sharing, and multiuser transactions, reducing redundancy compared to traditional file systems
  • Three-schema architecture supports data independence and allows evolving storage/implementation details without disrupting user views or applications
  • A range of languages and interfaces (DDL, SDL, DML, VDL, QDL, DCL, TCL, SQL, ODBC/JDBC) enable definition, manipulation, and control of data and access
  • Architecture choices (centralized vs. client/server vs. multi-tier) impact scalability, maintenance, and deployment strategies in real-world systems
  • Classification criteria (data model, users, sites, access paths, cost) help in selecting appropriate DBMS for a given domain