Database Systems and Data Management

Data Hierarchy

The data hierarchy progresses from the smallest to the largest as follows: Attribute, Entity, File, Database.

  • Data Item: Specific values of an attribute (e.g., 'Blue' for color).
  • Attribute: A characteristic or property of an entity (e.g., customer’s name).
  • Entity: A person, place, thing, or event about which information is stored (e.g., a customer).
  • Record: A collection of attributes about a specific entity (e.g., all info about one customer).
  • File: A collection of related records (e.g., a file with all customer data).
  • Database: An organized collection of related data.

Keys and Attributes

  • Primary Key: A unique identifier for a record.
  • Foreign Key: An attribute in one table that links to the primary key in another.

Databases in Action

Various sectors utilize databases:

  • Offshore Leaks Database: Assists in identifying tax evaders.
  • National Integrated Ballistic Information Network (NIBIN): Links crime scenes with firearms.
  • Global Terrorism Database (GTD): Provides insights into terrorist events.
  • Leads Online: Tracks pawn transactions for law enforcement.

Database Management Systems (DBMS)

A collection of programs that manage databases and facilitate access for users. DBMS is essential for:

  • Data Integrity: Ensuring accuracy and trustworthiness of data.
  • Security Management: Protecting data from unauthorized access and breaches.
  • Backup and Recovery: Restoring database integrity after failures.

Consists of activities that provide user access, data manipulation, and reporting capabilities. Involves schemas and data dictionaries for defining structure and data properties.

Data Cleansing and Validation

Improves data quality through:

  • Data Cleansing: Detecting and correcting inaccuracies.
  • Data Validation: Preventing entry of bad data.

Database Design Considerations

Key aspects include:

  • Content: What data to collect.
  • Access: Who can view or modify the data.
  • Logical Structure: How data is organized.
  • Physical Organization: Locations of the data.
  • Response Time: Efficiency in retrieving data.
  • Security: Protecting against unauthorized access.

Data Modeling

Involves creating representations of data structures, often using Entity-Relationship (ER) diagrams to visualize data relationships.

Relational Database Features

Organizes data in tables with:

  • Unique primary keys for each record.
  • Relationships defined by primary and foreign keys.
  • Support for SQL for data manipulation and querying.

ACID Properties

Ensures reliable transactions:

  • Atomicity: Each transaction is all-or-nothing.
  • Consistency: Database remains valid throughout.
  • Isolation: Concurrent transactions do not affect each other.
  • Durability: Once committed, transactions remain so, even in case of failures.

Database as a Service (DaaS)

Enables cloud-based access to databases, relieving organizations from direct management, enhancing scalability and flexibility.

Database Governance and Roles

Strong governance structures ensure data quality and compliance:

  • Data Stewards: Manage critical data aspects and maintain definitions.
  • Database Administrators (DBA): Focus on database creation, monitoring, and security.