Lecture ReVISION part 4
Title: BSc (Hons) ORYX UNIVERSAL COLLEGE
Program: Computer Science
Partnership: Liverpool John Moores University
Presenter: Dr. Ali Baydoun
Data is organized into tables (relations) consisting of rows and columns.
Rows: Identified by a unique key (primary key).
Columns: Represent attributes that provide values for the rows.
Each table typically represents one entity type (e.g., customer, product).
Rows are instances of the entity type, while columns store specific details (e.g., name, price).
Data extraction is performed using Structured Query Language (SQL).
SQL is designed to manage structured data that includes relationships among entities and variables.
Relational Constraints: Primary Keys (PKs) and Foreign Keys (FKs).
Important principles include ACID properties (Atomicity, Consistency, Isolation, Durability).
Transactions can be cascaded.
Queries are often relational and JOIN-based.
Scalability: Poor performance across multiple servers.
Challenges with resource elasticity.
Difficulty in modeling certain data types without major restructuring (Normalization required).
Databases may be ineffective or inefficient.
Relational approaches are becoming limited due to the massive increase in data production.
Recent advances aim to address these limitations, but challenges remain with large datasets needing distributed databases.
The emergence of diverse data types (tweets, video, etc.) makes structured schemas inadequate.
This has led to the development of NoSQL database approaches.
NoSQL stands for "NOT SQL" or "Not Only SQL."
Characterized by a flexible approach to data storage suitable for handling "big data."
Various types of NoSQL databases:
Key-Value
Graph
Column
Document
Flexible Structure: Key-value databases are an alternative to relational databases, lacking predefined schemas.
Values: Can be a wide range of data types (numbers, strings, JSON, etc.).
Functionality: Accessed through unique keys; no query language exists.
Operations: Basic functions include:
Put: Adds/updates key-value pairs.
Get: Retrieves values for given keys.
Delete: Removes key-value pairs.
Data Types: Can store any data type including BLOBs.
Similar to a dictionary with unique keys indexing values.
Enables rapid retrieval of values regardless of database size.
Key-value databases treat values as single opaque collections with varying fields.
Provides flexibility similar to object-oriented programming.
Memory-efficient storage leading to performance gains.
Graph Databases: Use graph theory for storing and querying relationships.
Comprised of nodes (entities) and edges (connections).
Unique identifiers and properties define each node.
Suited for analyzing connections, especially in social media data mining.
Useful in business data that involves complex relationships (e.g., supply chain management).
Concept attributed to mathematician Leonhard Euler.
Document Databases: Store data in JSON-like documents.
Facilitates data storage that mirrors application code, flexible and semi-structured.
Adaptable to application evolution, supporting catalogs, user profiles, etc.
Ideal for Content Management Systems (e.g., blogs, video platforms).
Allows for easy updates and changes without schema updates or downtime.
Efficient for storing product catalogs; handles diverse attributes without relational inefficiency.
Each product can be described in a single document, enabling easier management and reading.
Flexible Design: Reduces schema design requirements.
Encourages reduced constraints on database querying, simplifying access.
Massive web applications (e.g., Amazon) catering to large user bases.
Big Data Applications: Platforms like Twitter and Facebook manage high-frequency updates and data distribution.
Content Management Systems: NoSQL databases provide flexibility for managing diverse content types and structures, enabling rapid development and deployment.
create flash cards