ch05 - 6th Edition

Chapter 5

Relational Algebra and Relational Calculus

Relational algebra and relational calculus are two foundational concepts associated with the relational model in databases. They are essential for understanding how to interact with and manipulate relational data effectively.

Key Differences

  • Relational Algebra: This is a procedural language, meaning it requires the user to specify a sequence of operations to obtain a result. It operates on relations (tables) and allows users to construct new relations by applying various operations without altering the original data.

  • Relational Calculus: In contrast, relational calculus is non-procedural, meaning that it focuses on the desired results rather than the specific steps to achieve them. Users describe what they want rather than how to compute it, which makes it more abstract than relational algebra. Both languages are logically equivalent, meaning they can produce the same results in terms of the relations they derive.

Relational Completeness

A programming language is said to be relationally complete if it can express all queries that can be expressed in relational calculus. This concept is crucial for understanding the capabilities of query languages in relational database management systems (RDBMS).

Objectives

  • Understand and define relational completeness.

  • Construct queries using relational algebra.

  • Formulate queries in tuple relational calculus.

  • Utilize domain relational calculus for queries.

  • Identify and categorize different types of relational Data Manipulation Language (DML).

Relational Algebra: Overview

Relational algebra consists of a set of operations that transform one or more relations into a new relation without modifying the original data sets. One fundamental aspect is the closure property, which allows the results of one operation to be used as input for another, thereby enabling nested expressions.

Basic Operations of Relational Algebra

  1. Selection (σ): Extracts tuples satisfying a specific condition (predicate). For example, listing all staff members with a salary greater than £10,000.

  2. Projection (π): Outputs a vertical subset of attributes, removing any duplicate tuples. E.g., retrieving only the staff numbers, first names, last names, and salaries of all staff members.

  3. Cartesian Product (×): Combines every tuple of relation R with every tuple of relation S, resulting in a relation that contains all possible pairs.

  4. Union (∪): Merges tuples from two relations into a single relation, while eliminating duplicates. For instance, listing all cities that have either branch offices or rental properties.

  5. Set Difference (−): Returns tuples that are in relation R but not in relation S. An example could be listing cities with branches but without rental properties.

  6. Intersection (∩): Returns tuples found in both relations. For example, cities that have both a branch office and rental properties.

  7. Join Operations: This involves combining two relations based on a common attribute. Various types of joins exist, such as natural join, outer join, theta join, etc. Implementing joins efficiently is often one of the largest challenges in RDBMS.

  8. Division (÷): Returns those tuples from relation R that match every tuple in relation S. An example could be clients who have viewed all properties with three rooms.

Aggregate Operations

This set of operations applies aggregate functions such as COUNT, SUM, AVG, etc., to produce relations. An example could be counting how many properties exceed a rental value of £350.

Grouping Operation

The grouping operation clusters tuples into distinct groups and applies aggregate functions across these groups. For example, tallying the number of staff members per branch and calculating total salaries within each group.

Relational Calculus

Relational calculus involves specifying the results to be retrieved without detailing how to retrieve them. There are two main forms:

  1. Tuple Relational Calculus: This form identifies tuples for which a condition is true, using tuple variables and predicate functions. For instance, finding staff whose earnings exceed £10,000.

  2. Domain Relational Calculus: In this form, variables take values from attribute domains instead of entire tuples. An example would be identifying managers earning more than £25,000.

Safe Expressions

Safe expressions are an essential concept in domain calculus, ensuring that they yield finite results similar to those returned by tuple calculus and relational algebra. This ensures that queries remain practical and can be successfully executed in a real-world setting.

Other Languages

In addition to relational algebra and calculus, there are other languages for querying databases, including:

  • Transform-oriented languages (e.g., SQL), which focus on specifying transformations on data.

  • Graphical languages (e.g., QBE - Query By Example), which provide a visual interface for querying databases.

  • Fourth-generation languages (4GLs) designed for developing custom applications; and Fifth-generation languages (5GLs), still in early development stages, focused on advanced problem-solving and AI-driven queries.