CS4416 Lecture 15 - Query Tuning - Introducing SQL Indexes1

Lecture Overview

  • Subject: Query Tuning and SQL Indexes

  • Course: CS4416

  • Recording Notice: Lecture is recorded for academic purposes and retained on Sulis for 13 months.

What is an Index

  • Definition: A data structure that enables quick data row retrieval in a table without full table scans.

  • Benefits: Significantly speeds up query execution.

  • Creation Example:

    • CREATE INDEX product_maker ON products(maker);

  • Functionality: Helps locate specific data rows based on criteria without scanning the entire table.

Database Tuning

  • Challenges: Deciding on which indexes to create for optimal performance.

  • Pros of Indexes: Speed up applicable queries.

  • Cons of Indexes: Slow down modifications since the index must also be updated.

  • Tool Suggestion: Use the EXPLAIN statement for identifying performance bottlenecks and determining necessary indexes.

Implementation of Indexes

  • Data Structures Used:

    • B+ Trees

    • B-Trees

    • Hash Tables

Hash Tables

  • Concept: Maps keys to array positions via a hash function.

  • Hash Function: Takes a key and outputs a hash value that points to an index in an array.

  • Collision Handling: Designs account for collisions (different keys producing the same hash value).

B-Trees: Example

  • Query Example: Retrieve rows where quantity equals 12:

    • SELECT * FROM sales WHERE quantity = 12;

B-Trees: Definition

  • Node Structure:

    • Stores up to k key-pointer pairs per node.

    • Each internal node can hold k+1 pointers.

  • Characteristics: Generalizes binary search trees (nodes can have multiple children).

  • Origin: Developed by Bayer & McCreight at Boeing Research Labs.

B+ Trees

  • Differences from B-Trees:

    • Internal nodes only store keys, not key-pointer pairs.

    • Leaf nodes contain key-pointer pairs and sibling pointers.

    • All keys in leaf nodes; some keys in internal nodes.

Comparison of B-Trees and B+ Trees

  • Branching Factor:

    • B+ Trees maximize branching factor which minimizes tree height and disk I/O, potentially improving performance.

  • Performance: B+ Trees are faster for sequential access compared to B-Trees.

Indexes in Practice

  • Popularity: B-trees and B+ trees dominate indexing in database systems.

  • Supports:

    • MySQL (InnoDB and MyISAM support only B-trees)

    • B+ trees used in several systems like IBM DB2, Microsoft SQL Server, etc.

MySQL Indexes

  • Automatic Indexing: Indexes created for PRIMARY KEY or UNIQUE attributes.

  • Default Structure: MySQL primarily uses B-trees, with exceptions for spatial types (R-trees) and MEMORY tables (hash tables).

  • Terminology: "Key" can refer to an index in ALTER TABLE statements.

Additional Resources

  • Tuning Advisor: MySQL Query Analyzer demo link for practical insights.