CS4416 Lecture 15 - Query Tuning - Introducing SQL Indexes1
Lecture Overview
Subject: Query Tuning and SQL Indexes
Course: CS4416
Recording Notice: Lecture is recorded for academic purposes and retained on Sulis for 13 months.
What is an Index
Definition: A data structure that enables quick data row retrieval in a table without full table scans.
Benefits: Significantly speeds up query execution.
Creation Example:
CREATE INDEX product_maker ON products(maker);
Functionality: Helps locate specific data rows based on criteria without scanning the entire table.
Database Tuning
Challenges: Deciding on which indexes to create for optimal performance.
Pros of Indexes: Speed up applicable queries.
Cons of Indexes: Slow down modifications since the index must also be updated.
Tool Suggestion: Use the
EXPLAINstatement for identifying performance bottlenecks and determining necessary indexes.
Implementation of Indexes
Data Structures Used:
B+ Trees
B-Trees
Hash Tables
Hash Tables
Concept: Maps keys to array positions via a hash function.
Hash Function: Takes a key and outputs a hash value that points to an index in an array.
Collision Handling: Designs account for collisions (different keys producing the same hash value).
B-Trees: Example
Query Example: Retrieve rows where quantity equals 12:
SELECT * FROM sales WHERE quantity = 12;
B-Trees: Definition
Node Structure:
Stores up to k key-pointer pairs per node.
Each internal node can hold k+1 pointers.
Characteristics: Generalizes binary search trees (nodes can have multiple children).
Origin: Developed by Bayer & McCreight at Boeing Research Labs.
B+ Trees
Differences from B-Trees:
Internal nodes only store keys, not key-pointer pairs.
Leaf nodes contain key-pointer pairs and sibling pointers.
All keys in leaf nodes; some keys in internal nodes.
Comparison of B-Trees and B+ Trees
Branching Factor:
B+ Trees maximize branching factor which minimizes tree height and disk I/O, potentially improving performance.
Performance: B+ Trees are faster for sequential access compared to B-Trees.
Indexes in Practice
Popularity: B-trees and B+ trees dominate indexing in database systems.
Supports:
MySQL (InnoDB and MyISAM support only B-trees)
B+ trees used in several systems like IBM DB2, Microsoft SQL Server, etc.
MySQL Indexes
Automatic Indexing: Indexes created for PRIMARY KEY or UNIQUE attributes.
Default Structure: MySQL primarily uses B-trees, with exceptions for spatial types (R-trees) and MEMORY tables (hash tables).
Terminology: "Key" can refer to an index in ALTER TABLE statements.
Additional Resources
Tuning Advisor: MySQL Query Analyzer demo link for practical insights.