1/55
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What is Hashing?
It’s the process of converting a key into another value (a hash value) using a hash function to allow fast data lookup.
Hash Function
A mathematical function that takes a key and produces a fixed-size hash value (an integer).
Hash Table / Hash Map
A data structure that stores key-value pairs and uses a hash function to map each key to an index for quick access.
What does a Hash Table do?
It maps keys to values using a hash function, allowing fast insertion, search, and deletion.
Why use Hash Tables over Arrays or Linked Lists?
Because hash tables allow direct access to data using a key, making operations faster than searching through arrays or lists.
Real-life analogy for Hashing
Like a magical library system where a “machine” (hash function) tells you exactly which shelf (index) a book (value) is on.
Key-Value Pair
Each item in a hash table consists of a key and a value (e.g., key: “name”, value: “Kimmy”).
Formula to get the index
index = hash(key) % array_size or h(k) = k mod m.
Bucket
A storage location (array slot) in the hash table that holds key-value pairs.
What happens during Retrieval?
The hash function is used again on the key to locate its bucket and return the stored value.
What Java class represents a hash table?
HashMap — a built-in Java class implementing the Map interface.
Insertion operation
put(key, value) — adds a new key-value pair.
Retrieval operation
get(key) — returns the value of a given key.
Checking for a key
containsKey(key) — checks if the key exists.
Deletion operation
remove(key) — removes a key-value pair.
What is Table Design?
The way the hash table is structured, sized, and organized to minimize collisions and improve efficiency.
Why is proper table design important?
To ensure even key distribution, minimal collisions, and efficient memory use.
If the table size (m) is too small…
Too many collisions occur, slowing performance.
If the table size (m) is too large…
Wastes memory.
Why use a prime number for table size?
Prime numbers reduce clustering when using h(k) = k mod m.
What is resizing (rehashing)?
Expanding the table when it becomes too full and recomputing all hash values.
What is a load factor?
The ratio of stored entries to the table size (e.g., 0.75). It helps decide when to resize.
What makes a good hash function?
It must be deterministic, distribute keys evenly, and compute quickly.
What happens if the hash function clusters keys?
Collisions increase, causing slower lookups.
Bucket
Each position in the hash table used to store key-value pairs.
Two main methods of storage
Chaining and Open Addressing.
Chaining definition
Each bucket contains a list (or tree) of all key-value pairs that share the same hash index.
Advantages of Chaining
Simple to implement and handles many collisions well.
What data structures can chaining use?
Linked lists or balanced trees (like in Java HashMap).
Open Addressing definition
Stores all elements directly in the table; if a collision happens, it searches for the next empty slot.
Why is Open Addressing used?
When there are space restrictions (no extra linked lists).
Linear Probing
Checks the next slot sequentially until an empty one is found.
Quadratic Probing
Moves through slots in increasing quadratic steps to avoid clustering.
Double Hashing
Uses a second hash function to determine the step size for searching.
Rehashing process
Creates a new, larger table, recomputes hashes, and reinserts all keys.
What does load factor measure?
How full a hash table is (stored elements / number of buckets).
If load factor is high…
The table is crowded; collisions increase → need rehashing.
If load factor is low…
The table is underused but performs efficiently.
What is a Collision?
When two different keys map to the same bucket index.
How are collisions handled?
Using Chaining or Open Addressing.
Why are collisions unavoidable?
Because multiple keys can generate the same hash value.
Set (Data Structure)
Stores a collection of unique elements; no duplicates allowed.
Unordered Set
Implemented with a hash table; elements have no order and operations are fast (constant time on average).
Ordered Set
Implemented with a balanced binary search tree (BST); keeps elements sorted.
What happens when you add a duplicate to a Set?
It’s ignored — only unique elements are kept.
HashSet
Uses hashing; elements are stored based on hash codes, not in order.
TreeSet
Uses a binary search tree; keeps elements sorted (ascending order).
When to use Sets
When you need to store unique elements efficiently, like in algorithms or data analysis.
Map (Data Structure)
Stores key–value pairs, where each key is unique and maps to exactly one value.
Why use a Map instead of an array or list?
You can retrieve data directly using a key instead of searching by index.
Key Features of Maps
Common Map Operations
Insert (put), Retrieve (get), Check (containsKey), Delete (remove).
HashMap
Uses a hash table; stores key-value pairs with no guaranteed order.
TreeMap
Uses a binary search tree; stores keys in sorted order.
LinkedHashMap
Uses hashing + linked list; preserves insertion order.
What does re-inserting an existing key do in a Map?
Updates the value of that key.