Hashing

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/22

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

23 Terms

New cards

Hashing Function

Converts a key (often a string) to an integer

Integer is used for indexing a storage array

The same deterministic function is used for storing and retrieving data

Index integer is not just restricted to 0 to 25

Hence, data can be spread across a much longer array

New cards

How do we Deal with Collisions?

chaining

New cards

Hash Table

data structure that is designed to be fast to work with

New cards

Hash Function

takes the key of an element to generate a hash code

maps keys to array indices

New cards

Hash Code

says what bucket the element belongs to, so we can now find it assuming no collisions

New cards

Buckets

array indices

where the elements are stored

New cards

What is the Unique Part of an Element?

the key

New cards

How does Hashing Work? (diagram)

New cards

Increasing Range of Indexes

For more objects (more keys), need longer array to minimise collisions
Hashing function needs to produce a large range of integers
Use more information from each key
- e.g. all letters, and perhaps length
But, just adding these together, a 15-letter key will rarely give an index, say, over 300

New cards

Designing a Good Hashing Function

minimise collisions
get a wide range of index values

New cards

Minimising Collisions

a wide range of index values and
unform hashing (each key is equally likely to map to an array location)

New cards

Getting a Wide Range of Index Values

use multiplication as well as addition

New cards

Dealing with Large Indices

If we want to use an array of length size, we can use:

index = rawIndex%size:

where % is the integer remainder operation (modulo)

e.g. to store 50 or so words, choose size = 100:
- key = "pace", rawIndex = 71671
  - index = 71671%100 = 71
- key = "cart", rawIndex = 345438
  - index = 345438%100 = 38

New cards

The Hashing Pitfall

Note that hash retrieval has two stages:

Use hashing function to compute index
Search chain in array element to see if there is a match

With a good hashing function and a large array, Stage 2 can be close to O(1), but, Stage 1 should also be fast

New cards

Real-World Hashing Functions…

avoid slow multiplication and division operations
use fast logical shift and ‘OR’ operations

New cards

Multiplication vs Bit Shifting

Multiplication is a slow operation
Where the multiplier is a power of 2, we can left shift the bits in the multiplicand (X) by the power of 2 instead, and get exactly the same answer

<ul><li><p>Multiplication is a slow operation</p></li><li><p>Where the multiplier is a power of 2, we can left shift the bits in the multiplicand (X) by the power of 2 instead, and get exactly the same answer</p></li></ul><p></p>

New cards

Why do we use Key-Value Pairs?

to distinguish between key-value pairs with the same hash

New cards

Key

unique input to hash function that is translated by hash function

New cards

Value

what we put in the bucket associated with the key

New cards

Hash Table ADT

Stores key-value pairs
Keys are unique and a handle to the associated value
Operations
- put(t, k) stores k in the table t
- get(t, k) returns if k is in t else nil
- delete(t, k) removes k from t

New cards

Hash Table ADT Pseudocode

//chains are lists; A is hash table array

put(t, <k,v>) {
	chain = t.A[hash(k)]
	//C doesn't have to be sorted
	add(chain, <k,v>)
}

get(t, k) {
	chain = t.A[hash(k)]
	return search(chain,k)
}

delete(t, k) {
	chain = t.A[hash(k)]
	remove(chain,k)
}

New cards

What Happens if Chains Grow too Long