Chapter 1. Proximity Service

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards

Functional requirements

  1. Return all nearby business

  2. Business owner can add, update, delete info

  3. User can read business info

2
New cards

Non functional requirements

  1. Low latency

  2. Data privacy for user location

  3. High availability and scalability for peak hours

3
New cards

What is the read write pattern for this system and what’s the solution?

It is read heavy, use leader follower pattern.

4
New cards

What are 2 real life geospacial database?

  1. Postgres with GIS extension

  2. Redis geohash

5
New cards

How do we use geospacial index?

  1. Convert location to index

  2. Get index of nearby area

  3. Search for business belong to the area index

6
New cards

What are 3 types of geospatial index?

  1. Geohash

  2. Quadtree

  3. Google S2

7
New cards

How does geohash work?

  1. Uses base32 representation

  2. 6 chars is around 1 square mile and 10 chars is 1 square meter

  3. You can query business using LIKE ‘9q8zn%’

8
New cards

What are some issues and solutions with geohash?

  1. Business close enough but not match prefix

  2. Not enough business with current prefix

Solution: expand search range to get 8 neighbor cells

9
New cards

How to build quadtree?

  1. Given list of business with coordinates

  2. Create a parent node representing the coordinates it covers

  3. If has more than 100 business, subdivide coordinates into 4 areas, making them the child nodes

<ol><li><p>Given list of business with coordinates</p></li><li><p>Create a parent node representing the coordinates it covers</p></li><li><p>If has more than 100 business, subdivide coordinates into 4 areas, making them the child nodes</p></li></ol>
10
New cards

Quadtree vs Geohash

  1. Quadtree is a tree structure, geohash is stored as table (though it’s index can be trie)

  2. For quadtree, the leaf node siblings have 100+ business, but geohash sublings might not. (Dynamic grid size)

  3. Easier to update geohash index than a tree

11
New cards

How do we store quadtree?

We can store it in memory, because given a billion business, there would be only 1M leaf nodes if each node has 1000 business

12
New cards

Time complexity of building quadtree

  1. For each business, it takes logn time to send it to leaf node, so it’s O(nlogn)

  2. For 200M business it would take a few minutes

13
New cards

How to find nearby business with quadtree

  1. Given coordinate, find the node for it

  2. Traverse up until gathered enough business

14
New cards

How does quadtree handle updates?

  1. Real time: insert business to the tree

  2. Batch job: re-build tree weekly

15
New cards

Operational considerations of quadtree

  1. Keep multiple replica for availability and scalability

  2. Blue green or canary deployment

16
New cards

What is google S2

  1. It maps a location to 1D index based on hilbert curve

  2. 1 points that have similar value are close in 2D space

17
New cards

Advantages of google S2

  1. Great with geofencing because it can cover arbitrary areas

  2. More flexible cell size and precision

18
New cards

What do geospacial index have in common?

  1. Represent 2D space with 1D index

  2. A range in 1D index covers an area in 2D space

  3. Nearby locations have similar 1D index value

19
New cards

Which one should I pick during interview?

Geohash

20
New cards

What to cache?

  1. List of business IDs in the grid

  2. Business metadata

21
New cards

How to cache business ID list in grids?

  1. Select 3 geohash precision

  2. 8 bytes x 200 million x 3 precisions = ~5 GB

22
New cards

How to have international support?

  1. Deploy servers in different regions

  2. DNS geo based routing

  3. Add country specific cache that can scale independnetly

23
New cards

Component diagram

  1. Location service: get nearby

  2. Business service: get business info

  3. Cache: business info + geohash

  4. DB: leader follower setup

<ol><li><p>Location service: get nearby</p></li><li><p>Business service: get business info</p></li><li><p>Cache: business info + geohash</p></li><li><p>DB: leader follower setup</p></li></ol>