Chapter 1. Proximity Service

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 22

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

23 Terms

1

Functional requirements

  1. Return all nearby business

  2. Business owner can add, update, delete info

  3. User can read business info

New cards
2

Non functional requirements

  1. Low latency

  2. Data privacy for user location

  3. High availability and scalability for peak hours

New cards
3

What is the read write pattern for this system and what’s the solution?

It is read heavy, use leader follower pattern.

New cards
4

What are 2 real life geospacial database?

  1. Postgres with GIS extension

  2. Redis geohash

New cards
5

How do we use geospacial index?

  1. Convert location to index

  2. Get index of nearby area

  3. Search for business belong to the area index

New cards
6

What are 3 types of geospatial index?

  1. Geohash

  2. Quadtree

  3. Google S2

New cards
7

How does geohash work?

  1. Uses base32 representation

  2. 6 chars is around 1 square mile and 10 chars is 1 square meter

  3. You can query business using LIKE ‘9q8zn%’

New cards
8

What are some issues and solutions with geohash?

  1. Business close enough but not match prefix

  2. Not enough business with current prefix

Solution: expand search range to get 8 neighbor cells

New cards
9

How to build quadtree?

  1. Given list of business with coordinates

  2. Create a parent node representing the coordinates it covers

  3. If has more than 100 business, subdivide coordinates into 4 areas, making them the child nodes

<ol><li><p>Given list of business with coordinates</p></li><li><p>Create a parent node representing the coordinates it covers</p></li><li><p>If has more than 100 business, subdivide coordinates into 4 areas, making them the child nodes</p></li></ol>
New cards
10

Quadtree vs Geohash

  1. Quadtree is a tree structure, geohash is stored as table (though it’s index can be trie)

  2. For quadtree, the leaf node siblings have 100+ business, but geohash sublings might not. (Dynamic grid size)

  3. Easier to update geohash index than a tree

New cards
11

How do we store quadtree?

We can store it in memory, because given a billion business, there would be only 1M leaf nodes if each node has 1000 business

New cards
12

Time complexity of building quadtree

  1. For each business, it takes logn time to send it to leaf node, so it’s O(nlogn)

  2. For 200M business it would take a few minutes

New cards
13

How to find nearby business with quadtree

  1. Given coordinate, find the node for it

  2. Traverse up until gathered enough business

New cards
14

How does quadtree handle updates?

  1. Real time: insert business to the tree

  2. Batch job: re-build tree weekly

New cards
15

Operational considerations of quadtree

  1. Keep multiple replica for availability and scalability

  2. Blue green or canary deployment

New cards
16

What is google S2

  1. It maps a location to 1D index based on hilbert curve

  2. 1 points that have similar value are close in 2D space

New cards
17

Advantages of google S2

  1. Great with geofencing because it can cover arbitrary areas

  2. More flexible cell size and precision

New cards
18

What do geospacial index have in common?

  1. Represent 2D space with 1D index

  2. A range in 1D index covers an area in 2D space

  3. Nearby locations have similar 1D index value

New cards
19

Which one should I pick during interview?

Geohash

New cards
20

What to cache?

  1. List of business IDs in the grid

  2. Business metadata

New cards
21

How to cache business ID list in grids?

  1. Select 3 geohash precision

  2. 8 bytes x 200 million x 3 precisions = ~5 GB

New cards
22

How to have international support?

  1. Deploy servers in different regions

  2. DNS geo based routing

  3. Add country specific cache that can scale independnetly

New cards
23

Component diagram

  1. Location service: get nearby

  2. Business service: get business info

  3. Cache: business info + geohash

  4. DB: leader follower setup

<ol><li><p>Location service: get nearby</p></li><li><p>Business service: get business info</p></li><li><p>Cache: business info + geohash</p></li><li><p>DB: leader follower setup</p></li></ol>
New cards
robot