Internet reading

What is SEO -

The process of improving the position of your website appears at in the organic searcher results by sites like google.

Spiders - software the crawls the web to Identify the actual copy written on the page along with things like use of key words and phrases.

Page rank - shows how trustworthy your site is

Determines the usefulness of a page by how many pages are linked to it

Google bots - check how many pages are linked to a website

Map reduce - uses clusers to process data after than any super computer

Beware of rank farms

Big data - large amounts of information that grow at increasing rates

Big data is manages by:

  • data storage

  • Data mining

  • Data analytics

  • Data visualisation

Process -

  • Crawler goes to website and collects data,

  • Sends data to cluster.

  • Cluster sends multiple copies to nodes

  • Master node makes smaller nodes process list

  • Nodes shuffles the data

  • Duplicates get removed

  • Data gets sent to central disk to update google search engine