Revision Guide: Search Engines, Networks, and Encryption

Fundamentals of the Internet and the World Wide Web

The internet is formally defined as a vast collection of numerous interconnected networks. It serves as the global infrastructure that allows these disparate networks to communicate with one another. Within this infrastructure, the World Wide Web exists as a system of information. Accessing this web requires a web browser, which is a software application that specifically allows individuals to search for and retrieve information from across the digital landscape. Web pages themselves are constructed using a specialized language known as HTML (HyperText Markup Language). In the current digital environment, Google is identified as the most popular web page in the world.

Search Engine Mechanics and Web Crawling

A search engine is conceptually defined as an index that a user utilizes to find information. It is a common misconception that a search engine searches the entire live internet in real-time when a query is entered; in reality, it searches its own pre-compiled index. To build and update this index, search engines employ software entities known as spiders. These spiders are tasked with examining every word written on a website to categorize and find sites. They are particularly effective at identifying content based on synonyms of the words the user has written, thereby ensuring that relevant search results are returned even if exact keyword matches are not present.

Physical Network Infrastructure and Hardware

The primary physical components that connect the global network are fibre optic cables. These cables are essential to modern telecommunications because they allow data to be transmitted at speeds that are extremely close to the speed of light. On a local level, connecting an individual device to a broader network requires specific hardware known as a router. This device facilitates the routing of data packets between the local device and the internet.

Domain Names, IP Addressing, and the DNS

Every device on a network is associated with an IP address, where IP stands for Internet Protocol. An IP address serves as a numerical identifier for a device; for example, a user can determine their own current address by entering the query "what is my IP" into the Google search engine. Because these numerical sequences are difficult for humans to memorize, the Domain Name System (DNS) is utilized. A Domain name is a human-readable name assigned to a website that is used as a functional substitute for an IP address. We use domain names rather than IP addresses specifically because it is significantly easier for people to remember a name than to memorize a complex IP address. A URL (Uniform Resource Locator) provides the specific address for a resource on the web. Within these addresses are Top Level Domains (TLDs). Examples of available top-level domains include .com, .net, and .co.

Network Topologies

Network topology refers to the arrangement or layout of the various elements (links, nodes, etc.) of a computer network. There are 33 primary types of network topologies discussed: the bus network, the ring network, and the star network. Each of these structures has distinct characteristics. A bus network typically involves a single central cable to which all devices are connected, offering 11 advantage and 11 disadvantage in terms of cost and cable failure. A ring network connects devices in a circular fashion where data travels in one direction, while a star network features a central hub or switch to which all other devices are connected. The star topology is the specific configuration most commonly used in home environments. Note that in certain contexts, .com, .net, and .co are incorrectly grouped with topologies, though they are technically top-level domains.

Encryption, Ciphers, and Cybersecurity

Encryption is the process of protecting information by transforming it into an unreadable format. Plaintext is defined as a message in its original, human-readable form. Once this message is transformed, it becomes encrypted text. Decrypting or deciphering is the process of converting that encrypted text back into its original plaintext so it can be understood once more. One of the most famous historical examples is the Caesar Cipher, a substitution cipher where letters are shifted by a fixed number of positions. In the modern digital world, the Caesar Cipher has significant limitations because it is easily broken.

One method used to break encryption is a brute-force attack, which involves systematically checking all possible keys until the correct one is found. The security of an encryption system is directly affected by the length of the key; longer keys increase security because they exponentially increase the number of possible combinations a brute-force attack would need to test. During WWII, codebreakers leveraged the knowledge of common phrases within messages to help decipher codes. In terms of web security, HTTPS is considered more secure than HTTP due to its use of encryption protocols to protect data in transit.

Regional Context and Revision Details

This study material was compiled on Tuesday, 2828 June 20222022 at 3:093:09 PM as part of the L67 Revision. A specific point of inquiry within these notes involves the legal and privacy landscape of the United Arab Emirates (UAE), specifically asking whether the internet activity of individuals is monitored within the country. This highlights the importance of understanding the ethical and practical implications of network activity and state surveillance.