Comprehensive Study Guide for Computer Information Processing and Systems

The Theoretical Foundations of Information and Systems

Computer information processing is predominantly associated with the data level of information. Information itself can be understood as a message or communication that reduces the entropy of a system. According to the concepts established by Norbert Wiener, information arises in the content of what we exchange with the external world as we interact with it and adapt to it. However, a common misconception that contradicts Wiener's view is the assertion that information is merely the content of exchange, rather than the result of the interaction and the adaptation process itself. Interpretation of this information is inherently subjective and possesses an individual character, depending largely on the knowledge and relevant context of the receiver. Without relevant knowledge, information remains merely data without inherent value or content. The reduction of information entropy within a system is directly linked to the removal of the receiver's ignorance or lack of knowledge regarding a specific phenomenon.

From a hierarchical perspective, the correct sequence of fundamental terms progresses from signals to signs and finally to data. Signals represent the lowest level of information and are physical quantities, usually with a low energy level, that reflect the state of a system. At the signal level, the actual value of the transmitted information cannot be distinguished, but this is the level where all data transfers are physically realized. Data consists of sequences of signs, which are permissible signals recognized by humans or systems that have been assigned a specific meaning. Signs and their combinations are governed by the syntactic aspect of information, which deals with encoding, writing, and the rules for merging characters. Content-wise, the semantic level of information is where meaning is attributed to data by the receiver or user. Finally, the pragmatic level of information assigns sense to messages and relates them to specific situations, problems, or intended goals, reflecting a highly subjective character.

Technological advancement has led to specific barriers in satisfying information needs. These include cognitive barriers, where an individual is unable to correctly interpret or understand information; dislocation barriers, which involve an inability to navigate the content or location of data sources; and psychological barriers, where a potential user may have moral or mental inhibitions about acquiring specific information. Furthermore, the term "information overload" characterizes a modern situation where the volume and scope of available information are not matched by an adequate development of human ability to process and work with that information.

Information Coding, Algorithms, and Data Representation

Codes are defined in information theory as rules for the unambiguous assignment between characters and their meanings. A code alphabet is fundamentally a set of these elementary characters. One classic example is the ASCII code, which serves as a primary standard for encoding in the personal computer sector. Redundant codes are those that contain extra characters not strictly necessary for expressing the information, often used for error detection or security. For instance, even parity is a method where a bit string is supplemented with an additional bit to ensure the total number of ones in the string is even. Calculations for data capacity are precise; in a memory with a capacity of 1kB1\,kB (which is 1024bytes1024\,bytes), one can record up to 8192bits8192\,bits of data. Specifically, a hexadecimal code allows the storage of two hexadecimal digits within one byte.

The capacity of different bit-depth codes to distinguish system states is determined by power-of-two relationships. A six-bit code can distinguish 26=642^{6} = 64 different states. A seven-bit code can distinguish 27=1282^{7} = 128 states, and an eight-bit code can distinguish 28=2562^{8} = 256 states. These mathematical principles underpin the formal representation of information required for machine processing. Machine data processing is characterized by formalized information expression, exceptionally high operation speeds, and the handling of large data volumes.

In the realm of logic and programming, George Boole is credited as the creator of mathematical logic, which is essential for computer operations. Algorithms are precisely determined procedures expressed as a sequence of operations. When an algorithm is written in a computer language, it is referred to as a source program. A computer program is more broadly an algorithm recorded in a language that is "understandable" by the computer, effectively a sequence of instructions written in binary code with a direct link to specific hardware. Positional number systems are also foundational, where the value of a character depends specifically on its position within the number string.

Computer Architecture and Hardware Components

The fundamental architecture of a computer, often following the von Neumann schema, includes basic blocks like the controller (control unit) and the arithmetic-logic unit (ALU). When these two components are integrated into a single circuit, they form a microprocessor. The invention of the microprocessor dates back to the late 1960s. Modern processors, such as the Intel Pentium IV, contain tens of millions of transistors (specifically over 100million100\,million in some variants). Processor performance is often defined by "word length," which is the number of bits the processor can process simultaneously. There are different architectural philosophies, such as RISC (Reduced Instruction Set Computer), which features fewer instructions compared to CISC (Complex Instruction Set Computer). Executing a single processor instruction usually requires anywhere from units to tens of machine cycles. Furthermore, personal computers utilize a system bus consisting of address, data, and control buses; the control bus is specifically responsible for synchronization and cooperation between different computer parts. A local bus may be used to divide the system bus into a fast "local" section and a slower "external" section.

Memory types are categorized by their functionality and volatility. RAM (Random Access Memory) is volatile memory intended for continuous reading and writing; if power is disconnected, data is lost. Static RAM typically has a shorter access time than dynamic RAM. Cache memory is a form of fast static RAM placed between the processor and the main memory to increase the overall speed of operations. ROM (Read Only Memory) is non-volatile and intended only for reading. Specific types include PROM (Programmable ROM), which allows for a single write and repeated reads. The ROM-BIOS is a fundamental computer program stored in ROM that runs every time the computer is turned on, handling settings and initial tests. Another hardware feature is the interrupt vector, which points to the memory address where the interrupt service routine is stored.

Peripheral connectivity is managed through various interfaces. The USB (Universal Serial Bus) is a serial interface that allows for the connection of peripherals while the computer is running (hot-swapping). The SCSI interface allows for the connection of fast peripherals in a cascade, supporting up to eight devices. Parallel interfaces allow for the simultaneous transmission of multiple data signals, whereas serial interfaces transmit bits one after another. Printing quality is often measured in DPI (dots per inch); a lower DPI value results in lower print quality. For external communication, a modem serves as a device that transforms digital signals for transmission over analog lines.

Data Storage and Media Technologies

Magnetic and optical media provide long-term data storage. A hard disk uses a thin ferromagnetic layer for recording and reading data based on magnetization. The surface is logically divided into tracks and sectors. Under the Microsoft operating system, a single sector can record 571bytes571\,bytes of unformatted data. An allocation unit, or cluster, is the smallest logical data unit on a disk. The File Allocation Table (FAT) is used for the physical addressing and organization of files on the disk. Disk access time is calculated as the sum of the head seek time and the rotational latency of the media. For increased performance or reliability, a RAID (Redundant Array of Independent Disks) can be used, which appears to the user as a single logical disk.

Optical storage includes CDs and DVDs, where data is recorded in a spiral track. Unlike magnetic media, optical discs are insensitive to magnetic field changes. These discs generally operate based on Constant Linear Velocity (CLV). Blu-ray Disc technology achieves higher recording density by employing a laser with a shorter wavelength. Previous generations of computers, such as the second generation, were characterized by the use of transistors and operated at speeds in the order of 1000operations/s1000\,operations/s, while the first generation utilized vacuum tubes and performed approximately 100operations/s100\,operations/s.

Networking Structures and Topologies

Computer networks are classified by their geographic scope and management. LANs (Local Area Networks) operate in limited geographic areas and usually have a single owner or administrator. MANs (Metropolitan Area Networks) are characterized by fast backbone networks connecting end users. WANs (Wide Area Networks) like the Internet are global. Network topologies determine how nodes are arranged: a bus topology allows signals to travel in both directions; a ring topology is vulnerable because the failure of a single link disables the entire network; and a star topology becomes non-functional if the central active device (like a hub or switch) fails. Structured cabling is a unified system used for both telephone and data transmissions.

Transmission media include twisted pair, which is inexpensive but susceptible to mechanical damage, and optical cables, which transmit light signals through glass fibers. Single-mode optical fibers offer a longer reach for signals compared to multi-mode fibers. Terrestrial microwave links require a direct line of sight between the transmitter and receiver for data transfer. Modulation techniques for these signals include Amplitude Modulation (AM), where the amplitude varies while the frequency remains constant, and Frequency Modulation (FM), which involves frequency changes at a constant amplitude. WDM (Wave Division Multiplexing) allows the transmission of several independent signals over a single optical fiber.

Communication protocols like TCP/IP are structured into layers; TCP/IP specifically has four communication layers, whereas the ISO/OSI model has seven. The physical layer defines the physical realization of signal transmission across the medium. Network interconnection devices include repeaters, which connect networks using identical technologies and speeds; routers, which compare packet addresses to direct traffic; and gateways, which can connect networks regardless of differing technologies or speeds. A firewall is a set of measures designed to protect a network from unauthorized external access.

The Internet, Databases, and the Information Society

The Internet is a global decentralized network that is not centrally controlled or coordinated. It is funded through the commercialization of network infrastructure rather than direct state contributions. Internet services are largely platform-independent and standardized. The DNS (Domain Name System) provides a hierarchical addressing structure, while URLs (Uniform Resource Locators) are used to address specific documents. Common services include FTP (File Transfer Protocol) for file exchange, TELNET for connecting to remote computers (utilizing the host computer's capacity), and WWW, which is based on hypertext documents. Full-text search engine databases allow users to search for specific text strings. When communicating via geostacionary satellites—which orbit at the same angular velocity as Earth—there is a signal delay often cited as roughly 0.2seconds0.2\,seconds or up to 2seconds2\,seconds.

Data management involves structured entities and attributes. An entity is a significant object of interest, and an attribute is a significant property of that entity. A data record (data sentence) is a reflection of a single occurrence of an entity. A primary key is used to uniquely distinguish between different occurrences of the same entity type within a data file, which can contain only one such primary key. Databases (data foundations) represent the total collection of data and data files within a system. Technologies like ADSL allow for high-speed Internet access over standard telephone lines, while Bluetooth provides short-range wireless radio connections between digital devices. Encryption methods include symmetric cryptography, where both parties use the same key, and asymmetric cryptography, which uses a public and private key pair and forms the basis for electronic signatures. These signatures provide "non-repudiation," ensuring that a sender cannot deny sending a message and the recipient cannot deny receiving it.

In a broader social context, the Information Society is characterized by information itself becoming a commodity. The shift from mass production to customized production in industries like automotive manufacturing is a direct result of IT capabilities. Technological progress is often mapped through Kondratieff cycles—long-term economic cycles linked to stages of fundamental technological innovation. Legal and criminal responsibility for data published on the Internet remains an issue that is not yet satisfactorily resolved on a global scale. Additionally, the use of modern IT brings inherent risks regarding the potential misuse of personal data.

Questions & Discussion

What is ROM-BIOS? ROM-BIOS is the Basic Input/Output System. it is the fundamental program of a PC, stored in ROM memory, used for testing and configuring the computer upon startup and providing instructions to load the operating system.

What is an IP address? An IP address is a unique address of a computer within the Internet network. A Class B IP address specifically allows for the addressing of 2162^{16} (65,536) connected devices. Tables containing the system of assigned IP addresses must be available on the relevant routers.

How does a distributed computer system appear to a user? It behaves and presents itself to the user as a single computer.

What are the attributes of the Information Age? One significant attribute is the dependence between economic dynamics and the volume of transmitted data.

What is the function of a Certification Authority? A Certification Authority is mandatory for issuing lists of revoked or invalidated certificates to maintain the security and trust of electronic signatures.

What is the delay for geostationary satellite communication? The minimal signal delay represents an order of magnitude of tenths of a second (often rounded to 2s2\,s in practical exam contexts provided in the transcript).