AP Computer Science Principles Vocabulary
Unit 1: Digital Information
Binary: A foundational concept representing information using two distinct states, typically denoted as 0 and 1. This is the basis for all digital computation.
Bit: (Binary Digit) The smallest unit of data in computing, representing a single binary value (0 or 1). Bits are the building blocks of all digital data.
Byte: A standard unit of digital information that consists of 8 bits. Bytes are commonly used to measure the size of files and data.
Overflow Error: A critical error that occurs when the result of an arithmetic calculation exceeds the maximum value that the data type can store. This can lead to incorrect results or program crashes. For example, if an 8-bit system tries to represent the number 256 (which requires 9 bits), an overflow error will occur.
Round-off Error: An error that arises when a decimal number is approximated due to the limited precision of computer representations. This is particularly common in floating-point arithmetic and can accumulate over multiple calculations, leading to significant discrepancies.
Analog Data: Data represented by continuous, smoothly varying values. Examples include temperature, sound, and voltage. Analog data is continuous in nature, meaning it can take on an infinite number of values within a range.
Digital Data: Data represented by discrete, separate values, typically in binary format. Digital data is used in computers because it is easier to store, process, and transmit accurately. Examples include text, images, and audio files.
Sampling: The process of converting analog data into digital data by measuring the analog signal at regular intervals. The frequency of these intervals is known as the sampling rate and affects the accuracy of the digital representation.
Lossless Compression: A data compression technique that allows the original data to be perfectly reconstructed from the compressed data. No information is lost during compression, making it ideal for files where data integrity is crucial, such as text documents and certain image formats (e.g., PNG, GIF).
Lossy Compression: A data compression technique that reduces file size by removing some data, resulting in a loss of quality. This method is commonly used for media files like images (e.g., JPEG) and audio (e.g., MP3) where some loss of fidelity is acceptable in exchange for smaller file sizes.
Intellectual Property: Creations of the mind that are protected by law, including inventions, literary and artistic works, designs, symbols, names, and images used in commerce. Intellectual property rights include patents, copyrights, trademarks, and trade secrets.
Creative Commons: A licensing system that allows creators to specify which rights they reserve and which they waive, providing a flexible range of permissions for recipients or other creators. Creative Commons licenses enable the sharing and reuse of creative works under certain conditions.
Protocol: A set of rules and procedures that govern the exchange or transmission of data between devices. Protocols ensure that data is correctly formatted, transmitted, and interpreted.
Lowest Level of Abstraction: The most detailed level of representation in a computer system, closest to the actual hardware or physical data storage. At this level, data is represented in its rawest form, such as binary code or electrical signals.
ASCII: (American Standard Code for Information Interchange) A character encoding standard for electronic communication that represents text in computers, communication equipment, and other devices. ASCII codes represent letters, numbers, punctuation marks, and control characters using 7-bit binary codes.
RGB Color Model: An additive color model in which red, green, and blue light are combined in various ways to reproduce a broad array of colors. The RGB model is used in electronic displays such as computer monitors and televisions. Each color component (red, green, and blue) is typically represented by a number between 0 and 255, where 0 represents the absence of that color and 255 represents the maximum intensity.
Unit 2: The Internet
Computing Device: Any machine that can run a program. This includes a wide range of devices such as desktop computers, laptops, tablets, smartphones, servers, routers, and smart sensors.
Computing System: A collection of computing devices and programs working together to achieve a common purpose. Examples include a network of servers providing cloud services, or a set of sensors and computers monitoring environmental conditions.
Computing Network: A group of interconnected computing devices that are capable of sending and receiving data. Networks can range from small local area networks (LANs) to the global Internet.
Path: The sequence of connections between computing devices on a network, starting with a sender and ending with a receiver. Data is transmitted along this path in the form of packets.
Bandwidth: The maximum rate at which data can be transmitted over a network connection, typically measured in bits per second (bps). Higher bandwidth allows for faster data transfer rates.
IP Address: (Internet Protocol Address) A unique numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. IP addresses enable devices to be located and identified on the network. There are two versions of IP addresses: IPv4 (32-bit) and IPv6 (128-bit).
Internet Protocol (IP): A set of rules and procedures that govern the transmission of data across the Internet. The IP protocol is responsible for addressing and routing data packets between devices on the network.
Router: A networking device that forwards data packets between networks. Routers examine the destination IP address of each packet and determine the best path to send it toward its destination.
Packet: A small segment of data that is transmitted over a network. Data is typically divided into packets before being sent over the Internet, and each packet contains a header with addressing information.
Redundancy: The duplication of critical components or functions of a system with the intention of increasing reliability. Redundancy is implemented to improve the overall availability and fault tolerance of a system.
Fault Tolerant: A characteristic of a system that enables it to continue operating properly in the event of the failure of some of its components. Fault-tolerant systems are designed to minimize the impact of failures and maintain availability.
HTTP: (HyperText Transfer Protocol) The protocol used for transmitting web pages and other content over the Internet. HTTP defines how messages are formatted and transmitted between web browsers and web servers.
Internet: A global network of interconnected computer networks that use standardized, open communication protocols, primarily the Internet Protocol Suite (TCP/IP). The Internet enables communication and data exchange between billions of devices worldwide.
HTTPS: (HyperText Transfer Protocol Secure) A secure version of HTTP that uses encryption to protect the privacy and integrity of data transmitted between web browsers and web servers. HTTPS uses SSL/TLS to establish a secure connection.
World Wide Web: A system of interlinked hypertext documents and multimedia content that is accessed via the Internet. The World Wide Web is built on top of the Internet and utilizes HTTP for communication.
DNS: (Domain Name System) A hierarchical and distributed naming system for computers, services, or any resource connected to the Internet or a private network. DNS translates domain names (e.g., www.example.com) into IP addresses, allowing users to access online resources using human-readable names.
Datastream: A continuous flow of data transmitted over a network. Datastreams are commonly used in applications such as video streaming, audio streaming, and real-time data analysis.
www vs Internet: The World Wide Web (WWW) is a collection of web pages and other resources that are accessed via the Internet. The Internet is the underlying network infrastructure that enables communication between computers and devices worldwide. The WWW is just one of many services that run on the Internet.
Transmission Control Protocol (TCP): A reliable, connection-oriented protocol that manages the sending and receiving of data as packets over a network. TCP provides error detection and correction mechanisms to ensure that data is delivered accurately and in the correct order.
Metadata: Data that provides information about other data. Metadata can describe various aspects of data, such as its format, creation date, author, and access rights. Metadata is used to organize, manage, and discover data.
Network Node: A connection point in a network that can receive, create, store, or send data along distributed network routes. Network nodes can be computers, routers, switches, or other devices that participate in the network.
Unit 5: Big Data and Privacy
Google Trends: A tool provided by Google that analyzes the popularity of top search queries in Google Search across various regions and languages. Google Trends provides insights into trending topics and search patterns.
Citizen Science: Scientific research conducted, in whole or in part, by amateur or nonprofessional scientists. Citizen science projects often involve collecting and analyzing data to contribute to scientific knowledge.
Data Visualization: The graphical representation of information and data. Data visualization tools and techniques are used to communicate data insights effectively.
Histograms and Bar Charts: Graphs used to represent data distributions. Histograms are used for numerical data, while bar charts are used for categorical data. These charts provide a visual summary of the frequency or distribution of data values.
Scatter Plot: A type of graph in which the values of two variables are plotted along two axes, revealing any correlation present between the variables. Scatter plots are used to identify patterns and relationships in data.
Cleaning Data: The process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset. Data cleaning is an essential step in data analysis to ensure the quality and reliability of the results.
Crowdsourcing: The practice of obtaining input or information from a large number of people via the Internet. Crowdsourcing is used to gather ideas, feedback, and solutions from a diverse group of contributors.
Data Bias: A systematic error in data collection or analysis that can lead to misleading results. Data bias can arise from various sources, such as sampling bias, measurement bias, or algorithmic bias.
Data Filtering: The process of choosing a smaller subset of a dataset to use for analysis. Data filtering is used to focus on relevant data and remove noise or irrelevant information.
Open Data: Data that is freely available for everyone to use and republish as they wish, without restrictions. Open data initiatives promote transparency and collaboration by making data accessible to the public.
Big Data: Extremely large and complex datasets that may be analyzed computationally to reveal patterns, trends, and associations. Big data is characterized by its volume, velocity, variety, and veracity.
Scalability and Big Data: The capability of a system to handle a growing amount of work or its potential to accommodate growth. Scalability is a critical requirement for systems that process big data.
Parallel Systems: Computing systems that carry out multiple operations simultaneously. Parallel systems are used to improve performance and efficiency by dividing tasks among multiple processors or cores.
Distributed Computing: A computing model in which components of a software system are shared among multiple computers to improve efficiency and performance. Distributed computing is used to process large datasets and handle complex workloads.
Machine Learning: A method of data analysis that automates analytical model building. Machine learning algorithms learn from data and can make predictions or decisions without being explicitly programmed.
Open Source: Software for which the original source code is made freely available and may be redistributed and modified. Open-source software promotes collaboration, transparency, and innovation.
Open Access: Free, immediate, online availability of research articles, coupled with the rights to use these articles fully in the digital environment. Open access publishing makes research findings more accessible to the public.
Unit 8: Cybersecurity and Global Impacts
Computing Innovation: An innovation that includes a computer or program code as an integral part of its functionality. Computing innovations have transformed various aspects of society, including communication, transportation, healthcare, and entertainment.
Digital Divide: The gap between those who have easy access to computers and the Internet and those who do not. The digital divide can exacerbate social and economic inequalities.
Personally Identifiable Information (PII): Information that can be used to identify an individual, such as name, address, or social security number. Protecting PII is essential for maintaining privacy and preventing identity theft.
Run-time Error: An error that occurs while the program is running. Run-time errors can cause a program to crash or behave unexpectedly.
Logic Error: An error in a program that causes it to operate incorrectly but does not prevent the program from running. Logic errors can be difficult to detect and debug.
Keylogging: A method of capturing and recording user keystrokes. Keylogging is often used by malicious actors to steal passwords and other sensitive information.
Malware: Software that is intended to damage or disable computers and computer systems. Malware includes viruses, worms, Trojans, and ransomware.
Phishing: A method of trying to gather personal information using deceptive e-mails and websites. Phishing attacks often impersonate legitimate organizations to trick users into providing sensitive information.
Computer / Software Virus: A type of malicious software program that, when executed, replicates by inserting copies of itself into other computer programs. Viruses can spread quickly and cause significant damage.
DDoS (Distributed Denial of Service): An attack in which multiple compromised systems attack a single target, causing denial of service for users of the targeted system. DDoS attacks can overwhelm a target system with traffic, making it unavailable.
Encryption: The process of encoding messages or information so that only authorized parties can read it. Encryption is used to protect the confidentiality of data.
Decryption: The process of converting encrypted data back into its original form. Decryption requires a key or password.
Caesar's Cipher: A simple substitution cipher where each letter in the plaintext is shifted a certain number of places down or up the alphabet. Caesar's cipher is one of the earliest known encryption techniques.
Symmetric Key Encryption: An encryption method where the same key is used for both encryption and decryption. Symmetric key encryption is faster than public key encryption but requires the secure exchange of the key.
Public Key Encryption: An encryption technique that uses a pair of keys, one public and one private. The public key is used to encrypt data, and the private key is used to decrypt it. Public key encryption is used for secure communication and digital signatures.
Multi-factor Authentication: A security system that requires more than one method of authentication from independent categories of credentials to verify the user's identity. Multi-factor authentication adds an extra layer of security to protect against unauthorized access.
Unit 10: Algorithms
Problem: A general description of a task that can (or cannot) be solved with an algorithm. Problems can range from simple tasks to complex challenges.
Algorithm: A finite set of instructions that accomplish a specific task. Algorithms are precise and unambiguous, and they must terminate after a finite number of steps.
Sequencing: The application of each step of an algorithm in the order in which the statements are given. Sequencing is fundamental to the execution of algorithms.
Selection: A programming construct where a decision is made using a condition (e.g., if-else statements), allowing an algorithm to choose different paths. Selection enables algorithms to handle different scenarios and make decisions based on input data.
Iteration: The repetition of part of an algorithm until a condition is met or for a certain number of times (e.g., for loops or while loops). Iteration allows algorithms to perform repetitive tasks efficiently.
Efficiency: A measure of how many resources (time, space, etc.) an algorithm uses as the input size grows. Efficiency is a critical factor in the design and analysis of algorithms.
Linear Search: A search algorithm that checks each item in a list one by one until the desired item is found or the list ends. Linear search is simple but inefficient for large datasets.
Binary Search: A search algorithm that repeatedly divides a sorted list in half to find a target value. Binary search is more efficient than linear search for sorted data because it eliminates half of the remaining elements in each step.
Reasonable Time: A problem is solvable in reasonable time if the number of steps required grows at a polynomial rate (like n, n^2, etc.) as the input size increases. Problems that can be solved in reasonable time are considered tractable.
Unreasonable Time: A problem requires an exponential or factorial number of steps to solve, meaning it becomes practically unsolvable as the input size grows. Problems that require unreasonable time are considered intractable.
Heuristic: An approach to problem-solving that produces a solution close to the best one when an exact solution is impractical or too slow to compute. Heuristics are often used to find approximate solutions to complex problems.
Decision Problem: A problem that can be posed as a yes/no question for each input (e.g., “Is this number prime?”). Decision problems are fundamental in theoretical computer science.
Optimization Problem: A problem that asks for the best solution among many possible ones (e.g., “What’s the shortest route?”). Optimization problems are common in various fields, including engineering, economics, and logistics.
Undecidable Problem: A problem for which no algorithm can be written that always leads to a correct yes/no answer. Undecidable problems are beyond the capabilities of computers to solve.
Sequential Computing: A computing model where instructions are executed one at a time in order. Sequential computing is the traditional model of computation.
Parallel Computing: A model in which programs are broken into smaller pieces and run simultaneously on multiple processors. Parallel computing can significantly improve performance for certain types of problems.
Distributed Computing: A model where computing processes are spread across multiple devices, often connected over a network, to solve a problem. Distributed computing is used to process large datasets and handle complex workloads.
Speedup: A measure of the improvement in speed of a parallel or distributed algorithm compared to a sequential