LM

AP Computer Science Principles - Big Idea 2: Data

AP Computer Science Principles Big Idea 2: Data

Overview

  • Big Idea 2 focuses on data and its importance in computer science.
  • It constitutes 17-22% of the AP exam weighting.
  • Students should understand how computers handle data and how to use data to solve problems.

Topic Questions

  • Topic Questions are available in AP Classroom.
  • They can be used to check student understanding and identify areas to emphasize.
  • Multiple-choice format with approximately 20 questions.

Developing Understanding

  • Everything we do with a computer is broken down into some form of data.
  • Students should understand how computers handle data to solve problems like:
    • Raising awareness for a cause.
    • Determining which state will gain seats in the House of Representatives using census data.
    • Determining the ideal location for prom using traffic and cost data.
  • Information is stored in binary and translated for display on screens or speakers.
  • Data are processed to learn something new.
  • This big idea is paired with Big Idea 3 (Algorithms and Programming) and Big Idea 5 (Impact of Computing).

Computational Thinking Practices

  • 2.B, 3.C, 5.B

Preparing for the AP Exam

  • Students will be presented with data representation for text or media and will be asked to convert values from binary to decimal or vice versa.
  • Understanding number systems beyond decimal is important.
  • Real-world problems involve large data sets that require programming solutions.
  • Data abstraction is used to write programs that can handle changes in data entries.
  • Documentation within the program explains the solution.
  • Data compression algorithms maximize storage space or transmit data over the Internet, sometimes at a cost to data quality.
  • Students need to compare data compression algorithms to determine the best one to use in a given situation.
  • Students should be able to distinguish between compression algorithms by understanding how data might be restored or be unable to be restored.
  • When presented with scenarios that describe data and metadata for analysis, students will be asked to determine what information can be found, as well as a potential programming process that can be used to extract information or modify the existing data.
  • Practice identifying a problem they could solve using data, such as the best route to take to school, gathering the necessary data to analyze, and implement a program that will manipulate the data to find an answer.

Essential Questions

  • DAT-1: How can we use 1s and 0s to represent something complex like a video of the marching band playing a song?
  • DAT-2: How can you predict the attendance at a school event using data gathered from social media?
  • When is it more appropriate to use a computer to analyze data than to complete the analysis by hand?

Big Idea at a Glance

Learning ObjectiveTopicSkillsUnit/Module
DAT-1.A, DAT-1.B, DAT-1.C2.1 Binary Numbers1.D, 2.B, 3.C
DAT-1.D2.2 Data Compression1.D
DAT-2.A, DAT-2.B, DAT-2.C2.3 Extracting Information from Data5.B, 5.D
DAT-2.D, DAT-2.E2.4 Using Programs with Data2.B, 5.B

Sample Instructional Activities

  • These are optional and provide ways to incorporate instructional approaches into the classroom.

Activity 1: Look for a Pattern (Topic 2.1)

  • Provide students with compressed lossless text and a key.
  • Have them look for patterns in retrieving the original text and evaluate the compression algorithm.
  • Have them write down the patterns they see along with their evaluation and share these in a large group.

Activity 2: Diagramming (Topic 2.4)

  • Give students a question and a list of data.
  • Have them diagram a process to answer the question using the data.
  • Include the input(s) of information and the output of the transformed data.
  • Have students include an explanation of how the process represented in their diagram would work to find the solution.

Topic 2.1: Binary Numbers

Enduring Understanding

  • DAT-1: The way a computer represents data internally is different from the way the data are interpreted and displayed for the user. Programs translate data into a human-understandable representation.

Learning Objective DAT-1.A

  • Explain how data can be represented using bits.
Essential Knowledge
  • DAT-1.A.1: Data values can be stored in variables, lists of items, or standalone constants and can be passed as input to (or output from) procedures.
  • DAT-1.A.2: Computing devices represent data digitally, meaning the lowest-level components of any value are bits.
  • DAT-1.A.3: Bit is shorthand for binary digit and is either 0 or 1.
  • DAT-1.A.4: A byte is 8 bits.
  • DAT-1.A.5: Abstraction reduces complexity by focusing on the main idea, hiding irrelevant details, and bringing together useful details.
  • DAT-1.A.6: Bits are grouped to represent abstractions (numbers, characters, color).
  • DAT-1.A.7: The same sequence of bits may represent different types of data in different contexts.
  • DAT-1.A.8: Analog data have values that change smoothly over time (pitch, volume, colors).
  • DAT-1.A.9: The use of digital data to approximate real-world analog data is an example of abstraction.
  • DAT-1.A.10: Analog data can be closely approximated digitally using a sampling technique, which means measuring values of the analog signal at regular intervals called samples. The samples are measured to figure out the exact bits required to store each sample.

Learning Objective DAT-1.B

  • Explain the consequences of using bits to represent data.
Essential Knowledge
  • DAT-1.B.1: In many programming languages, integers are represented by a fixed number of bits, which limits the range of integer values and mathematical operations on those values. This limitation can result in overflow or other errors.
  • DAT-1.B.2: Other programming languages provide an abstraction through which the size of representable integers is limited only by the size of the computer’s memory.
  • DAT-1.B.3: In programming languages, the fixed number of bits used to represent real numbers limits the range and mathematical operations on these values; this limitation can result in round-off and other errors. Some real numbers are represented as approximations in computer storage.

Learning Objective DAT-1.C

  • For binary numbers:
    • Calculate the binary (base 2) equivalent of a positive integer (base 10) and vice versa.
    • Compare and order binary numbers.
Essential Knowledge
  • DAT-1.C.1: Number bases, including binary and decimal, are used to represent data.
  • DAT-1.C.2: Binary (base 2) uses only combinations of the digits zero and one.
  • DAT-1.C.3: Decimal (base 10) uses only combinations of the digits 0 – 9.
  • DAT-1.C.4: As with decimal, a digit’s position in the binary sequence determines its numeric value. The numeric value is equal to the bit’s value (0 or 1) multiplied by the place value of its position.
  • DAT-1.C.5: The place value of each position is determined by the base raised to the power of the position. Positions are numbered starting at the rightmost position with 0 and increasing by 1 for each subsequent position to the left.

Topic 2.2: Data Compression

Enduring Understanding

  • DAT-1: The way a computer represents data internally is different from the way the data are interpreted and displayed for the user. Programs are used to translate data into a representation more easily understood by people.

Learning Objective DAT-1.D

  • Compare data compression algorithms to determine which is best in a particular context.
Essential Knowledge
  • DAT-1.D.1: Data compression can reduce the size (number of bits) of transmitted or stored data.
  • DAT-1.D.2: Fewer bits does not necessarily mean less information.
  • DAT-1.D.3: The amount of size reduction from compression depends on both the amount of redundancy in the original data representation and the compression algorithm applied.
  • DAT-1.D.4: Lossless data compression algorithms can usually reduce the number of bits stored or transmitted while guaranteeing complete reconstruction of the original data.
  • DAT-1.D.5: Lossy data compression algorithms can significantly reduce the number of bits stored or transmitted but only allow reconstruction of an approximation of the original data.
  • DAT-1.D.6: Lossy data compression algorithms can usually reduce the number of bits stored or transmitted more than lossless compression algorithms.
  • DAT-1.D.7: In situations where quality or ability to reconstruct the original is maximally important, lossless compression algorithms are typically chosen.
  • DAT-1.D.8: In situations where minimizing data size or transmission time is maximally important, lossy compression algorithms are typically chosen.

Topic 2.3: Extracting Information from Data

Enduring Understanding

  • DAT-2: Programs can be used to process data, which allows users to discover information and create new knowledge.

Learning Objective DAT-2.A

  • Describe what information can be extracted from data.
Essential Knowledge
  • DAT-2.A.1: Information is the collection of facts and patterns extracted from data.
  • DAT-2.A.2: Data provide opportunities for identifying trends, making connections, and addressing problems.
  • DAT-2.A.3: Digitally processed data may show correlation between variables. A correlation found in data does not necessarily indicate that a causal relationship exists. Additional research is needed to understand the exact nature of the relationship.
  • DAT-2.A.4: Often, a single source does not contain the data needed to draw a conclusion. It may be necessary to combine data from a variety of sources to formulate a conclusion.

Learning Objective DAT-2.B

  • Describe what information can be extracted from metadata.
Essential Knowledge
  • DAT-2.B.1: Metadata are data about data. For example, the piece of data may be an image, while the metadata may include the date of creation or the file size of the image.
  • DAT-2.B.2: Changes and deletions made to metadata do not change the primary data.
  • DAT-2.B.3: Metadata are used for finding, organizing, and managing information.
  • DAT-2.B.4: Metadata can increase the effective use of data or data sets by providing additional information.
  • DAT-2.B.5: Metadata allow data to be structured and organized.

Learning Objective DAT-2.C

  • Identify the challenges associated with processing data.
Essential Knowledge
  • DAT-2.C.1: The ability to process data depends on the capabilities of the users and their tools.
  • DAT-2.C.2: Data sets pose challenges regardless of size, such as:
    • The need to clean data
    • Incomplete data
    • Invalid data
    • The need to combine data sources
  • DAT-2.C.3: Depending on how data were collected, they may not be uniform. For example, if users enter data into an open field, the way they choose to abbreviate, spell, or capitalize something may vary from user to user.
  • DAT-2.C.4: Cleaning data is a process that makes the data uniform without changing their meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word).
  • DAT-2.C.5: Problems of bias are often created by the type or source of data being collected. Bias is not eliminated by simply collecting more data.
  • DAT-2.C.6: The size of a data set affects the amount of information that can be extracted from it.
  • DAT-2.C.7: Large data sets are difficult to process using a single computer and may require parallel systems. Scalability of systems is an important consideration when working with data sets, as the computational capacity of a system affects how data sets can be processed and stored.

Topic 2.4: Using Programs with Data

Enduring Understanding

  • DAT-2: Programs can be used to process data, which allows users to discover information and create new knowledge.

Learning Objective DAT-2.D

  • Extract information from data using a program.
Essential Knowledge
  • DAT-2.D.1: Programs can be used to process data to acquire information.
  • DAT-2.D.2: Tables, diagrams, text, and other visual tools can be used to communicate insight and knowledge gained from data.
  • DAT-2.D.3: Search tools are useful for efficiently finding information.
  • DAT-2.D.4: Data filtering systems are important tools for finding information and recognizing patterns in data.
  • DAT-2.D.5: Programs such as spreadsheets help efficiently organize and find trends in information.
  • DAT-2.D.6: Some processes that can be used to extract or modify information from data include the following:
    • Transforming every element of a data set
    • Filtering a data set
    • Combining or comparing data in some way
    • Visualizing a data set through a chart, graph, or other visual representation

Learning Objective DAT-2.E

  • Explain how programs can be used to gain insight and knowledge from data.
Essential Knowledge
  • DAT-2.E.1: Programs are used in an iterative and interactive way when processing information to allow users to gain insight and knowledge about data.
  • DAT-2.E.2: Programmers can use programs to filter and clean digital data, thereby gaining insight and knowledge.
  • DAT-2.E.3: Combining data sources, clustering data, and classifying data are parts of the process of using programs to gain insight and knowledge from data.
  • DAT-2.E.4: Insight and knowledge can be obtained from translating and transforming digitally represented information.
  • DAT-2.E.5: Patterns can emerge when data are transformed using programs.