Computer Science Principles - Big Idea 2

Binary Numbers

  • Binary Number System: A number system that uses two digits, 0 and 1.
    • One bit is either 1 or 0.
    • One byte is 8 bits.
  • Binary Digit (Bit): The smallest unit of data in computing.
  • Byte: A group of 8 bits.
  • Representing Electrical Signals:
    • 1 (ON): Represents an electrical signal in the computer being on.
    • 0 (OFF): Represents an electrical signal in the computer being off.
  • Transistors: Circuits in a computer's processor are made up of billions of transistors.
  • Binary and Electrical Signals: The digits 1 and 0 used in binary reflect the electrical signal being on and off.
  • Information Storage: All software, music, documents, and any other information processed by a computer is stored in sequences of binary 0s and 1s.
  • Interpretation of Binary Sequences: How a binary sequence is interpreted depends on how it will be used.
    • A byte of information, like 0100 1001, can be used to represent instructions to the computer.
  • Representation of Instructions: The 1s and 0s can represent anything, including pictures, letters, and videos.

Digital Data and Abstractions

  • Abstraction: Reduces complexity and allows focusing on the main idea or larger problem.
  • Example:
    • ASCII representation of "H.!"
    • Binary: 01001000 01101000 00100001
    • Decimal: 72 105 33
  • Analog Data: A mechanism, device, or technology that represents data by measurement of a continuous physical variable.
  • Abstraction Example: The use of digital data to approximate real-world analog data is an example of abstraction.
  • Sampling: Measuring the values of the analog signal at regular intervals.
    • Samples are measured to figure out the exact bits required to create and store the analog data in digital form.
  • Essential Knowledge:
    • DAT-1.A.1: Data values can be stored in variables, lists of items, or standalone constants and can be passed as input to (or output from) procedures (return value).
    • DAT-1.A.2: Computing devices represent data digitally, meaning that the lowest-level components of any value are bits.
    • DAT-1.A.3: Bit is shorthand for binary digit and is either 0 or 1.
    • DAT-1.A.4: A byte is 8 bits.

Representing Integers with Fixed Number of Bits

  • All data is represented by 1s and 0s arranged in groups called bytes.
    • This includes integers (whole numbers, even and odd, including 0).
  • Integers are represented in computers by a fixed number of bits.
    • Example: Some programming languages store data values in up to 32 bits (or 4 bytes).
    • 4 bytes can represent 2^{32} different values, which is a little over 4 billion different values total.
  • DAT-1.B.1: In many programming languages, integers are represented by a fixed number of bits, which limits the range of integer values and mathematical operations on those values. This limitation can result in overflow or other errors.
  • DAT-1.B.2: Other programming languages provide an abstraction through which the size of representable integers is limited only by the size of the computer's memory; this is the case for the language defined in the exam reference sheet.
  • DAT-1.B.3: In programming languages, the fixed number of bits used to represent real numbers limits the range and mathematical operations on these values; this limitation can result in round-off and other errors. Some real numbers are represented as approximations in computer storage.

Overflow Error

  • If a program encounters a calculation that requires a number larger than what its memory will allow to be stored, this can result in an overflow error.

Round-off Error

  • Programming languages can have problems with real numbers like pi.

Computer's Available Memory

  • Ideal situation: the range of numbers a computer can work with would only be limited by the computer's available memory.
  • Real world: this is not always possible.
    • If a number stretches towards infinity, it would require an infinite amount of computer memory in order to store and calculate, which is not possible.

4-Bit Computer Example

  • Computer uses only 4 bits to represent integers.
    • First bit = sign of integer (positive or negative).
    • Other 3 bits for the absolute value.
  • Largest number this system could represent:
    • Binary: 0111
    • Positive number 7 since 2^2 + 2^1 + 2^0 = 4 + 2 + 1 = 7
  • What would happen if we ran a program like this on the 4-bit computer, where the largest positive integer is 7?
X <- 7
y <- x+1
  • Overflow error or number too large.
  • Could possibly wrap the number around like an odometer that has reached its max and 8 becomes 1.

Binary Numbers: Base 2 and Base 10 Conversions

  • DAT-1.C: For binary numbers:
    • Calculate the binary (base 2) equivalent of a positive integer (base 10) and vice versa.
    • Compare and order binary numbers.
  • DAT-1.C.1: Number bases, including binary and decimal, are used to represent data.
  • DAT-1.C.2: Binary (base 2) uses only combination of the digits zero and one.
  • DAT-1.C.3: Decimal (base 10) uses only combination of the digits 0-9.

Decimal Number System Place Values

  • Example: 5012
  • Place Values: … Powers of 10
  • 5 * 10^3 + 0 * 10^2 + 1 * 10^1 + 2 * 10^0

Binary Number System Place Values

  • Example: 1101
  • Place Values: … Powers of 2
  • 1 * 2^3 + 1 * 2^2 + 0 * 2^1 + 1 * 2^0

Converting Binary to Decimal

  • Example: 00101001
  • Deconstructing a binary number means adding up the powers of 2 that are "turned on."
  • 2^0 + 2^3 + 2^5 = 1 + 8 + 32 = 41

Constructing a Binary Number

  • Figuring out which powers of 2 add up to the number you want.

Data Compression

  • Data compression is a reduction in the number of bits needed to represent data.
  • Data compression is used to save transmission time and storage space.

How Compression Works

  • When data is compressed, you are looking for repeated patterns and predictability.
  • The larger the data file, the more patterns that can be pulled out.

Text Compression

  • Remove all repeated characters and insert a single character or symbol in its place.

Data Compression Methods: Lossless vs Lossy

  • Lossless: Reduces the number of bits stored or transmitted while guaranteeing complete reconstruction of the original data.
  • Lossy: Significantly reduces the number of bits stored or transmitted but only allows reconstruction of an approximation of the original data.

Lossy vs Lossless

  • Lossless:
    • The typical approach where the loss of words or numbers would change the information.
    • Examples: Executable files, text, spreadsheet files.
  • Lossy:
    • The typical approach where the removal of some data has little or no discernible effect on the representation of the content since the data removed are redundant, unimportant, or imperceptible.
    • Examples: Graphics, audio, video, images.

Lossy Example

  • .jpg compression algorithm is used on images.
    • Divides the picture up into blocks and squares.
    • Uses approximation to average out the pixel color data.

Extracting Information from Data

  • DAT-2.C.6: The size of a data set affects the amount of information that can be extracted from it.
  • DAT-2.C.7: Large data sets are difficult to process using a single computer and may require parallel systems.
  • DAT-2.C.8: Scalability of systems is an important consideration when working with data sets, as the computational capacity of a system affects how data sets can be processed and stored.

Where to Start with Data

  • Collecting Data
  • Issues to consider:
    • Source: Do you need more sources?
    • Potential Bias:
      • Intentional: Who collected the data? Do they have an agenda?
      • Unintentional: How is the data collected? Who collected the data?

Data Cleaning

  • Identifying incomplete, corrupt, duplicate, or inaccurate records.
  • Replacing, modifying, or deleting the "dirty" data.
  • Be careful about modifying or deleting!
  • Be sure there is a mistake!
  • Keep records of what data is modified/deleted and WHY.
  • Invalid data may need to be modified - keep form consistent.

Metadata

  • Prefix meta: behind, among, between
  • Metadata – data about data
  • Some data has information about itself.
    • Author
    • Date
    • Length/Size
  • Why?
    • Identify
    • Track/organize

Example of Metadata

  • Data - Photo/Image
    • Date
    • Time
    • Location
    • Height
    • Width
    • Pixels

Using Programs with Data

  • DAT-2.E: Explain how programs can be used to gain insight and knowledge from data.
  • Using programs, the data can be stored in types of lists to be processed.
  • After filtering and cleaning the data, users can utilize the program to interact with the data to gain insight and knowledge.
  • Users can interact with the data by filtering, sorting, combining, transforming, clustering or classifying.
  • Each iteration leads to more knowledge and insight!
  • Spreadsheets are very powerful with Selection and Interation.
  • Lists in Programming Languages give flexibility to do anything programmer wants.
  • Visit in Big Idea 3