SK

AP Computer Science Principles Unit 1

Digital Information

Understanding Bits

Information in computers is fundamentally represented by electrical signals in wires, which can either be on or off. This binary system forms the basis for all data processing within a computer, with each wire representing a simple yes or no choice. Although we seldom interact with these binary numbers directly, they are crucial for understanding how computers function.

01:21

Binary Number System

A bit, the smallest piece of information in a computer, represents a state of a single wire as either on (1) or off (0). While the decimal system uses ten digits, the binary system relies on just two digits—zero and one—to represent all numbers. Each position in binary values doubles, allowing for the storage of larger numbers, as demonstrated by how the binary representation of nine is calculated.

01:59

Binary Counting

The binary number system uses only two digits, 0 and 1, to represent numbers, allowing counting to any value. Similar to the decimal system where each digit's position holds a different value, each position in binary represents a power of 2 instead of 10, with values increasing by factors of 2.

02:34

Storing Numbers in Binary

Numbers can be represented in binary using only ones and zeros, which allows for efficient data storage via electrical signals. Each letter, image, and sound can also be translated into numerical values, enabling computers to process all forms of information. For instance, images consist of pixels, while sounds can be visualized as waveforms, both of which generate substantial amounts of data.

03:14

Representing Information

Using the binary number system, eight wires can store numbers from 0 to 255, and with 32 wires, the capacity expands to over four billion. Additionally, various forms of information, such as text, images, and sound, can also be represented as numbers. For example, each letter in the alphabet can be assigned a unique numerical value.

03:55

Text Representation

Words are represented as sequences of numbers, which can be stored as on or off electrical signals. Similarly, images, videos, and graphics consist of tiny dots called pixels, each of which can be assigned a color through numerical representation.

Sequences of bits

Computers use multiple bits to represent data that is more complex than a simple on/off value.

A sequence of two bits can represent four (\[2^2\]) distinct values:

00, 01, 10, 11

A sequence of three bits can represent eight (\[2^3\]) different values:

000, 001, 010, 110

A sequence can represent many things: a number, a character, a pixel. Plus, the same sequence can represent different types of data in different contexts. The sequence 1000011 could represent 67 in a calculator application while also representing the letter "C" in a text file.

04:37

Image and Video Data

Images consist of millions of pixels, and videos display 30 frames per second, generating a vast amount of data. Sound waves can be graphically represented as waveforms, with sound quality increasing from 8-bit to 32-bit audio due to a higher range of numerical representation. Understanding computers involves recognizing that they process information as combinations of ones and zeros, grounded in electrical signals within their circuits.

Bytes

A bit is the smallest piece of information in a computer, a single value storing either 0 or 1. A byte is a unit of digital information that consists of 8 of those bits.Here's a single byte of information:\[\texttt{11110110}\]Here are three more bytes of information:000010100101010011011100Conversion between bits and bytes is a simple calculation: divide by \[8\] to convert from bits to bytes or multiply by 8 to convert from bytes to bits.

Number Limits, Overflow, and Roundoff

A higher number of bits allows for a greater range of numbers, which is crucial when coding or creating apps as users engage with images, sound, or video. Understanding how computers function internally relies on binary, represented by ones and zeros, and the electrical signals within their circuits. This binary system is fundamental to how computers input, store, process, and output information.

When computer programs store numbers in variables, the computer needs to find a way to represent that number in computer memory. Computers use different strategies based on whether a number is an integer or not. Due to limitations in computer memory, programs sometimes encounter issues with roundoff, overflow, or precision of numeric variables.

Integer Representation

An integer is any number that can be written without a fractional component. The same term is used in both programming and in math, so hopefully it's familiar to you.

All of these numbers are integers: 120, 12, 24, -26

How can a programming language represent those integers in computer memory? Well, computers represent all data with bits, so we know that ultimately, each of those numbers is a sequence of 0s and 1s.

To start simple, let's imagine a computer that uses only 4 bits to represent integers. It can use the first bit to represent the sign of the integer, positive or negative, and the other 3 bits for the absolute value.

In that system, the number 1 would be represented like this:

0

0

0

1

+/-

4

2

1

sign

2^2

2^1

2^0

The 0 in the sign bit represents a positive number, and the \[1\] in the right most bit represents the 2^0 1 place of the value.

What's the largest number this system could represent? Let's fill all the value bits with 1 and see:

0

1

1

1

+/-

4

2

1

sign

2^2

2^1

2^0

That's the positive number 7, since 2^2 + 2^1 + 2^0 = (4 + 2 + 1) = 7.

Overflow

What would happen if we ran a program like this on the 4-bit computer, where the largest positive integer is 7?

var x = 7;
var y = x + 1;

The computer can store the variable x just fine, but y is one greater than the largest integer it can represent with the 4 bits. In a case like this, the computer might report an "overflow error" or display a message like "number is too large". It might also truncate the number (capping all results to 7) or wrap the number around (so that 8 becomes 1).

We don't want to end up in any of those situations, so it's important we know the limitations of our language and environment when writing programs.

The Need for Compression

Modern computers can store increasingly large numbers of files, but file size still matters. The smaller our files are, the more files we can store.We use compression algorithms to reduce the amount of space needed to represent a file. There are two types of compression: lossless and lossy.Lossless compression algorithms reduce the size of files without losing any information in the file, which means that we can reconstruct the original data from the compressed file. Lossy compression algorithms reduce the size of files by discarding the less important information in a file, which can significantly reduce file size but also affect file quality.

Compression Algorithm

Computers can compress text in a similar way, by finding repeated sequences and replacing them with shorter representations. They don't need to worry about the end result sounding the same, like people do, so they can compress even further.

Let's try it with this quote from William Shakespeare:

to be or not to be, that is the question

The most obvious repeated sequences are "to" and "be", so the computer could represent them with a character that isn't part of the original text, like:

⊜ ⬗ or not ⊜ ⬗, that is the question

Any repeated sequence can be replaced, even if it's not a whole word, so the computer can also replace "th":

⊜ ⬗ or not ⊜ ⬗, ⟡at is ⟡e question

The computer also needs to store the table of replacements that it made, so that it can reconstruct the original.

replacement

original

to

be

th

Creative Distribution Options

RZA's insights highlight the ongoing struggle to treat music with the same regard as traditional art, emphasizing its devaluation in a culture where it is often given away for free. The Wu-Tang Clan's unique choice to sell an album as a collector's item for two million dollars, restricted from exploitation for 88 years, showcases how copyright enables artists to experiment with their music distribution creatively. Such control through copyright allows artists to decide how they share their music, whether on mainstream platforms like YouTube and Spotify or through innovative methods of their choosing.

03:12

Value of Art and Music

The Wu-Tang Clan exemplifies a unique approach to music by treating it as high art, having sold an exclusive album to a private collector under strict conditions that prevent profit exploitation for 88 years. This highlights the power of copyright, allowing artists to choose how to distribute and profit from their music, unlike many bands who might not pursue such unconventional strategies. Ultimately, copyright provides artists the flexibility to innovate in their release methods, empowering them to control their creative output.

Open Access Definition

Open Access allows for free, immediate online access to research articles, enabling anyone worldwide to read, reuse, and innovate with the content. This model emerged from the necessity to distribute complex scientific manuscripts widely, overcoming the previous high costs associated with publishing such detailed research.

00:36

Traditional Publishing Process

The traditional publishing process for scientific work involves submitting papers to journals, which manage reviews, revisions, typesetting, printing, and distribution. This system initially facilitated scientific progress effectively. However, with the advent of digitization, it has transitioned to an electronic format, fundamentally changing how scientific work is published and disseminated.

01:11

Rising Journal Subscription Costs

Journal subscription costs have skyrocketed, with prices rising over 250% beyond inflation in the last 30 years. This surge has led to a crisis for libraries, making access to many academic disciplines increasingly unaffordable and highlighting the absurdity of the situation.

01:45

High Subscription Prices

Subscription fees for academic journals are exorbitantly high, often exceeding $1,000 a year, with averages: $4,225 for chemistry and $3,649 for physics. Even fields like agriculture and geology see costs over $1,000, and some journals, such as Ketcher Hedrin, charge as much as $440,000. Despite these high prices, the journals themselves do not produce the academic content.

02:13

Inequity in Access to Research

The funding structure of scientific research is illogical, as those who write papers are not the ones who review them, leading to limited access to critical research findings. Many individuals, including students, often face barriers when trying to obtain full research papers, leaving them unaware of the systemic issues that hinder knowledge dissemination. This inequity in access undermines the core purpose of science, which is to discover and share new information.

03:08

Impact on Education and Research

Access to journal literature is crucial for student education and research, particularly in low and middle-income countries where barriers hinder academic contributions. Personal experiences, such as the inability to access medical literature during a family emergency, highlight the profound impact of restricted access on crucial knowledge and decision-making.

04:03

Personal Experience with Access Issues

Faced with decisions on medical treatment, the narrator, a trained scientist, grapples with the challenges of accessing relevant research articles. Despite purchasing numerous papers, the ambiguity of abstracts leaves them uncertain about the content's applicability, resulting in financial strain and frustration. This highlights the critical issue of lack of access to clear and accessible scientific information.

05:00

Need for Open Access Models

Publishing should be free since taxpayers fund both research and subscriptions, suggesting a need for open access models that promote broad distribution of knowledge instead of restricting it. This aligns with the understanding that the government pays for various aspects of scholarly publishing.

05:28

Full Reuse Rights Importance

Open Access allows free access to articles, eliminating paywalls and granting full reuse rights, enabling scientists to develop advanced tools that analyze and mine data from numerous articles. This capability allows for the discovery of connections and snippets of information across different fields, which would be impossible for individual researchers to achieve without negotiating rights with numerous publishers. An Open Access framework ensures that valuable research information is universally accessible on the internet.

06:17

Cultural Resistance to Change

Despite a general support for openness in scientific practices, there remains a cultural resistance to change due to the slow adoption of new norms. Many scientists prioritize traditional metrics, like impact factors and journal prestige, which hinders a more responsive approach to community demands for openness. This conservatism in practice limits the advancement of data mining capabilities.

06:53

Future of Scientific Publishing

The current scientific publishing model lacks evidence of being optimal, suggesting a need for experimentation with diverse publishing systems. Corporations may innovate faster than governments, and those scientists and publishers slow to adapt may fall behind as openness becomes the future of publishing. Creative solutions will determine which entities thrive in this evolving landscape.

07:28

Benefits of Open Access for Researchers

Graduate students should actively engage in conversations about open access with their research teams and PIs to express their commitment to the importance of accessible scientific work. Greater visibility of their research allows for collaborative advancements, benefiting not only the audience but also the researchers themselves. Open access fosters an environment where scientific knowledge can flourish without restrictions.

07:54

Alternative Access Options

The acceleration of scientific discovery is enhanced through openness, as it allows for quicker sharing of knowledge. Researchers have the option to make their articles freely accessible, even when published in subscription journals, enabling broader access to their findings.