AP Compsci Principle Review
Compression: is also an important consideration when it comes to backing up and archiving your important files, particularly for uploading over the Internet.
Compression is a two- way process: a compression algorithm can be used to make a data package smaller, but it can also run the other way, to decompress the package into its original form.
Data compression: is useful in computing to save disk space, or to reduce the bandwidth used when sending data (eg, over the Internet).
Data compression deals with taking a string of bytes and compressing it down to a smaller set of bytes, whereby it takes either less bandwidth to transmit the string or to store it to disk.
Lossless algorithms: are those that can reconstruct the original message exactly from the compressed message, and lossy algorithms can only reconstruct an approximation of the original message.
Lossless algorithms are typically used for text, and lowy algorithms for images and sound where a little bit of loss in resolution is often undetectable, or at least acceptable.
Lossless compression: packs data in such a way that the compressed package can be decompressed, and the data can be pulled out exactly the same as it went in.
Text compression: is another important area for lossless compression.
It is very important that the reconstruction is identical to the original text, as very small differences can result in statements with very different meanings.
Lossy compression is a technique that does not decompress digital data back to 100% of the original.
Lossy methods can provide high degrees of compression and result in smaller compressed files, but some number of the original pixels, sound waves, or video frames are removed forever.
Lossy is used in an abstract sense, however, and does not mean random lost pixels, but instead means loss of a quantity such as a frequency component, or perhaps loss of noise.
A popular way of defining abstraction is information hiding.
Just as related program statements are bundled together, related program variables can be bundled together.
Such abstractions allow us to think of the data within a program hierarchically.
A list is an example of data abstraction.
The use of parameters can make procedures more flexible.
Parameters: allow the calling program to send values to the procedure.
They are passed to the procedure as arguments when the procedure is called.
The values sent to the procedure can be different each time, making the procedure more flexible through the ability for multiple calls to the same section of code.
For each programming language, there are prewritten programs to provide commonly needed functionality, and these programs are stored in libraries, which are folders with several programs.
API stands for “Application Programming Interface.”
The API documentation provides the information needed to set up the interface and use the newly connected software.
Generating random numbers is a frequently needed feature in programs.
Most programming languages have a library of prewritten code for a variety of random number generators.
RANDOM needs two values passed to it using arguments, the beginning and ending range for the selected random number.
Simulations: Simulations are designed to represent and mirror the real world for testing.
An instance of a problem is a specific example.
A decision problem has a yes or no answer.
An optimization problem is one that should find the best solution for the problem.
The efficiency of algorithms deals with resources needed to run it in terms of how long it will take and how much memory will be needed.
This becomes especially important with extremely large datasets, and efficiency is usually stated in terms of the size of the input.
Efficiency: can be determined by mathematically proving it and informally measured by actually running it on datasets of different sizes and measuring how long it took and the memory resources needed.
Algorithms have limits, and there are some problems for which we do not have efficient enough algorithms to solve.
These algorithms can’t run in a reasonable amount of time with our current technology.
Heuristic approach: This is an approach that may not be optimal or the best but is close enough to use as a solution.
Decidable problem: is one where an algorithm can be written that results in a correct “yes” or “no” answer for all inputs.
Determining if a number is prime is an example of a decidable problem.
Undecidable problem: does not have an algorithm that can give a correct “yes” or “no” for all cases of the problem.
A network of networks
The Internet is very hardware driven with wires, cables, and devices such as routers and servers.
Routers are computing devices along a path that send the information along to the next stop on the path.
Routing: is the process of finding a path from sender to receiver
Bandwidth: is a measure of the maximum amount of data that can be transferred through a channel or network connection.
It's measured in bits per second, and it determines how quickly you can download and upload files from the internet.
Internet protocol (IP): is responsible for addressing and routing your online requests.
Transmission control protocol (TCP): is a protocol that defines how computers send packets of data to each other.
User datagram protocol (UDP): is a protocol that allows computer applications to send messages without checking for missing packets to save on time needed to retransmit missing packets.
The internet has cables and wires spanning the world that connect computers.
Natural disasters: could cause the hardware to be destroyed, bringing the network activity to a halt.
Solar Flare: is an intense radiation that is released from the sun.
This happens because of the released from the sun.
Parallel computing solution takes as long as the longest of the tasks done in parallel.
A parallel computing solution takes as long as its sequential tasks plus the longest of its parallel tasks.
Parallel computing can consist of a parallel portion and a sequential portion
Cloud computing: offers new ways for people to communicate, making collaboration easier and more efficient.
Storing documents in the “cloud” simply means they are stored on a computer server at a location different than where the owner of the files is located.
Technology has had a major impact on the world, enabling innovation through the sharing of resources and computational artifacts.
It also allows us to virtually meet with people from anywhere.
It is helping us on the path of becoming a true global society.
The impact of the digital divide includes access to information, knowledge, markets, and different cultures.
Bias: which is intentional or unintentional prejudice for or against certain groups of people, shows up in computing innovations too.
Humans: write the algorithms, and our biases can make their way into the algorithms and the data used by innovations without us realizing it.
Artificial intelligence programs: are used more and more in ways such as screening applications of job candidates, determining if a person merits credit to purchase a house, and locating what areas have more crime.
Crowdsourcing: allows people to share information and ask the “crowd”— anyone who accesses the site—for feedback, to help solve problems, find employment, or for funding.
Another use of crowdsourcing is when scientists share data and ask nonscientists, or “citizen scientists,” to look for and report on patterns or other interesting features of the data or to “donate” computer time during periods of time their machine is inactive.
This helps to “scale up” processing capability at little to no cost to the organization seeking the resources.
Anything a person creates, including any computational artifacts created with a computer, is the intellectual property of that person.
Material created by someone else that you use in any way should always be cited.
Peer-to-peer networks exist that are used to illegally share files of all types.
Devices that continually monitor and collect data, such as a voice-activated device we install or video cameras used for facial recognition posted in our communities, can have legal and/or ethical issues.
Creative Commons: provides a way for creators of software, images, music, videos, and any computational artifact to share their creations with stipulations for sharing and permission from the author clearly indicated.
Digital data: is easy to find, copy, and paste, so ensuring you have written permission from the creator or owner is important.
Not only do we have access to data but to software as well.
Open source software: is software that is freely shared, updated, and supported by anyone who wants to do so.
The availability of this software for everyone has greatly expanded people’s abilities to participate in a variety of tasks that many would not have been able to participate in otherwise.
The sharing of huge amounts of public data by organizations, such as the U.S. government, provides the opportunity for anyone to search for information or to help solve problems.
In addition, the availability of open databases in a variety of fields—including science, entertainment, sports, and business—has benefited people everywhere.
Social media sites: as well as search engines publish what the most frequent searches and posts are about.
Our browsers keep a list of our most frequently visited sites on their home page to help us out.
The search engines are able to identify when more people than usual are watching a video or searching for a topic.
Analytics: identify trends for marketing purposes and help businesses determine what and where customers are searching for their products and their competitors’ products, how long an item sits in a virtual shopping cart, and when people buy.
Data mining: is a field of study that analyzes large datasets.
Machine learning: is a subset of data mining.
Machine learning uses algorithms to analyze data and predict behavior and is used in Artificial Intelligence (AI).
Any information that identifies you is considered Personally Identifiable Information (PII).
It includes data such as your address, age, or social security number.
It also includes data about you, such as your medical or financial information.
Our PII information is also used by websites to show us certain information or related topics based on our prior visits.
Digital footprints and fingerprints: are the trail of little pieces of data we leave behind as a sign of our presence as we go through our daily lives.
Many people willingly provide personal information to sites to gain access or privileges, whether it’s through sports teams, shopping, or restaurants.
Their data is stored and may be sold with or without their knowledge or permission.
Many web browsers now have “incognito” or “private” modes so that web searches and file downloads are not recorded in the web history.
Some web browsers attest that they do not track and retain your search data.
Many aspects of our lives are much easier today because of the easy access to all sorts of sites and information that the Internet provides.
This can range from shopping, entertainment, and sports sites to price comparisons.
Cybersecurity: has a global impact because now anyone from anywhere can attempt to gain unauthorized entry to someone else’s computer, data, servers, or network.
The security of our data deals with the ability to prevent unauthorized individuals from gaining access to it and preventing those who can view our data from changing it.
Strong passwords: help block those trying to gain unauthorized access.
Multifactor authentication: is another layer that is increasingly used.
Cybersecurity: protects our electronic devices and networks from attacks and unauthorized use.
These attacks can come in many forms and can have a major impact on those affected.
Different types of attacks cause different problems.
Data: may be damaged or the device may be used to further spread the malware.
Phishing: attacks create e-mail and/or websites that look a legitimate hoping to induce a person to click on the malicious link.
Computer viruses: are like human viruses.
They attach themselves to, or are part of, an infected file.
Keylogging software: is a form of malware that captures every keystroke and transmits it to whomever planted it.
Cryptography: is the writing of secret codes.
Encryption: is converting a message to a coded format.
Deciphering: the encrypted message is called decryption.
Security: also relates to encrypting data before it is transmitted to ensure it remains secure if it is intercepted during transmission.
Public key encryption: uses open standards, meaning the algorithms used are published and available to everyone and are discussed by experts and interested parties and known by all.
The key is what keeps information secret until the person it is intended for decrypts it.
The Internet is based on a “trust” model.
This means that digital certificates can be purchased from Certificate Authorities (CAs), which identify trusted sites.
They issue certificates that businesses, organizations, and individuals load to their websites.
The certificates verify to web browsers that the encryption keys belong to the business, thereby enabling online purchases and the sending and receiving of secure documents.
Compression: is also an important consideration when it comes to backing up and archiving your important files, particularly for uploading over the Internet.
Compression is a two- way process: a compression algorithm can be used to make a data package smaller, but it can also run the other way, to decompress the package into its original form.
Data compression: is useful in computing to save disk space, or to reduce the bandwidth used when sending data (eg, over the Internet).
Data compression deals with taking a string of bytes and compressing it down to a smaller set of bytes, whereby it takes either less bandwidth to transmit the string or to store it to disk.
Lossless algorithms: are those that can reconstruct the original message exactly from the compressed message, and lossy algorithms can only reconstruct an approximation of the original message.
Lossless algorithms are typically used for text, and lowy algorithms for images and sound where a little bit of loss in resolution is often undetectable, or at least acceptable.
Lossless compression: packs data in such a way that the compressed package can be decompressed, and the data can be pulled out exactly the same as it went in.
Text compression: is another important area for lossless compression.
It is very important that the reconstruction is identical to the original text, as very small differences can result in statements with very different meanings.
Lossy compression is a technique that does not decompress digital data back to 100% of the original.
Lossy methods can provide high degrees of compression and result in smaller compressed files, but some number of the original pixels, sound waves, or video frames are removed forever.
Lossy is used in an abstract sense, however, and does not mean random lost pixels, but instead means loss of a quantity such as a frequency component, or perhaps loss of noise.
A popular way of defining abstraction is information hiding.
Just as related program statements are bundled together, related program variables can be bundled together.
Such abstractions allow us to think of the data within a program hierarchically.
A list is an example of data abstraction.
The use of parameters can make procedures more flexible.
Parameters: allow the calling program to send values to the procedure.
They are passed to the procedure as arguments when the procedure is called.
The values sent to the procedure can be different each time, making the procedure more flexible through the ability for multiple calls to the same section of code.
For each programming language, there are prewritten programs to provide commonly needed functionality, and these programs are stored in libraries, which are folders with several programs.
API stands for “Application Programming Interface.”
The API documentation provides the information needed to set up the interface and use the newly connected software.
Generating random numbers is a frequently needed feature in programs.
Most programming languages have a library of prewritten code for a variety of random number generators.
RANDOM needs two values passed to it using arguments, the beginning and ending range for the selected random number.
Simulations: Simulations are designed to represent and mirror the real world for testing.
An instance of a problem is a specific example.
A decision problem has a yes or no answer.
An optimization problem is one that should find the best solution for the problem.
The efficiency of algorithms deals with resources needed to run it in terms of how long it will take and how much memory will be needed.
This becomes especially important with extremely large datasets, and efficiency is usually stated in terms of the size of the input.
Efficiency: can be determined by mathematically proving it and informally measured by actually running it on datasets of different sizes and measuring how long it took and the memory resources needed.
Algorithms have limits, and there are some problems for which we do not have efficient enough algorithms to solve.
These algorithms can’t run in a reasonable amount of time with our current technology.
Heuristic approach: This is an approach that may not be optimal or the best but is close enough to use as a solution.
Decidable problem: is one where an algorithm can be written that results in a correct “yes” or “no” answer for all inputs.
Determining if a number is prime is an example of a decidable problem.
Undecidable problem: does not have an algorithm that can give a correct “yes” or “no” for all cases of the problem.
A network of networks
The Internet is very hardware driven with wires, cables, and devices such as routers and servers.
Routers are computing devices along a path that send the information along to the next stop on the path.
Routing: is the process of finding a path from sender to receiver
Bandwidth: is a measure of the maximum amount of data that can be transferred through a channel or network connection.
It's measured in bits per second, and it determines how quickly you can download and upload files from the internet.
Internet protocol (IP): is responsible for addressing and routing your online requests.
Transmission control protocol (TCP): is a protocol that defines how computers send packets of data to each other.
User datagram protocol (UDP): is a protocol that allows computer applications to send messages without checking for missing packets to save on time needed to retransmit missing packets.
The internet has cables and wires spanning the world that connect computers.
Natural disasters: could cause the hardware to be destroyed, bringing the network activity to a halt.
Solar Flare: is an intense radiation that is released from the sun.
This happens because of the released from the sun.
Parallel computing solution takes as long as the longest of the tasks done in parallel.
A parallel computing solution takes as long as its sequential tasks plus the longest of its parallel tasks.
Parallel computing can consist of a parallel portion and a sequential portion
Cloud computing: offers new ways for people to communicate, making collaboration easier and more efficient.
Storing documents in the “cloud” simply means they are stored on a computer server at a location different than where the owner of the files is located.
Technology has had a major impact on the world, enabling innovation through the sharing of resources and computational artifacts.
It also allows us to virtually meet with people from anywhere.
It is helping us on the path of becoming a true global society.
The impact of the digital divide includes access to information, knowledge, markets, and different cultures.
Bias: which is intentional or unintentional prejudice for or against certain groups of people, shows up in computing innovations too.
Humans: write the algorithms, and our biases can make their way into the algorithms and the data used by innovations without us realizing it.
Artificial intelligence programs: are used more and more in ways such as screening applications of job candidates, determining if a person merits credit to purchase a house, and locating what areas have more crime.
Crowdsourcing: allows people to share information and ask the “crowd”— anyone who accesses the site—for feedback, to help solve problems, find employment, or for funding.
Another use of crowdsourcing is when scientists share data and ask nonscientists, or “citizen scientists,” to look for and report on patterns or other interesting features of the data or to “donate” computer time during periods of time their machine is inactive.
This helps to “scale up” processing capability at little to no cost to the organization seeking the resources.
Anything a person creates, including any computational artifacts created with a computer, is the intellectual property of that person.
Material created by someone else that you use in any way should always be cited.
Peer-to-peer networks exist that are used to illegally share files of all types.
Devices that continually monitor and collect data, such as a voice-activated device we install or video cameras used for facial recognition posted in our communities, can have legal and/or ethical issues.
Creative Commons: provides a way for creators of software, images, music, videos, and any computational artifact to share their creations with stipulations for sharing and permission from the author clearly indicated.
Digital data: is easy to find, copy, and paste, so ensuring you have written permission from the creator or owner is important.
Not only do we have access to data but to software as well.
Open source software: is software that is freely shared, updated, and supported by anyone who wants to do so.
The availability of this software for everyone has greatly expanded people’s abilities to participate in a variety of tasks that many would not have been able to participate in otherwise.
The sharing of huge amounts of public data by organizations, such as the U.S. government, provides the opportunity for anyone to search for information or to help solve problems.
In addition, the availability of open databases in a variety of fields—including science, entertainment, sports, and business—has benefited people everywhere.
Social media sites: as well as search engines publish what the most frequent searches and posts are about.
Our browsers keep a list of our most frequently visited sites on their home page to help us out.
The search engines are able to identify when more people than usual are watching a video or searching for a topic.
Analytics: identify trends for marketing purposes and help businesses determine what and where customers are searching for their products and their competitors’ products, how long an item sits in a virtual shopping cart, and when people buy.
Data mining: is a field of study that analyzes large datasets.
Machine learning: is a subset of data mining.
Machine learning uses algorithms to analyze data and predict behavior and is used in Artificial Intelligence (AI).
Any information that identifies you is considered Personally Identifiable Information (PII).
It includes data such as your address, age, or social security number.
It also includes data about you, such as your medical or financial information.
Our PII information is also used by websites to show us certain information or related topics based on our prior visits.
Digital footprints and fingerprints: are the trail of little pieces of data we leave behind as a sign of our presence as we go through our daily lives.
Many people willingly provide personal information to sites to gain access or privileges, whether it’s through sports teams, shopping, or restaurants.
Their data is stored and may be sold with or without their knowledge or permission.
Many web browsers now have “incognito” or “private” modes so that web searches and file downloads are not recorded in the web history.
Some web browsers attest that they do not track and retain your search data.
Many aspects of our lives are much easier today because of the easy access to all sorts of sites and information that the Internet provides.
This can range from shopping, entertainment, and sports sites to price comparisons.
Cybersecurity: has a global impact because now anyone from anywhere can attempt to gain unauthorized entry to someone else’s computer, data, servers, or network.
The security of our data deals with the ability to prevent unauthorized individuals from gaining access to it and preventing those who can view our data from changing it.
Strong passwords: help block those trying to gain unauthorized access.
Multifactor authentication: is another layer that is increasingly used.
Cybersecurity: protects our electronic devices and networks from attacks and unauthorized use.
These attacks can come in many forms and can have a major impact on those affected.
Different types of attacks cause different problems.
Data: may be damaged or the device may be used to further spread the malware.
Phishing: attacks create e-mail and/or websites that look a legitimate hoping to induce a person to click on the malicious link.
Computer viruses: are like human viruses.
They attach themselves to, or are part of, an infected file.
Keylogging software: is a form of malware that captures every keystroke and transmits it to whomever planted it.
Cryptography: is the writing of secret codes.
Encryption: is converting a message to a coded format.
Deciphering: the encrypted message is called decryption.
Security: also relates to encrypting data before it is transmitted to ensure it remains secure if it is intercepted during transmission.
Public key encryption: uses open standards, meaning the algorithms used are published and available to everyone and are discussed by experts and interested parties and known by all.
The key is what keeps information secret until the person it is intended for decrypts it.
The Internet is based on a “trust” model.
This means that digital certificates can be purchased from Certificate Authorities (CAs), which identify trusted sites.
They issue certificates that businesses, organizations, and individuals load to their websites.
The certificates verify to web browsers that the encryption keys belong to the business, thereby enabling online purchases and the sending and receiving of secure documents.