* Capability of a system to change in size & scale to meet new demands * An important consideration when working with data sets, as the computational capacity of a system affects how data sets can be processed & stored.
2
New cards
program
* Processes information to allow users to gain insight & knowledge about data * Used to filter & clean digital data * Combines data sources, clusters data, & classifies data * Allows patterns to emerge
3
New cards
API (application programming interface)
* Defines how other programs can communicate with it & use it * Useful for real-time data that changes frequently
4
New cards
data
* Provides opportunities for identifying trends, making connections, & addressing problems * Digitally processed data may show correlation between variables
5
New cards
metadata
* Data about data * Changes & deletions made to metadata do not change the primary data. * Allows data to be structured & organized * Used for finding, organizing, & managing information * Increases the effective use of data or data sets by providing additional information
6
New cards
data abstraction
* Manages complexity by giving a collection of data a name without referencing the specific details of the representation. * Provides a separation between the abstract properties of a data type & the concrete details of its representation
7
New cards
big data
* Data sets that are too large to fit on a normal computer/be processed by a standard spreadsheet/database program * May require parallel systems * Size of a data set affects the amount of information that can be extracted from it
8
New cards
data science
Manipulation of large data to extract information & visualize results
9
New cards
cleaning data
* Makes data uniform without changing its meaning * Data translation & transformation may be necessary to convert data from one format to another
10
New cards
Data-Information-Knowledge-Wisdom pyramid
* Data is basic facts or figures
* Information is data that has been organized or visualized
* Knowledge extracts generalizations from information
11
New cards
correlation
* Statistical measure that indicates that two or more variables fluctuate together * Does not equal causation (cause & effect relationship)
12
New cards
spreadsheet
* Document where the data is arranged in rows & columns * Helps organize & find trends in information * Allow formulas to be used to make calculations & charting capabilities * Programs can also be used to filter & clean digital data
13
New cards
cell
Box in a spreadsheet identifiable by its column letter & row number
14
New cards
JSON (Javascript Object Notaion)
* Standard data file format used widely on the web * Data objects are represented using lists & attribute-value pairs
15
New cards
GeoJSON
Standard agreed-upon format for geographical information used on the web & in data files
16
New cards
CSV (comma separated values)
Simple text format for data files to put each row on a separate line with the column separated by commas
17
New cards
CloudDB
* Web-based database service used to store & retrieve data values located on the web * Data an easily be shared with other devices & users * Data persists between uses of the app
18
New cards
synchronous event
* Performed instantaneously * Access to data is immediate * Good for sharing data between uses of the app on the same device * Not good for sharing data among users on different devices
19
New cards
asynchronous event
* Not performed instantaneously * Storing & retrieving data is not immediate * Program must request the data operation so the CloudDB can signal the program when it is completed
20
New cards
machine learning
* Algorithms that learn intelligent behavior from training data * Allows computer programs to learn & improve on their own by being given examples of correct & incorrect solutions
21
New cards
deep learning
uses neural networks with many layers to learn data representations on its own from massive amounts of data.
22
New cards
neural network
computer system modeled on simple neurons in the brain
23
New cards
AI
computer program that simulates human intelligence
24
New cards
algorithmic bias
* Systematic errors in a computer system that create unfair outcomes * Caused by algorithm design or how the data used by the program is collected/used to train the algorithm.
25
New cards
feedback loop
amplification of what happened in the past regardless if it is negative
26
New cards
copyright
* Grants the creator of an original work exclusive rights for its use & distribution * Public gets the benefit of having & using the work without restriction after the monopoly has expired.
27
New cards
secondary infringement
* Idea that even if the accused party was not directly liable for copyright infringement, they can still be held liable for playing an indirect role in allowing others to do so
28
New cards
contributory infringement
Service consciously supplies the means for others to infringe.
29
New cards
vicarious infringement
Service profits on the known infringement of its users.
30
New cards
inducement
Service encourages its users to infringe copyright.
31
New cards
DMCA (Digital Millenium Copyright Act)
Criminalizes production & dissemination of technology or services intended to circumvent measures that control access to copyrighted works
32
New cards
anti-circumvention provisions (DMCA)
prohibits individuals from circumventing digital blocks placed by copyright holders to restrain the use of their work.
33
New cards
safety harbor provisions (DMCA)
* Protects online service providers from the liability of the actions of its users * Cannot be held accountable for the behaviors of a user who is participating in infringement.
34
New cards
Creative Commons
* Enables the free distribution of copyrighted works * Used when the content creator wants to give others the right to share, use, & build upon the work they have created
35
New cards
DRM (Digital Rights Management)
Various access control technologies used to restrict usage of proprietary hardware & copyrighted work
36
New cards
Copyright Alert System
Employs third party companies to monitor peer-to-peer networks for infringing activity, reporting their findings to the ISP.
37
New cards
Fair Use
limited use of copyrighted material without having permission from the copyright holder
38
New cards
Content ID (YouTube)
* Collects the media from the copyright holders & puts the reference files in an enormous database. * When a video is uploaded to YouTube, it is compared against every file in that database to see if there is a match * If there is a match, it is up to the copyright holder to decide what to do with the video.
39
New cards
open access
unrestricted access & unrestricted reuse
40
New cards
open access materials
* Online research output free of restrictions on access & use, i.e. copyright or license
41
New cards
open source software
programs that are made freely available & may be redistributed & modified
42
New cards
open source license
* Allows free & “open” access to source code or blueprint/design * Original work can then be modified & redistributed as long as the original author’s integrity is still intact under the same license.
43
New cards
centralized network
resources & workload are coordinated/managed by a centralized computer (server)
44
New cards
decentralized network
allocation of resources & workload are distributed to individual devices on a network
45
New cards
peer-to-peer
* Computers act as both clients & servers, requesting sources & providing them. * Server needs to communicate only a tiny amount of directory information * Large network load for transmitting the files is distributed over the Internet connections of all the users.