1/15
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Numpy
A library used for arithmetic operations that assumes a grid structure, can contain either all numbers or words.
Numplot
A tool used to create charts similar to those in spreadsheet applications.
Pandas
A library for panel data; it provides data structures resembling sheets and allows for various data types unlike numpy.
Series
A one-dimensional array or column of data in pandas.
Row
Represents one individual in a dataset or test.
Conditionals
Also referred to as masks, used in data filtering operations.
Five C’s of data ethics
Framework for responsible data collection: Consent, Clarity, Consistency/Trust, Control/Transparency, and Consequences.
Consent
An agreement between a service user and the data provider, often binary and sometimes sold without consent.
Clarity
The necessity of providing clear information about data usage to users.
Consistency/Trust
The requirement for companies to maintain a reliable adherence to their data use policies.
Control/Transparency
The ability of users to track the use of their data and understand their control over it.
Consequences
Laws and regulations designed to protect individuals' rights regarding their online data.
Big Data
Refers to the massive volume of data generated daily, estimated at 2.5 quintillion bytes.
Signal and noise
The concept that data requires interpretation; it's not self-explanatory and needs context from humans.
Data-driven prediction
The use of data to make informed predictions about future outcomes.
Scatterplot
A type of chart that displays values for typically two variables for a set of data.