Digital Data
Various types of information including text, numbers, images, and audio
Big Data
Extremely large and complex sets of data that are beyond the ability of traditional data processing methods to handle
Name an example of Big Data
Social Media: User Posts, Comments, Likes, Shares, and Interactions
E-commerce: Customer Browsing Behavior, Purchase History, Reviews, Preferences
Healthcare: Electronic Health Records, Medial Imaging, Genomic Data, Patient Monitoring Systems
Structured Data
Type of data that follows a specific organization or format and has a clear and well-defined structure that makes it easy to understand and process
Unstructured Data
Information that doesn’t have a specific organization or format
Semi-Structured Data
Information that falls somewhere between structure and unstructured data; It has some organizational elements but doesn’t adhere to a strict structure like structured data
Distributed Files Systems
Similar to a virtual storage system that spreads your files across multiple computers or servers; Files are also divided into smaller pieces and stored on different machines
Node
Each computer that stores a piece of your data
Data Warehouse
Like a library where you organize and store all your data in a structured way (tables) which make it easier to search, filter, and compare; It is designed to handle lots of data from different sources and make it easier to analyze
Cloud Storage
A virtual storage space that exists on the internet; It’s not a physical place you can touch but rather a network of powerful computers located in different places around the world
Why do we process big data?
To extract meaningful insights and valuable information
Data Collection
Gathering big data from various sources, including social media platforms, sensors, transactions, publicly available data, and crowdsourcing
Crowdsourcing
Concept that involves obtaining ideas, services, or contributions from a large group of people
Data Cleaning
“Cleaning” (Getting rid of the errors, noise, or inconsistencies) in big data
What are the five cleaning steps?
Removing Mistakes
Fixing Missing Parts
Getting Rid of Duplicates
Standardizing Formats
Checking for Inconsistencies
Data Storage
Efficiently storing big data
Data Processing
Applying various techniques and algorithms to extract insights and patterns
What are some examples of data processing?
Statistical Analysis
Machine Learning
Data Mining
Descriptive Statistics
Help us understand the basic characteristics of big data
Correlation Analysis
Helps us understand how two or more variables are related
Regression Analysis
Helps us predict one variable based off of another variable
Hypothesis Testing
Determine if there is a significant difference or relationship between groups or variables
Machine Learning
Teaching computers to learn and make decisions on their own based off the patterns found in the data
What could go wrong with machine learning?
Biased Results
Limited Generalization
Data Dependency
Ethical Concerns
Data Mining
Finding valuable patterns or insights from large amounts of data
What could go wrong with data mining?
Privacy Concerns
Bias and Discrimination
False Discoveries and Misinterpretation
Data Quality and Integrity
Ethical Considerations
Data Visualization
Insights derived from big data being visual presented using charts, graphs, or dashboards
Decision Making
Utilizing the insights gained from big data to make informed decisions
Metadata
Data about data
How is metadata useful?
Organization
Searchability
Data Integrity
Data Sharing
Data security is crucial for protecting information from… and it ensures data…
unauthorized access, breaches, or misuse
confidentiality, integrity, and availability
Privacy refers to an individual’s right to control their what>
Personal Information
What does PII stand for?
Personal Identifying Information
Computing Innovation
Includes hardware or software innovations, networking, user interfaces, or analytical innovations
What kind of biases can exist in computing innovations?
Biased Data
Algorithmic Bias
Human Bias
Lack of Diversity
What efforts can help control and reduce bias in computing innovations?
Diverse and Inclusive Teams
Data Scrutiny
Algorithmic Transparency
Ongoing Evaluation and Testing
Name some of the beneficial effects of computing innovations.
Speed and Efficiency
Information Access
Automation and Productivity
Connectivity
Name some of the harmful of computing innovations.
Security Risks
Dependency and Addiction
Job Displacement
Privacy Concerns
Name some of the legal concerns in computing.
Data Privacy
Intellectual Property
Cybersecurity
Name some of the ethical concerns in computing.
Bias and Discrimination
AI Ethics
Autonomy and Consent
How can you legally use the work of others?
Creative Commons
Open Source
Open Access
Creative Commons
Public copyright license that enables the free distribution of an otherwise copyrighted work
Cybersecurity
Practice of protecting computer systems, networks, and data from unauthorized access, attacks, and damage
Phishing
Attempt to trick a user into divulging their private information
How can you protect yourself from phishing attacks?
Suspicious email address
Suspicious URL
Non-secured HTTP connections
Requests for sensitive information
Urgency and scare tactics
Rogue Access Points
Unauthorized wireless access point set up by attackers to intercept network traffic and gain unauthorized access to you data
Why are rogue access points bad?
Hackers can use a rogue access point to capture sensitive data that is flowing through a network
How can you protect yourself from rogue access points?
Use secure and trusted networks
Verify network names (SSIDs)
Be cautious with sensitive activities
Malware
Malicious software that is unknowingly installed onto a computer
Trojan Malware
Malware disguised as a legitimate software with the purpose of tricking you into executing malicious software on your computer
Spyware
Type of malicious software that is designed to covertly collect information froma user’s device without their knowledge or consent
Adware
Software designed to display advertisements on a user’s device
Ransomware
Type of malware that encrypts a victims’s files or locks their entire computing sustem, making it inaccessible
Keylogger
Keeps track of your keystrokes on your keyboard in an attempt to capture sensitive information like users and passwords
How can you protect yourself from malware attacks?
Antivirus Software
Update Software
Beware of Phishing Attacks
Enable a Pop-Up Blocker
Enable Your Firewall
What does DDoS stand for?
Distributed Denial of Service
Botnet
Created by users downloading malware, and the malware being programed to conduct an attack on the server by sending fake traffic at a coordinated time
Why is cybersecurity important?
Protecting personal information
Prevents theft or fraud
Maintain privacy online
What are some of the potential consequences of cyber attacks?
Identity Theft
Financial Loss
Reputational Damage
Disruption of Services
Data Breaches
Encryption helps safeguard sensitive data from data breaches
Encrypted data is useless without the
Decryption Key
What are the first line of defense against unauthorized access to your computer and personal information?
Passwords
Brute Force Attack
Hackers use automated software that systemically tries all possible combinations of characters until the correct password is found. This method is time consuming.
Man-in-the-Middle Attacks
When an attacker intercepts and alters communication between two parties
Dictionary Attacks
Hackers use software that systemically checks common words, phrases, and dictionary terms as passwords.