1/61
This set of flashcards covers vocabulary and key technical concepts from the 'Data Systems and Risk' lecture notes, including data types, analytics, database management, emerging technologies, and risk management frameworks.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Data
Raw facts and figures collected from observations or measurements, serving as the fundamental input for processing and analysis to generate useful insights.
Qualitative Data
Non-numerical information that describes qualities or characteristics and is used for categorical analysis.
Nominal Data
A type of qualitative data representing categories without any inherent order, such as gender or blood type.
Ordinal Data
A type of qualitative data representing categories with a meaningful order or rank but without defined intervals, such as satisfaction levels.
Quantitative Data
Numerical information that quantifies something and allows for mathematical operations.
Discrete Data
Countable quantitative data consisting of whole numbers which cannot be further divided into smaller fractions.
Continuous Data
Measurable quantitative data that can take any value within a range and can be divided into fractions including decimals.
Structured Data
Data organized in a fixed format, usually in rows and columns like a table, and typically stored in databases.
Unstructured Data
Information that does not have a fixed format or table, such as text, images, videos, or audio files, making it harder to process with regular tools.
Sentiment Analysis
A process of using computational tools to identify and classify the emotional tone of text, such as positive, negative, or neutral feedback.
Natural Language Processing (NLP)
A field of artificial intelligence that enables computers to understand, interpret, and manipulate human language.
Semi-structured Data
Data that is partly organized with tags or markers rather than fitting neatly into database tables; examples include emails and web pages.
XML (Extensible Markup Language)
A markup language that uses tags to define data in a flexible, hierarchical form instead of using tables.
JSON (JavaScript Object Notation)
A lightweight data format used to store and exchange data that is easy for humans to read and computers to process.
Primary Data
Original information collected directly for a specific research purpose using direct methods such as surveys or focus groups.
Secondary Data
Information already collected by someone else for a different purpose, such as government statistics or industry reports.
Internal Data
Information generated within a company from its own operations and systems, like sales records or employee HR files.
External Data
Information that comes from outside the organization, often from third parties or public sources like social media or market reports.
API (Application Programming Interface)
A set of tools that allows different software applications to communicate and enables automated data collection from external platforms.
Data Governance
A system of rules, processes, and responsibilities to manage data throughout its life from creation to disposal.
Data Owners
Senior leaders who make strategic decisions about specific data domains.
Data Stewards
Operational custodians who ensure data accuracy, availability, and compliance with policies.
Metadata
Described as 'data about data', it handles context to improve discovery, understanding, and integration through standardized definitions.
Data Integrity
The process of keeping data accurate, consistent, and reliable throughout its life cycles.
Primary Key
A unique identifier for each row in a database table which must contain unique values and cannot be NULL.
Foreign Key
A field in one table that refers to the primary key in another table to enforce a link between the data.
Zero Trust Architecture
A security approach following the principle of 'never trust, always verify' where users get access only after verifying identity and context.
Homomorphic Encryption
A cryptographic method enabling direct computation on encrypted data, yielding a result that is identical to the operation performed on raw data.
Data Analytics
The process of examining datasets to find trends, patterns, and insights to guide decision-making using summaries or machine learning.
Descriptive Analytics
Analysis that tells 'what happened' in the past by summarizing historical data using averages, percentages, and charts.
Diagnostic Analytics
Analysis that answers 'why did it happen' through techniques like data mining, correlation, and root cause analysis.
Predictive Analytics
Analysis focusing on forecasting future outcomes using machine learning models and statistical algorithms to estimate outcomes.
Prescriptive Analytics
Analysis that recommends the best action to take using optimization techniques and decision-support systems.
Big Data
Massive, ever-growing amounts of data characterized by the 5 Vs: Volume, Velocity, Variety, Veracity, and Value.
Apache Hadoop
A distributed system used for the storage and processing of big data where traditional databases are inadequate.
DBMS (Database Management System)
Software that enables the creation, management, and utilization of databases, ensuring data remains secure and organized.
ACID Model
A set of properties (Atomicity, Consistency, Isolation, Durability) that ensures database transactions are processed reliably.
Logical Data Independence
The ability to change the logical structure of a database (like tables or relationships) without affecting user views or application programs.
Physical Data Independence
The ability to modify physical storage mechanisms (like file structures or indexing) without affecting the logical schema or apps.
Database Normalization
A systematic process of organizing data in a relational database to minimize redundancy and ensure data integrity.
OLTP (Online Transaction Processing)
Systems designed for real-time daily transactions such as sales orders or banking withdrawals.
ETL Process
The multi-stage process of Extracting data from sources, Transforming it through cleaning/standardizing, and Loading it into a warehouse.
Scalability
The capability of a system to handle more work by adding resources like extra servers (Horizontal) or more CPU/RAM (Vertical).
ERP (Enterprise Resource Planning)
An integrated software platform that connects key departments into one unified system to manage a company's main business activities.
Artificial Intelligence (AI)
A branch of computer science focused on creating machines capable of performing tasks normally requiring human intelligence.
IoT (Internet of Things)
A network of physical objects with sensors and software that connect and share data over the internet.
Blockchain
A secure, decentralized digital ledger that records transactions across a network, making the records virtually impossible to alter.
Quantum Computing
Computing leverage principles like superposition and entanglement to perform computations exponentially faster than traditional computers.
Qubit
The basic unit of a quantum computer that can exist in a state of 0, 1, or both simultaneously.
Edge Computing
A distributed paradigm that brings data processing and storage closer to where it is generated to reduce latency.
Robotic Process Automation (RPA)
Technology enabling software robots (bots) to emulate human actions for executing rule-based, repetitive tasks.
Cloud Computing
The on-demand delivery of IT resources like servers, storage, and databases over the internet.
IaaS (Infrastructure as a Service)
Cloud service providing basic computing resources where the user manages the OS and applications.
PaaS (Platform as a Service)
Cloud service providing a platform for building apps; the provider manages the OS and infrastructure while user manages code.
SaaS (Software as a Service)
Cloud service delivering ready-to-use software over the internet where the provider handles maintenance.
Consensus Mechanisms
Rules that help participants agree on the correct version of a blockchain, such as Proof of Work (PoW) or Proof of Stake (PoS).
Fintech
The use of technology to improve and automate financial services, making them faster and more accessible.
RegTech (Regulatory Technology)
A sub-branch of fintech that helps financial institutions follow regulations and manage risks through automation.
Digital Disruption
Occurs when new technologies or business models completely change how an industry works, making old ways obsolete.
Risk Management
A systematic process used to identify, assess, and control threats to protect an organization's assets.
SIEM (Security Information and Event Management)
Technology providing centralized collection, correlation, and analysis of security logs from across a network.
ITGCs (IT General Controls)
Fundamental, broad IT controls that provide a secure and reliable foundation for all IT systems across the entire environment.