1/86
80 question-and-answer flashcards covering the entire lecture: data growth, definitions, file vs. database systems, DBMS services, data modeling, development life cycle, architectures, advantages, costs, and the Pine Valley Furniture case.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
How much new data did global enterprises store in 2010?
Over 7 exabytes.
What strategic benefit can U.S. retail firms gain from data, according to estimates?
Up to a 60 % increase in net margin.
Which cancer center uses IBM Watson for evidence-based oncology recommendations?
Memorial Sloan-Kettering Cancer Center.
What database-driven tactic does a leading fast-food chain use at its drive-thru?
Video data adjust digital menu boards based on line length.
Define a database.
An organized collection of logically related data.
Differentiate data and information.
Data are raw facts; information is processed data that increases knowledge.
What is metadata?
Data that describe the properties, context, and characteristics of end-user data.
Give two examples of unstructured data.
Tweets and video clips (other valid examples: e-mails, images, GPS).
What are the three Vs of big data?
Volume, Variety, and Velocity.
Name two shortcomings of traditional file processing systems.
Program-data dependence and extensive data duplication.
What is program-data independence?
The separation of data descriptions (metadata) from application programs.
Which file system drawback causes heavy maintenance loads consuming up to 80 % of IS budgets?
Excessive program maintenance due to data duplication and dependence.
What is an entity in a data model?
An object about which information is stored, such as CUSTOMER or ORDER.
Define an attribute in database terminology.
A specific piece of information captured about an entity (e.g., CustomerName).
What type of relationship exists when a customer can place many orders?
One-to-Many (1:M).
How does a relational database link entities?
By storing common fields (keys) in related tables.
Give four key services provided by a DBMS.
Data access control, integrity enforcement, concurrency control, and backup/restore.
List two advantages of the database approach over file systems.
Improved data consistency and better data sharing.
What planned technique controls redundancy in databases?
Planned data redundancy—recording each fact once within an integrated structure.
Why can poor planning negate database benefits?
Without sound design, even good DBMS software cannot prevent redundancy, inconsistency, or conflict.
Mention one cost of adopting the database approach.
Need for specialized personnel such as database administrators.
What are conversion costs in database projects?
Expenses involved in moving legacy systems to modern database technology.
Contrast schema-on-write with schema-on-read.
Schema-on-write fixes structure before storage (relational/data warehouse); schema-on-read applies structure when data are accessed (big data).
Which three book chapters focus on transactional systems according to the framework?
Chapters 2–8.
What repository component stores extended metadata?
The centralized repository.
Who are typical data administrators?
Professionals responsible for overall data resource planning and policy.
Purpose of enterprise data modeling?
Define the scope and content of organizational data at a high level.
Name the five major SDLC phases.
Planning, Analysis, Design, Implementation, and Maintenance.
What is logical database design?
Transforming conceptual schema into a technology-specific logical schema (e.g., relational tables).
Describe Rapid Application Development (RAD).
Iterative cycles of analysis, design, and implementation to deliver systems quickly.
How does prototyping support database development?
By building working models that users review and refine iteratively.
State one principle of Agile development.
Responding to change over following a rigid plan.
What are the three levels of ANSI/SPARC architecture?
External, Conceptual, and Internal schemas.
Which schema represents user-specific views?
External schema.
What is stored in a physical schema?
Details of how data are stored on secondary storage in a given DBMS.
Role of a database architect?
Set data standards and ensure data quality across projects.
Which 1970 innovation made relational databases possible?
E. F. Codd’s relational model.
Give one disadvantage of hierarchical and network DBMS models.
Limited data independence.
What language is standard for relational data retrieval?
SQL (Structured Query Language).
Why did object-oriented databases see limited adoption?
Complexity and lack of clear advantages over relational systems for most tasks.
Name two popular NoSQL systems.
MongoDB and Apache Cassandra.
How do in-memory databases improve performance?
By storing data entirely in RAM, reducing disk I/O latency.
Define ad hoc querying.
Users issue spontaneous database queries using the DBMS interface.
Difference between client and database server in client/server architecture?
Client runs the user interface; server runs the DBMS and stores data.
What is a personal database’s main limitation?
Difficulty sharing data with other users.
Why use multi-tier architecture?
To separate presentation, business logic, and data layers for scalability and maintainability.
What are data warehouses primarily used for?
Historical data analysis and decision support.
Describe a data lake.
A repository storing vast amounts of raw, heterogeneous data without predefined schema.
What two phases did Pine Valley Furniture follow to add Internet tech?
First an intranet, then a web interface for certain applications.
Which prototyping tool did Pine Valley Furniture choose for personal databases?
Microsoft Access.
Why did Chris create indexes on attributes with >10 distinct values?
To improve query performance for commonly filtered columns.
How often are Pine Valley’s marketing support tables rebuilt?
Weekly, every Sunday night.
Give one decision-support question current Pine Valley databases cannot answer easily.
“Who are our 10 largest customers and what are their buying patterns?”
What technology might Pine Valley implement for better decision support?
A data warehouse or data lake.
Define operational (transactional) systems.
Databases supporting day-to-day business transactions.
What is an informational system?
A database environment (e.g., warehouse/big data) used for analytics and decision making.
What does OLAP stand for?
Online Analytical Processing.
List two integrity services of a DBMS.
Enforcing data accuracy constraints and handling concurrency control.
What is planned data redundancy’s performance trade-off?
Sometimes duplicating data can improve access speed but must be controlled centrally.
Why is database recovery important?
To restore data to a correct state after hardware, software, or human failure.
Explain ‘duplication of data’ problem in file systems.
Separate programs store the same data in multiple files, wasting space and risking inconsistency.
What are indexes in databases?
Auxiliary structures that speed up data retrieval based on key columns.
Name three popular Agile methodologies.
Scrum, Extreme Programming (XP), and Feature-Driven Development (FDD).
What is a conceptual schema’s key characteristic?
It is technology-independent, integrating all external views into one model.
How does data scrubbing improve data quality?
By detecting and correcting or removing erroneous or inconsistent data.
Define concurrency control.
DBMS mechanisms ensuring simultaneous users can safely access and update data without conflicts.
Give an example of multimedia data.
A sound clip attached to a customer support ticket.
Why might organizational conflict arise in shared databases?
Departments must agree on data definitions, ownership, and stewardship.
What is the primary key?
A uniquely identifying attribute (or set) for each row in a table.
Difference between enterprise and departmental databases?
Enterprise databases span the whole organization; departmental (multi-tier) serve a specific unit.
What does ‘data governance’ encompass?
Policies and procedures for managing data quality, security, and ownership enterprise-wide.
Which SDLC phase loads data and trains users?
Implementation phase.
Why are Excel files compared to traditional file systems?
They store data in isolated files prone to duplication and lack enforced relationships.
What is normalization?
A process for organizing relational tables to reduce redundancy and improve integrity.
State one reason cloud computing influences database use.
It offers managed database services, reducing on-premise infrastructure needs.
What is a data mart?
A subset of a data warehouse focused on a particular business area.
Define enterprise resource planning (ERP).
An integrated system managing core business processes, heavily reliant on databases.
Explain ‘schema’ in database context.
The structural design of a database, including tables, attributes, and relationships.
Why is metadata crucial for avoiding data misinterpretation?
It clarifies meaning, format, source, and usage, preventing confusion among similar data items.
List two graphical data modeling tools.
Entity-Relationship (ER) diagrams and Unified Modeling Language (UML) class diagrams.
What is client/server’s benefit for maintenance?
Changes in business logic can be made at the server without altering each client.
Which model organizes data as cubes suited for OLAP?
Multidimensional database model.
Purpose of business rules in data modeling?
To capture constraints governing relationships and data validity.
What does ‘data independence’ allow developers to do?
Change the database structure without altering application programs.
How does a backup differ from recovery?
Backup is the copy; recovery is the process of restoring from the copy after failure.
Why are NoSQL databases favored for IoT data?
They scale horizontally and handle diverse, rapidly arriving data.
What is the function of a repository in CASE tools?
Stores models, definitions, and code fragments for reuse and consistency.