Introduction to Databases – Lecture 1-2 Review

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/86

flashcard set

Earn XP

Description and Tags

80 question-and-answer flashcards covering the entire lecture: data growth, definitions, file vs. database systems, DBMS services, data modeling, development life cycle, architectures, advantages, costs, and the Pine Valley Furniture case.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

87 Terms

1
New cards

How much new data did global enterprises store in 2010?

Over 7 exabytes.

2
New cards

What strategic benefit can U.S. retail firms gain from data, according to estimates?

Up to a 60 % increase in net margin.

3
New cards

Which cancer center uses IBM Watson for evidence-based oncology recommendations?

Memorial Sloan-Kettering Cancer Center.

4
New cards

What database-driven tactic does a leading fast-food chain use at its drive-thru?

Video data adjust digital menu boards based on line length.

5
New cards

Define a database.

An organized collection of logically related data.

6
New cards

Differentiate data and information.

Data are raw facts; information is processed data that increases knowledge.

7
New cards

What is metadata?

Data that describe the properties, context, and characteristics of end-user data.

8
New cards

Give two examples of unstructured data.

Tweets and video clips (other valid examples: e-mails, images, GPS).

9
New cards

What are the three Vs of big data?

Volume, Variety, and Velocity.

10
New cards

Name two shortcomings of traditional file processing systems.

Program-data dependence and extensive data duplication.

11
New cards

What is program-data independence?

The separation of data descriptions (metadata) from application programs.

12
New cards

Which file system drawback causes heavy maintenance loads consuming up to 80 % of IS budgets?

Excessive program maintenance due to data duplication and dependence.

13
New cards

What is an entity in a data model?

An object about which information is stored, such as CUSTOMER or ORDER.

14
New cards

Define an attribute in database terminology.

A specific piece of information captured about an entity (e.g., CustomerName).

15
New cards

What type of relationship exists when a customer can place many orders?

One-to-Many (1:M).

16
New cards

How does a relational database link entities?

By storing common fields (keys) in related tables.

17
New cards

Give four key services provided by a DBMS.

Data access control, integrity enforcement, concurrency control, and backup/restore.

18
New cards

List two advantages of the database approach over file systems.

Improved data consistency and better data sharing.

19
New cards

What planned technique controls redundancy in databases?

Planned data redundancy—recording each fact once within an integrated structure.

20
New cards

Why can poor planning negate database benefits?

Without sound design, even good DBMS software cannot prevent redundancy, inconsistency, or conflict.

21
New cards

Mention one cost of adopting the database approach.

Need for specialized personnel such as database administrators.

22
New cards

What are conversion costs in database projects?

Expenses involved in moving legacy systems to modern database technology.

23
New cards

Contrast schema-on-write with schema-on-read.

Schema-on-write fixes structure before storage (relational/data warehouse); schema-on-read applies structure when data are accessed (big data).

24
New cards

Which three book chapters focus on transactional systems according to the framework?

Chapters 2–8.

25
New cards

What repository component stores extended metadata?

The centralized repository.

26
New cards

Who are typical data administrators?

Professionals responsible for overall data resource planning and policy.

27
New cards

Purpose of enterprise data modeling?

Define the scope and content of organizational data at a high level.

28
New cards

Name the five major SDLC phases.

Planning, Analysis, Design, Implementation, and Maintenance.

29
New cards

What is logical database design?

Transforming conceptual schema into a technology-specific logical schema (e.g., relational tables).

30
New cards

Describe Rapid Application Development (RAD).

Iterative cycles of analysis, design, and implementation to deliver systems quickly.

31
New cards

How does prototyping support database development?

By building working models that users review and refine iteratively.

32
New cards

State one principle of Agile development.

Responding to change over following a rigid plan.

33
New cards

What are the three levels of ANSI/SPARC architecture?

External, Conceptual, and Internal schemas.

34
New cards

Which schema represents user-specific views?

External schema.

35
New cards

What is stored in a physical schema?

Details of how data are stored on secondary storage in a given DBMS.

36
New cards

Role of a database architect?

Set data standards and ensure data quality across projects.

37
New cards

Which 1970 innovation made relational databases possible?

E. F. Codd’s relational model.

38
New cards

Give one disadvantage of hierarchical and network DBMS models.

Limited data independence.

39
New cards

What language is standard for relational data retrieval?

SQL (Structured Query Language).

40
New cards

Why did object-oriented databases see limited adoption?

Complexity and lack of clear advantages over relational systems for most tasks.

41
New cards

Name two popular NoSQL systems.

MongoDB and Apache Cassandra.

42
New cards

How do in-memory databases improve performance?

By storing data entirely in RAM, reducing disk I/O latency.

43
New cards

Define ad hoc querying.

Users issue spontaneous database queries using the DBMS interface.

44
New cards

Difference between client and database server in client/server architecture?

Client runs the user interface; server runs the DBMS and stores data.

45
New cards

What is a personal database’s main limitation?

Difficulty sharing data with other users.

46
New cards

Why use multi-tier architecture?

To separate presentation, business logic, and data layers for scalability and maintainability.

47
New cards

What are data warehouses primarily used for?

Historical data analysis and decision support.

48
New cards

Describe a data lake.

A repository storing vast amounts of raw, heterogeneous data without predefined schema.

49
New cards

What two phases did Pine Valley Furniture follow to add Internet tech?

First an intranet, then a web interface for certain applications.

50
New cards

Which prototyping tool did Pine Valley Furniture choose for personal databases?

Microsoft Access.

51
New cards

Why did Chris create indexes on attributes with >10 distinct values?

To improve query performance for commonly filtered columns.

52
New cards

How often are Pine Valley’s marketing support tables rebuilt?

Weekly, every Sunday night.

53
New cards

Give one decision-support question current Pine Valley databases cannot answer easily.

“Who are our 10 largest customers and what are their buying patterns?”

54
New cards

What technology might Pine Valley implement for better decision support?

A data warehouse or data lake.

55
New cards

Define operational (transactional) systems.

Databases supporting day-to-day business transactions.

56
New cards

What is an informational system?

A database environment (e.g., warehouse/big data) used for analytics and decision making.

57
New cards

What does OLAP stand for?

Online Analytical Processing.

58
New cards

List two integrity services of a DBMS.

Enforcing data accuracy constraints and handling concurrency control.

59
New cards

What is planned data redundancy’s performance trade-off?

Sometimes duplicating data can improve access speed but must be controlled centrally.

60
New cards

Why is database recovery important?

To restore data to a correct state after hardware, software, or human failure.

61
New cards

Explain ‘duplication of data’ problem in file systems.

Separate programs store the same data in multiple files, wasting space and risking inconsistency.

62
New cards

What are indexes in databases?

Auxiliary structures that speed up data retrieval based on key columns.

63
New cards

Name three popular Agile methodologies.

Scrum, Extreme Programming (XP), and Feature-Driven Development (FDD).

64
New cards

What is a conceptual schema’s key characteristic?

It is technology-independent, integrating all external views into one model.

65
New cards

How does data scrubbing improve data quality?

By detecting and correcting or removing erroneous or inconsistent data.

66
New cards

Define concurrency control.

DBMS mechanisms ensuring simultaneous users can safely access and update data without conflicts.

67
New cards

Give an example of multimedia data.

A sound clip attached to a customer support ticket.

68
New cards

Why might organizational conflict arise in shared databases?

Departments must agree on data definitions, ownership, and stewardship.

69
New cards

What is the primary key?

A uniquely identifying attribute (or set) for each row in a table.

70
New cards

Difference between enterprise and departmental databases?

Enterprise databases span the whole organization; departmental (multi-tier) serve a specific unit.

71
New cards

What does ‘data governance’ encompass?

Policies and procedures for managing data quality, security, and ownership enterprise-wide.

72
New cards

Which SDLC phase loads data and trains users?

Implementation phase.

73
New cards

Why are Excel files compared to traditional file systems?

They store data in isolated files prone to duplication and lack enforced relationships.

74
New cards

What is normalization?

A process for organizing relational tables to reduce redundancy and improve integrity.

75
New cards

State one reason cloud computing influences database use.

It offers managed database services, reducing on-premise infrastructure needs.

76
New cards

What is a data mart?

A subset of a data warehouse focused on a particular business area.

77
New cards

Define enterprise resource planning (ERP).

An integrated system managing core business processes, heavily reliant on databases.

78
New cards

Explain ‘schema’ in database context.

The structural design of a database, including tables, attributes, and relationships.

79
New cards

Why is metadata crucial for avoiding data misinterpretation?

It clarifies meaning, format, source, and usage, preventing confusion among similar data items.

80
New cards

List two graphical data modeling tools.

Entity-Relationship (ER) diagrams and Unified Modeling Language (UML) class diagrams.

81
New cards

What is client/server’s benefit for maintenance?

Changes in business logic can be made at the server without altering each client.

82
New cards

Which model organizes data as cubes suited for OLAP?

Multidimensional database model.

83
New cards

Purpose of business rules in data modeling?

To capture constraints governing relationships and data validity.

84
New cards

What does ‘data independence’ allow developers to do?

Change the database structure without altering application programs.

85
New cards

How does a backup differ from recovery?

Backup is the copy; recovery is the process of restoring from the copy after failure.

86
New cards

Why are NoSQL databases favored for IoT data?

They scale horizontally and handle diverse, rapidly arriving data.

87
New cards

What is the function of a repository in CASE tools?

Stores models, definitions, and code fragments for reuse and consistency.