Module 6 continued

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/26

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

27 Terms

1
New cards

SeqScan

A full table scan, used when most of the table is needed in results.

2
New cards

Index Scan

scans index, look up tuples in table, using index values directly from memory. Used when a smaller part of table is needed in results.

3
New cards

Bitmap Scan

A hybrid approach that sorts the row locations identified by the index into physical order before reading them to minimize the cost.

4
New cards

VACUUM

Reclaims storage occupied by dead tuples which are not physically removed after deletion or update. It's necessary to do VACUUM periodically.

5
New cards

Logical Backups

contain copies of the database schema and data exported in a SQL script file. They are architecture independent but slower and More portable. (e.g., pg_dump in PostgreSQL).

6
New cards

Physical Backups

copies of the actual data files. They are architecture dependent but faster and Less portable.

7
New cards

Cold (offline) backup

"Database is offline; ""Easiest to implement."""

8
New cards

Hot (online) backup

"Database remains online; ""Crucial for applications that require high availability."" (e.g., pg_basebackup in PostgreSQL using ""Point-in-Time Recovery"" (PITR))."

9
New cards

Privilege Management

"""Databases implement access control by setting access privileges to objects for individual users and/or groups of users."""

10
New cards

GRANT and REVOKE

PostgreSQL Commands to assign and remove privileges on various object types (e.g., SELECT, INSERT, etc).

11
New cards

Advantages of Distributed Databases

Data Redundancy, High Availability (HA), Scalability, Monitoring & Automation.

12
New cards

Data Redundancy

"""Multiple nodes work together to store/process data amongst each other providing data redundancy."" All nodes are ""synchronized (replication)."""

13
New cards

High Availability (HA)

"""In case a server shut down the data/database will still be available."" ""More nodes provides higher availability."""

14
New cards

Scalability

"""Adding more nodes scales the system horizontally (more storage, processing/serving power)."" Load balancing facilitates this."

15
New cards

Monitoring & Automation

"""Indispensable to administer clusters,"" often managed by a ""designated machine"" running automated scripts."

16
New cards

Homogeneous

"""All the sites use same software"" and are ""aware of each other and agree to cooperate."" Appears to the user ""as a single system."""

17
New cards

Heterogeneous

"""Different sites may use different schemas and software,"" leading to ""major problem[s] for query processing"" and ""transaction processing."" Sites ""may not be aware of each other."""

18
New cards

Business Intelligence (BI)

"""Combination of strategies that use data, software, and company information to provide business owners with an overview of business operations at past and present levels in order to adjust and take business decisions."""

19
New cards

Business Analytics (BA)

"""Process by which company’s use historical data combined with software technologies to make predictions and support business decisions."""

20
New cards

data warehouse

Is a specialized data store designed for analytical purposes; they use OLAP technology. It can be used for making consolidated reports, Finding relationships and correlations, and data mining.

21
New cards

OLAP (On-line Analytical Processing)

Is an online analytical processing method used for data analysis, utilizing various databases, complex queries, and de-normalized table structures for planning and decision support.

22
New cards

MOLAP (Multidimensional OLAP)

Earliest OLAP systems used multidimensional arrays in memory to store data cubes.

23
New cards

ROLAP (Relational OLAP)

OLAP facilities were integrated into relational systems, with data stored in a relational database.

24
New cards

HOLAP (Hybrid OLAP)

Store some summaries in memory and store the base data and other summaries in a relational database.

25
New cards

Backend

Any relational database can be used as the be backend or data source for an OLAP implementation.

26
New cards

OLTP (On-line Transaction Processing)

Is a fast, efficient, and standardized system for managing transaction-oriented operations, with a focus on managing large numbers of short transactions and minimizing space requirements.

27
New cards

Data Mining

Is a semiautomatic process involving analyzing large databases to identify patterns, make business decisions, and make predictions, often linked to machine learning.