Databases

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/395

flashcard set

Earn XP

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

396 Terms

1
New cards

What do ACID properties do?

ensure that the transaction get executed successfully and its effect permanently stored in the database.

2
New cards

• If the transaction is rolled back, it must return the database to its last consistent state before the update.

3
New cards

Describe situations suitable for data warehousing

Strategic Planning

4
New cards
5
New cards

• Strategic planning is a review and planning process that is undertaken to make thoughtful decisions about an organization's future in order to ensure its success.

6
New cards

By following a strategic planning process, an organizations can improve business outcomes and avoid taking on unanticipated risks due to lack of foresight.

7
New cards

• One key item the organization would need to plan is data.

8
New cards

• A data warehouse provides data necessary for this

9
New cards

Business Modelling

10
New cards
11
New cards

At its simplest, a business model is a specification describing how an organization fulfills its purpose. All business processes and policies are part of that model. • A business model answers the following questions: Who is your customer, what does the customer value, and how do you deliver value at an appropriate cost?

12
New cards

• Data in a data warehouse largely influence how a business is modeled

13
New cards

Explain the need for ETL processes in data warehousing.

• ETL is ideal when the

14
New cards

data has to be integrated from different source systems

15
New cards

source system have data in different formats

16
New cards

process has to be repeated severally

17
New cards

Associations

• In association, a pattern is discovered based on a relationship between items in the same transaction.

18
New cards

• That's is the reason why association technique is also known as relation technique.

19
New cards

• The association technique is used in market basket analysis to identify a set of products that customers frequently purchase together

20
New cards

Describe situations that benefit from

21
New cards

data mining.

• Database analysis and decision support • Market analysis and management

22
New cards

• target marketing, customer relation management, market basket analysis, cross selling, market segmentation

23
New cards

• Risk analysis and management

24
New cards

• Forecasting, customer retention, improved underwriting, quality control, competitive analysis

25
New cards

• Fraud detection and management

26
New cards

Difference between clustering and classification

• Supervised learning (classification)

27
New cards

• Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations

28
New cards

• New data is classified based on the training set

29
New cards
30
New cards

• Unsupervised learning (clustering)

31
New cards

• The class labels of training data is unknown

32
New cards

• Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data

33
New cards

Forecasting

• This discovers relationships between independent variables and the predicted variables from past occurrences, and exploiting them to predict the unknown outcome.

34
New cards

• For instance, the prediction analysis technique can be used in sale to predict profit for the future if we consider sale is an independent variable, profit could be a dependent variable.

35
New cards

Sequential Patterns

• Sequential patterns analysis seeks to discover or identify similar patterns, regular events or trends in transaction data over a business period.

36
New cards

• In sales, with historical transaction data, businesses can identify a set of items that customers buy together at different times in a year. Then businesses can use this information to recommend customers buy it with better deals based on their purchasing frequency in the past.

37
New cards

Classifications

• Classification is a classic data mining technique based on machine learning (computer systems that can learn from data).

38
New cards

• Basically classification is used to classify each item in a set of data into one of predefined set of classes or groups.

39
New cards

• Classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics.

40
New cards

• In classification, the software developed can learn how to classify the data items into groups.

41
New cards

Data

Collection of raw facts and figures

42
New cards

Information

This is processed data within a context. Processing may involve sorting, selection, arithmetic manipulations, interpretation, summarizing

43
New cards

Differences between data and information with examples

Data:

44
New cards
  1. Usually meaningless and difficult to understand because of lack of context
45
New cards
  1. Usually serves as input to processing systems
46
New cards
  1. Almost useless in decision making
47
New cards
  1. Example: Statistics, numbers, characters, images
48
New cards

Information:

49
New cards
  1. Usually meaningful and easy to understand and interpret since there is context
50
New cards
  1. Usually is the output of some processing
51
New cards
  1. Useful in decision making
52
New cards
  1. Examples: reports, pay slips, bills
53
New cards

Database

a collection of data and information that is organized so that it can easily be accessed, managed, and updated.

54
New cards

Information System

collection of technical and human resources providing storage, computing, distribution, and communication for the information an organization needs.

55
New cards

Database state

the data in a database at a particular time

56
New cards

Why are databases beneficial

When designed properly, databases help to:

57
New cards
  1. Find information quickly because searching for info or relationships between info is much faster using a database. Because of things like search queries, using a database would be more time saving than using a book
58
New cards
  1. More than one user can have access to the data at once. This is great for organizations with several employees who need to access the same data at once.
59
New cards
  1. With a well designed database, data redundancy is avoided
60
New cards
  1. Flexibility, as in using different views, or custom data representations so that the user can view the data that benefits them most. It can be referenced by many different applications.
61
New cards
  1. Longevity, as in it can be viewed in a DBMS software as it is released.
62
New cards

Database transaction

A logical unit of work in a database performed by a DBMS on a database which reads or updates the contents of a database.

63
New cards

Aggregation

A database operation that summarizes multiple rows into a single row.

64
New cards

Eg. Counting up the number of rows with the same name is a "count" aggregation.

65
New cards

What does an information system consist of?

software, hardware, people, data, and procedures. Thus, a database is a component of an information system

66
New cards

How do transactions maintain data consistency

• During the transaction there would be instances where certain values would be in an inconsistent state.

67
New cards

• For example when the money to be paid has been deducted but not yet given to the customer, his balance is not correct.

68
New cards

• To ensure integrity and consistency, COMMIT and ROLLBACK commands are used

69
New cards

• If a transaction completes successfully, it is said to have committed i.e the necessary updates are made.

70
New cards

• The database reaches a new consistent state.

71
New cards

• On the other hand, if the transaction does not execute successfully, the transaction is aborted.

72
New cards

• If a transaction is aborted, the database must be restored to the consistent state it was in before the transaction started. Such a transaction is rolled back or undone.

73
New cards

Process of transactions maintaining data consistency

The general way transactions are executed is shown

74
New cards

BEGIN TRANSACTION

75
New cards

do task 1

76
New cards

do task 2

77
New cards

do task 3

78
New cards

COMMIT ON ERROR

79
New cards

ROLLBACK

80
New cards

END TRANSACTION Changes are committed if all tasks 1,2,3 are successful. The transaction is rollback if any of the tasks fails. This ensures the database is always in a consistent state.

81
New cards

Data integrity

maintenance of, and the assurance of the accuracy and consistency of, data over its entire life-cycle, and is a critical aspect to the design, implementation and usage of any system which stores, processes, or retrieves data.

82
New cards

Data concurrency

when a database has the ability to allow multiple users to carry out transactions on a database at one time

83
New cards

Eg. When a particular user can view a file, like a srpeadsheet, but not edit the data

84
New cards

ACID properties of a transaction

Atomicity, Consistency, Isolation, Durability

85
New cards

Atomicity

A transaction is an indivisible unit that is either performed in its entirety or is not performed at all. All tasks must succeed together or fail together.

86
New cards

Consistency

The results of a transaction must conform to existing constraints in the database. The database must always be left in a valid state after a transaction.

87
New cards

• It is the responsibility of both the DBMS and the application developers to ensure consistency.

88
New cards

Isolation

• Transactions execute independently of one another. The partial effects of incomplete transactions should not be visible to other transactions.

89
New cards

• It is the responsibility of the concurrency control subsystem to ensure isolation.

90
New cards

Durability

The effects of a successfully completed transaction must be permanently recorded in the database and must not be lost because of a subsequent failure.

91
New cards

• It is the responsibility of the recovery subsystem to ensure durability

92
New cards

Database operations are…

retrievals (queries), Updates (modification)

93
New cards

Retrievals/queries

This involves selecting fields and records that satisfy the needs of a particular user.

94
New cards

Updates/modifictations

This changes the state of the database. There are three operations which results in changes to databases:

95
New cards

• Insert: used to add one or more tuples (records) to a relation (table)

96
New cards

• Update: used to change the values of some attributes in existing tuples.

97
New cards

• Delete: used to remove/delete records from relations.

98
New cards

Purposes of database transactions

Provide reliable units of works that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops (completely or partially) and many operations upon a database remain uncompleted, with unclear status.

99
New cards

• To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the program's outcome are possibly erroneous.

100
New cards

What is the role of data validation?

check on input data to ensure that it is reasonable.