Databases Final

0.0(0)
studied byStudied by 4 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/79

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

80 Terms

1
New cards

Transaction

Logical unit of work; from a list of things you want to happen: everything for nothing happens

2
New cards

Properties that the relational databases support

ACID - atomicity, consistency, isolation, durability

3
New cards

Atomicity

All operations of a transaction happen or none do

4
New cards

Consistency

Consistent before and and after; maintaining data points in the correct state after transaction

5
New cards

Isolation

Transaction executes as if it were by itself

6
New cards

Durability

Once a change is made, it is permanent and lasts forever

7
New cards

Session

Connection + username, password, db name, hostname

8
New cards

Result set

List of lists returned from a query to driver

9
New cards

Dynamic web pages

Load information that was stored somewhere else (i.e. databases) and are continually updated

10
New cards

Static Web Pages

Information on pages not change

11
New cards

Stateless

For every request, WS asks for your information

12
New cards

Cookies

Little pieces of information that WS sends to browser for identification purposes

13
New cards

SQL Injection Attack

Injection of a separate SQL query via input data from client to application

14
New cards

Prepared statement

Database pre-compiles SQL code and stores results separate from data

15
New cards

Object oridented database

Has user reference databases to find information

16
New cards

ORM (Object Relational Mapping)

Aligns code with database structures, simplifies interaction between relational databases and OOP languages

17
New cards

4 pieces in information to connect to the database

1. hostname of server

2. username

3. password

4. name of database

18
New cards

COALESCE

Returns first non-null value in a list (NULLs to 0)

19
New cards

Tier 2 Architecture

Business logic that consists of a combined webserver and application with a database

20
New cards

Tier 3 Architecture

Business logic that where the webserver does not connect to the database; webserver, application, and database different

21
New cards

Process of sending signal to server

1. App to WIFI router

2. WIFI router to ISP, analyzed by ISP, if can't get there

3. ISP to SIP Trunk

4. SIP Trunk to backbone

5. Backbone to ISP closer to App server

6. ISP to App server

22
New cards

Man in the middle attack

Hacker positions themselves between user and application internet conversations

23
New cards

Cross Site Scripting Attack (XSS)

Hacker injects malicious executable scripts of code via an unsecure link

24
New cards

HTTPS

Hypertext Transfer Protocol Secure

25
New cards

Normalization

Organizing data only once in a database to reduce redundant data

26
New cards

Bad smell

Code that is "off" in certain areas

27
New cards

Redundant data

Same piece of information that is stored or not needed in different areas of the database

28
New cards

Steps to determine amount of redundant data

1. Identify functional dependencies

2. Calculate closure

3. Categorize closure

29
New cards

1st Normal Form

Data attributes of atomic type (= or !=), does not eliminate redundant data

30
New cards

2nd Normal Form

Useless

31
New cards

3rd Normal Form

1. Is closure trivial?

2. Is closure key?

3. If X+=Y AND X is subset of Y, then all attributes that is an element of Y-X is a Candidate Key

32
New cards

Candidate key closure

smallest set of attributes that is a key closure

33
New cards

Boyce-Codd Normal Form

3NF and X should be a superkey for every X->Y

34
New cards

Superkey

An attribute or attributes that uniquely identify each entity in a table.

35
New cards

4th Normal Form

BCNF & exists a simple candidate key; focus is on multi-value dependencies

36
New cards

Simple candidate key

set of 1 that gets every set of attributes

37
New cards

5th Normal Form

3NF & all candidate keys are simple; focus on join dependencies

38
New cards

Domain Key Normal Form

ultimate NF

39
New cards

ER Diagrams

Entity Relationship Diagram; model/design databases to display to the customer

40
New cards

Entities

objects or things on our enterprise, have attributes

41
New cards

Relationships

measure of interaction between entities

42
New cards

Cardinalities

maximum # of times entities can relate to other entities

43
New cards

Primary key

Field that uniquely identifies a given entity in a table, represented by ____

44
New cards

Multi-value attribute

multiple values for specific attribute, represented by [ ]

45
New cards

Derived attribute

An attribute whose values can be calculated from related attribute values, represented by ( )

46
New cards

Composite attribute

An attribute that can be further subdivided into additional attributes, represented by an indent

47
New cards

Generalization

2 or more entities that have more commonalities, common attributes go to superclass

48
New cards

Specialization

Entity divided into sub-entities based on its characteristics

49
New cards

Total

Everything in superclass must be in subclass (abstract)

50
New cards

Partial

Can represent entities other than subclasses

51
New cards

Disjoint

Either or

52
New cards

Overlapping

Can be both

53
New cards

Weak entity

Depends upon another entity to exist in database

54
New cards

Discriminates

(underlined dotted line) tell attributes apart in weak entity

55
New cards

Aggregation

Created when we want a relationship between relations

56
New cards

Notes

Any clarification materials and primary keys for combined entities

57
New cards

Data mining

Analyzing large databases in order to generate patterns (AI)

58
New cards

kNN

Make predictions about a data point based on k closest data points

59
New cards

Classification problem

Know some info but don't know the information we are supposed to predict using neighbors

60
New cards

Regression problem

Take average of closest neighbors information

61
New cards

Leave-One-Out Cross Validation (LOOCV)

Purposely leave out data in order to train model based on the data given later on

62
New cards

Mean squared error

The average of the squared differences between the forecasted and observed values

63
New cards

Root mean squared error (RMSE)

Give indication of how good prediction is with k and n

64
New cards

Clustering

Cluster points that belong together

65
New cards

k-means proces

1. pick k centers (randomly)

2. place data nearest each center

3. compute new center (average all points)

repeat 2./3. until nothing changes

66
New cards

Association rules

1. support - level needed for occurrence to be valid (transactions)

2. confidence - # combination / # antecedent

67
New cards

a priori

Limit of # of large item sets

68
New cards

2 factors for documents relevant keywords

1. Term frequency - contains words a lot of times

2. Inverse document frequency - weight indicating how commonly a word is used - measure of how rare or common a term is across a collection of documents.

69
New cards

Page rank algorithm

Provides a ranking to web pages that should be returned from a web search. Based in large part on how often other web pages link to a given page. (more links = better rank)

70
New cards

Random walks

Determines the probability of each site

71
New cards

Precision

How accurate retrieval is, # relevant docs / # docs

72
New cards

Recall

Found vs. relevant, # relevant docs found / # relevant docs

73
New cards

Search engine optimization (SEO)

Companies pay to get priority in search engines

74
New cards

Big Data properties

1. Volume

2. Velocity

3. Variety

75
New cards

Big Table properties

1. Sparse - only stores information, no NULL values

2. Dynamic Schema - each row has different set of attributes for that row

76
New cards

Map Reduce

uses a parallel distributed algorithm to process large amounts of data

77
New cards

Spark

An open-source, distributed processing system for big data workloads

78
New cards

Resilient distributed dataset (RDD)

Collection of data elements that are partitioned across nodes in a cluster

79
New cards

Dataframe

Collection of RDDs, cant be changed (immutable)

80
New cards

Why use database instead of Spark?

ACID principles not supported (no FK, PK, etc)

Explore top flashcards