Task 6: databases (CS)

0.0(0)
studied byStudied by 1 person
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/57

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

58 Terms

1
New cards

Database

Structured collection of data, stored electronically in a computer system

Controlled by a database management system (DBSM)

2
New cards

Flat file database

Contains individual, unrelated tables (collections of data)

e.g., excel

3
New cards

Relational database

Contains multiple tables that share related fields (ID, customer)

Use structured query language (SQL) to store, manipulate and retrieve data

e.g., SQLite

4
New cards

Data v

Raw, unorganised facts that need to be processed

5
New cards

Info

Data that is processed, organised, structured or presented in a given context to make it useful

6
New cards

SQL data types

INTEGER - whole

REAL - decimal

Boolean - true/false

TEXT - string

date

  • TEXT as ISO8601 (“yyyy-mm-dd hh:mm:ss.sss”)

  • INTEGER as Unix Time (number of seconds since 1970-01-01 00:00:00 UTC

7
New cards

Relationships

Associations or connections between tables in a relational database

Help organise and structure data, allowing for efficient data retrieval and maintaining data integrity

8
New cards

Related table

Tables that share a relationship

Established using foreign keys, which are columns in a table that refer to the primary key in another table

9
New cards

One-to-one

Each record in Table A is associated with one and only one record in Table B, and vice versa

e.g., Tables ‘users’ and ‘user_profiles’, where each user has a single corresponding profile

10
New cards

One-to-many

Each record in Table A can be associated with multiple records in Table B, but each record in Table B is associated with only one record in Table A

e.g., Tables ‘departments’ and ‘employees’, where each department can have multiple employees, but each employee belongs to one department

11
New cards

Many-to-many

Each record in Table A can be associated with multiple records in Table B, and vice versa

e.g., Tables ‘students’ and ‘courses’, where each student can enroll in multiple courses, and each course can have multiple students

12
New cards

Data anomalies

Inconsistency in data resulting from an operation like an update, insertion or deletion

13
New cards

Normalisation

To reduce the chance of creating data anomalies when modifying database

Minimise or eliminate redundancy

14
New cards

Redundancy

Several, unrelated copies of the same data in the database

15
New cards

Update anomaly

Results from data redundancy and partial update

16
New cards

Delete anomaly

Unintended loss of data due to deletion of other data

17
New cards

Insert anomaly

Inability to add data to the database due to the absence of other data

18
New cards

Composite keys

Creating a primary key by combining more than one attribute

19
New cards

Constraints

Used to limit the type of data that can go into a table

Ensures accuracy and reliability of data

Specified when table is created

e.g., primary key, autoincrement, not null, unique

20
New cards

Data integrity/quality

Accuracy, completeness and quality of data as it’s maintained over time and across formats

e.g., currency, authenticity, relevance, accuracy, outliers

21
New cards

Entity integrity

Feature of relational DB

Each entity (row) must have a unique and non-null PK

Ensures that data can’t be listed multiple times

22
New cards

Domain integrity

Enforced by types and constraints

Ensures the accuracy of values of data within a domain (entity)

23
New cards

Referential integrity

Relationship between tables

Each table in a DB must have a PK which can appear in other tables

Relationship between FK leads to hierarchy of dependencies

Updating or deleting rows can break this

To prevent use ON UPDATE/DELETE CASCADE/SET NULL

24
New cards

Currency

Accurate, up to date and relevant

25
New cards

Authenticity

Trustworthy info collected from real life events

26
New cards

Relevance

Degree to which data provides insight into the problem or purpose being addressed

27
New cards

Accuracy

Percentage of errors across the total number of records

28
New cards

Outliers

Data point that differs significantly from other observations that can distort statistical results

29
New cards

Accuracy vs validity

Both seek to describe the quality of usability of a data set

Accuracy - how well the data corresponds to the real world or true value

Validity - how well data values are consistent, based on defined rules

30
New cards

Referential integrity

Making sure that the relationships between tables remain valid, even if data is modified or deleted

Foreign keys join tables and establish dependencies - if change or delete row in one table, meaning of rows in other table is destroyed

31
New cards

Aggregate function

Performs a calculation on a set of values and returns a single value

e.g., COUNT, SUM AVG, MAX, MIN

32
New cards

Aliases

Give table or column in table a temporary name, making them more readable

Use AS

33
New cards

Concatenated fields

Created by combining the values of multiple fields using the | | operator, often with additional characters for formatting

34
New cards

Calculated fiedls

Columns of a table whose values are a function of other columns in the same row

35
New cards

Open database connectivity (ODBC)

Open standard API used for accessing a database

Allows programs to use SQL requests that access databases without knowing the proprietary interfaces to the databases

Enables a developer to cover and effectively work with many database systems whereby for relational database it enables a developer to make connections with tabular data in ODBC server

36
New cards

ACID

Ensures reliable and consistent database transactions

Atomicity, consistency, isolation, durability

37
New cards

Role of ACID

Provide a mechanism to ensure the correctness and consistency of a database in a way such that each transaction is a group of operations that acts as a single unit, produces consistent results, acts in isolation from other operations, and updates that it makes are durably stored

38
New cards

Atomicity

Entire transaction takes place at once or doesn’t happen at all

39
New cards

Consistency

Database must be consistent before and after the transaction

40
New cards

Isolation

Multiple transactions occur independently without interference

41
New cards

Durability

The changes of a successful transaction occurs even if the system failure occurs

42
New cards

Normalisation

Process of organising data in a database

Creating tables and establishing relationships between them according to rules designed to protect data and make it more flexible

Eliminates redundancy and inconsistent dependency

43
New cards

Form of a relation

No repeated attributes

All cells must be atomic (only contain single piece of data)

44
New cards

1NF

In the form of a relation

Each record must have a primary or composite key

45
New cards

2NF

In 1NF

Have no partial dependencies - when a non-key attribute is only dependent on part of the composite key

46
New cards

3NF

In 2NF

Have no transitive dependencies - when a non-key field is dependent on a field other than the primary key

47
New cards

Data quality

Assesses whether information can serve its purpose in a particular context

48
New cards

Cleaning outliers

Deleting the outlier values, transforming the data or treating them as missing values

49
New cards

Ethics in data collection

Entities in DBs are people which contain their personal info

Important to obtained informed consent for collection, collect only necessary info and ensure data is secured

50
New cards

Privacy concerns

As DBs store personal info, they are attractive targets for cyberattacks

Protecting data privacy is crucial to maintain trust and comply with regs

51
New cards

Appropriate use of data

DBs enable data analysis and decision making based on store info (can lead to discrimination)

Personal info datasets are often ‘on sold’ without knowledge or consent of people whose info that’s sold

52
New cards

Reliability of data sources

DBs often aggregate data from various sources

Ensure the reliability of these sources is crucial for data quality and validity of subsequent analyses or decisions

53
New cards

Acknowledgement of data sources

DBs often contain data complied or curated by others

Acknowledging data sources is important for respecting IP rights, ensuring transparency and enabling reproducibility of research

54
New cards

Use of data mining

DBs are the raw material for data mining algorithms, which extract patterns and insights from large datasets

Can be used for various purposes (e.g., market research, scientific discovery)

55
New cards

Keep personal data private practices

Store personal data in encrypted form

Collect only min necessary data

Implement strict access controls based on roles and responsibilities

Define clear policies for how long it’s retained and how to dispose of it

56
New cards

Backups of organisational data practices

Schedule regular backups

Store it in multiple, separate locations

Regularly test the restoration process to ensure they’re usable and valid

Maintain multiple versions

57
New cards

Restricting access to data practices

Role-based access control (RBAS): assign permissions based on job roles and responsibilities

Principle of least privilege (POLP): grant users only min necessary access to perform their tasks

Use strong authentication

Log in and monitor data base access to detect unauthorised activities

58
New cards

Ownership and control of data practices

Policies and procedures to define data ownership, roles and responsibilities and processes for handling data

Compliance with data protection regs

Establish legal contracts that specify data ownership, usage, restrictions and liabilities when dealing with third parties