ACC 3303 chp 4, 5, 6, & 7

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/87

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:21 AM on 5/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

88 Terms

1
New cards

what is a database

  • a related group of files that efficiently and centrally coordinates information

2
New cards

a file is

  • a related group of records

3
New cards

a record is

  • a related group of fields

4
New cards

a field is

  • a specific attribute of interest for the entity

5
New cards

advantages of database

  • data is integrated

  • data sharing

  • minimize data redundancy and inconsistencies

  • data is independent of the programs that use the data

  • data is easily accessed for reporting and cross-functional analysis

6
New cards

relational database

  • represents the conceptual and external schema as if that “data view” were truly stored in one table

  • conceptual view appears to the user that this information is in one big table, it really is a set of tables that relate to one another

7
New cards

update anomaly

  • if a database isn’t normalized, updating one piece of information, will have to be updated every time it exists in the database

  • it might be missed

8
New cards

insert anomaly

  • when we can’t put in a new record, because something doesn’t exist yet

9
New cards

delete anomaly

  • if we start to delete some information it might not update everywhere

10
New cards

relational database design rules

  • every cell of a column in a row must be single valued

  • primarky key cannot be empty also known as entity integrity

  • if a foreign key is not null, it must have a value that corresponds to the value of a primary key in another table (referential integrity)

  • all other attributes in the table must describe characteristics of the object identified by the primary key

11
New cards

data presentation

  • visualized data is processed faster than written or tabular information

  • easier to use; users need less guidance to find information with visualized data

  • supports the dominant learning style of the population because most learners are visual learners

12
New cards

visualization: comparison

  • used for comparing data across categories or groups; require numeric and categorical data values

13
New cards

visualization: correlation

  • used to compare how two numeric variables fluctuate with each

14
New cards

visualization: distribution

  • used to show spread in numeric values

15
New cards

visualization: trend evaluation

  • used to show changes over an ordered variable, usually time

16
New cards

visualization: part to whole

  • used to show which items make up parts of a total

17
New cards

high-quality visualization: simplification

  • refers to making a visualization easy to interpret and understand

    • distance: how far apart related data is presented

    • orientation: change the direction of the entire chart

18
New cards

visualization: emphasis

  • assuring the most important message is easily identifiable

    • highlighting: using color, contrasts, labels, arrows, fonts, etc. to draw attention to an item

    • weighting: amount of attention an element attracts

    • ordering: intentional arranging to produce emphasis

19
New cards

high-quality visualization: ethical presentation

  • refers to avoiding the intentional or unintentional use of deceptive practices that can alter the user’s understanding of the data being presented

    • data deception: graphical depiction of information, designed with or without intent to deceive, that may create a message that varies from the actual message

20
New cards

alternative hypothesis

  • statement of inequality, suggesting that one concept, idea, or group is related to another concept, idea, or group

21
New cards

categorical data

  • limited number of assigned values to represent different groups, while numeric data are continuous or near continuous

22
New cards

classification analyses

  • techniques that identify key characteristics of groups or populations and use them to classify new observations into one of those groups

23
New cards

confirmatory data analysis

  • testing a hypothesis and providing statistical measures of the likelihood that the evidence refutes or supports a hypothesis

24
New cards

data ordering

  • refers to the intentional arrangement of visualization items to produce emphasis

25
New cards

data overfitting

  • the model fits training data very well but does not predict well when applied to another datasets

26
New cards

effect size

  • quantitative measure of the magnitude of the effect, provides insight into the importance of the relationship

27
New cards

exploratory data analysis

  • an approach that studies data without testing formal models or hypotheses

28
New cards

extrapolation beyond the range

  • the process of estimating a value that is beyond the data used to create the model

29
New cards

machine learning

  • an application of AI that allows computer systems to improve and update prediction by using algorithms and statistical models to analyze and draw inferences from patterns in data

30
New cards

outlier

  • data point that lie far from other values in the data

31
New cards

simplification

  • making a visualization easy to interpret and understand

32
New cards

test dataset

  • used to create the model for future prediction

33
New cards

analytics mindset

  • asked the questions and formed some predictions

  • also performed ETL

34
New cards

descriptive analytics

  • what happened

  • understanding how the data has behaved

  • performance ratios: profitability, turnover

  • exploratory analysis includes

    • finding any mistakes

    • understand the structure of the data

    • check assumptions for higher level analytics

    • determine existing relationships in the data

35
New cards

four categories of data analysis

  • descriptive - what happened

  • diagnositc -why did this happen

  • predictive - what is likely to happen in the future

  • prescriptive - what should be done

36
New cards

what can go wrong with data analytics?

  • poor data leading to inappropriate conclusions

  • overfitting the data

  • extrapolating beyond the range

  • failing to appreciate the level of variation

37
New cards

automation

  • refers to the use of machines to perform tasks previously carried out by humans

38
New cards

bot

  • RPA software that users create autonomous computer programs

39
New cards

dark data

  • data that the organization has collected and stored but is not analyzed and is therefore generally ignored

40
New cards

data lake

  • collection of structured, semi-structured, and unstructred data stored in a single location

41
New cards

data mart

  • it is often more efficient to process data in smaller data repositories holding structured

42
New cards

data storytelling

  • process of translation complex data analyses into simpler terms to aid in better decision making

43
New cards

data swamps

  • data repositories that are not accurately documented and thus their stored data cannot be identifiable and analyzed

44
New cards

data variety

  • different forms data can take

45
New cards

data velocity

  • pace at which data is created and stored

46
New cards

data veracity

  • quality of trustworthiness of data

47
New cards

delimiter

  • a character, or series of characters, that separates one field from another

48
New cards

ETL process

  • process of extracting, transforming, and loading data

49
New cards

flat file

  • text file that consolidates data from multiple tables or sources into a single row

50
New cards

metadata

  • data that describes other data, such as the number of characters allowed in different fields, the type of characters allowed, and the format of data in a particular field

51
New cards

robotic process automation

  • allow users to create autonomous computer programs, called bots, to perform specific tasks across different applications

52
New cards

structured data

  • data that are highly organized and fit into fixed fields

53
New cards

text qualifier

  • two characters that indicate the start and end of a field and tell the program to ignore any delimiters contained between the qualifiers

54
New cards

unstructred data

  • have no uniform structure and include items such as images, audio files, documents, tweets, e-mails, videos, and presentations

55
New cards

analytics mindset

  • the ability to visualize, articulate, conceptualize, or solve both complex and simple problems by making decisions that are sensible given the available information and ability to idenfity trends through analysis of the data/information

    • ask the right questions

    • extract, transform and load relevant data

    • apply appropriate data analytics techniques

    • interpret and share the results with stakeholders

56
New cards

data analytics is _____

  • forward looking

  • what will happen next that can be improved

57
New cards

extracting data - 3 step process

  • understand the data needs and available data

  • perform the data extraction

  • verify the data extraction and document what you’ve done

58
New cards

data dictionary includes

  • “meta dat” which is data about data

  • examine the data dictionary before you start to process your data

  • make sure you understand what is in the data

59
New cards
60
New cards

ETL process

  • most people say this takes up to 80% of the time when performing data analysis

  • extracting, transforming, loading

61
New cards

data structuring

  • process of changing the organization and relationship among data fields; rearranging for analysis

    • aggregate

    • joining

    • pivoting

62
New cards

aggregate

  • summary with fewer details than original

63
New cards

joining

  • bringing data from different tables together

64
New cards

pivoting

  • rotating data from rows to columns

65
New cards

data standardization

  • standardizing structure and meaning of each data element so it can be analyzed and used

  • ensure data is consistent syntax throughout

    • parsing

    • concatenation

    • cryptic data

    • misfielded data

66
New cards

parsing

  • separating from single field to multiple

67
New cards

concatenation

  • combining data from multiple fields into one

68
New cards

cryptic data

  • data items with no apparent meaning without coding scheme

69
New cards

dummy variables

  • contains only 2 responses, usually 0 or 1

70
New cards

misfielded data

  • correctly formatted but in wrong field

71
New cards

data cleaning

  • process of updating data to be consistent, accurate and complete

72
New cards

data de-duplication

  • process of analyzing data and removing two or more records that contain identical information

73
New cards

data filtering

  • process of removing records or fields of information from a data source

74
New cards

data imputation

  • process of replacing a null or missing value with a substituted value

  • only works with numeric data

75
New cards

data contradiction errors

  • errors that exist when the same entity is described in two conflicting ways

  • need to be investigated and resolved appropriately

76
New cards

data threshold violations

  • data errors that occur when a data value falls outside an allowable level

77
New cards

violated attribute dependencies

  • errors that occur when a secondary attribute in a row of data does not match the primary attribute

78
New cards

data entry errors

  • are all types of errors that come from inputting data incorrectly

  • often occur in human data entry and can also be introduced by the computer system

79
New cards

data validation

  • process of analyzing data to make certain the data has the properties of high-quality data

  • it should happen throughout data transformation

80
New cards

visual inspection

  • the process of examining data using human vision to see if there are problems

81
New cards

basic statistical tests

  • can be performed to validate the data

82
New cards

audit a sample

  • one of the best techniques for assuring data quality

83
New cards

advanced testing techniques

  • possible with a deeper understanding of the content of data

84
New cards
85
New cards
86
New cards

data consistency

  • every value in a field should be stored in the same way

87
New cards

88
New cards

regular expression (regex)

  • sequence of characters that specify a search pattern