Accounting Information Systems Exam 2

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/74

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:51 AM on 3/27/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

75 Terms

1
New cards

Database Management System (DBMS)

a software program that manages and regulates the data, as well as the communication between the data and the application programs that access the data stored in the database.

2
New cards

Database System

The database, the DBMS, and the application programs that access the database through the DBMS

3
New cards

Data Warehouse

One or more very large databases containing both detailed and summarized data from various sources that are used for analytical processing rather than transaction processing

  • allows the online transaction processing database to process 1 million+ transactions per minute

4
New cards

Business Intelligence

Analyzing large amounts of data for strategic decision making
- Two main business intelligence techniques: Online analytical processing and data mining

5
New cards

Online Analytical Processing

Using queries to investigate hypothesized relationships among data

  • one of the two main techniques used in business intelligence

    • Ex: manager may review supplier purhcases for past 3 years, followed by additional queries that drill down to see prucases by item number and by fiscal period

6
New cards

Data mining

uses advanced statistical analysis, including AI techniques such as neural networks, to identify non hypothesized relationships in the data

  • ex: credit card companies use data mining to identify usage patterns that may indicate fraud, and to identify relationships in sales data that can be sued for future promotions

7
New cards

Advantages of database systems

  • Data integration - previously separate application files are combines into large pools of data that can be accessed by multiple application programs, such as an employee database that combines payroll, personnel, and job skills files

  • Data sharing - integrated data can be easily shared with authorized users through centralized data storage, making it easy to research a problem or obtain detailed information from a report

  • Minimal data redundancy and data inconsistencies - by storing data items only once, data redundancy and data inconsistencies are minimized

  • Data independence - programs and data are independent of each other, allowing each to be changed without affecting the other, which simplifies programming and data management

  • Cross-functinoal analysis - relationships, such as the connection between selling costs and promotional campaigns, can be defined and used in management reports

8
New cards

Logical View vs. physical view

  • logical view = how people actually conceptually organize and understand the relationships among data items

    • ex: a sales manager views all customer information as being stored in a table

  • physical view = refers to the way data are physically arranged and stored in the computer system

9
New cards

Schemas definition

  • a description of the data elements in a database, the relationships among them, and the logical model used to organize and describe the data

  • Think of a schema as the architectural drawing for a building: it shows where the walls (tables), rooms (records), and plumbing (connections) go, but it isn't the furniture inside.

10
New cards

The three levels of schemas

  • the external-level schema = a user’s logical view of their portion of a database as noted, each of which is referred to as a subschema

  • conceptual-level schema = the organization-wide view of the entire database, lists all data elements and the relationships among them

  • Internal-level schema = includes detailed view of the database, describes how the data are stored and accessed, including record layouts, definitions, addresses, and indexes

<ul><li><p>the external-level schema = a user’s logical view of their portion of a database as noted, each of which is referred to as a subschema</p></li><li><p>conceptual-level schema = the organization-wide view of the entire database, lists all data elements and the relationships among them</p></li><li><p>Internal-level schema = includes detailed view of the database, describes how the data are stored and accessed, including record layouts, definitions, addresses, and indexes</p></li></ul><p></p>
11
New cards

How are schemas used by the database management system

  • the DBMS uses the mappings (schemas) to translate a user’s or program’s request for data (expressed in terms of logical names and relationships) into the indexes and addresses needed to physically access the data

12
New cards

The five languages of the DBMS

  • Data definition language (DDL): builds the data dictionary, creates the database, describes logical views for each user, and specifies record or field security constraints

    • CREATE, DROP, ALTER

  • Structured Query Language (SQL): standardized commercial programming language designed for managing data in relational database systems. Though it is standardized, variations exist among different database systems

  • Data manipulation language (DML): changes database content, including data, updates, insertions, and deletions

    • Ex: INSERT, UPDATE, TRUNCATE, DROP

  • Data Query Language (DQL): a high-level, English-like languages that contains powerful, east-to-use commands that enable users to retrieve, sort, order, and display data

    • Ex: SELECT

  • Report writer - uses standard English to explain how a query is fulfilled

13
New cards

Relational Databases

A collection of two-dimensional tables with each table representing an object about which we wish to collect and store information

  • each row is called a tuple, contains data about a specific occurrence of an entity

  • each column contains data about an attribute of that entity

Relational data model represents conceptual and external-level schemas as if they were represented in tables, though they are actually represented as if they were internal-level schemas

14
New cards

Primary key vs. Foreign key - relational databases

  • primary key = the database attribute, or combination of attributes, that uniquely identifies a specific row in a table

    • The one piece of data that is guaranteed to be unique for every single record

    • Can never be empty (null)

    • Ex: Student ID number - even if two students are named Alex Smith and were born on the same day, they’ll still have different student IDs

  • foreign key = an attribute in one table that is a primary key in another table AND is used to link the two tables

    • does NOT have to be unique

    • creates a relationship between two tables

    • Ex: The student ID in the Class Enrollment table links back to the Student table. This allows us to track which student is in which class, rather than having to type in the student name, date of birth, address, major, etc.

15
New cards

The 4 Vs of big data

  • Data Volume = refers to the amount of data created and stored by an organization

  • Data Variety = refers to the different forms data can take

  • Data Velocity = refers to the speed at which data are created and stored by an organization

  • Data Veracity = refers to the quality and trustworthiness of data

16
New cards

EY definition of analytics mindset

the ability to ask the right questions, extract and transform relevant data (ETL), apply appropriate analytical techniques, and interpret/share results with stakeholders

17
New cards

SMART objectives when asking a question

Specific - direct and focused to produce a meaningful answer

Measurable - amenable to data analysis. The inputs to answering the question must be measurable with data

Achievable - able to be answered, and the answer should result in a decision maker taking action

Relevant - related to the objectives of the organization or the situation under consideration

Timely - have a defined time horizon for answering

18
New cards

ETL

Extract

Transform

Load

19
New cards

Steps to extract data

Extract - 3 steps

  • understanding the data needs and the data that is available

  • performing the data extraction

  • verifying the data extraction quality and documenting what was done

20
New cards

Steps to transform data

Transform - 4 steps

  • understand the data and the desired outcome

  • standardize, structure, clean the data

  • validate data quality and verify that data meet data requirements

  • document the transformation process

21
New cards

Steps to load data

Load -

  • transformed data must be stored in a format and structure compatible with the receiving software

  • save data as a text file using delimiters, put them in relational table form

  • strip out, standardize most formatting (remove commas, use standard date format, use minus sign to indicate negative)

22
New cards

Structured data

Data that are highly organized and fit into fixed fields. Data warehouses typically only store structured data

  • accounting data

  • relational data

  • most spreadsheet data

23
New cards

Unstructured Data

Have no uniform structure and include items such as images, audio files, documents, tweets, e-mails, videos, presentations

24
New cards

Semi-structured data

Data that are organized to some extent but not enough to be inserted into a relational database

25
New cards

Data Mart

smaller (than data warehouses) data repositories holding structured data

  • ex: company creates data mart for all data in geographic regions (N. America, S. America)

  • Ex: company creates data mart for each division (sales, marketing)

26
New cards

Data Lake

A collection of structured, semi-structured, and unstructured data stored in a single location

  • can cause problems if it becomes too large and if important data gets lost (dark data)

27
New cards

Dark data

Data the organization has collected and stored that would be useful for analysis but is not analyzed or organized and is thus generally ignored

28
New cards

Data Swamp

Form when data lakes are not accurately documented, making it so that their stored data cannot be identified and analyzed

  • disorganized, impossible to use

29
New cards

Delimiters and text qualifiers in flat files

  • flat file = text file that consolidates data from multiple tables or sources into a single row

    • make it easier to find and analyze data as opposed to separate files

    • A DELIMITER (aka field separator) is used to distinguish between fields in a single line

      • A character/series of characters that separates one field from another

        • often the “ | “ symbol

30
New cards

Categories of Data analytics techniques (4)

  • descriptive

  • diagnostic

  • predictive

  • prescriptive

31
New cards

Descriptive Analytics

Answers the question of “what happened” or “what is happening”

  • ex: financial ratios (ROE, Debt to Equity)

32
New cards

Diagnostic analytics

Answers the question of “why did this happen”

  • identifies cause/effect relationships

  • ex: determine if increasing the IT budget in an organization increases employee efficiency and effectiveness

33
New cards

Predictive Analytics

Answers the question of “what might happen in the future”

  • used to forecast events like stock prices or currency exchange rates

34
New cards

Prescriptive Analytics

Answers the question of "what should be done”

  • ex: can provide recommendations like whether or not to grant a loan

35
New cards

Correlation vs. Causation

Correlation - indicates that two things occur at the same time

Causation - implies that one event causes another

36
New cards

How to share results

Data Storytelling = process of transforming complex data analyses into simpler terms to aid in better decision making

  • consider the objectives of the stakeholders, design the story to address those objectives

  • consider the audience’s needs and preferences

Data visualization = the use of a graphical representation of data to convey meaning

  • usually displayed on a data dashboard

37
New cards

5 Attributes of High-Quality Data

  • Accurate = correct, free from error, accurately represent events and activities

  • Complete = do not omit aspects of events or activities

  • Consistent = presented in the same format over time

  • Timely = provided in time for decision makers to make decisions

  • Valid = data measure what it is intended to measure

38
New cards

Data Structure

Data structure = how data are organized and stored, including the relationships between different fields

Data structuring = the process of organization to prepare data for analysis

  • 3 ways data can be changed: Aggregate data, Data joining, Data Pivoting

39
New cards

Aggregate Data (one of the ways data can be structured)

A summary of the data with fewer details than in the originals set

Ex: remove the detail of each sale, but provides a summary of sales data by manager

40
New cards

Data Joining (a way data can be structured)

Joining data from different tables into one single table for analysis

41
New cards

Data Pivoting (a way data can be structured)

Rotating data from rows to columns so as to make certain aspects of the data easier to interpret

42
New cards

Data Standardization

The process of standardizing the structure and meaning of each data element so it can be analyzed and used in decision making.

  • made up of: parsing, concatenation, cryptic data values, misfielded data values, data formatting, data consistency

43
New cards

Data parsing

Separating data from a single field into multiple fields

  • Ex: Splitting into separate fields something like N-0504-2002

44
New cards

Data Concatenation

Process of combining data from two or more fields into a single field

  • Ex: Taking columns with first, middle, last names and pulling them all into one field

45
New cards

Cryptic Data Values

Data items that have no apparent meaning without knowledge of the underlying coding scheme

  • Ex: consulting firm may use the numbers 1, 2, 3 to represent partner, senior consultant, and research analyst in its database

46
New cards

Misfielded Data Values

Data values that are correctly formatted but not listed in the correct field

  • Ex: City field contains “Germany”

47
New cards

Data Formatting

Data values should have the same format as all other values in the same field and match the data type specified in the data dictionary

48
New cards

Data Consistency

The principle that every value in a field should be stored in the same way

49
New cards

Data Cleaning

Involves updating data to ensure they are consistent, accurate, and complete

50
New cards

Data De-Duplication

The process of analyzing data and removing two or more records that contain identical information

51
New cards

Data Filtering

The process of removing records or fields of information from a data source

  • Ex: company periodically removes data about potential vendors that they’ve decided they are not going to purchase from (inactive)

52
New cards

Data imputation

The process of replacing a null or missing value with a substituted value

  • only works with numeric data

53
New cards

Data Contradiction Errors

Occur when the same entity is described in two conflicting ways

  • a data file contains information about a manufacturing plant in two different records, but the physical address of the manufacturing plant is different in each record

54
New cards

Data Threshold Violations

Occurs when a data value falls outside an allowable level

55
New cards

Data Entry Errors

All types of errors that come from inputting data incorrectly

  • misspelling words, transposing digits in numeric strings, failing to enter data

56
New cards

Data Validation - Four methods

The process of analyzing data to ensure that they have the properties of high-quality data before, during, and after the transformation process. Can be formal and informal

  • visual inspection

  • basic statistical tests

  • auditing a sample

  • advanced testing techniques

57
New cards

Visual Inspection

A simple technique that involves examining data using human vision to detect potential issues

58
New cards

Basic Statistical Tests

  • calculate minimum, maximum, mean, median, sum for each field

    • examine for unreasonable values

59
New cards

Audit a sample - a method of data validation

one of the best ways to ensure data quality

  • set up a sample of rows, columns and carefully analyze each value to ensure its validity

  • trace back to original source documents

60
New cards

Advanced Testing Techniques

With a deep understanding of the data, one can leverage that knowledge to confirm that the transformed data adhere to accounting rules, which provides evidence of the quality of the data

61
New cards

Four Categories of Data Analysis

Descriptive analytics, Diagnostic analytics, Predictive analytics, Prescriptive analytics

62
New cards

Descriptive Analytics - 4 items important to undeststand

  • identify mistakes in the data

  • understand the structure of the data

  • check assumptions required by more formal statistical modeling techniques

  • determine the size, direction, and strength of relationships between variables

63
New cards

5 Whys Principle

It often requires asking “why” five times to uncover the true reason why something happened

64
New cards

Diagnostic analysis - basic process of testing hypothesis

1- state a null and alternative hypothesis

2- select a level of significance for refuting the null hypothesis

3- collect a sample of data to compute the probability value

4- compare the computed probability against the level of significance and determine if the evidence refutes the null. Failing to refute the null hypothesis is seen as support of the alternative hypothesis

65
New cards

Null hypothesis vs alternative hypothesis

Null = a statement of equality- suggesting there is no relationship among the ideas

  • in stats, we try to find enough evidence to reject the null (we dont try to prove the null)

  • Ex: The new layout of the store had NO effect on spending.

Alternative = a statement of inequality - suggests that one idea is related to another idea

  • Ex: The new layout increased customer spending

66
New cards

Predictive analysis steps

Definition: utilizes historical data to identify patterns that are likely to repeat themselves in the future

1- define the desired outcome to predict

2- identify and prepare the relevant data for analysis

3- develop and validate a predictive model

67
New cards

Training Data vs. Testing Data

training = used to create the model for future prediction

test = used to assess how well the model predicts the target outcome

  • these data sets cannot be the same

  • data overfitting = when the model fits the training data very well but does not predict well when applied to other datasets

68
New cards

Machine Learning Types

Machine learning = an application of artificial intelligence that allows computer systems to improve and update predictions by using algorithms and statistical models to analyze and draw inferences from data

3 types:

1- supervised

2- unsupervised

3- reinforcement

69
New cards

Supervised machine learning

A type of machine learning algorithm in which the model is trained using a labeled dataset

  • This helps the model to classify and predict outcomes accurately

  • Helps the computer learn general rules that maps inputs into the correct outputs using the labeled training data

70
New cards

Unsupervised machine learning

A type of machine learning algorithm that uses unlabeled datasets to analyze datasets

71
New cards

Reinforcement machine learning

A type of machine learning algorithm that is trained by only rewarding or punishing the algorithm as it tries to achieve a desired outcome

72
New cards

Common problems in data analytics

  • Garbage in, garbage out = you must start with high quality data if you are to have useful analysis

  • Extrapolation = a process of estimating a value beyond the range of the data used to create the model

    • Data overfitting = when a model is too closely matched to a specific set of data (training data) and is unable to actually predict future observations

73
New cards

5 Main Purposes for Visualization

1- comparison (bar chart)

2- Correlation (scatterplot)

3- Distribution (histogram)

4- Trend Evaluation (line chart)

5- part-to-whole (pie chart)

<p>1- comparison (bar chart)</p><p>2- Correlation (scatterplot)</p><p>3- Distribution (histogram)</p><p>4- Trend Evaluation (line chart)</p><p>5- part-to-whole (pie chart)</p>
74
New cards

3 design principles for high quality visualizations

Simplification - in design, making a visualization easy to interpret and understand

Emphasis - assuring the most important message is easily identifiable

Ethical data presentation - avoiding the intentional or unintentional use of deceptive practices that can alter the user’s understanding of the data being presented

75
New cards

Explore top notes

note
Unit 4 - Chapter 6
Updated 909d ago
0.0(0)
note
2.5: air pollution
Updated 1206d ago
0.0(0)
note
Chapter 14: Postmortem Examination
Updated 1083d ago
0.0(0)
note
The Krebs Cycle
Updated 1200d ago
0.0(0)
note
🦅 APUSH Unit 2 Notes
Updated 179d ago
0.0(0)
note
Chapter 16: Cytoskeleton
Updated 990d ago
0.0(0)
note
AP World History - Ultimate Guide
Updated 325d ago
0.0(0)
note
Chapter 12: Social Psychology
Updated 1283d ago
0.0(0)
note
Unit 4 - Chapter 6
Updated 909d ago
0.0(0)
note
2.5: air pollution
Updated 1206d ago
0.0(0)
note
Chapter 14: Postmortem Examination
Updated 1083d ago
0.0(0)
note
The Krebs Cycle
Updated 1200d ago
0.0(0)
note
🦅 APUSH Unit 2 Notes
Updated 179d ago
0.0(0)
note
Chapter 16: Cytoskeleton
Updated 990d ago
0.0(0)
note
AP World History - Ultimate Guide
Updated 325d ago
0.0(0)
note
Chapter 12: Social Psychology
Updated 1283d ago
0.0(0)

Explore top flashcards

flashcards
AP Lit American Year Vocab
166
Updated 1193d ago
0.0(0)
flashcards
Japanese Year 11 ATAR
182
Updated 1021d ago
0.0(0)
flashcards
CCHS Meteorology Brown
41
Updated 869d ago
0.0(0)
flashcards
Semester 1 Final: Names
37
Updated 1199d ago
0.0(0)
flashcards
G8 U4
22
Updated 471d ago
0.0(0)
flashcards
Unit 9: Protein Synthesis
32
Updated 377d ago
0.0(0)
flashcards
year 11 biology
297
Updated 422d ago
0.0(0)
flashcards
AP Lit American Year Vocab
166
Updated 1193d ago
0.0(0)
flashcards
Japanese Year 11 ATAR
182
Updated 1021d ago
0.0(0)
flashcards
CCHS Meteorology Brown
41
Updated 869d ago
0.0(0)
flashcards
Semester 1 Final: Names
37
Updated 1199d ago
0.0(0)
flashcards
G8 U4
22
Updated 471d ago
0.0(0)
flashcards
Unit 9: Protein Synthesis
32
Updated 377d ago
0.0(0)
flashcards
year 11 biology
297
Updated 422d ago
0.0(0)