Accounting Information Systems Exam 2

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/74

There's no tags or description

Looks like no tags are added yet.

Last updated 2:51 AM on 3/27/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

75 Terms

New cards

Database Management System (DBMS)

a software program that manages and regulates the data, as well as the communication between the data and the application programs that access the data stored in the database.

New cards

Database System

The database, the DBMS, and the application programs that access the database through the DBMS

New cards

Data Warehouse

One or more very large databases containing both detailed and summarized data from various sources that are used for analytical processing rather than transaction processing

allows the online transaction processing database to process 1 million+ transactions per minute

New cards

Business Intelligence

Analyzing large amounts of data for strategic decision making
- Two main business intelligence techniques: Online analytical processing and data mining

New cards

Online Analytical Processing

Using queries to investigate hypothesized relationships among data

one of the two main techniques used in business intelligence
- Ex: manager may review supplier purhcases for past 3 years, followed by additional queries that drill down to see prucases by item number and by fiscal period

New cards

Data mining

uses advanced statistical analysis, including AI techniques such as neural networks, to identify non hypothesized relationships in the data

ex: credit card companies use data mining to identify usage patterns that may indicate fraud, and to identify relationships in sales data that can be sued for future promotions

New cards

Advantages of database systems

Data integration - previously separate application files are combines into large pools of data that can be accessed by multiple application programs, such as an employee database that combines payroll, personnel, and job skills files
Data sharing - integrated data can be easily shared with authorized users through centralized data storage, making it easy to research a problem or obtain detailed information from a report
Minimal data redundancy and data inconsistencies - by storing data items only once, data redundancy and data inconsistencies are minimized
Data independence - programs and data are independent of each other, allowing each to be changed without affecting the other, which simplifies programming and data management
Cross-functinoal analysis - relationships, such as the connection between selling costs and promotional campaigns, can be defined and used in management reports

New cards

Logical View vs. physical view

logical view = how people actually conceptually organize and understand the relationships among data items
- ex: a sales manager views all customer information as being stored in a table
physical view = refers to the way data are physically arranged and stored in the computer system

New cards

Schemas definition

a description of the data elements in a database, the relationships among them, and the logical model used to organize and describe the data
Think of a schema as the architectural drawing for a building: it shows where the walls (tables), rooms (records), and plumbing (connections) go, but it isn't the furniture inside.

New cards

The three levels of schemas

the external-level schema = a user’s logical view of their portion of a database as noted, each of which is referred to as a subschema
conceptual-level schema = the organization-wide view of the entire database, lists all data elements and the relationships among them
Internal-level schema = includes detailed view of the database, describes how the data are stored and accessed, including record layouts, definitions, addresses, and indexes

<ul><li><p>the external-level schema = a user’s logical view of their portion of a database as noted, each of which is referred to as a subschema</p></li><li><p>conceptual-level schema = the organization-wide view of the entire database, lists all data elements and the relationships among them</p></li><li><p>Internal-level schema = includes detailed view of the database, describes how the data are stored and accessed, including record layouts, definitions, addresses, and indexes</p></li></ul><p></p>

New cards

How are schemas used by the database management system

the DBMS uses the mappings (schemas) to translate a user’s or program’s request for data (expressed in terms of logical names and relationships) into the indexes and addresses needed to physically access the data

New cards

The five languages of the DBMS

Data definition language (DDL): builds the data dictionary, creates the database, describes logical views for each user, and specifies record or field security constraints
- CREATE, DROP, ALTER
Structured Query Language (SQL): standardized commercial programming language designed for managing data in relational database systems. Though it is standardized, variations exist among different database systems
Data manipulation language (DML): changes database content, including data, updates, insertions, and deletions
- Ex: INSERT, UPDATE, TRUNCATE, DROP
Data Query Language (DQL): a high-level, English-like languages that contains powerful, east-to-use commands that enable users to retrieve, sort, order, and display data
- Ex: SELECT
Report writer - uses standard English to explain how a query is fulfilled

New cards

Relational Databases

A collection of two-dimensional tables with each table representing an object about which we wish to collect and store information

each row is called a tuple, contains data about a specific occurrence of an entity
each column contains data about an attribute of that entity

Relational data model represents conceptual and external-level schemas as if they were represented in tables, though they are actually represented as if they were internal-level schemas

New cards

Primary key vs. Foreign key - relational databases

primary key = the database attribute, or combination of attributes, that uniquely identifies a specific row in a table
- The one piece of data that is guaranteed to be unique for every single record
- Can never be empty (null)
- Ex: Student ID number - even if two students are named Alex Smith and were born on the same day, they’ll still have different student IDs
foreign key = an attribute in one table that is a primary key in another table AND is used to link the two tables
- does NOT have to be unique
- creates a relationship between two tables
- Ex: The student ID in the Class Enrollment table links back to the Student table. This allows us to track which student is in which class, rather than having to type in the student name, date of birth, address, major, etc.

New cards

The 4 Vs of big data

Data Volume = refers to the amount of data created and stored by an organization
Data Variety = refers to the different forms data can take
Data Velocity = refers to the speed at which data are created and stored by an organization
Data Veracity = refers to the quality and trustworthiness of data

New cards

EY definition of analytics mindset

the ability to ask the right questions, extract and transform relevant data (ETL), apply appropriate analytical techniques, and interpret/share results with stakeholders

New cards

SMART objectives when asking a question

Specific - direct and focused to produce a meaningful answer

Measurable - amenable to data analysis. The inputs to answering the question must be measurable with data

Achievable - able to be answered, and the answer should result in a decision maker taking action

Relevant - related to the objectives of the organization or the situation under consideration

Timely - have a defined time horizon for answering

New cards

ETL

Extract

Transform

Load

New cards

Steps to extract data

Extract - 3 steps

understanding the data needs and the data that is available
performing the data extraction
verifying the data extraction quality and documenting what was done

New cards

Steps to transform data

Transform - 4 steps

understand the data and the desired outcome
standardize, structure, clean the data
validate data quality and verify that data meet data requirements
document the transformation process

New cards

Steps to load data

Load -

transformed data must be stored in a format and structure compatible with the receiving software
save data as a text file using delimiters, put them in relational table form
strip out, standardize most formatting (remove commas, use standard date format, use minus sign to indicate negative)

New cards

Structured data

Data that are highly organized and fit into fixed fields. Data warehouses typically only store structured data

accounting data
relational data
most spreadsheet data

New cards

Unstructured Data

Have no uniform structure and include items such as images, audio files, documents, tweets, e-mails, videos, presentations

New cards

Semi-structured data

Data that are organized to some extent but not enough to be inserted into a relational database

New cards

Data Mart

smaller (than data warehouses) data repositories holding structured data

ex: company creates data mart for all data in geographic regions (N. America, S. America)
Ex: company creates data mart for each division (sales, marketing)

New cards

Data Lake

A collection of structured, semi-structured, and unstructured data stored in a single location

can cause problems if it becomes too large and if important data gets lost (dark data)

New cards

Dark data

Data the organization has collected and stored that would be useful for analysis but is not analyzed or organized and is thus generally ignored

New cards

Data Swamp

Form when data lakes are not accurately documented, making it so that their stored data cannot be identified and analyzed

disorganized, impossible to use

New cards

Delimiters and text qualifiers in flat files

flat file = text file that consolidates data from multiple tables or sources into a single row
- make it easier to find and analyze data as opposed to separate files
- A DELIMITER (aka field separator) is used to distinguish between fields in a single line
  - A character/series of characters that separates one field from another
    - often the “ | “ symbol

New cards

Categories of Data analytics techniques (4)

descriptive
diagnostic
predictive
prescriptive

New cards

Descriptive Analytics

Answers the question of “what happened” or “what is happening”

ex: financial ratios (ROE, Debt to Equity)

New cards

Diagnostic analytics

Answers the question of “why did this happen”

identifies cause/effect relationships
ex: determine if increasing the IT budget in an organization increases employee efficiency and effectiveness

New cards

Predictive Analytics

Answers the question of “what might happen in the future”

used to forecast events like stock prices or currency exchange rates

New cards

Prescriptive Analytics

Answers the question of "what should be done”

ex: can provide recommendations like whether or not to grant a loan

New cards

Correlation vs. Causation

Correlation - indicates that two things occur at the same time

Causation - implies that one event causes another

New cards

How to share results

Data Storytelling = process of transforming complex data analyses into simpler terms to aid in better decision making

consider the objectives of the stakeholders, design the story to address those objectives
consider the audience’s needs and preferences

Data visualization = the use of a graphical representation of data to convey meaning

usually displayed on a data dashboard

New cards

5 Attributes of High-Quality Data

Accurate = correct, free from error, accurately represent events and activities
Complete = do not omit aspects of events or activities
Consistent = presented in the same format over time
Timely = provided in time for decision makers to make decisions
Valid = data measure what it is intended to measure

New cards

Data Structure

Data structure = how data are organized and stored, including the relationships between different fields

Data structuring = the process of organization to prepare data for analysis

3 ways data can be changed: Aggregate data, Data joining, Data Pivoting

New cards

Aggregate Data (one of the ways data can be structured)

A summary of the data with fewer details than in the originals set

Ex: remove the detail of each sale, but provides a summary of sales data by manager

New cards

Data Joining (a way data can be structured)

Joining data from different tables into one single table for analysis

New cards

Data Pivoting (a way data can be structured)

Rotating data from rows to columns so as to make certain aspects of the data easier to interpret

New cards

Data Standardization

The process of standardizing the structure and meaning of each data element so it can be analyzed and used in decision making.

made up of: parsing, concatenation, cryptic data values, misfielded data values, data formatting, data consistency

New cards

Data parsing

Separating data from a single field into multiple fields

Ex: Splitting into separate fields something like N-0504-2002

New cards

Data Concatenation

Process of combining data from two or more fields into a single field

Ex: Taking columns with first, middle, last names and pulling them all into one field

New cards

Cryptic Data Values

Data items that have no apparent meaning without knowledge of the underlying coding scheme

Ex: consulting firm may use the numbers 1, 2, 3 to represent partner, senior consultant, and research analyst in its database

New cards

Misfielded Data Values

Data values that are correctly formatted but not listed in the correct field

Ex: City field contains “Germany”

New cards

Data Formatting

Data values should have the same format as all other values in the same field and match the data type specified in the data dictionary

New cards

Data Consistency

The principle that every value in a field should be stored in the same way

New cards

Data Cleaning

Involves updating data to ensure they are consistent, accurate, and complete

New cards

Data De-Duplication

The process of analyzing data and removing two or more records that contain identical information

New cards

Data Filtering

The process of removing records or fields of information from a data source

Ex: company periodically removes data about potential vendors that they’ve decided they are not going to purchase from (inactive)

New cards

Data imputation

The process of replacing a null or missing value with a substituted value

only works with numeric data

New cards

Data Contradiction Errors

Occur when the same entity is described in two conflicting ways

a data file contains information about a manufacturing plant in two different records, but the physical address of the manufacturing plant is different in each record

New cards

Data Threshold Violations

Occurs when a data value falls outside an allowable level

New cards

Data Entry Errors

All types of errors that come from inputting data incorrectly

misspelling words, transposing digits in numeric strings, failing to enter data

New cards

Data Validation - Four methods

The process of analyzing data to ensure that they have the properties of high-quality data before, during, and after the transformation process. Can be formal and informal

visual inspection
basic statistical tests
auditing a sample
advanced testing techniques

New cards

Visual Inspection

A simple technique that involves examining data using human vision to detect potential issues

New cards

Basic Statistical Tests

calculate minimum, maximum, mean, median, sum for each field
- examine for unreasonable values

New cards

Audit a sample - a method of data validation

one of the best ways to ensure data quality

set up a sample of rows, columns and carefully analyze each value to ensure its validity
trace back to original source documents

New cards

Advanced Testing Techniques

With a deep understanding of the data, one can leverage that knowledge to confirm that the transformed data adhere to accounting rules, which provides evidence of the quality of the data

New cards

Four Categories of Data Analysis

Descriptive analytics, Diagnostic analytics, Predictive analytics, Prescriptive analytics

New cards

Descriptive Analytics - 4 items important to undeststand

identify mistakes in the data
understand the structure of the data
check assumptions required by more formal statistical modeling techniques
determine the size, direction, and strength of relationships between variables

New cards

5 Whys Principle

It often requires asking “why” five times to uncover the true reason why something happened

New cards

Diagnostic analysis - basic process of testing hypothesis

1- state a null and alternative hypothesis

2- select a level of significance for refuting the null hypothesis

3- collect a sample of data to compute the probability value

4- compare the computed probability against the level of significance and determine if the evidence refutes the null. Failing to refute the null hypothesis is seen as support of the alternative hypothesis

New cards

Null hypothesis vs alternative hypothesis

Null = a statement of equality- suggesting there is no relationship among the ideas

in stats, we try to find enough evidence to reject the null (we dont try to prove the null)
Ex: The new layout of the store had NO effect on spending.

Alternative = a statement of inequality - suggests that one idea is related to another idea

Ex: The new layout increased customer spending

New cards

Predictive analysis steps

Definition: utilizes historical data to identify patterns that are likely to repeat themselves in the future

1- define the desired outcome to predict

2- identify and prepare the relevant data for analysis

3- develop and validate a predictive model

New cards

Training Data vs. Testing Data

training = used to create the model for future prediction

test = used to assess how well the model predicts the target outcome

these data sets cannot be the same
data overfitting = when the model fits the training data very well but does not predict well when applied to other datasets

New cards

Machine Learning Types

Machine learning = an application of artificial intelligence that allows computer systems to improve and update predictions by using algorithms and statistical models to analyze and draw inferences from data

3 types:

1- supervised

2- unsupervised

3- reinforcement

New cards

Supervised machine learning

A type of machine learning algorithm in which the model is trained using a labeled dataset

This helps the model to classify and predict outcomes accurately
Helps the computer learn general rules that maps inputs into the correct outputs using the labeled training data

New cards

Unsupervised machine learning

A type of machine learning algorithm that uses unlabeled datasets to analyze datasets

New cards

Reinforcement machine learning

A type of machine learning algorithm that is trained by only rewarding or punishing the algorithm as it tries to achieve a desired outcome

New cards

Common problems in data analytics

Garbage in, garbage out = you must start with high quality data if you are to have useful analysis
Extrapolation = a process of estimating a value beyond the range of the data used to create the model
- Data overfitting = when a model is too closely matched to a specific set of data (training data) and is unable to actually predict future observations

New cards

5 Main Purposes for Visualization

1- comparison (bar chart)

2- Correlation (scatterplot)

3- Distribution (histogram)

4- Trend Evaluation (line chart)

5- part-to-whole (pie chart)

New cards

3 design principles for high quality visualizations

Simplification - in design, making a visualization easy to interpret and understand

Emphasis - assuring the most important message is easily identifiable

Ethical data presentation - avoiding the intentional or unintentional use of deceptive practices that can alter the user’s understanding of the data being presented

New cards