1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Preattentive Attributes
features that can be used in a data visualization to reduce the cognitive load required to interpret it
Include color, shape, size, length
Data Ink
the ink used in a table or chart that is necessary to convey the meaning of the data to the audience
Data-Ink Ratio
the proportion of ink used for data to the total amount of ink in a table or chart (high ratio is good)
Decluttering
the process of increasing the data-ink ratio in a chart
Table Design Principles
Keep the data-ink ratio high
Use lines only to separate labels from data and calculated fields
Labels should be left-aligned
Values should be right-aligned
Center vertical labels
Crosstabulation
a useful approach to describing a tabular summary of data for two variables
Pivot Table
a crosstabulation in Excel
Scatter Charts
a graphical presentation of the relationships between two quantitative variables
Multiple Line Charts
Line Chart
Bar Charts
Stacked Column Chart
Clustered Column Chart
Pie Charts
usually inefficient use of preattentive attributes of color and size make it a poor choice for data visualization
Data Dashboards
a data visualization tool that illustrates multiple KPIs and automatically updates as new data becomes available
KPIs
key performance indicators are metrics indicative of current operating characteristics
Raw Data
data that has NOT been processed or prepared for analysis; also called source data or primary data
Data Wrangling
cleaning, transforming, and managing raw data so that it is more reliable and can be more easily accessed and used for analysis
Data Wrangling Objective
to produce a final dataset that is accurate, reliable, and accessible
Data Wrangling Activities
Merging multiple data sources
Identifying missing values
Identifying erroneous values
Deleting duplicate observations
Identifying extreme values (outliers)
Subsetting data
Data Wrangling 6 steps
Discovery, structuring, cleaning, enriching, validating, publishing
Structured Data
refers to data arrayed in a predetermined pattern to make them easy to manage and search (flat file is most common pattern)
Unstructured
data not arranged in a predetermined pattern
Semi-structured Data
not organized as structured data but contain elements that allow for isolating some raw data elements
Stacked Data
organized so that the values for each variable are stored in a single field
Unstacked data
are organized so that the values for one variable correspond to separate fields
Legitimately Missing Data
a value of a field is missing because of an appropriate reason
Illegitimately missing data and how to handle them
1. Discard records (rows) with any missing values
2. Discard any field with missing values
3. Fill in missing entries with estimated values
4. Apply a data-mining algorithm that can handle missing values
Data Dictionary
documenting several attributes should be included: names, definitions, and units of measure used in the fields, the raw data sources and relationship(s) with other data, and miscellaneous attributes
Internal Controls Definition
a process, implemented and managed by an entity’s board of directors, management, and other personnel, designed to provide reasonable assurance regarding the achievement of objectives in the following categories:
Effectiveness and efficiency of operations
Reliability of financial reporting
Compliance with applicable laws and regulations
Risk
the potential for loss, damage or destruction of an asset as a result of a threat exploiting a vulnerability
Assessment Criteria of Risk
Likelihood (probability)
Impact (severity)
Likelihood (probability)
How likely is the event to happen
Impact (severity)
Assuming it happens, how bad is it?
4 Purposes of Internal Controls
Safeguard assets
Ensure reliable financial reporting
Promote operating efficiency
Encourage compliance with management directives (and the law)
Brown's Taxonomy- 4 Categories of Risk: a way to think about risk
Financial risk
Operational risk
Strategic risk
Hazard risk
can fit into multiple categories
Examples of Controls
Adequate documentation, background checks, firewalls, internal audits, password policies, job rotation, video surveillance, data encryption, computer backup
rows, records, obersvations
columns
field names
flat file
table
Discovery
become familiar with the raw data and conceptualize it use
Structuring
arrange the raw data to be more readily analyzed
cleaning
find and correct errors in the raw data that might distort the ensuring analyses
enriching
incorporate values from other data sets and/or apply transformations to portions of the existing data
validating
verify that the wrangled data are accurate, reliable and ready for the ensuing analysis
publishing
create a file containing the wrangled data and documentation and make it available to its intended users
Financial Risks
market risk, credit risk, liquidity risk
Operational Risks
systems risk, human error risk
Strategic Risks
legal and regulatory risks, business strategy risk
Hazard risk
directors and officers liability risk