1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
General Rules About Pie Charts
1. appropriate for non technical audiences
2. widely used
3. few categories
4. when data is 100%
5. add labels for percentage
6. Keep it simple
Time
goal is to see the evolution of one or several quantities over time.
- often referred to as time-series data
how to visualize discrete points in time?
- bar graph
- stacked bar graph
- points
Continuous time point
- time series chart
- step chart
why visualize data?
1. Find patterns and see data in context
2. Expand memory
3. Make data accessible to everyone
4. Answer questions (or discover them)5. Make decisions / Persuade others to make decisions
How to se distributions?
- histogram
- continuous density plot
- box plot
Microsoft excel Pros.
Supports processing of data
Compatible with Word and Power Point
Relatively easy to learn
Widely used
Excel Cons
Good for basic visualization - not interactive
Require customization to adhere to design
standards
May not process large dataset (~1GB)
GEOSPATIAL VISUALIZATION TOOLS
ARCGIS
- Built for desktop mapping
- User interface, no coding required
- Used by professional cartographers, graphics departments
- Google, Yahoo, and Microsoft Maps
- Easiest online solution, requires some programming
- Modest maps
- Flash and ActionScript library for tile based maps, coding required
- Other tools such as Tableau and R
PROGRAMMING TOOLS - R
Pros
Free open-source statistical programming language
Built and maintained for statisticians by statisticians.
Capable of both data analysis and data graphics
Libraries used for visualization in R: Graphics, ggplot2, car, lattice, ndtv, plotly
Can write your own functions and packages to make graphics the way you want
Cons
Default chart outputs require design refinements: Lack of titles, Undefined scales for axes
Use R to create graphs and edit and refine using design software: Adobe Illustrator, Inkmap
R is great for exploratory data visualization (analysis) but may not be the best tool for explanatory data visualization (presenting results and storytelling)
PROGRAMMING TOOLS - PYTHON
-pros
Can handle large amounts of data without crashing.
Useful for analyses and heavy computation
Clean and easy to read syntax
Some of pythons data visualization libraries: Matplotlib, seaborn, geoplotlib, ggplot
-cons
Great starting point for data exploration, not very good aesthetically
PROGRAMMING TOOLS - JAVASCRIPT
PROS
- web-based scripting language
- some javascript libraries - d3, rcharts, highcharts, charts.js, dimple.js
- freely available and allow users to create sophisticated web-based visualizations
CONS
- steep learning curve
- require skills in working with HTML and JSON
CHECKLIST FOR DATA VISUALIZATION TOOLS
- preparing data before visualization
- integration
- ease of use
- ease of collaboration
- visualization types
- communication
- performance
- privacy
- price
categorical data
Data that consists of names, labels, or other nonnumerical values
ordinal data
data exists in categories that are ordered but differences cannot be determined or they are meaningless. (Example: 1st, 2nd, 3rd)
Quantitative data collection
Measurable, using only factual content
Exploratory
testing a hypothesis (visual confirmation) and mining for patterns,trends, and anomalies (visual exploration)
Explanatory
usually simple everyday visualizations —line charts, bar charts, pies, andscatter plots conveying a single message
NINE VISUAL CUES
position, length, angle, direction, shapes, area, volume, color saturation, color hue
VISUAL CUES - POSITION
-commonly used on scatter plots
-you compare values bases on where other are placed in the coordinate system
-easy to notice outliers and clustering
VISUAL CUES - LENGTH
- Commonly used on bar charts
- Length of bars in bar graph provides visual cues
- The longer the bar, the longer the absolute value
- Start the axis at zero as people visually compare the distance from 0 to the end of the bar
VISUAL CUES - ANGLES
- Commonly used for pie charts
- Commonly used to represents parts of a whole
- Donut charts do not use angles since the center of the circle is cut out -arc lengths are used as visual cue
VISUAL CUES - DIRECTION
-Commonly noticed in line graphs
-Direction provides one basic visual cue
-Direction helps with noticing trends
-Slope be used to signal sharp/drastic changes in direction
VISUAL CUES - AREA AND VOLUME
bigger object represent greater value
VISUAL CUES - SHAPES
-shapes can be used to denote categories and objects
-visually shapes are readily recognized
VISUAL CUES - COLOR
Hue - refers to the different
colors
Saturation - refers to the
density of a given color e.g
gradients
Color can be used to show
categories
Color can be used to
highlight certain aspects of
your data visualization
steps to preparing data
Structuring data
Cleaning data
Aggregating data from different
sources
Validating data
What is special about a bar chart?
- intuitive
- appropriate for non-technical audience
- useful to visualize discrete data
- start axis at zero
wedge
each portion of the pie represents a category of value.
what are the kinds of coordinates?
cartesian, Polar, Geographic,
Types
Linear, Logarithmic, categorical, ordinal, percent, time
what is a histogram
- encodes data using height as the visual cue
- density is on the vertical axis
- horizontal axis has values
Histogram bin size
- bin size changes by dataset
- you want the bin sizes big enough so that you see variability in the data
- not so small that the histogram is too noisy to interpret
Continuous density plot
Like a histogram butis continuous insteadof bins
what are the three types of distributions?
symmetric distribution, left skewed distribution, right skewed distribution
Box Plot
Shows range, median and quartiles of the
data
Uses position and height/length visual cues
You can use multiple box plots to compare
distributions
Less specific than histograms or density
plots
Union
Merging data that is spread acorss several files, sheets, or tables
what are the kinds of unions?
- inner join
- Left Join
- Right Join
- Full outer Join
What does the pill color indicate?
continuous or discrete
what color are discrete pills?
blue
what color are continuous pills?
green
How do dimensions come out?
Dimensions come out onto the view as themselves
How do measures come out?
Measures come out onto the view as aggregates
What is a scatter plot used for?
Often used to visualize the
relationship between two
variables
Scatter plots use position as
the visual cue
Each dot has X- ad Y-
coordinates that match the
axes
correlation
means one thing tends to change a certain way as another thing changes.
Direction
positive or negative correlation
magnitude
strong or weak correlation
coefficient of correlation
quantifies how tightly coupledthe values of two variables are with respect to each other
Scatterplot Matrix
Scatter plot matrix is useful
to see relationships among
multiple variables
Allows comparison across
multiple dimensions
Plot every variable pair and
look for correlation
bubble chart
Allows you to compare 3
variables at once: x variable, y
variable, and area variable
Bubble should be sized based
on area not radius, diameter
or circumference
Example - Hans Rosling's Tool
symbol maps
Specific geographic locations are marked with circles, squares, or custom shapes
Form, size or color of these marks can vary according to a measure or dimension
choropleth maps
Geographic areas are shaded according to a measure or dimension
density maps
Areas of relative concentration are colored intensely, while those with sparse occurrences of the dimensionare colored lightly