Data 200 Lect. 2, 3,4
- Data is represented in Binary
- strings of 1 ands 0
- small info = bits
- Binary Basics
- 0 → 1
- 1 → 1
- 2 → 10
- 3 → 11
- 4 → 100
- Base of 2
- Always divide by 2
- Characters and More
- ASCII table - converts hexadecimal and decimals into commands
- Colors
- How to represent them?
- RGB (###, ###, ###)
- Hexadecimal (#RRGGBB)
- each of the two-digit color codes are in the range 00…FF hexadecimal
- Data Groups
- Structured Data
- everything has a name and a type
- relationship is defined between value and name
- eg. spreadsheets
- Unstructured
- no implied relationship between values
- Semi-structured Data
- some portions are structures and some are not
- Files
- CSV (Comma Separated Values)
- text has quotes, numbers do not
- each value is separated from other by commas
- xml (eXentsible Markup Language)
- XML extends HTML to provide structure for exchanging non-document information
- eg. webpage
- tsv (Tab Separated Value
- same as CSV but separated by Tab
- JSOM: Javascript Object notation
- RSS (really simple syndication
- dialect of XML
- A lot of diff. ways data can be stored. in class mainly using CSV when incorporating data
- How to find dataset?
- data.gov
- dataset.google.com
Lect. 3
- Center of data
- Mean
- Median
- Mode
- Spread
- Standard Deviation
- Range difference
- St. dev range difference
- range/SD
- Z-score
- (observed # - mean) /SD
- \
- Visualizing Data/Finding relationships
- Histogram
- if data is continuous, bars touch
- Density plot
- gives more info than a histogram
- Bar plot
- bars don’t touch
- can arrange in any order
- Scatter plot
- \