Data & Data Analysis Notes
Data & Data Analysis Notes
3.1 Understanding Data and Data Analysis
- By the end of this chapter, you should understand:
- Different types, uses, and representations of data.
- The role of big data and data analytics in extracting and processing information.
- Opportunities and dilemmas concerning data in the digital society.
Key Concepts
- Data:
- Raw, unorganized facts and figures (numbers, letters, images).
- Individual units containing no inherent meaning; measured in bits and bytes.
- Example: Temperature data = 50, Test score = 75.
- Information:
- Processed, organized data now ready for visualization or analysis.
- Provides context and content derived from questions (who, what, where, when).
- Example: 50°C represents temperature, 75% is a test score.
- Data: Unstructured facts.
- Information: Structured data offering context.
- Knowledge: Application of information to make decisions.
- Wisdom: Utilizing knowledge to predict outcomes and make informed decisions.
Stages of Gaining Wisdom
- Steps to achieve wisdom from knowledge (discussion points):
- Recognize the need for information.
- Process information for clarity.
- Apply information to formulate knowledge.
- Implement knowledge to make informed decisions.
Types of Data
- Financial Data: Quantitative information related to business finances (cash flow, balance sheets).
- Medical Data: Collected during patient care (electronic health records, clinical trial registrations).
- Meteorological Data: Weather and climate data collected via instruments and technology.
- Geographical Data: Data indicating an object's position in geographic space (GPS technologies).
Examples of Data Collection and Usage
- Citizen Science Example: Increased bird-watching during the pandemic led to a spike in data sharing about bird behaviors.
Data Collection Methods
- Primary Data: Original data collected for the first time for a specific purpose (e.g., surveys, interviews).
- Secondary Data: Pre-existing data collected for a different purpose (e.g., research articles).
Data Lifecycle
- Creation: Data can be generated manually or automatically.
- Storage: Data must be stored securely with appropriate access permissions.
- Usage: Data is processed or analyzed for various applications.
- Preservation: Ensuring data remains available for future use.
- Destruction: Permanent data removal complying with regulations.
Ways to Organize Data
- Data stored in databases using tables (columns for attributes, rows for records).
- Examples of common data types in databases:
- Strings (text), Integers (numbers), Dates, Booleans.
- Relational Databases: Store multiple tables linked by keys to ensure uniqueness of records.
Data Verification and Validation
- Validation: Ensures only suitable data enters the database.
- Verification: Confirms entered data matches original source.
Data Presentation
- Effective data presentation utilizes charts, tables, and visualizations for clarity.
- Examples include:
- Bar charts for categorical data (e.g., rainfall).
- Line graphs for continuous data (e.g., temperature).
Data Security
- Data must be secure during storage and transmission.
- Encryption: Converts data into unreadable forms to prevent unauthorized access. Types include:
- Symmetric Key: Same key used for encryption and decryption.
- Public Key: Different keys for encoding (public) and decoding (private).
Data Dilemmas
- Ethics of Data Collection: Ensure ethical standards and privacy regulations are met during data collection and usage.
- Discuss the implications of anonymity and potential misuse of data (e.g., cyberbullying).
Conclusion
- Understanding the relationships and processes involving data, information, knowledge, and wisdom is crucial in navigating the digital landscape effectively.