Comprehensive Notes: Data Science & Data Visualisation

Quantitative and Qualitative Data

Quantitative Data: Numerical data that can be measured or counted.
- Discrete Data: Countable, finite values, often whole numbers.
- Example: Number of students in a class (you can't have 25.5 students).
- Continuous Data: Measurable values that can take any value within a range, often involving decimals.
- Example: Height 8.5 m, temperature 23.7 °C, speed.
Qualitative Data: Descriptive data based on observations, characteristics, or attributes. It describes qualities or categories.
- Involves the five senses (touch, smell, taste, sight, sound).
- Example: Colours (red, blue), textures (soft, rough), opinions (good, bad).

Data Types

Classify the kind of data a variable can hold (e.g., text, numbers, true/false).
- Essential for program interpretation and memory allocation.
String: A sequence of characters (letters, numbers, symbols). Used for names, addresses, descriptions.
DateTime: Stores information in a date and/or time format. Regional settings can vary.
Boolean: Represents a binary state: True or False (1 or 0). Often used for checkboxes or conditional logic.
Numeric Data Types: For calculations.
- Integer: Whole positive or negative numbers (e.g., 3, -2). No decimal points.
- Floating-point (Float): Numbers with decimal values (e.g., 0.5, -75.9). Used for precision.
- Real Numbers: Encompasses both rational and irrational numbers (e.g., 2, π). Necessary for complex scientific calculations.
Why Data Types Matter: Without explicit declaration, a program might misinterpret data (e.g., "123" as text instead of a number), preventing calculations or leading to errors.

Levels of Measurement Applied to Data

These scales dictate what statistical analyses can be performed on data.
Qualitative Data Scales:
- Nominal Scale: Categorical data without any order or ranking. Only allows for classification.
- Example: Gender (Male, Female), Marital Status (Single, Married, Divorced), Eye Colour.
- Ordinal Scale: Categorical data with a meaningful order or ranking, but the intervals between categories are not uniform or measurable.
- Example: Education Level (High School, Bachelor’s, Master’s, PhD), Satisfaction Rating (Very Dissatisfied, Dissatisfied, Neutral, Satisfied, Very Satisfied).
Quantitative Data Scales:
- Interval Scale: Numerical data where the order and exact differences between values are meaningful, but there is no true zero point. Ratios are not meaningful.
- Example: Temperature in Celsius or Fahrenheit (0°C doesn’t mean no temperature), IQ scores.
- Ratio Scale: Numerical data with a meaningful order, exact differences, and a true zero point. Ratios are meaningful.
- Example: Height, Weight, Age, Income, Number of products sold (0 products means no products).

Data Sampling

Definition: The process of selecting a subset (sample) of data from a larger dataset (population) for analysis. Conclusions drawn from the sample are then generalized to the entire population.
Purpose: Makes data analysis more manageable, quicker, and cost-effective while ensuring the sample is representative to minimize bias and provide accurate results.
Active Data Collection: Requires direct interaction or deliberate action to gather data.
- Manual Active: Direct human interaction.
- Examples: Surveys (paper-based), Interviews, Direct Observations.
- Computerised Active: User input via digital means.
- Examples: Online forms, Interactive applications (e.g., voting apps), Sensors requiring explicit activation.
Passive Data Collection: Automated data gathering without explicit user action or intervention.
- Manual Passive: Non-digital logging by humans.
- Examples: Security entry logs (manual sign-in), Tallying (e.g., counting cars passing a point).
- Computerised Passive: Automated digital tracking.
- Examples: Website cookies (track user behaviour), IoT devices (smart home data), Social media impression data (automatic tracking of views, likes).

Relevance, Accuracy, Validity, and Reliability of Data

Crucial for ensuring the quality and trustworthiness of data.
Data Sources:
- Primary Data: Collected directly by the user/researcher for their specific purpose.
- Pros: Highly relevant, tailored to exact needs, more control over collection.
- Cons: Can be time-consuming, expensive to collect.
- Examples: Surveys conducted by a company, results from a specific experiment.
- Secondary Data: Data collected by someone else for a different purpose, then used by another researcher.
- Pros: Easier and faster to obtain, often cheaper.
- Cons: May not be perfectly relevant, source reliability can be an issue, may be outdated.
- Examples: Government reports, academic journals, company annual reports.
Relevance: How applicable and pertinent the data is to a specific project or question.
- Example: Sales data is relevant for understanding customer preferences, but customer’s favourite colour (unless linked to product choice) might not be relevant.
Accuracy: The extent to which data is free from errors and represents the real-world context precisely.
- Example: Financial reports with precise numbers. Inaccurate data leads to incorrect conclusions. Achieved through cross-referencing and validation.
Validity: Data accurately represents the scenario being analysed in the correct context; it measures what it intends to measure.
- Example: A survey to measure customer satisfaction is valid if it asks customers directly about their satisfaction with product features they've used.
Reliability: The consistency of results when data collection is repeated under similar conditions over time.
- Example: A scientific experiment producing the same results when replicated.

Informatics

Definition: The study and practice of creating, storing, finding, manipulating, and using data to generate meaningful information. It combines computer science, information technology, and data science.
Support for Deeper Understanding: Informatics facilitates the transformation of raw data into valuable insights, helping users make informed decisions and solve problems more efficiently by managing and analysing vast amounts of data.

Impact of Errors, Uncertainty, and Limitations in Data

These factors can significantly impact data quality, introduce bias, distort insights, and lead to incorrect conclusions if not properly identified and addressed.
Errors: Mistakes or inaccuracies in data collection, entry, or processing.
- Effect: Skew results, lead to misleading insights.
- Example: A typo in a dataset (e.g., age 180 instead of 18) can severely affect statistical analysis.
Uncertainty: Variability or lack of confidence in data, often due to incomplete or ambiguous information.
- Effect: Insights become less reliable and difficult to interpret.
- Example: A weather forecast predicting rain has a degree of uncertainty (e.g., 60% chance of rain).
Limitations: Constraints in data, such as small sample sizes, outdated information, or narrow scope.
- Effect: Reduces the generalisability or depth of findings.
- Example: A survey conducted only among urban populations limits the relevance of findings to rural areas.
Factors Leading to Issues:
- Data Source: Unreliable or outdated sources introduce errors.
- Raw vs. Processed Data: Raw data is unfiltered and prone to errors; processed data has been organised and refined, reducing uncertainties.
- Data Bias: Skewed results from over/under-representing specific groups (see "Bias" section below).

Blockchain Technology

Definition: A decentralised and distributed ledger system that securely records, stores, and shares data across multiple nodes in a network. Each "block" contains a list of transactions, cryptographically linked to the previous block, creating an immutable chain of records.
Key Characteristics: Decentralisation, immutability (data cannot be altered), transparency (visible to all participants), security (cryptography).
Uses for Data Management and Verification:
- Online Voting: Enhances security, transparency, and integrity by providing an immutable record of votes, preventing alteration or tampering.
- Online Identities: Allows individuals to manage and verify their identity across platforms securely without repeatedly sharing personal information, reducing reliance on centralised authorities.
- Tracking Items: Provides an immutable record of every transaction, enabling verifiable tracking of goods and assets across supply chains, ensuring proof of ownership and origin.
- Recordkeeping: Guarantees data cannot be altered or deleted, ensuring long-term data retention and tamper-proof records for critical fields like legal contracts and medical history.
Data Quality (Cross-cutting theme, vital for all data operations)

Data Quality (Cross-cutting theme)

Refers to the overall fitness of data for its intended use. Encompasses accuracy, completeness, consistency, validity, timeliness, and integrity. High data quality is crucial for reliable analysis and decision-making.

Social, Ethical, and Legal Issues Associated with Using Data

Bias
- Definition: Occurs when data does not accurately reflect the full demographic or context it represents, leading to skewed or inaccurate results; may over-represent or under-represent specific groups.
- Unconscious Bias: Unintentional biases in data collection, sampling, or interpretation, often reflecting societal prejudices.
- Example: A hiring algorithm trained on historical data that disproportionately hired certain demographics might perpetuate racial or gender bias in new hiring decisions, even if unintentional.
- Impact: Affects real-world accuracy, leading to unfair or discriminatory outcomes.
Accuracy of Collected Data
- Definition: The correctness of data, free from errors.
- Achieved Through: Cross-referencing, Data Validation, Data Verification.
- Impact of Inaccurate Data: Leads to incorrect information and unreliable system outputs, impacting decision-making.
Metadata
- Definition: "Data that describes data." Provides context and structure to information.
- Examples: Timestamp, author, file size, HTML tags describing webpage content, a data dictionary describing fields in a database.
- Ethical Considerations: Metadata can reveal patterns of behaviour, locations, or personal details, raising privacy concerns. Balancing utility with user protection is crucial.
Copyright & Acknowledgement of Data Sources
- Copyright Act 1968 (Australia): Protects original works, including software code, databases, and unique datasets.
- Importance: Ensures acknowledgment and rewards creators.
- Ethical Implication: Unauthorized sharing of usernames, passwords, or proprietary datasets can breach copyright/licensing.
- Acknowledgement: Proper citation and attribution of data sources are essential for academic integrity and avoiding plagiarism.
Intellectual Property (IP) and Indigenous Cultural and Intellectual Property (ICIP)
- IP: Creations of the mind, such as inventions; literary and artistic works; designs; and symbols used in commerce. Protected by patents, trademarks, copyright.
- ICIP: Rights of Indigenous peoples to their traditional knowledge, cultural expressions, and heritage. Covers stories, languages, ceremonies, designs, and more. [Content continues in later sections about ICIP considerations and appropriate use of indigenous data and knowledge.]
Interpersonal Relationships: Affected by reliance on digital communication; special considerations for ICIP data to ensure respectful and appropriate use, benefit sharing, and protection against misappropriation.
Privacy and Security of Data
- Privacy Act 1988 (Australia): Regulates how personal information is handled; includes principles like collection limits, data quality, data security, access and correction.
- Security: Measures to protect data from unauthorized access, use, disclosure, disruption, modification, or destruction.
- Cybersecurity: Protecting computer systems and networks from digital attacks.
- Data Backup: Creating copies of data to ensure recovery in case of data loss.
- Specific Features/Concepts:
- Autofill: Convenience feature, but stores personal data, raising security concerns if compromised.
- Public/Private Connections: Risks of insecure public Wi-Fi vs. secure private networks for data transfer.
- Checkbox: Often used for user consent (e.g., "I agree to the terms," "Remember me"), with privacy implications.
- Terms of Agreement: Legal documents outlining data collection, use, and storage; users often accept without fully reading.
- Responsible Authorities: Government bodies (e.g., Office of the Australian Information Commissioner - OAIC) responsible for enforcing data protection legislation.
Impact of Data Scale (Big Data Implications)
- The sheer Volume of Raw Data in Big Data creates challenges and opportunities.
- Storage Streaming: Handling continuous, high-volume data flows.
- Machine Learning (ML): Relies on massive datasets to learn and identify patterns.
- Human Behaviour: Big Data enables deep analysis of collective human behaviour, leading to targeted advertising, predictive policing, etc.
- Ethical Implications: Who owns the data? How is it used? Potential for surveillance, discrimination, and manipulation.
Social Issues
- Changing Nature of Work
- Working from Home (WFH): Enabled by technology, affecting work-life balance, real estate, and team dynamics.
- Job Losses: Automation and AI replacing human jobs (e.g., self-checkout vs. cashiers, ATMs vs. bank tellers).
- Virtual Communities: Online spaces where individuals interact.
- Pros: Build friendships, facilitate niche interests, offer financial opportunities (e.g., Second Life).
- Cons: Time-consuming, costly, exposure to negative influences, addiction, blurring of reality.
Legal Aspects of Data Collection and Handling
- Legislation: Laws governing data protection, privacy, copyright (e.g., Privacy Act 1988, Copyright Act 1968, WHS Act 2011 in NSW).
- Data Sovereignty of Indigenous People: Recognises Indigenous peoples’ right to own, control, access, and possess their data, knowledge, and traditional practices; requires special ethical and legal frameworks for data collection and use.

Field Name, Data Format, and Data Stewardship

Field Name: Name of the column (e.g., FirstName, Student_ID).
Data Format: Specific format within the data type (e.g., YYYY-MM-DD for Date/Time).
Curated and Communicated Data on Social Behaviour: Ethical guidelines for how social data is collected, analysed, and presented to avoid misrepresentation or harm.

Processing and Presenting Data

1. Flat File Database
- Definition: A database stored in a single file or table. Simple, but limited in handling complex relationships.
- Organisation Hierarchy:
- File: The entire collection of records (e.g., "Students").
- Record: A single row of related data about one entity (e.g., all information for one student).
- Field: A single piece of information, a column in the table (e.g., "First_Name," "DOB," "Address").
- Character: The smallest unit of data (a letter, number, or symbol).
- Data Dictionary: A document that defines the structure of a database or table.
- Data Type: Type of data stored (e.g., Text, Integer, Boolean, Date/Time, Real/Floating Point).
- Field Size: Maximum length or number of characters/digits allowed.
- Description: Explains the purpose of the field.
- Example: A sample valid entry.
2. Spreadsheet Summaries and Collation of Information
- Filter, Group, and Sort Data: Organising data to view specific subsets, combine similar items, or arrange in a particular order.
- Linking Sheets: Referencing data across multiple worksheets within a single workbook.
- Conditional Formatting: Automatically applying formatting (colours, fonts) to cells based on their values, highlighting patterns or conditions (e.g., red for numbers below a threshold).
- Data Comparisons: Using formulas or functions to compare different datasets or values.
- Forms and Reports: Creating user-friendly interfaces for data entry (forms) and structured outputs for viewing/printing (reports).
- Spreadsheet Dashboards: Interactive visual displays of key performance indicators (KPIs) and data trends, built using charts, tables, and conditional formatting.
- Graphs: Visual representation of data (bar, pie, line charts).
- Pivot Tables/Slicers: Powerful tools to summarise, analyse, explore, and present large datasets. Slicers provide interactive filtering for pivot tables/charts.
- VLOOKUP: A function to look up a value in one column of a table and return a corresponding value from another column in the same row.
3. Design a Relational Database
- Relational Database Model: Organises data into multiple tables, linked by common fields (keys). Reduces data redundancy and improves data integrity.
- Flat File Databases vs. Relational Database Models: Flat files are single tables, prone to redundancy and inconsistency. Relational databases are multiple interconnected tables, ensuring data integrity and flexibility.
- Entity Relationship Diagram (ERD): A visual model that illustrates the relationships between different entities (tables) in a database. Shows how data is structured and connected.
- Use Key Fields for Linking Tables:
- Primary Key: Uniquely identifies each record.
- Foreign Key: A field in one table that refers to the primary key in another table, establishing a link between them.
4. SQL (Structured Query Language) Introduction
- Definition: A standard language for accessing and manipulating databases.
- Capabilities: Create new databases, tables, stored procedures, views; Set permissions on tables, procedures, and views.
- Basic Operators/Clauses: SELECT, FROM, WHERE, NOT, LIKE, BETWEEN, IN, and comparison operators (e.g., >=,
- Primary Key and Foreign Key concepts described above.
- User Views: Customised subsets of data from a database that present only the information relevant to a specific user or role, enhancing security and usability.
- Search and Sort - Including SQL: Querying databases to find specific information and order results.

Machine Learning (ML) and Statistical Modelling

Machine Learning: A subset of AI that enables systems to learn from data without explicit programming. Algorithms identify patterns and make predictions.
Statistical Modelling: Uses mathematical equations to model relationships between variables in data, often for prediction or inference.
Relevance: Both are core to advanced data analysis, enabling predictive analytics, classification, and deeper insights from large datasets.
Current Technology (Brief Mention):
- Quantum Computing: Emerging technology with potential to revolutionise data processing power far beyond classical computers, impacting cryptography, complex simulations, and AI.
- AI (Artificial Intelligence) Prompt Refining: The process of improving the input given to AI models to elicit more accurate, relevant, and useful outputs. Essential for effective human-AI interaction.

Module: Data Visualisation

1. Purposes of Data Visualisation

Definition: Transforming raw data into a visual format (charts, graphs, infographics) to provide clarity, insights, and enable efficient interpretation of large datasets.
Simplify Information: Breaks down complex datasets into easily understandable visual formats (e.g., bar graphs, pie charts) to identify patterns, relationships, and trends, making insights accessible to a broader audience.
Tell a Story: Acts as a narrative tool. Visualisations can reveal changes over time, highlight causal relationships, and influence viewers' opinions by explaining "why" certain events or trends occurred. The choice of visual can shape the narrative.
Highlight Results: Draws immediate attention to significant results. Visual emphasis through colours, conditional formatting, or specific chart types (e.g., a line chart showing a sharp increase/decrease) helps the audience focus on key data points, successes, concerns, and predictive outcomes.

2. Software Features Supporting Data Visualisation

Enhance understanding through aggregation, filtering, and interactive visualisation; enables exploration from multiple perspectives.
Spreadsheets (e.g., Excel, Google Sheets):
- Features: Charts (bar, line, pie), pivot tables, conditional formatting, formulas, data analysis tools.
- Contribution: Summarise large datasets, identify trends, highlight key metrics, build simple dashboards.
Creative Design Applications (e.g., Adobe Illustrator, Canva):
- Features: Templates, colour schemes, brand integration, customization.
- Contribution: Make infographics visually appealing, engaging, and accessible for broad communication.
Combining Applications:
- Integration: Connect diverse data sources and allow data to pass between programs (e.g., data from a database analysed in Excel, then visualised in Tableau, and presented in PowerPoint).
- Contribution: Track trends and forecast outcomes by leveraging the strengths of specific applications for analysis, design, and presentation.

3. Identifying Patterns by Interpreting and Comparing Datasets

This process involves uncovering trends, relationships, and anomalies to transform raw data into meaningful information.
Enterprise Issues: Analysis uncovers performance trends, growth opportunities, underperforming areas, seasonal fluctuations. Informs inventory, marketing, resource allocation, consumer demand forecasting, supply chain optimisation, and cost reduction.
Social Issues: Identifying patterns in society reveals trends or disparities (e.g., inequity, public health crises, resource shortages like water). Governments use this for policy-making.
Ethical Issues: Awareness of patterns can reveal biases (e.g., in loan applications), necessitating ethical considerations in data collection and interpretation to avoid discrimination.
Predictive Data Analytics: Using historical data patterns to forecast future outcomes. While powerful, predictions are never 100% certain.

4. Impact of the Evolution of Hardware and Software on Data Analytics

Technological advancements have transformed data analytics, enabling faster and more efficient processing and interpretation of massive datasets.
Processing Devices:
- CPU (Central Processing Unit): The "brain" of the computer. Performs rapid calculations and executes instructions. More powerful CPUs (e.g., i5/i7/i9, AMD Ryzen, Apple M series) and multiple cores enable faster performance and multitasking (Fetch-Decode-Execute-Store cycle).
- GPU (Graphics Processing Unit): Specialized for parallel processing of graphical tasks. Crucial for data visualisation, video editing, 3D rendering, and machine learning, where complex computations are performed simultaneously.
Storage Devices:
- Magnetic Disks (HDDs): Traditional hard drives; slower access; used for bulk, less frequently accessed storage.
- Optical Disks (CDs, DVDs, Blu-ray): Archival storage; slower access.
- Network Storage: Data stored on remote servers accessible over a network (e.g., RAID systems, NAS).
- Flash Memory (USB, SD cards, SSDs): Non-volatile, high-speed storage; SSDs are significantly faster due to no moving parts.
- RAM (Random Access Memory): Volatile primary storage; holds data/instructions actively used by the CPU.
- Trend: Evolution towards faster access and greater use of flash memory and Cloud Storage (e.g., Google Drive, iCloud, Dropbox).
Communication Devices/Media: Faster network speeds (fibre optic, 5G), improved protocols, and interconnected devices enable rapid data transfer and real-time analytics across distributed systems and cloud platforms.

5. Online Analytical Processing (OLAP)

Definition: A category of software tools that provides rapid, interactive analysis of multi-dimensional data from various perspectives; used for complex analytical queries.
Purpose: Allows users to slice and dice data, drill down into details, or roll up to summaries, facilitating business intelligence and decision-making by quickly identifying trends and patterns in large datasets. Often used with data warehouses.

6. Assess Data Integrity in the Development of a Data Visualisation

Data Integrity: The overall accuracy, completeness, consistency, and reliability of data throughout its lifecycle. Crucial for trustworthy visualisations.
Components to Assess:
- Ownership: Who is responsible for the data? Is it clearly defined?
- Source: Is the data from a credible and trustworthy origin? (Primary vs. Secondary data considerations).
- Validation: Has the data been checked for errors, inconsistencies, and adherence to rules? (e.g., through data validation tools, input masks).
- Risk: What are potential risks to data integrity (e.g., data breaches, human error, system failures)? How are these mitigated?
Importance: Visualisations based on compromised data integrity are misleading and can lead to incorrect decisions. (Refer to examples like real-world data breach lists to highlight risks.)

7. Impact of Enterprise Data Warehousing on Data Visualisation

Data Warehousing: A system that extracts, cleans, transforms, and loads historical and current data from various operational systems into a central, consolidated repository for reporting and analysis.
Impact:
- Analysis of Historical Data Trends and Patterns: Provides a stable, long-term view of data, enabling consistent trend analysis over extended periods.
- Correlation with Current Data: Allows for the integration of real-time data with historical context, providing richer insights.
- Data Refinement/Optimisation: Data in a warehouse is cleaned and optimised for analytical queries, ensuring high data quality for visualisations.
- Benefit for Visualisation: Provides a reliable, consistent, and structured source of high-quality data, making the creation of accurate and comprehensive visualisations more efficient.

8. How Big Data Affects the Design and Development of Data Visualisation

Big Data (Volume, Velocity, Variety, Veracity) introduces challenges and requirements for visualisation.
Scope of Information: Visualisations must handle massive volumes of data, often requiring aggregation or sampling to remain interpretable.
Types and Depth of Insight: Visualisations need to reveal complex relationships and granular insights from diverse data types (structured, unstructured, semi-structured).
Bias: Visualisation design must be conscious of inherent biases in Big Data sources and avoid perpetuating them.
Accuracy: Ensuring the accuracy of visualisations despite volume and potential noise.
Audience: Visualisations must be tailored to the audience’s ability to comprehend complex, large-scale information, often requiring interactive elements.
Data Source: Reliance on diverse and often raw data sources (e.g., IoT, social media feeds) requires robust data cleaning and integration before visualisation.
Unconscious Bias: Designers must be aware of their own unconscious biases that could influence chart choices, colour schemes, or filtering decisions, leading to misrepresentation.

9. Interpreting Data Visualisations

Software Tools:
- Spreadsheet Software: Basic charts and pivot tables for direct interpretation.
- Dashboards: Interactive, consolidated visual displays for real-time monitoring and quick insights.
- Presentations for Data Analysis: Static or animated visualisations embedded in slides for a structured narrative.
- Business Analyst Services: Experts who interpret complex visualisations and provide strategic insights.
- Custom Software: Bespoke tools for highly specific, complex visualisations and interactive analysis.
Interrogating Data from Data Visualisations:
- Interpreting what you see: Draw initial conclusions from visual patterns.
- Aggregation: Understand how data has been summarised (e.g., averages, sums) and its impact on the view.
- Filtering: Recognise which data has been included/excluded and how it affects trends.
- The effect of outliers: Identify extreme values and assess their impact on overall trends.
- Reasoning: Apply critical thinking to deduce underlying causes, relationships, and implications from the visual data.

10. Designing for User Experience (UX)

UX Influence: Focus on overall satisfaction and usability of a system.
- Relevance to the audience: Visualisations must be meaningful and directly applicable to user needs.
- Audience interpretation: Design should ensure clarity and avoid ambiguity.
- Customisation: Allow users to tailor views, filters, or parameters.
- Live analysis: Real-time updates and interactive capabilities for immediate exploration of data.
Graphic Design Tools: Software (e.g., Adobe Illustrator, Tableau, Power BI) used to create visually appealing and effective charts, graphs, and infographics.
Criteria for Evaluating UX: Usability, efficiency, satisfaction, accessibility, error handling.
Emerging Hardware and Software Technologies on UI/UX:
- Hardware: Touchscreens, gesture control, VR/AR headsets, larger/high-resolution displays.
- Software: AI-driven interfaces, natural language processing (NLP) for querying data, voice commands, advanced animation and interactivity, cloud-based collaborative tools.
- Goal: Make data interaction more intuitive, engaging, and powerful.

11. Creating Data Visualisations

Process:
- Research, Source, Organise and Store Data: Meticulous data management is foundational.
- Design and develop a data visualisation for a specific scenario: Represent trends, patterns, and relationships.
- Illustrate predictive analysis incorporating big data: Requires robust models and careful visualisation of forecasts and confidence intervals.
Methods to maintain data security:
- Cybersecurity: Firewalls, encryption, access controls, intrusion detection systems.
- Data Backup: Regularly creating copies of data and storing securely to prevent loss due to hardware failure, cyberattacks, or accidental deletion.

12. General Reminders for Trials ( NSW Enterprise Computing )

Interconnections: Always look for links between modules. For instance, Data Quality from Data Science is directly relevant to Data Integrity in Data Visualisation and the success of Enterprise Projects.
Problem-Solving Focus: Enterprise Computing is about solving business problems. When discussing concepts, always link them back to how they contribute to effective solutions.
Impact on Enterprise: Think about how each concept (e.g., Big Data, Remote Work, Iterative Approach) affects businesses in terms of efficiency, cost, decision-making, and competitive advantage.
Ethical Lens: For all data-related topics, be prepared to discuss ethical implications, social consequences, and legal responsibilities. Use Bias, Privacy, Copyright, and ICIP notes extensively.
Technological Trends: Understand the why behind technological evolution (e.g., faster CPUs enable faster data analytics).
Practical Examples: Use clear, simple real-world examples to illustrate points.
Directive Verbs: Be ready for prompts like "Explain how," "Assess the impact," "Justify". Structure answers with topic sentences and supporting details.
Terminology Precision: Use correct terminology accurately and consistently. A good glossary understanding is vital.
Diagrams (Mental or Drawn): For concepts like ERDs or Gantt Charts, have a mental picture of what they look like and their purpose.