Cambridge International AS Level Information Technology
Data and Information Notes
Introduction to Data and Information
Data Definition: Data refers to raw facts and figures that have no meaning on their own. The term "data" typically is the plural of "datum," but is often used in both singular and plural contexts.
Information Definition: When data is processed and context is added, it becomes information. Information is data that has been given meaning, often through processing by a computer.
Examples: Sets of data, such as postal codes or area codes, can become information when they are contextualized, like stating they are postal codes in a country or area dialing codes for telecommunication.
Data Processing
How Data is Stored: Data is stored in binary digits, represented as 1s and 0s. Various storage media include hard drives, solid-state drives, and memory cards.
Processing Purposes: Data is processed for analysis and to produce new data or information, where the processing produces results that can be interpreted and utilized. For instance, opening a .csv file and utilizing spreadsheet formulas manipulates the raw data into usable information.
Types of Data
Direct and Indirect Data
Direct Data: Collected specifically for a particular purpose, such as through questionnaires, interviews, or logging data in real-time scenarios.
Indirect Data: Obtained from third-party sources and used for different purposes than originally intended, such as electoral registers or business databases.
Data Sources and Collection Methods
Direct Data Collection Sources
Questionnaires: Comprised of sets of questions to gather specific information from respondents. They can include both closed and open-ended questions and can be distributed on paper or digitally.
Interviews: Formal discussions that gather detailed information about a subject from interviewees, which can be structured or unstructured.
Observation: Collecting data by watching events as they occur.
Data Logging: Involves automatic collection of data via sensors and computer systems, commonly used in scientific monitoring.
Indirect Data Sources
Weather Data: Collected and used by services such as the Met Office for forecasting, which can later be sourced and used by other entities like construction companies.
Electoral Registers: Lists of eligible voters that can be used for various practical purposes but are collected for a specific governmental purpose.
Research Sources: Textbooks, journal articles, and related online resources that provide indirect information for research purposes.
Census Data: Produced nationwide by governments to inform about populations, often used indirectly for various analyses but originally collected for governmental planning.
Data Quality
Quality of Information: The quality of information refers to how accurate, relevant, and complete the data is when processed into information. Poor quality data can result in incorrect conclusions and decisions.
Characteristics Affecting Quality
Accuracy: Information should be error-free, and accuracy checks should be implemented during data collection and processing.
Relevance: The data being collected must be pertinent to the purpose of its intended use. Collecting irrelevant data wastes resources.
Age: Information should be current; outdated data can lead to poor decisions based on inaccurate predictions or trends.
Level of Detail: The right amount of detail is crucial; too much can overwhelm the user, and too little can lead to insufficient understanding.
Completeness: High-quality information must cover all relevant aspects of a particular problem to avoid gaps that might mislead users.
Encryption
Importance of Encryption
Need for Encryption: To protect personal information from being intercepted during transmission over the Internet, various encryption techniques are implemented. Encryption scrambles data such that only authorized users can decode it, thus ensuring information confidentiality.
Methods of Encryption
Symmetric Encryption: Both sender and recipient use a common key for both encryption and decryption, making it faster, but it presents security risks if the key is intercepted.
Asymmetric Encryption: Involves a pair of keys (public and private) for encryption and decryption, commonly used in many secure online transactions today.
Encryption Protocols: Specific sets of algorithms and rules (such as SSL/TLS and IPsec) guide how encryption is applied to secure communications.
Validation and Verification
Importance of Validation and Verification
Validation is used to ensure that the data entered is reasonable and sensible. For example, limiting numeric fields to realistic ranges reduces errors.
Verification confirms that data has been entered accurately and matches the source document, mitigating the risk of human error during data entry processes.
Types of Processing
Batch Processing
Definition: In batch processing, large volumes of data are collected and processed as a single batch without user interaction. Common applications include payroll systems and utility billing.
Online Processing
Definition: Online processing allows users to input data that is processed almost immediately. It is key in applications requiring up-to-date transactions, such as banking systems.
Real-time Processing
Definition: This occurs continuously, processing data in real time, often used in critical systems like monitoring and control applications (temperature, pressure systems).
Conclusion
Understanding the distinctions between data, information, direct and indirect data sources, processing methods, quality factors, and security measures are essential in the field of information technology and data management.