Introduction to Information Science

Page 1: The Information Science Discipline

Nature of Information Science

Information science is an academic discipline and profession dealing with the collection, storage, retrieval, and use of recordable information.

Four Conceptions of Information Science (Buckland; Limberg):
  1. Computing and Data Science: Focused on algorithms and data handling.
  2. ICT Focus: Centered on information and communication technologies.
  3. Physical/Biological Entity: Information as an entity within physical or biological sciences.
  4. Meaning and Knowledge: Concern with information recorded in documents (growing from librarianship and documentation).
Formal Definitions
  • Saracevic (2010): Information science deals with the effective collection, storage, retrieval, and use of information. It is concerned with recordable knowledge and the technologies that facilitate management and use.
  • Paul Hirst’s "Field of Study" Perspective: Information science is a multidisciplinary field focused on recorded information, using various forms of knowledge (sociological, mathematical, philosophical) to study its central concept.

The Information Communication Chain (Robinson, 2009)

The uniqueness of the discipline lies in its focus on the entirety of the communication chain of recorded information:

  • Stages: Creation $\rightarrow$ Dissemination $\rightarrow$ Collection $\rightarrow$ Storage $\rightarrow$ Organisation $\rightarrow$ Description $\rightarrow$ Retrieval $\rightarrow$ Use $\rightarrow$ Preservation/Disposal.
  • Other disciplines (journalism, publishing, computer science) focus on specific components; information science addresses the totality and the interactions between them.

Overlaps with Other Disciplines

  1. Collection Disciplines: Librarianship and archive science (conflated as LIS or Library and Information Management).
  2. Digital Humanities (DH): Integrating digital computing with humanities research; both focus on data curation and documentation.
  3. Technology Disciplines: Computer science overlaps in areas like Information Retrieval (IR) and Human-Computer Interaction (HCI).
  4. Data Science: Focused on extracting meaning from large datasets via data curation and management.
  5. Social Sciences: Overlap in studies of "Information Society," behavior, and policy.
  6. Communication and Media: Writing, abstracts, publishing, and information design.
  7. Management and Policy: Knowledge management (KM) and information governance.

History and Professionalization

  • 1808: Martin Schrettinger coins Bibliothekswissenschaft (Library Science).
  • 1895: Paul Otlet and Henri La Fontaine establish the International Federation for Information and Documentation (FID).
  • 1937: American Documentation Institute (later ASIST) founded.
  • 1948: Royal Society conference addresses the "information explosion."
  • 1961: Jason Farradane pioneers modern training in the UK and coins the terms "information science" and "information scientist."
  • Current Trend: The iSchools movement (established 2005) aligns information departments with computing and technology.

Page 2: History of Information: The Story of Documents

Defining Document Evolution

History is divided into periods of evolving communication systems:

  • Prehistory: No information systems.
  • History: Systems support society.
  • Hyperhistory: Society is dependent upon and defined by information systems (Floridi, 2014).

Chronological Timeline

Ancient World
  • Proto-writing: 7,000 years ago (Tartaria tablets). Bone tags in Abydos (3320 BCE).
  • Cuneiform: Created in Uruk (3300 BCE) on clay tablets. Sippar and Assurbanipal (Nineveh) house massive, subjects-classified archives.
  • The Alphabet: Phoenician alphabet (1000 BCE) influenced Greek and Aramaic scripts. Led to the first alphabetical document ordering.
  • Library of Alexandria: Aimed to house all human knowledge; developed the Pinakes (bibliographic tool).
Classical to Medieval
  • The Codex: Forerunner of the book; replaced the scroll (100–500 CE) due to convenience.
  • House of Wisdom (Baghdad): Ninth-century center for translation and scholarship.
  • Medieval Scholasticism: Focus on illuminated manuscripts in monasteries (Lindisfarne Gospels, Beowulf).
  • Paper: Reached Europe in the 12th century (from China via the Arabic world).
The Age of Print (Renaissance - 1789)
  • Johannes Gutenberg (1447): Moveable Character printing press.
  • Effects: Identifiable bibliographies (Gesner’s Bibliotheca Universalis, 1545), wider dissemination, and scientific journals (1665).
Mass Communication (The Long 19th Century)
  • Industrial Revolution: Steam-powered presses (Koenig, 1814) increased output from 250 to 20,000 sheets/hour.
  • Technological Advances: Telegraph (1830s), Telephone (1870s), Photography (1840s).
  • Library Growth: Dewey Decimal Classification (1876); Library of Congress Subject Headings (1898).
The Documentation and Digital Age
  • 20th Century: Paul Otlet’s Traité de documentation (1934) and the emergence of microforms, punched cards, and BOolean logic.
  • Digital Shift: The World Wide Web and social media create a dramatic shift analogous to the invention of printing.

Page 3: Philosophies and Theories of Information

Philosophy and Ethics

  • Ontology: Studying the nature of information and existing entities.
  • Epistemology: Defining knowledge. Fundamental view: knowledge = justified, true belief.
  • Realist Perspective: Reality is objective; underlies the systems paradigm.
  • Constructivist Perspective: Reality is subjective; underlies the cognitive paradigm.

Key Philosophical Figures

  • Karl Popper’s Three Worlds:
      - World 1: Physical world.
      - World 2: Subjective mental world.
      - World 3: Objective communicable knowledge (libraries, books).
  • Luciano Floridi’s Philosophy of Information (PI):
      - Defines Data as difference/lack of uniformity.
      - Defines Information as well-formed, meaningful, and truthful data.
      - Concepts: Fourth Revolution, Infosphere, and Onlife.
  • Social Epistemology (Egan & Shera): Focus on how society as a group holds and creates knowledge.

Paradigms and Turns

  1. Systems Paradigm (Cranfield): Quantitative, experimental evaluation of retrieval systems.
  2. Cognitive Paradigm: Focuses on the user’s thought structure (Anomalous State of Knowledge - ASK).
        - Brookes' Equation: K(S)+ΔI=K(S+ΔS)K(S) + \Delta I = K(S + \Delta S)
  3. Socio-Cognitive Paradigm: Knowledge is constructed within social/disciplinary contexts (Domain Analysis).
  4. Neo-documentary Paradigm: Focus on "Information-as-thing" (Buckland).

Theory Classification (Gregor, 2006)

  • Type 1: Analyzing.
  • Type 2: Explaining.
  • Type 3: Predicting.
  • Type 4: Explaining and Predicting ("Grand Theories").
  • Type 5: Design and Action.

Page 4: The Nature of Information and Data

Shannon’s Information Theory

First formal quantitative measure of information, ignoring meaning and focusing on transmission efficiency.

  • Formula for H (Measure of Information):
    H=KpilogpiH = - K \sum p_i \log p_i
  • Information is equated to choice, surprise, and uncertainty.

Biological and Physical Information

  • Information is Physical (Landauer): Must be instantiated in a physical system.
  • Negentropy: Information as the opposite of physical entropy (order vs. disorder).
  • Biology: Genetic information in DNA codes and biological signaling.

The Social World: The DIKW Hierarchy

Data $\rightarrow$ Information $\rightarrow$ Knowledge $\rightarrow$ Wisdom.

  • Information: Truthful semantic content (distinguished from misinformation/disinformation).
  • Relevance: Essential for retrieval (utility, topicality, pertinence).
  • Buckland’s Three Forms: Information-as-thing, Information-as-process, Information-as-knowledge.

Page 5: Documents and Documentation

Document Theory

  • Briet (1951): An object is a document if it is cataloged and shown as evidence (e.g., a photo of a star, an animal in a zoo).
  • Attributes of a Document:
      - Indexicality: Is about something.
      - Documentality: Has social power/agency.
      - Complementarity: Material, informational, and social properties.
      - Fixity: Stability over time.

The FRBR / LRM Model (Bibliographic Levels)

  1. Work: Intellectual creation (e.g., Hamlet).
  2. Expression: Combination of signs (e.g., the English text).
  3. Manifestation: Physical carrier (e.g., a specific 1980 edition).
  4. Item: Single exemplar (e.g., your specific copy).

Resource Levels

  • Primary: Original info (articles, tweets).
  • Secondary: Summarized (textbooks, reviews).
  • Tertiary: Pointers (bibliographies, directories).

Page 6: Domain Analysis and Information Organisation

Domain Analysis (Hjørland)

A socio-cognitive approach to understanding subject areas (disciplines).

  • 11 Approaches: Producing literature guides, special classifications, user studies, bibliometric studies, historical studies, terminology studies, etc.
  • Subject Specialism: Information professionals need to understand the "logic and language" of a domain (e.g., Chemistry vs. Law).

Knowledge Organisation Systems (KOS)

  • Controlled Vocabulary: Fixed terms to control language redundancy.
  • Classification Types:
      - Enumerative: Lists all known subjects (Dewey, Library of Congress).
      - Faceted: Uses facets (aspects) to build complex notations (Ranganathan’s Colon Classification).
  • Taxonomy: Hierarchical arrangement for specific environments.
  • Ontology: Formal description of entities and their relationships (e.g., OWL language).
  • Thesaurus: Includes Equivalence (Synonyms), Hierarchy (Broader/Narrower), and Association (Related terms).

Metadata Types

  • Descriptive: Finding (Title, Author).
  • Administrative: Management (Acquisition date).
  • Rights: Legal (Copyright).
  • Structural: Linking units (Chapters to books).

Page 7: Technologies, Systems, and Data Science

Computer Architecture (Von Neumann)

  • CPU: Processor (Arithmetic and Logic Unit + Control Unit).
  • Memory: Working storage.
  • File Storage: Magnetic/Digital long-term storage.
  • Bus: Circuits linking components.
  • The Fetch-Execute Cycle: Core operational loop.

Software and Networks

  • Languages: Low-level (machine code), High-level (Python, Java), Scripting (JavaScript), Markup (HTML, XML).
  • Open Source (FOSS): Collaborative, free modification (GitHub, EPrints).
  • Artificial Intelligence (AI):
      - GOFAI: Logic-based expert systems.
      - Machine Learning: Statistical algorithms learning from patterns (Deep Learning/Neural Nets).

Data Wrangling

  • Web Scraping: Extracting info from websites via code or APIs.
  • Cleaning: Transforming "messy data" into "tidy data" using tools like OpenRefine or Regular Expressions (RegEx).
  • Visualisation: NodeXL, VOSviewer, Word Clouds.

Page 8: Management, Policy, and Information Behavior

Information Behavior Models

  • Wilson’s Model: Contextual nature of seeking, searching, and use.
  • Ellis’s Features: Starting, Chaining, Browsing, Monitoring, Differentiating, Extracting, Verifying, Ending.
  • Kuhlthau’s Information Search Process: Cognitive/Affective stages: initiation, selection, exploration, formulation, collection, presentation.

Information Management (IM) and KM

  • Knowledge Management (KM): Handling "Tacit Knowledge" (know-how).
  • Records Management: Managing records as evidence of business processes.
  • Information Governance: Ensuring legal/ethical compliance (GDPR).
  • Evaluations:
      - ROI (Return on Investment): Demonstrating financial value.
      - Impact Analysis: Showing real change in user state or life (ISO 16439).

Information Law and Ethics

  • Key Laws: Intellectual Property (Copyright), Freedom of Information (FOI), Data Protection.
  • Ethical Frameworks:
      - Utilitarianism: Outcome-based.
      - Deontology: Duty-based.
      - Communitarianism: Community-based (Social Justice).
  • Privacy: Managing "Informational Friction" and protecting personal identity.

Digital Literacies

  • Gilster’s View: Mastery of concepts/ideas over technical keystrokes.
  • Information Literacy: Recognizing needs, locating, and evaluating info.
  • Evaluating Mis- and Disinformation: Lateral reading (checking outside the source) vs. checklists.

Page 9: Research Methods

Common Research Styles

  1. Desk Research: Literature reviews, Meta-analysis, Bibliometrics.
  2. Experiment: Controlled tests of system interfaces/algorithms.
  3. Surveys: Questionnaires (Quantitative) or Semi-structured interviews (Qualitative).
  4. Sampling: Random, Purposive (Judgement), or Convenience (Snowball).

Page 10: The Future of Information Science

Predictions

  • Curators of the Infosphere: The role of the information professional is evolving from "keeper"/"gatekeeper" to the curator of semantic capital.
  • Onlife Sustainability: Adapting to the merger of physical and digital realities.
  • Interdisciplinarity: Closer alignments with Media studies, Data Science, and Digital Humanities.