Bioinformatics flashcards

Page 1: Functional Variant Analysis in Genomics

Key Components of Genomic Analysis

  • UCL Ensembl GWAS Locus: Identification of genomic regions linked to diseases through Genome-Wide Association Studies (GWAS).

  • Causal Variants: Variants likely responsible for observed associations in GWAS data.

  • Signal Peptide: Short peptide sequences that direct the transport of a protein.

  • Genomic Protein Domain Annotation: Characterizing functional domains within proteins based on genomic data.

  • Cellular Localisation: Identification of where proteins are located within a cell.

  • Variant Severity and Annotation: Use of tools like SIFT, PolyPhen2, CADD to predict the impact of variants on protein function.

  • Functional Annotations: Utilizing databases like UniProt, GRAIL for analyzing implications of variants on gene function.

Page 2: Introduction to Bioinformatics

Objectives

  • Provide a brief introduction to Bioinformatics.

  • Define Functional Annotation in genomic context and other domains.

  • Describe resources and workflows involved in Functional Annotation.

Page 3: Types of Annotation

Structural vs Functional Annotation

Structural Annotation

  • Refers to identifying the location of various genomic elements.

Functional Annotation

  • Involves assigning biological functions to genomic features identified in structural annotations.

Page 4: Context of Bioinformatics

What is Bioinformatics?

  • An interdisciplinary field combining biology, computer science, and statistics for analyzing biological data.

Types of Annotation

  • Nucleotide-Level Annotation: Details on specific sequences.

  • Protein-Level Annotation: Investigating protein sequences and functions.

  • Process-Level Annotation: Involves understanding biological processes.

Page 5: Molecular Biology Focus

Investigating Cellular Functions

  • Molecular biology aims to understand cellular processes.

  • Utilizing bioinformatics to study the role of proteins in cells.

Page 6: Bioinformatics Resources

Limitations of Search Engines

  • Why not use standard search engines?

  • Need for specialized databases like NCBI and PubMed for in-depth genetic information.

COL4A3 Example

  • Provides instructions for making components of type IV collagen, significant in structural integrity.

Page 7: Importance of Biological Databases

Features of Biological Databases

  • Manually curated and standardized data for ease of computation.

  • Searchable and transparent linking to other resources.

Page 8: Types of Data in Genomics

Data Types Necessary for Investigation

  • Genomic and mRNA Sequences of COL4A3.

  • Review of protein sequences and variants related to disease.

Page 9: NCBI Gene Information

Exploring Gene Context

  • Provides essential details about COL4A3, including gene ID and variations.

  • Summarizes genomic context, including its roles in diseases like Alport syndrome.

Page 10: Published Information on COL4A3

Literature and Data Availability

  • Identification of papers describing functions and variants of COL4A3.

  • Comparison with similar proteins for functional insight.

Page 11: Further Gene Information

Summary of Gene Data

  • Contains details on mutations relevant to health conditions.

  • Discusses interactions and pathways related to COL4A3.

Page 12: PubMed Search for COL4A3

Finding Relevant Literature

  • Summarizing various studies and findings associated with COL4A3.

  • Clinical implications and genetic studies featured.

Page 13: Gene Expression Analysis

Investigating Tissue Expression of COL4A3

  • Analyzing where COL4A3 is expressed and its relevance to tissue-specific studies.

Page 14: GTEx Portal and Gene Expression

GTEx Data for COL4A3

  • Accessing tissue expression data and exploring expression patterns in various tissues.

Page 15: Expression Statistics of COL4A3

Summary of Expression Findings

  • Comparison of expression levels across different tissues, particularly kidney tissues important for COL4A3 studies.

Page 16: Protein Information

Protein Analysis of COL4A3

  • Investigating the sequence and functional domains of COL4A3 protein.

  • Relating its biological roles to structural insights.

Page 17: UniProt for Protein Information

Utilizing UniProt for COL4A3

  • Accessing sequence and function information through UniProt database, highlighting features of type IV collagen.

Page 18: Linking UniProt to Enzymes and Pathways

Connecting to Other Resources

  • Exploring connections between COL4A3 and pathways or interaction with databases used to validate findings.

Page 19: Detailed Uniprot Information

Insights on Isoforms and Variants

  • Summarization of variants and their links to diseases with detailed background on structure and function of COL4A3 sequences.

Page 20: Understanding Protein Families

COL4A3 and Protein Family Members

  • Exploring the classification of COL4A3 within collagen families and how it fits into broader protein interactions.

Page 21: Utilization of Protein Domains

Resource Utilization for Protein Domains

  • Investigating protein features associated with COL4A3 and implications in health.

Page 22: InterPro as a Resource

Using InterPro for Protein Family Analysis

  • Understanding the structural diversity in collagen proteins and their various functions.

Page 23: Detailed Protein and Domain Information

Exploring Domain Structures

  • Utilizing databases like Pfam to obtain structural and functional annotations relevant to COL4A3.

Page 24: Structural Insights into COL4A3

3D Structures

  • Investigating crystal structures and predictions provided by various structural databases for insights into molecular functions.

Page 25: Swiss-Model Repository and Protein Structure

Using Swiss-Model for Structural Information

  • Analyzing comparative models of protein structures, emphasizing structure-function relationships.

Page 26: Viewing Different 3D Models in Detail

3D Structural Models Overview

  • Overview and availability of 3D protein models, exploring implications for understanding structure related to function.

Page 27: Introduction to Chemical Entities

Understanding Chemical Biology

  • Overview of chemical entities associated with biological data and relationships in COL4A3 studies.

Page 28: Disease Associations with COL4A3

Studying related phenotypes

  • Identifying diseases linked to variants in COL4A3 using various databases like OMIM and ClinVar.

Page 29: OMIM and Disease Exploration

Interpreting Gene-Phenotype Relationships

  • Using OMIM to categorize genetic diseases and conditions linked to COL4A3 genetic information.

Page 30: Phenotypic Associations in Genetics

Investigating Genetic Variants

  • Detailed relationships between variants in COL4A3 and genetic conditions such as forms of Alport syndrome.

Page 31: Utilizing Open Targets for Genetics

Exploring Targeted Disease Approaches

  • Understanding how databases like Open Targets help identify disease pathways associated with COL4A3 expression.

Page 32: Protein Interaction Analysis

Interactions of COL4A3 in Biological Processes

  • Investigating interactions of COL4A3 protein with other proteins using databases like IntAct and STRING.

Page 33: Building Protein Interactions Network

Networking with Bioinformatics Tools

  • Presenting interaction networks for COL4A3 and discussing findings relevant to protein interactions.

Page 34: Collaboration Among Related Protein Studies

Utilizing STRING Database

  • Exploring data about various interactor proteins engaged in the network with COL4A3.

Page 35: Data Reliability in Bioinformatics

Assessing Resource Credibility

  • Evaluation criteria for the reliability of data derived from bioinformatics resources, including curation practices.

Page 36: Understanding Object Names and Identifiers

Clarity in Biological Naming Conventions

  • Importance of object names and identifiers in biological databases.

Page 37: Gene Name Confusions

Resolving Ambiguities in Gene Naming

  • Discussion on various aliases attributed to genes and how to manage them for clarity.

Page 38: Unique Gene Names

Addressing Duplicate Naming Issues

  • Exploration about distinct genes sharing similar aliases and the importance of distinct nomenclature.

Page 39: The Role of HGNC in Nomenclature

Ensuring Biological Accuracy

  • How the HUGO Gene Nomenclature Committee standardizes and maintains unique gene names across species.

Page 40: UniProt ID Activity

Engaging with UniProt Database

  • Instructions to search UniProt IDs for relevant protein information through specified letter mappings.

Page 41: Quiz Format on UniProt IDs

Interactive Learning with Gene Names

  • Engaging users in an anagram game to familiarize with UniProt IDs related to gene names.

Page 42: Searching Gene Names in UniProt

Practice Searching for Gene Information

  • Challenges gamers with searching for gene names from given UniProt IDs.

Page 43: Trying Different UniProt IDs

Clarification on Protein Species

  • Importance of recognizing protein species for accurate data interpretation.

Page 44: Utilization of Approved Nomenclature

Best Practices in Scientific Communication

  • Guidelines on using approved nomenclature when discussing proteins and genes in literature.

Page 45: Cellular Location of COL4A3

Questions on Cellular Functions

  • Inquiries into the localization and roles of COL4A3 protein within cellular contexts.

Page 46: Gene Ontology Overview

Understanding GO Resources

  • Using Gene Ontology to explore the roles, locations, and functions of genes.

Page 47: Defining an Ontology

Structuring Biological Knowledge

  • Explanation of how ontologies organize biological terms and definitions.

Page 48: Components of an Ontology

Building Blocks of Ontological Structures

  • Key elements include vocabulary terms, defined relationships, and structured definitions.

Page 49: Purpose of Ontologies

Capturing Biological Knowledge

  • Designed to convey biological knowledge in a computable format for gene analysis.

Page 50: Scope of Gene Ontology

Comprehensive Description of Genes

  • Framework employed by GO to describe gene product attributes, including functions and locations.

Page 51: Goals of Gene Ontology

Objectives for Gene Annotation

  • Aiming to compile vocabularies on molecular biology for gene product annotations.

Page 52: Main Domains of Gene Ontology

Breakdown of Gene Attributes

  • Three key aspects: Molecular Function, Biological Process, and Cellular Component.

Page 53: Molecular Function Description

Task-Based Activity of Proteins

  • Examples include enzyme activities, protein-protein interactions, and transcription functions.

Page 54: Biological Processes Overview

Series of Biochemical Events

  • Describing the sequences of events that constitute cellular processes.

Page 55: Organ Processes and Development

Specific Biological Examples

  • Breakdown of developmental processes through recognized events related to organ systems.

Page 56: Cellular Component Locations

Where Gene Products Reside

  • Identifying specific cellular locations of gene products through ontology descriptions.

Page 57: GO Terms and Definitions

Overarching GO Structure

  • Insight into how terms in GO assist in understanding specific biological operations.

Page 58: Basement Membrane Identity

Example of Specific GO Terms

  • Highlighting definitions and synonyms associated with important biological terms.

Page 59: Nature of GO as an Ontology

Structuring Biological Terms

  • Discussing the hierarchical and relational aspects of biological terms within ontologies.

Page 60: Graph Structure of Ontologies

Directed Acyclic Graph Usage

  • Explanation of ontology structures, depicting relationships and term hierarchies.

Page 61: Necessity for Ontologies

Addressing Biological Data Challenges

  • Importance of ontologies in clarifying inconsistencies in biological terminology and data.

Page 62: Application of Ontologies

Language Clarity in Biology

  • Acknowledging various interpretations of terms and their implications within biological contexts.

Page 63: Need for Ontological Structures

Navigating Biological Information

  • Challenges faced by researchers in organizing and sharing biological data effectively.

Page 64: Scientific Publishing and Ontologies

Navigating Data Presentation

  • Conventions in publishing related to genetic and protein information to ensure clarity.

Page 65: Research Utilization of Genetic Databases

Advancements in Searching Literature

  • Exploring the functionalities of databases in enhancing research quality.

Page 66: Standardizing Scientific Communication

Addressing Biological Data Uncertainty

  • Importance of structured vocabulary use in ensuring best practices in reporting genetic data.

Page 67: GO Objectives for Gene Annotations

Comprehensive Approach to Annotations

  • Utilizing organized vocabularies to clearly annotate gene functions.

Page 68: Linking Disease Data with GO Terms

Enhancing Relevance of Ontologies

  • Providing pathways connecting genetic symptoms, diseases, and their relationships.

Page 69: Descriptive Statement Usage in GO

Specificity in Gene Functional Claims

  • Importance of clear assertions regarding gene product functionalities within analyses.

Page 70: Annotations of Specific Genes

Annotational Examples in GO

  • Presentation of different gene annotations indicating their roles and relationships.

Page 71: GO Annotation Methodology

Providing Context of Gene Product Interactions

  • Detailing methods employed to create and support gene ontology annotations based on studies.

Page 72: Coding Annotations and Tips

Types of Annotations Used

  • Description of different inference codes used to annotate functional roles.

Page 73: Practical Applications of GO

Experimental Results in GO

  • Outline key studies demonstrating gene annotations' functional relevance in research.

Page 74: Principles of Grouping Genes in GO

Hierarchical Grouping of Gene Functions

  • Illustrating connections between genes through child and parent term associations.

Page 75: Gene Grouping Approaches

Utilizing Directed Hierarchies

  • Applying hierarchical organization to ascertain gene relationships across different biological categories.

Page 76: Gene Group Analysis in GO

Functional Categories and Distribution

  • Analysis of gene populations based on GO annotations provides broad insights into function distributions.

Page 77: Role of Specific Protein Associations

Pathological Correlations in Genetic Data

  • Discussing the physiological relevance of identified proteins in diseases such as PKD.

Page 78: Cross-Species GO Term Utilization

Universality Across Species

  • Insights into gene annotations across diverse biological species facilitating shared understanding.

Page 79: Utilizing GO in Research

Applications in Dataset Evaluation

  • Importance of GO in validating histological and functional data through high-throughput approaches.

Page 80: Access to Gene Product Functional Information

Summarizing Biological Roles

  • GO provides necessary insights for researchers enabling access to gene functions and processes.

Page 81: Flow of Gene Annotations

Processing High-Throughput Data

  • GO provides structure facilitating the analysis and interpretation of profound genomic data sets.

Page 82: Discovering Gene Activity in Various Contexts

Tools for Transcriptomic Studies

  • Overview of transcriptomic isolations and their significance in evaluating gene behavior in conditions.

Page 83: Proteomic Results Validation

Ensuring Accuracy in Results

  • Practical strategies and studies confirming proteomic results with proper citation and data references.

Page 84: Identifying Dysregulated Biological Processes

Investigating Clinical Phenomena

  • Correlation of data patterns and biological insights derived from gene studies in clinical contexts.

Page 85: Gene Expression Studies

Data Representation in Gene Studies

  • Visualization of gene expressions and variations between clinical samples to derive insights.

Page 86: Correlational Analysis of Gene Expressions

High Throughput Data Analysis Outcomes

  • Summarizing findings related to gene expressions across differing conditions and physiological states.

Page 87: Utilizing Network Associations

Exploring Gene Associations

  • Analysis of interactions demonstrating how genes and proteins operate within functional networks.

Page 88: Accessing Gene Functional Information Resources

Informational Tools for Genetic Analysis

  • Overview of various databases and platforms available for accessing gene and protein information.

Page 89: Finding GO Annotations in Databases

Multiple Resource Availability

  • Outlining how to efficiently locate GO annotations across different genomic databases.

Page 90: Accessing Comprehensive Gene Reports

Importance of Gene Data Presentation

  • Procedures for navigating through extensive gene information provided by NCBI and other databases.

Page 91: Exploring High Throughput GO Data

Organizational Queries Across GO Terms

  • Demonstrating how GO terms can be used in exploring related genes and their functions.

Page 92: Necessity of Accurate Annotation Practices

Importance of Curation

  • Highlighting the pivotal role of accurate annotations in interpreting genomic data correctly.

Page 93: Human Phenotype Ontology Use

Applications in Genetic Analysis

  • How phenotype ontologies assist in genetic research for clear associations with disorders.

Page 94: Investigative Uses of HPO

Linking Genes to Phenotypes

  • Exploratory tools for examining the relationships between genetic elements and their physiological manifestations.

Page 95: Mapping Genes to Disease Phenotypes

Annotations of Associative Phenotypes

  • Efforts to connect gene information with clinical representations for comprehensive understanding.

Page 96: Continued Exploration of Genes and Phenotypes

Restrictions in Current Data

  • Current state of data related to gene-phenotype relationships for enhancing research knowledge.

Page 97: Questions and Support Contact Information

Inquiries and Assistance

  • Providing contact information for questions and collaborations in genomic research.