Bioinformatics Techniques for Protein Identification and Analysis
Overview of Bioinformatics in Protein Identification
- Aim: Identify an unknown protein (LAB1) involved in cancer survival.
- Techniques: Sequence-specific DNA affinity chromatography and Edman degradation peptide sequencing.
- Outcome: Obtained amino acid sequence of potential DNA binding protein (Query1).
Query Sequence Details
- Amino Acid Sequence (Query1):
TQIPLSQPIQIAQDLQQLQQLQQQNLNLQQFVLVHPTTNLQPAQFIISQTPQ…
- Sequence length: 50+ amino acids.
Part 1: Performing a BLAST Search
- BLAST (Basic Local Alignment Search Tool): A bioinformatics tool for comparing an input sequence against a database to identify homologous sequences.
- Steps for BLASTP search:
- Open web browser: BLAST
- Select 'Human' as the organism.
- Choose the BLASTP option (for protein sequences).
- Paste Query1 into the provided input box.
- Click on the BLAST button to start the search.
- Post-search, review the Graphic Summary and Descriptions tab for results.
Analyzing Results
Key Metrics Explained:
- Query Cover: The portion of the query sequence included in the aligned sequences.
- E-value: Describes the expected number of matches one might see by chance; lower values indicate significance.
- Bit Score: Reflects alignment quality; higher scores mean better alignment.
- Accession Number: Unique identifier for each entry in the database.
Questions to Answer:
- Identify the protein matched by the search.
- Determine if Query1 sequence aligns with a known protein and explore differences.
- Collect essential information about the identified protein, including:
- Name of the protein
- Function
- Interacting partners
- Transcript variations
Part 2: DNA Sequence Manipulation
- Retrieve Full-Length Nucleotide Sequence:
- Follow procedures to find the nucleotide showing mRNA coding for the protein; find the correct entry from database results.
- Convert the mRNA sequence into FASTA format for further bioinformatics work.
- Confirm the starting codon and identify necessary PCR primers. (Use GAG as a reference for start codon).
Part 3: Domain Identification
- Isolate DNA Binding Domain:
--- Use Prosite to identify conserved domains in Oct-1. - Highlight two identified domains in the amino acid sequence and document their corresponding DNA sequences in results.
Structural Information of POU Domain
- Utilize Pymol to visualize the POU domain structure.
fetch 1POUcommand retrieves the structure.- Use display options to analyze structural components such as helix-turn-helix formations.
- Provide screenshots of structural analysis for documentation.
Conclusion
- The Oct-1 transcription factor, indicated by the sequence analysis, modulates gene expression and is integral to cancer research.