Bioinformatics Techniques for Protein Identification and Analysis

Aim: Identify an unknown protein (LAB1) involved in cancer survival.
Techniques: Sequence-specific DNA affinity chromatography and Edman degradation peptide sequencing.
Outcome: Obtained amino acid sequence of potential DNA binding protein (Query1).

TQIPLSQPIQIAQDLQQLQQLQQQNLNLQQFVLVHPTTNLQPAQFIISQTPQ…
Sequence length: 50+ amino acids.

BLAST (Basic Local Alignment Search Tool): A bioinformatics tool for comparing an input sequence against a database to identify homologous sequences.
Steps for BLASTP search:
1. Open web browser: BLAST
2. Select 'Human' as the organism.
3. Choose the BLASTP option (for protein sequences).
4. Paste Query1 into the provided input box.
5. Click on the BLAST button to start the search.
6. Post-search, review the Graphic Summary and Descriptions tab for results.

Query Cover: The portion of the query sequence included in the aligned sequences.
E-value: Describes the expected number of matches one might see by chance; lower values indicate significance.
Bit Score: Reflects alignment quality; higher scores mean better alignment.
Accession Number: Unique identifier for each entry in the database.

Identify the protein matched by the search.
Determine if Query1 sequence aligns with a known protein and explore differences.
Collect essential information about the identified protein, including:
- Name of the protein
- Function
- Interacting partners
- Transcript variations

Retrieve Full-Length Nucleotide Sequence:
- Follow procedures to find the nucleotide showing mRNA coding for the protein; find the correct entry from database results.
- Convert the mRNA sequence into FASTA format for further bioinformatics work.
- Confirm the starting codon and identify necessary PCR primers. (Use GAG as a reference for start codon).

Isolate DNA Binding Domain:
--- Use Prosite to identify conserved domains in Oct-1.
Highlight two identified domains in the amino acid sequence and document their corresponding DNA sequences in results.

Utilize Pymol to visualize the POU domain structure.
1. fetch 1POU command retrieves the structure.
2. Use display options to analyze structural components such as helix-turn-helix formations.
3. Provide screenshots of structural analysis for documentation.

The Oct-1 transcription factor, indicated by the sequence analysis, modulates gene expression and is integral to cancer research.