Gene Expression Analysis and Machine Learning in Bioinformatics

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/99

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

100 Terms

1
New cards

Gene expression analysis

Study of gene activity in various conditions.

2
New cards

Differential expression

Comparison of gene expression levels across conditions.

3
New cards

Transcriptome

Complete set of expressed genes in a cell.

4
New cards

Unsupervised learning

Learning without labeled output data.

5
New cards

Over-represented terms

Terms occurring more frequently than expected.

6
New cards

Supervised learning

Learning with labeled output data.

7
New cards

K-nearest neighbor

Classification based on closest training examples.

8
New cards

Logistic regression

Statistical method for binary classification.

9
New cards

Neural networks

Computational models inspired by brain structure.

10
New cards

Random forest

Ensemble method using multiple decision trees.

11
New cards

Support vector machine

Classification method maximizing margin between classes.

12
New cards

Gene expression variability

Differences in gene expression across samples.

13
New cards

Technical replicates

Repeated measurements to control experimental variation.

14
New cards

Biological replicates

Independent samples to account for biological variation.

15
New cards

RNA-Seq

Sequencing method for analyzing RNA expression.

16
New cards

Differentially expressed genes (DEG)

Genes with significant expression changes between conditions.

17
New cards

Log fold change (logFC)

Measure of relative change in gene expression.

18
New cards

P-value

Probability measure for statistical significance.

19
New cards

False discovery rate (FDR)

Proportion of false positives among significant results.

20
New cards

De novo transcriptome assembly

Building transcriptomes without a reference genome.

21
New cards

Burrows-Wheeler transform

Algorithm for efficient sequence mapping.

22
New cards

Housekeeping genes

Genes consistently expressed across all conditions.

23
New cards

Transcripts per million (TPM)

Normalization method for RNA-Seq data.

24
New cards

Reads per kilobase of gene per million reads (RPKM)

Normalization accounting for gene length and sequencing depth.

25
New cards

Batch effects

Variability introduced by processing batches of samples.

26
New cards

Condition variability

Differences caused by environmental factors.

27
New cards

Single cell approaches

Techniques analyzing gene expression at single-cell level.

28
New cards

Diurnal variation

Daily fluctuations in gene expression.

29
New cards

Seasonal variation

Changes in gene expression across seasons.

30
New cards

Quality control (QC)

Processes ensuring data accuracy and reliability.

31
New cards

Microarrays

Techniques for measuring gene expression variations.

32
New cards

Biological Replicates

Samples to control for biological variation.

33
New cards

Internal Standards

Used to determine absolute RNA levels.

34
New cards

Housekeeping Genes

Implicit internal standards for RNA quantification.

35
New cards

Spiked RNA

Explicit internal standards added for measurement.

36
New cards

RNA Library Preparation

Process to isolate desired RNA types.

37
New cards

PolyA+ RNA

Messenger RNA enriched for sequencing.

38
New cards

Ribosomal RNA

Most abundant RNA type, usually subtracted.

39
New cards

Next-Generation Sequencing

High-throughput sequencing technology for RNA-Seq.

40
New cards

Read Mapping

Aligning sequenced reads to a reference genome.

41
New cards

Gene Expression Variation

Expression levels can vary over 5 orders of magnitude.

42
New cards

Fast Mapping Methods

Efficient algorithms for read alignment.

43
New cards

Burrows-Wheeler Transform

Algorithm used for fast read mapping.

44
New cards

Differentially Expressed Genes (DEG)

Genes with significant expression differences.

45
New cards

Log Fold Change (logFC)

Biological effect size metric for gene expression.

46
New cards

False Discovery Rate (FDR)

Proportion of false positives among significant DEGs.

47
New cards

Volcano Plots

Visual representation of fold change and significance.

48
New cards

T-test

Statistical test for comparing gene expression means.

49
New cards

Wald Test

Test for significant differences in log fold change.

50
New cards

RNA Isoforms

Transcripts differing due to splicing variations.

51
New cards

Quantitative PCR (qRT-PCR)

Method for confirming gene expression results.

52
New cards

Noise in Gene Expression

Variability affecting measurement accuracy.

53
New cards

Library Size

Total number of transcripts captured in sequencing.

54
New cards

Reads per Kilobase per Million (RPKM)

Normalization for gene length and sequencing depth.

55
New cards

Contaminants in RNA-Seq

Unwanted RNA affecting expression analysis.

56
New cards

Isoform Estimation Challenges

Difficulties in accurately measuring RNA isoforms.

57
New cards

Single Cell RNA-Seq

Technique to analyze gene expression in individual cells.

58
New cards

FACS

Fluorescence-activated cell sorting for cell separation.

59
New cards

Drop-seq

Method for sequencing RNA from single cells.

60
New cards

ChIP-seq

Chromatin immunoprecipitation followed by sequencing.

61
New cards

Methylation Sequencing

Identifies methylated DNA regions using bisulfite treatment.

62
New cards

Hi-C

Technique for studying chromosome conformation.

63
New cards

ATAC-seq

Assesses chromatin accessibility using transposase.

64
New cards

MNase-seq

Uses micrococcal nuclease to study nucleosome positioning.

65
New cards

Co-expressed Genes

Genes responding similarly to treatments or conditions.

66
New cards

Clustering

Grouping similar data points based on characteristics.

67
New cards

Hierarchical Clustering

Creates a tree of clusters based on distance.

68
New cards

Euclidean Distance

Measures straight-line distance between two points.

69
New cards

Correlation Distance

Measures similarity based on response patterns.

70
New cards

UPGMA

Average linkage clustering method for hierarchical trees.

71
New cards

Principal Component Analysis (PCA)

Reduces dimensionality by transforming data axes.

72
New cards

Eigenvectors

New axes representing directions of maximum variance.

73
New cards

Eigenvalues

Weights indicating variance explained by eigenvectors.

74
New cards

Principal Coordinate Analysis (PCoA)

Visualizes data based on distance metrics.

75
New cards

Heatmaps

Visual representation of gene expression data.

76
New cards

K-means Clustering

Partitions data into K groups based on means.

77
New cards

Transcriptional Regulatory Networks

Networks of genes regulated by transcription factors.

78
New cards

Histone Modifications

Chemical changes to histones affecting gene expression.

79
New cards

Biomarkers

Biological indicators of disease or condition.

80
New cards

Diagnostics

Methods for identifying diseases or conditions.

81
New cards

Log-transformed Values

Data transformation for scale normalization in analysis.

82
New cards

Sjögren's Syndrome

Autoimmune disease affecting moisture-producing glands.

83
New cards

Acute Myeloid Leukemia (AML)

Type of cancer affecting blood and bone marrow.

84
New cards

Huntington's Disease

Genetic disorder causing progressive brain degeneration.

85
New cards

K-means clustering

A method to partition data into K clusters.

86
New cards

Choosing K

Determining the optimal number of clusters.

87
New cards

Euclidean distance

Distance metric for measuring point separation.

88
New cards

Within group error

Sum of distances within a cluster.

89
New cards

Between group distance

Distance between different clusters.

90
New cards

Silhouette statistic

Measure of how similar an object is to its cluster.

91
New cards

Self-organizing map (SOM)

Neural network for clustering and visualization.

92
New cards

Grid connectivity

Arrangement of centroids in self-organizing maps.

93
New cards

Biclustering

Finding gene sets associated with specific classes.

94
New cards

Mean squared residue score (H)

Metric for evaluating cluster homogeneity.

95
New cards

Brute force algorithm

Exhaustive search for optimal clustering.

96
New cards

Gene Ontology (GO)

Framework for classifying gene functions.

97
New cards

Fisher's exact test

Statistical test for assessing subset bias.

98
New cards

Hypergeometric distribution

Probability distribution for sampling without replacement.

99
New cards

Gene ontology enrichment

Analysis of common GO terms in clusters.

100
New cards

Pathway enrichment

Assessing gene association with biological pathways.