Data Science

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/103

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

104 Terms

1
New cards

What does it mean that biology is now a “data science”?

Modern biology relies heavily on collecting, analyzing, and interpreting large datasets using computational tools

2
New cards

How are computers involved in modern biological and medical research?

Computers are used in every step, including experimental design, data collection, data analysis, visualization, and interpretation

3
New cards

What is a computer, according to the lecture definition?

A machine that stores information, manipulates information according to rules, and produces new information as output

4
New cards

What are the three basic steps everything a computer does reduces to?

Input, processing, and output

5
New cards

What are binary digits (bits)?

The smallest units of information in a computer, represented as 0 or 1

6
New cards

Why do computers use binary instead of continuous values?

Binary is robust to noise and easy to represent physically as on/off or high/low voltage states

7
New cards

What is the role of the central processing unit (CPU)?

The CPU executes instructions, including arithmetic, logical comparisons, and data movement

8
New cards

What does it mean that modern CPUs have multiple cores?

They can perform multiple operations in parallel, increasing computational speed

9
New cards

What is random access memory (RAM)?

Temporary, fast-access memory that stores active data, variables, and intermediate results

10
New cards

Why is RAM considered volatile?

Its contents are lost when the computer is powered off

11
New cards

What is long-term storage used for?

Storing data and programs permanently, even when the computer is powered off

12
New cards

How does storage differ from RAM?

Storage is much larger and non-volatile but significantly slower to access than RAM

13
New cards

What is the operating system (OS)?

Software that manages hardware resources, schedules programs, allocates memory, and handles files and errors

14
New cards

What is code in the context of computing?

A precise, human-readable set of instructions that tells a computer exactly what to do

15
New cards

What is R and how does it compare to other languages?

R is a programming language designed for statistics, data analysis, and visualization; compared to other tools, it excels at reproducible analysis and scientific graphics, while other languages like Python are more general-purpose and Excel is not reproducible

16
New cards

What is swirl in R?

An interactive R package used to learn R by typing responses directly into the console

17
New cards

Where is swirl used in RStudio?

Only in the Console pane, not the Script pane

18
New cards

What do file extensions define?

The type of data contained in a file and which programs can open it

19
New cards

What are the four fundamental data types in R?

Numeric, character, logical, and missing values (NA)

20
New cards

What is an object (variable) in R?

A container that stores data and has a name, value, and data type

21
New cards

How are objects created in R?

Using the assignment operator <-

22
New cards

What does the class() function do?

It returns the data type of an object

23
New cards

What is a vector in R?

An object that stores a sequence of values of the same data type

24
New cards

How are vectors created in R?

Using the c() function

25
New cards

What is indexing in R?

Accessing elements of a vector using square brackets

26
New cards

What types of values can be used for indexing?

Numeric indices or logical values

27
New cards

What is a function in R?

A small program that takes inputs called arguments and returns output

28
New cards

How can ranges of values be created in R?

Using the colon operator or the seq() function

29
New cards

What are error messages in R?

Messages produced when R cannot run a command due to syntax or logical problems

30
New cards

How do warnings differ from errors in R?

Warnings do not stop the program but indicate something unusual may have occurred

31
New cards

What does the error “object not found” mean?

R cannot find an object with that name in memory

32
New cards

What causes errors from using the wrong data type?

Providing a function with a type of data it cannot operate on

33
New cards

Why does forgetting quotes around strings cause an error?

R interprets unquoted text as an object name instead of character data

34
New cards

What causes missing or extra parentheses errors?

An unmatched opening or closing parenthesis in a command

35
New cards

What are the two key functions for using packages in R?

install.packages() to install a package and library() to load it into memory

36
New cards

What are SNP tables used for in biology?

To organize genotype data with rows as samples and columns as genomic loci

37
New cards

What does geographic occurrence data represent in a matrix?

Locations arranged in rows and columns corresponding to geographic points or samples

38
New cards

What is gene expression data commonly stored as in R?

A matrix with rows as samples and columns as genes

39
New cards

Define a matrix in R.

A 2D array where all elements must be of the same data type

40
New cards

What does SNP stand for and mean?

Single Nucleotide Polymorphism; a position in the genome with variation among individuals

41
New cards

What is polymorphism in genetics?

The existence of multiple alleles or genetic variants within a population

42
New cards

What distinguishes gene expression matrices from other data?

They quantify gene activity levels across samples, arranged in matrix form

43
New cards

What is a data frame in R?

A 2D data structure that can hold different data types in each column

44
New cards

How do you create a data frame in R?

Using the data.frame() function with vectors as columns

45
New cards

hat is the purpose of the $ operator in R?

To access a specific named column in a data frame

46
New cards

How do you index elements in matrices or data frames?

Using square brackets [row, column]

47
New cards

What are logical operators used for in R?

To evaluate expressions and return TRUE or FALSE values

48
New cards

What is a logical expression?

A statement in R that evaluates to TRUE or FALSE

49
New cards

Name some common comparison operators in R.

== (equal), != (not equal), <, >, <=, >=

50
New cards

What does the logical AND operator (&) do?

Returns TRUE only if both sides of the expression are TRUE

51
New cards

What does the logical OR operator (|) do?

Returns TRUE if either or both sides of the expression are TRUE

52
New cards

Why should you avoid recycling vectors in logical operations?

Because recycling shorter vectors can cause unintended comparisons and warnings

53
New cards

How can logical expressions be useful when working with biological data?

For filtering datasets based on conditions, such as genotypes or expression thresholds

54
New cards
55
New cards
56
New cards
57
New cards
58
New cards
59
New cards
60
New cards
61
New cards
62
New cards
63
New cards
64
New cards
65
New cards
66
New cards
67
New cards
68
New cards
69
New cards
70
New cards
71
New cards
72
New cards
73
New cards
74
New cards
75
New cards
76
New cards
77
New cards
78
New cards
79
New cards
80
New cards
81
New cards
82
New cards
83
New cards
84
New cards
85
New cards
86
New cards
87
New cards
88
New cards
89
New cards
90
New cards
91
New cards
92
New cards
93
New cards
94
New cards
95
New cards
96
New cards
97
New cards
98
New cards
99
New cards
100
New cards