Cogs9 Midterm Review

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/13

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

14 Terms

1
New cards

Donho argues that a “data science” framework is needed beyond classical statistics primarily because:

The real work of learning from data includes many essentials activities outside modeling, such as data cleaning, transforming, computing, communicating

2
New cards

Donho proposes bringing the Common Task Framework (CTF) into today’s statistical and data science training. Which of the following would be an example of that?

Designing a course project where all students work to answer the same question using the same data set.

3
New cards

Which pairing best describes one benefit and one limitation of the Common Task Framework (CTF)?

Benefit: provides objective performance measurement and enables cumulative progress in the field.

Limitation: can lead to overemphasis on leaderboard rankings rather than understanding the underlying problem or developing generalizable methods.

4
New cards

Data Exploration and Preparation

Using detective-like work to identify artifacts

5
New cards

Data Representation and Transformation

Restructuring the data or variables within the dataset to suit the needs of the project

6
New cards

Computing with Data

Knowledge of a number of programming languages/packages as well as approaches to computational efficiency

7
New cards

Data Modeling

Using inferential and predictive data to answer interesting questions

8
New cards

Data Visualization and Presentation

Generation of plots that help you explore your data as well as those that help you effectively communicate your feelings

9
New cards

Science about Data Science

Studying how data science is studied by others

10
New cards

Donho dicusesses the concept of “reproducibility” and “replicability” in data science. Why does he argue that computational reproducibility (being able to re-run someone’s code and get the same results) is important for data science as a science?

It allows others to verify results, understand exactly what was done, and build upon previous work

11
New cards

Donho says we should study the “science of data science” or how data science is actually done (tools, workflows; habits, error sources). Why does that matter for trust in results?

It measures which practices actually improve reliability (e.g., reproducibility, fewer mistakes), so we can build evidence-based standards and be more confident in conclusions.

12
New cards

Donho discusses the “Two Cultures” of data science: the generative/inferential culture that focuses on modeling for explanation and interpretation, and the predictive culture, which focuses on predicting outcomes.

Which of the following is an example of the generative culture of modeling for explanation/interpretation?

A study examining which factors (like income, education, and location) are most strongly associated with voter turnout.

13
New cards

Donho discusses the “Two Cultures” of data science: the generative/inferential culture that focuses on modeling for explanation and interpretation, and the predictive culture, which focuses on predicting outcomes.

Which of the following is an example of modeling for prediction?

A hospital using a model to predict which patients need surgery within 30 days.

14
New cards

What is Donoho’s main concern about the dominance of predictive culture in modern data science?

Predictive culture ignores the importance of understanding mechanisms and interpretability.