Section 1.1 and 1.2

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/15

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

16 Terms

1
New cards

Cases:

  • The objects described by a set of data

  • They can be customers, companies, subjects in a study,
    units in an experiment, or other objects.

2
New cards

Variable:

  • is a special characteristic of a case

3
New cards

Values:

  • Different cases can have different values of a variable

4
New cards

Label:

  • is a special variable used in some data sets to
    distinguish among the different cases

5
New cards

Categorical Variable:

  • places each case into one of several groups, or categories

6
New cards

Quantitative Variable:

  • takes numerical values for which arithmetic operations such as adding and averaging make sense

7
New cards

Key Characteristics of a Data Set:

Every data set is accompanied by important background
information. In a statistical study, always ask the following
questions:

  • Who? What cases do the data describe? How many
    cases does a data set have?

  • What? How many variables does the data set have? What
    are the exact definitions of these variables? What are the
    units of measurement for each quantitative variable?

  • Why? What purpose do the data have? Do the data
    contain the information needed to answer the questions of
    interest?

8
New cards

Exploratory Data Analysis:

  • Begin by examining each variable by itself. Then, move
    on to study the relationships among the variables

  • Begin with a graph or graphs. Then, add numerical
    summaries of specific aspects of the data.

9
New cards

Variables:

  • We construct a set of data by first deciding which cases or units we want to study. For each case, we record information about characteristics that we call variables

  • Characteristics of the individual

10
New cards

Distribution of a Variable:

  • To examine a single variable, we graphically display its distribution

  • The distribution of a variable tells us what values it takes and how often it takes these values

  • Distributions can be displayed using a variety of graphical tools. The proper choice of graph depends on the nature of the variable

  • Categorical Variable:

    • Pie chart

    • Bar Graph

  • Quantitative Variable:

    • Histogram

    • Stemplot

11
New cards

The Distribution of a Categorical Variable:

  • list the categories and gives the count of the percentage of individuals who fall into each category

  • Pie charts: show the distribution of a categorical variable as a “pie”. Its slices’ sies reflect the counts or percent’s for the categories

  • Bar graph: represent categories as bars whose heights show the category counts or percent’s

12
New cards

The Distribution of a Quantitative Variable:

  • The distribution of a quantitative variable tells us what values the variable takes on and how often it takes those values

  • Stemplots: separate each observation into a stem and a leaf that are then plotted to display the distribution while maintaining the original values of the variable

  • Histograms: show the distribution of a quantitative variable by
    using bars. The height of a bar represents the number of
    individuals whose values fall within the corresponding class

13
New cards

To Construct Stemplots:

1.) Separate each observation into a stem (all but the rightmost digit)
and a leaf (the remaining digit)

2.) Write the stems in a vertical column; draw a vertical line to the right
of the stems

3.) Write each leaf in the row to the right of its stem; order the leaves,
if desired

4.) If there are very few stems (when the data cover only a very small range
of values), then you might want to create more stems by splitting the
original stems

14
New cards

Examining Distributions:

  • In any graph of data, look for the overall pattern and for striking deviations from that pattern

  • You can describe the overall pattern by its shape, center, and
    spread

  • An important kind of deviation is an outlier, an individual that falls
    outside the overall pattern

  • Extreme values of a distribution are in a tail of the distribution

  • A peak in a distribution is called a mode. Bimodal Distribution

  • A distribution is symmetric if the right and left sides of the graph are approximately mirror images of each other

  • A distribution is skewed to the right (right-skewed) if the right side of the graph (containing the half of the observations with larger values) is much longer than the left side

  • It is skewed to the left (left-skewed) if the left side of the graph is
    much longer than the right side

15
New cards

Outliers:

  • An important kind of deviation is an outlier

  • Outliers are observations that lie outside the overall pattern of a distribution

  • Always look for outliers and try to explain them

16
New cards

Time Plots:

  • A time plot shows behavior over time

  • Time is always on the horizontal axis, and the variable being measured is on the vertical axis

  • Look for an overall pattern (trend) and deviations from this trend. Connecting the data points by lines may emphasize this trend

  • Look for patterns that repeat at known regular intervals (seasonal variations)