Graphs

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

What are the graphical procedures to describe 1 variable?

1.- Pie graph

2.- Bar graph

3.- Cummulative frequency graph

4,- Histogram

5.- Box-plot

2
New cards

What command is used to upload data files from internet?

webuse

downloads a sample dataset from the internet and loads it into memory

ie webuse lbw

lbw is a well-known example dataset (often labeled “Hosmer & Lemeshow data”) used for logistic regression examples.

3
New cards

What is doing the following command

graph pie, over(race)

This draws a pie where each slice = the number of observations in each race category.

over (variable)

<p>This draws a pie where each slice = the <strong>number of observations</strong> in each <code>race</code> category.</p><p></p><p>over (variable)  </p>
4
New cards

What does the command plabel () …

graph pie, over(race) plabel(_all percent) legend(on)

Add percentage and legend to the graphsfor each slice of the pie chart, showing the portion of each race category.

<p>Add percentage and legend to the graphsfor each slice of the pie chart, showing the portion of each <code>race</code> category. </p>
5
New cards
<p>What is doing the following command?</p>

What is doing the following command?

Is defining each colour for each slide

ie pie (1, color (gs15)) defines that first pie is gray (gs15)

<p>Is defining each colour for each slide</p><p></p><p>ie pie (1, color (gs15))  defines that first pie is gray (gs15)</p>
6
New cards

When is used a bar graph?

. represent a qualitative ordinal variable

. each bar represents a category

. Height proportional to frequency

<p>. represent a qualitative ordinal variable</p><p>. each bar represents a category</p><p>. Height proportional to frequency</p>
7
New cards

What is doing the function graph var, over (agecat),

makes a bar chart of the distribution of age_cat

For each category of age_cat (y axis) show the relative frequency or percentage

<p>makes a <strong>bar chart of the distribution of </strong><code>age_cat</code></p><p>For each category of age_cat (y axis) show the relative frequency or percentage</p>
8
New cards

hist age_cat, discrete freq gap(20) xlabel(, valuelabel) addlabel ///

fcolor(navy) lcolor(none) xtitle("Age of the mother")

Make a histogram of age_cat as separate bars, show counts, add gaps, label the x-axis with the category names, and print the counts on each bar.

What each part does

  • hist age_cat
    Draws a histogram of age_cat.

  • , discrete
    Treats age_cat as discrete categories (one bar per integer/category), not continuous bins.

  • freq
    Y-axis shows frequency (counts), not density/percent.

  • gap(20)
    Adds space between bars (bigger number = bigger gaps). Makes it look more like separated category bars.

  • xlabel(, valuelabel)
    Uses the value labels of age_cat on the x-axis (e.g., “18–24”, “25–34”) instead of showing just codes (1,2,3…).

  • addlabel
    Prints the count on top of each bar.

  • fcolor(navy)
    Sets the fill color of the bars to navy.

  • lcolor(none)
    Removes the outline around bars (no border line).

  • xtitle("Age of the mother")
    Sets the x-axis title.

  • ///
    Line continuation in Stata: lets you split a long command across lines.

<p>Make a histogram of <code>age_cat</code> as separate bars, show counts, add gaps, label the x-axis with the category names, and print the counts on each bar.</p><p>What each part does </p><ul><li><p><code>hist age_cat</code><br>Draws a histogram of <code>age_cat</code>.</p></li><li><p><code>, discrete</code><br>Treats <code>age_cat</code> as <strong>discrete categories</strong> (one bar per integer/category), not continuous bins.</p></li><li><p><code>freq</code><br>Y-axis shows <strong>frequency (counts)</strong>, not density/percent.</p></li><li><p><code>gap(20)</code><br>Adds space between bars (bigger number = bigger gaps). Makes it look more like separated category bars.</p></li><li><p><code>xlabel(, valuelabel)</code><br>Uses the <strong>value labels</strong> of <code>age_cat</code> on the x-axis (e.g., “18–24”, “25–34”) instead of showing just codes (1,2,3…).</p></li><li><p><code>addlabel</code><br>Prints the <strong>count</strong> on top of each bar.</p></li><li><p><code>fcolor(navy)</code><br>Sets the <strong>fill color</strong> of the bars to navy.</p></li><li><p><code>lcolor(none)</code><br>Removes the <strong>outline</strong> around bars (no border line).</p></li><li><p><code>xtitle("Age of the mother")</code><br>Sets the x-axis title.</p></li><li><p><code>///</code><br>Line continuation in Stata: lets you split a long command across lines.</p></li></ul><p></p>
9
New cards

cumul age, gen(c_age) equal

twoway scatter c_age age, connect(l)

1) cumul age, gen(c_age) equal

  • cumul computes the cumulative distribution function (CDF) of age.

  • gen(c_age) saves the cumulative values into a new variable called c_age.

  • After this, each observation gets a number between 0 and 1:

    • c_age = proportion of observations with age ≤ that observation’s age (a cumulative proportion).

  • equal tells Stata to treat each observation as having equal weight (each person counts the same). It’s mainly relevant if you have weights or if you want the “standard” unweighted empirical CDF.

2) twoway scatter c_age age, connect(l)

  • Plots c_age (y-axis) against age (x-axis).

  • connect(l) draws lines between the points, so it looks like a CDF curve rather than separate dots.

<p>1) <code>cumul age, gen(c_age) equal</code> </p><ul><li><p><code>cumul</code> computes the <strong>cumulative distribution function (CDF)</strong> of <code>age</code>.</p></li><li><p><code>gen(c_age)</code> saves the cumulative values into a new variable called <code>c_age</code>.</p></li><li><p>After this, each observation gets a number between <strong>0 and 1</strong>:</p><ul><li><p><code>c_age</code> = proportion of observations with age <strong>≤ that observation’s age</strong> (a cumulative proportion).</p></li></ul></li><li><p><code>equal</code> tells Stata to treat each observation as having <strong>equal weight</strong> (each person counts the same). It’s mainly relevant if you have weights or if you want the “standard” unweighted empirical CDF.</p></li></ul><p> 2) <code>twoway scatter c_age age, connect(l)</code> </p><ul><li><p>Plots <code>c_age</code> (y-axis) against <code>age</code> (x-axis).</p></li><li><p><code>connect(l)</code> draws <strong>lines</strong> between the points, so it looks like a CDF curve rather than separate dots.</p></li></ul><p></p>
10
New cards

graph box partial1, ylab(0(1)10)

Draws a box-and-whisker plot for the variable partial1 (one box showing its distribution).

ylab(0(1)10) (y-axis labels)

This controls the tick marks on the y-axis:

  • 0(1)10 means: label 0, 1, 2, …, 10 (step = 1)

So it forces a y-axis scale that’s nice for something like a score from 0 to 10.

<p>Draws a <strong>box-and-whisker plot</strong> for the variable <code>partial1</code> (one box showing its distribution).</p><p></p><p><code>ylab(0(1)10)</code> (y-axis labels) </p><p>This controls the tick marks on the y-axis:</p><p> </p><ul><li><p><code>0(1)10</code> means: label <strong>0, 1, 2, …, 10</strong> (step = 1)</p></li></ul><p> </p><p>So it forces a y-axis scale that’s nice for something like a score from 0 to 10.</p>
11
New cards

What graphs can be used to describe the relationship between two variables?

1.- Two or more pie graphs

2.- Two or more bar graphs

3.- Two or more box-plots

4.- Scatter plot

<p>1.- Two or more pie graphs</p><p>2.- Two or more bar graphs</p><p>3.- Two or more box-plots</p><p>4.- Scatter plot</p>
12
New cards

When to use two or more pie graphs?

Two qualitative nominal variables

<p>Two qualitative nominal variables</p>
13
New cards

When to use two or more bar graphs?

One qualitative nominal, one qualitative ordinal

<p>One qualitative nominal, one qualitative ordinal</p><p></p>
14
New cards

When to use two or more box-plots?

One qualitative and one quantitative

<p>One qualitative and one quantitative</p>
15
New cards

When to use scatter plot?

Two quantitative variables

<p>Two quantitative variables</p>
16
New cards

tw sc SBP age

two way scatter plot

  • Y-axis: SBP (systolic blood pressure)

  • X-axis: age

<p>two way scatter plot </p><ul><li><p><strong>Y-axis:</strong> <code>SBP</code> (systolic blood pressure)</p></li><li><p><strong>X-axis:</strong> <code>age</code></p></li></ul><p></p>
17
New cards

sc SBP weight0 if smk==1 || lfit SBP weight0

  • sc SBP weight0
    Scatter plot with:

    • Y = SBP (systolic blood pressure)

    • X = weight0 (baseline weight)

  • if smk==1
    Only plot the observations where smk equals 1 (e.g., smokers).

  • ||
    Combines multiple “twoway” plots in the same graph (adds another layer).

  • lfit SBP weight0
    Adds a linear fit line (OLS regression line) of SBP on weight0.

<ul><li><p><code>sc SBP weight0</code><br>Scatter plot with:</p><ul><li><p><strong>Y = SBP</strong> (systolic blood pressure)</p></li><li><p><strong>X = weight0</strong> (baseline weight)</p></li></ul></li><li><p><code>if smk==1</code><br>Only plot the <strong>observations where </strong><code>smk</code><strong> equals 1</strong> (e.g., smokers).</p></li><li><p><code>||</code><br>Combines multiple “twoway” plots in the <strong>same graph</strong> (adds another layer).</p></li><li><p><code>lfit SBP weight0</code><br>Adds a <strong>linear fit line</strong> (OLS regression line) of SBP on weight0.</p></li></ul><p></p>
18
New cards

sc SBP weight0 if smk==1 || qfit SBP weight0 ///

, xlabel(40(5)120) ylab(80(5)155, angle(0))

  • sc SBP weight0 if smk==1

    • sc = scatter plot

    • Y-axis: SBP (systolic blood pressure)

    • X-axis: weight0

    • if smk==1 = only plot observations where smk equals 1 (smokers)

  • ||
    Adds another plot layer on the same graph.

  • qfit SBP weight0
    Adds a quadratic fit curve (a 2nd-degree polynomial regression of SBP on weight0).

    • It fits: SBP=a+b⋅weight0+c⋅weight02SBP = a + b\cdot weight0 + c\cdot weight0^2SBP=a+b⋅weight0+c⋅weight02

    • Useful if the relationship looks curved rather than straight.

<ul><li><p><code>sc SBP weight0 if smk==1</code></p><ul><li><p><code>sc</code> = scatter plot</p></li><li><p><strong>Y-axis:</strong> <code>SBP</code> (systolic blood pressure)</p></li><li><p><strong>X-axis:</strong> <code>weight0</code></p></li><li><p><code>if smk==1</code> = only plot observations where <code>smk</code> equals 1 (smokers)</p></li></ul></li><li><p><code>||</code><br>Adds another plot layer on the same graph.</p></li><li><p><code>qfit SBP weight0</code><br>Adds a <strong>quadratic fit</strong> curve (a 2nd-degree polynomial regression of SBP on weight0).</p><ul><li><p>It fits: <span>SBP=a+b⋅weight0+c⋅weight02SBP = a + b\cdot weight0 + c\cdot weight0^2SBP=a+b⋅weight0+c⋅weight02</span></p></li><li><p>Useful if the relationship looks curved rather than straight.</p></li></ul></li></ul><p></p>
19
New cards

Kaplan-Meier graph

Cumulative survival plot with time

<p>Cumulative survival plot with time</p>
20
New cards

stset followup, fail(death)

This tells Stata: “I’m doing survival/time-to-event analysis.”

  • followup = the time variable (how long each person was followed, e.g., years/months/days).

  • fail(death) defines the event indicator:

    • death==1 → the event happened (failure)

    • death==0 → censored (no event during follow-up)

After stset, Stata creates internal survival variables and now you can use sts, stcox, etc.

21
New cards

sts graph, xlab(0(1)6)

This draws the Kaplan–Meier survival curve based on the stset data.

  • sts graph = plot the estimated survival function S(t)S(t)S(t).

  • xlab(0(1)6) puts x-axis tick labels at 0,1,2,3,4,5,6 (in the same time units as followup).This draws the Kaplan–Meier survival curve based on the stset data.

    • sts graph = plot the estimated survival function S(t)S(t)S(t).

    • xlab(0(1)6) puts x-axis tick labels at 0,1,2,3,4,5,6 (in the same time units as followup).

<p>This draws the <strong>Kaplan–Meier survival curve</strong> based on the <code>stset</code> data.</p><ul><li><p><code>sts graph</code> = plot the estimated survival function <span>S(t)S(t)S(t)</span>.</p></li><li><p><code>xlab(0(1)6)</code> puts x-axis tick labels at <strong>0,1,2,3,4,5,6</strong> (in the same time units as <code>followup</code>).This draws the <strong>Kaplan–Meier survival curve</strong> based on the <code>stset</code> data.</p><ul><li><p><code>sts graph</code> = plot the estimated survival function <span>S(t)S(t)S(t)</span>.</p></li><li><p><code>xlab(0(1)6)</code> puts x-axis tick labels at <strong>0,1,2,3,4,5,6</strong> (in the same time units as <code>followup</code>).</p></li></ul></li></ul><p></p>