Graphs

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/20

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

21 Terms

New cards

What are the graphical procedures to describe 1 variable?

1.- Pie graph

2.- Bar graph

3.- Cummulative frequency graph

4,- Histogram

5.- Box-plot

New cards

What command is used to upload data files from internet?

webuse

downloads a sample dataset from the internet and loads it into memory

ie webuse lbw

lbw is a well-known example dataset (often labeled “Hosmer & Lemeshow data”) used for logistic regression examples.

New cards

What is doing the following command

graph pie, over(race)

This draws a pie where each slice = the number of observations in each race category.

over (variable)

New cards

What does the command plabel () …

graph pie, over(race) plabel(_all percent) legend(on)

Add percentage and legend to the graphsfor each slice of the pie chart, showing the portion of each race category.

<p>Add percentage and legend to the graphsfor each slice of the pie chart, showing the portion of each <code>race</code> category. </p>

New cards

What is doing the following command?

Is defining each colour for each slide

ie pie (1, color (gs15)) defines that first pie is gray (gs15)

New cards

When is used a bar graph?

. represent a qualitative ordinal variable

. each bar represents a category

. Height proportional to frequency

New cards

What is doing the function graph var, over (agecat),

makes a bar chart of the distribution of age_cat

For each category of age_cat (y axis) show the relative frequency or percentage

<p>makes a <strong>bar chart of the distribution of </strong><code>age_cat</code></p><p>For each category of age_cat (y axis) show the relative frequency or percentage</p>

New cards

hist age_cat, discrete freq gap(20) xlabel(, valuelabel) addlabel ///

fcolor(navy) lcolor(none) xtitle("Age of the mother")

Make a histogram of age_cat as separate bars, show counts, add gaps, label the x-axis with the category names, and print the counts on each bar.

What each part does

hist age_cat
Draws a histogram of age_cat.
, discrete
Treats age_cat as discrete categories (one bar per integer/category), not continuous bins.
freq
Y-axis shows frequency (counts), not density/percent.
gap(20)
Adds space between bars (bigger number = bigger gaps). Makes it look more like separated category bars.
xlabel(, valuelabel)
Uses the value labels of age_cat on the x-axis (e.g., “18–24”, “25–34”) instead of showing just codes (1,2,3…).
addlabel
Prints the count on top of each bar.
fcolor(navy)
Sets the fill color of the bars to navy.
lcolor(none)
Removes the outline around bars (no border line).
xtitle("Age of the mother")
Sets the x-axis title.
///
Line continuation in Stata: lets you split a long command across lines.

<p>Make a histogram of <code>age_cat</code> as separate bars, show counts, add gaps, label the x-axis with the category names, and print the counts on each bar.</p><p>What each part does </p><ul><li><p><code>hist age_cat</code><br>Draws a histogram of <code>age_cat</code>.</p></li><li><p><code>, discrete</code><br>Treats <code>age_cat</code> as <strong>discrete categories</strong> (one bar per integer/category), not continuous bins.</p></li><li><p><code>freq</code><br>Y-axis shows <strong>frequency (counts)</strong>, not density/percent.</p></li><li><p><code>gap(20)</code><br>Adds space between bars (bigger number = bigger gaps). Makes it look more like separated category bars.</p></li><li><p><code>xlabel(, valuelabel)</code><br>Uses the <strong>value labels</strong> of <code>age_cat</code> on the x-axis (e.g., “18–24”, “25–34”) instead of showing just codes (1,2,3…).</p></li><li><p><code>addlabel</code><br>Prints the <strong>count</strong> on top of each bar.</p></li><li><p><code>fcolor(navy)</code><br>Sets the <strong>fill color</strong> of the bars to navy.</p></li><li><p><code>lcolor(none)</code><br>Removes the <strong>outline</strong> around bars (no border line).</p></li><li><p><code>xtitle("Age of the mother")</code><br>Sets the x-axis title.</p></li><li><p><code>///</code><br>Line continuation in Stata: lets you split a long command across lines.</p></li></ul><p></p>

New cards

cumul age, gen(c_age) equal

twoway scatter c_age age, connect(l)

1) cumul age, gen(c_age) equal

cumul computes the cumulative distribution function (CDF) of age.
gen(c_age) saves the cumulative values into a new variable called c_age.
After this, each observation gets a number between 0 and 1:
- c_age = proportion of observations with age ≤ that observation’s age (a cumulative proportion).
equal tells Stata to treat each observation as having equal weight (each person counts the same). It’s mainly relevant if you have weights or if you want the “standard” unweighted empirical CDF.

2) twoway scatter c_age age, connect(l)

Plots c_age (y-axis) against age (x-axis).
connect(l) draws lines between the points, so it looks like a CDF curve rather than separate dots.

<p>1) <code>cumul age, gen(c_age) equal</code> </p><ul><li><p><code>cumul</code> computes the <strong>cumulative distribution function (CDF)</strong> of <code>age</code>.</p></li><li><p><code>gen(c_age)</code> saves the cumulative values into a new variable called <code>c_age</code>.</p></li><li><p>After this, each observation gets a number between <strong>0 and 1</strong>:</p><ul><li><p><code>c_age</code> = proportion of observations with age <strong>≤ that observation’s age</strong> (a cumulative proportion).</p></li></ul></li><li><p><code>equal</code> tells Stata to treat each observation as having <strong>equal weight</strong> (each person counts the same). It’s mainly relevant if you have weights or if you want the “standard” unweighted empirical CDF.</p></li></ul><p> 2) <code>twoway scatter c_age age, connect(l)</code> </p><ul><li><p>Plots <code>c_age</code> (y-axis) against <code>age</code> (x-axis).</p></li><li><p><code>connect(l)</code> draws <strong>lines</strong> between the points, so it looks like a CDF curve rather than separate dots.</p></li></ul><p></p>

New cards

graph box partial1, ylab(0(1)10)

Draws a box-and-whisker plot for the variable partial1 (one box showing its distribution).

ylab(0(1)10) (y-axis labels)

This controls the tick marks on the y-axis:

0(1)10 means: label 0, 1, 2, …, 10 (step = 1)

So it forces a y-axis scale that’s nice for something like a score from 0 to 10.

<p>Draws a <strong>box-and-whisker plot</strong> for the variable <code>partial1</code> (one box showing its distribution).</p><p></p><p><code>ylab(0(1)10)</code> (y-axis labels) </p><p>This controls the tick marks on the y-axis:</p><p> </p><ul><li><p><code>0(1)10</code> means: label <strong>0, 1, 2, …, 10</strong> (step = 1)</p></li></ul><p> </p><p>So it forces a y-axis scale that’s nice for something like a score from 0 to 10.</p>

New cards

What graphs can be used to describe the relationship between two variables?

1.- Two or more pie graphs

2.- Two or more bar graphs

3.- Two or more box-plots

4.- Scatter plot

New cards

When to use two or more pie graphs?

Two qualitative nominal variables

New cards

When to use two or more bar graphs?

One qualitative nominal, one qualitative ordinal

New cards

When to use two or more box-plots?

One qualitative and one quantitative

New cards

When to use scatter plot?

Two quantitative variables

New cards

tw sc SBP age

two way scatter plot

Y-axis: SBP (systolic blood pressure)
X-axis: age

New cards

sc SBP weight0 if smk==1 || lfit SBP weight0

sc SBP weight0
Scatter plot with:
- Y = SBP (systolic blood pressure)
- X = weight0 (baseline weight)
if smk==1
Only plot the observations where smk equals 1 (e.g., smokers).
||
Combines multiple “twoway” plots in the same graph (adds another layer).
lfit SBP weight0
Adds a linear fit line (OLS regression line) of SBP on weight0.

<ul><li><p><code>sc SBP weight0</code><br>Scatter plot with:</p><ul><li><p><strong>Y = SBP</strong> (systolic blood pressure)</p></li><li><p><strong>X = weight0</strong> (baseline weight)</p></li></ul></li><li><p><code>if smk==1</code><br>Only plot the <strong>observations where </strong><code>smk</code><strong> equals 1</strong> (e.g., smokers).</p></li><li><p><code>||</code><br>Combines multiple “twoway” plots in the <strong>same graph</strong> (adds another layer).</p></li><li><p><code>lfit SBP weight0</code><br>Adds a <strong>linear fit line</strong> (OLS regression line) of SBP on weight0.</p></li></ul><p></p>

New cards

sc SBP weight0 if smk==1 || qfit SBP weight0 ///

, xlabel(40(5)120) ylab(80(5)155, angle(0))

sc SBP weight0 if smk==1
- sc = scatter plot
- Y-axis: SBP (systolic blood pressure)
- X-axis: weight0
- if smk==1 = only plot observations where smk equals 1 (smokers)
||
Adds another plot layer on the same graph.
qfit SBP weight0
Adds a quadratic fit curve (a 2nd-degree polynomial regression of SBP on weight0).
- It fits: SBP=a+b⋅weight0+c⋅weight02SBP = a + b\cdot weight0 + c\cdot weight0^2SBP=a+b⋅weight0+c⋅weight02
- Useful if the relationship looks curved rather than straight.

$<ul><li><code>sc SBP weight0 if smk==1</code><ul><li><code>sc</code> = scatter plot</li><li>Y-axis: <code>SBP</code> (systolic blood pressure)</li><li>X-axis: <code>weight0</code></li><li><code>if smk==1</code> = only plot observations where <code>smk</code> equals 1 (smokers)</li></ul></li><li><code>||</code> Adds another plot layer on the same graph.</li><li><code>qfit SBP weight0</code> Adds a quadratic fit curve (a 2nd-degree polynomial regression of SBP on weight0).<ul><li>It fits: SBP=a+b⋅weight0+c⋅weight02SBP = a + b\cdot weight0 + c\cdot weight0^2SBP=a+b⋅weight0+c⋅weight02</li><li>Useful if the relationship looks curved rather than straight.</li></ul></li></ul>$

New cards

Kaplan-Meier graph

Cumulative survival plot with time

New cards

stset followup, fail(death)

This tells Stata: “I’m doing survival/time-to-event analysis.”

followup = the time variable (how long each person was followed, e.g., years/months/days).
fail(death) defines the event indicator:
- death==1 → the event happened (failure)
- death==0 → censored (no event during follow-up)

After stset, Stata creates internal survival variables and now you can use sts, stcox, etc.

New cards

sts graph, xlab(0(1)6)

This draws the Kaplan–Meier survival curve based on the stset data.

sts graph = plot the estimated survival function S(t)S(t)S(t).
xlab(0(1)6) puts x-axis tick labels at 0,1,2,3,4,5,6 (in the same time units as followup).This draws the Kaplan–Meier survival curve based on the stset data.
- sts graph = plot the estimated survival function S(t)S(t)S(t).
- xlab(0(1)6) puts x-axis tick labels at 0,1,2,3,4,5,6 (in the same time units as followup).

<p>This draws the <strong>Kaplan–Meier survival curve</strong> based on the <code>stset</code> data.</p><ul><li><p><code>sts graph</code> = plot the estimated survival function <span>S(t)S(t)S(t)</span>.</p></li><li><p><code>xlab(0(1)6)</code> puts x-axis tick labels at <strong>0,1,2,3,4,5,6</strong> (in the same time units as <code>followup</code>).This draws the <strong>Kaplan–Meier survival curve</strong> based on the <code>stset</code> data.</p><ul><li><p><code>sts graph</code> = plot the estimated survival function <span>S(t)S(t)S(t)</span>.</p></li><li><p><code>xlab(0(1)6)</code> puts x-axis tick labels at <strong>0,1,2,3,4,5,6</strong> (in the same time units as <code>followup</code>).</p></li></ul></li></ul><p></p>