Graphing in Theta
Histogram
- A histogram is a graph of the frequency distribution for numerical data.
Constructing a Frequency Distribution
- Determine the range of data: Find the smallest and largest observations to understand the data's spread.
- Select the number of classes: Decide how many bars the histogram should have. Too many or too few classes can be less informative. Usually, the number of classes is between 5 and 15 but it depends on the data.
- Compute class intervals: Determine the width of the classes.
- Determine boundaries (limits): Define the boundaries for each class.
- Count observations and assign them to classes: Go through the data set and assign each observation to its appropriate class.
Types of Histograms
- Frequency Histogram: Shows the actual frequency of observations in each class.
- Relative Frequency Histogram (Proportion): Shows the proportion of observations in each class (frequency out of the total).
- Calculated as: \frac{\text{Frequency}}{\text{Total Observations}}
- Percent Histogram: Shows the percentage of observations in each class (proportion multiplied by 100).
- Calculated as: (\text{Proportion} \times 100)
Graphing the Histogram
- The horizontal axis represents the values or classes.
- The vertical axis represents the frequency, relative frequency, or percentage.
- Bars should touch each other to indicate that there are no gaps between the classes.
Stata - Do File Editor
- The do file editor is a text editor within Stata that allows you to write and execute a program or a series of commands.
- Commands in the do file editor are typically shown in blue.
- You can execute the entire do file or select specific portions of the code to execute.
Useful Commands:
clear
: Clears everything from memory.log using
: Starts a log file to record your Stata session.cd
: Changes the current working directory.input
: Enters data into Stata.list
: Lists the data in Stata.generate
: Generates a new variable.
Conditional Statements
if
: Used to apply a command only to observations that meet a specific condition.replace
: Replaces the value of a variable based on a specified condition.- Example:
generate bin1 = 0
: Creates a new variablebin1
and initializes it with zeros.replace bin1 = 1 if variable1 >= 15.5 & variable1 < 25.5
: Replaces the value inbin1
with 1 ifvariable1
is greater than or equal to 15.5 and less than 25.5.
Frequency Tables
- Frequency tables display the number of observations in each category or bin.
- Commands:
tabstat
: Tabulates the data.table
: Command to create table. (The command wasn't working, needs to be fixed.)
Recode Command
- The
recode
command is used to change the values of a variable based on specified conditions. - Example:
recode variable1 15.5/25.5 = 1 25.5/35.5 = 2 35.5/45.5 = 3, generate(bin)
: Recodes values invariable1
from 15.5 to 25.5 to 1, from 25.5 to 35.5 to 2, from 35.5 to 45.5 to 3, and generates a new variable calledbin
.
Graph Command
- The
graph
command is used to generate graphs in Stata.graph bar
: Generates a bar graph.
Histogram Command
- The
histogram
command is used to generate histograms in Stata. histogram variable1, frequency start(15.5) width(10)
: Generates a frequency histogram forvariable1
, starting at 15.5 with a width of 10.- Options:
frequency
: Specifies a frequency histogram.percent
: Specifies a percent histogram.start()
: Specifies the starting value for the histogram.width()
: Specifies the width of each class.bin
: Specifies the number of bins.
Skewness and Kurtosis
- Skewness measures the asymmetry of a distribution.
- A symmetric distribution has a skewness of 0.
- A distribution skewed to the right has a positive skewness.
- A distribution skewed to the left has a negative skewness.
- Kurtosis measures the