Graphing in Theta
Histogram
- A histogram is a graph of the frequency distribution for numerical data.
Constructing a Frequency Distribution
- Determine the range of data: Find the smallest and largest observations to understand the data's spread.
- Select the number of classes: Decide how many bars the histogram should have. Too many or too few classes can be less informative. Usually, the number of classes is between 5 and 15 but it depends on the data.
- Compute class intervals: Determine the width of the classes.
- Determine boundaries (limits): Define the boundaries for each class.
- Count observations and assign them to classes: Go through the data set and assign each observation to its appropriate class.
Types of Histograms
- Frequency Histogram: Shows the actual frequency of observations in each class.
- Relative Frequency Histogram (Proportion): Shows the proportion of observations in each class (frequency out of the total).
- Calculated as: \frac{\text{Frequency}}{\text{Total Observations}}
- Percent Histogram: Shows the percentage of observations in each class (proportion multiplied by 100).
- Calculated as: (\text{Proportion} \times 100)
Graphing the Histogram
- The horizontal axis represents the values or classes.
- The vertical axis represents the frequency, relative frequency, or percentage.
- Bars should touch each other to indicate that there are no gaps between the classes.
Stata - Do File Editor
- The do file editor is a text editor within Stata that allows you to write and execute a program or a series of commands.
- Commands in the do file editor are typically shown in blue.
- You can execute the entire do file or select specific portions of the code to execute.
Useful Commands:
clear: Clears everything from memory.log using: Starts a log file to record your Stata session.cd: Changes the current working directory.input: Enters data into Stata.list: Lists the data in Stata.generate: Generates a new variable.
Conditional Statements
if: Used to apply a command only to observations that meet a specific condition.replace: Replaces the value of a variable based on a specified condition.- Example:
generate bin1 = 0: Creates a new variable bin1 and initializes it with zeros.replace bin1 = 1 if variable1 >= 15.5 & variable1 < 25.5: Replaces the value in bin1 with 1 if variable1 is greater than or equal to 15.5 and less than 25.5.
Frequency Tables
- Frequency tables display the number of observations in each category or bin.
- Commands:
tabstat: Tabulates the data.table: Command to create table. (The command wasn't working, needs to be fixed.)
Recode Command
- The
recode command is used to change the values of a variable based on specified conditions. - Example:
recode variable1 15.5/25.5 = 1 25.5/35.5 = 2 35.5/45.5 = 3, generate(bin): Recodes values in variable1 from 15.5 to 25.5 to 1, from 25.5 to 35.5 to 2, from 35.5 to 45.5 to 3, and generates a new variable called bin.
Graph Command
- The
graph command is used to generate graphs in Stata.graph bar: Generates a bar graph.
Histogram Command
- The
histogram command is used to generate histograms in Stata. histogram variable1, frequency start(15.5) width(10): Generates a frequency histogram for variable1, starting at 15.5 with a width of 10.- Options:
frequency: Specifies a frequency histogram.percent: Specifies a percent histogram.start(): Specifies the starting value for the histogram.width(): Specifies the width of each class.bin: Specifies the number of bins.
Skewness and Kurtosis
- Skewness measures the asymmetry of a distribution.
- A symmetric distribution has a skewness of 0.
- A distribution skewed to the right has a positive skewness.
- A distribution skewed to the left has a negative skewness.
- Kurtosis measures the