OJ

Graphing in Theta

Histogram

  • A histogram is a graph of the frequency distribution for numerical data.

Constructing a Frequency Distribution

  • Determine the range of data: Find the smallest and largest observations to understand the data's spread.
  • Select the number of classes: Decide how many bars the histogram should have. Too many or too few classes can be less informative. Usually, the number of classes is between 5 and 15 but it depends on the data.
  • Compute class intervals: Determine the width of the classes.
  • Determine boundaries (limits): Define the boundaries for each class.
  • Count observations and assign them to classes: Go through the data set and assign each observation to its appropriate class.

Types of Histograms

  • Frequency Histogram: Shows the actual frequency of observations in each class.
  • Relative Frequency Histogram (Proportion): Shows the proportion of observations in each class (frequency out of the total).
    • Calculated as: \frac{\text{Frequency}}{\text{Total Observations}}
  • Percent Histogram: Shows the percentage of observations in each class (proportion multiplied by 100).
    • Calculated as: (\text{Proportion} \times 100)

Graphing the Histogram

  • The horizontal axis represents the values or classes.
  • The vertical axis represents the frequency, relative frequency, or percentage.
  • Bars should touch each other to indicate that there are no gaps between the classes.

Stata - Do File Editor

  • The do file editor is a text editor within Stata that allows you to write and execute a program or a series of commands.
  • Commands in the do file editor are typically shown in blue.
  • You can execute the entire do file or select specific portions of the code to execute.

Useful Commands:

  • clear: Clears everything from memory.
  • log using: Starts a log file to record your Stata session.
  • cd: Changes the current working directory.
  • input: Enters data into Stata.
  • list: Lists the data in Stata.
  • generate: Generates a new variable.

Conditional Statements

  • if: Used to apply a command only to observations that meet a specific condition.
  • replace: Replaces the value of a variable based on a specified condition.
  • Example:
    • generate bin1 = 0: Creates a new variable bin1 and initializes it with zeros.
    • replace bin1 = 1 if variable1 >= 15.5 & variable1 < 25.5: Replaces the value in bin1 with 1 if variable1 is greater than or equal to 15.5 and less than 25.5.

Frequency Tables

  • Frequency tables display the number of observations in each category or bin.
  • Commands:
    • tabstat: Tabulates the data.
    • table: Command to create table. (The command wasn't working, needs to be fixed.)

Recode Command

  • The recode command is used to change the values of a variable based on specified conditions.
  • Example:
    • recode variable1 15.5/25.5 = 1 25.5/35.5 = 2 35.5/45.5 = 3, generate(bin): Recodes values in variable1 from 15.5 to 25.5 to 1, from 25.5 to 35.5 to 2, from 35.5 to 45.5 to 3, and generates a new variable called bin.

Graph Command

  • The graph command is used to generate graphs in Stata.
    • graph bar: Generates a bar graph.

Histogram Command

  • The histogram command is used to generate histograms in Stata.
  • histogram variable1, frequency start(15.5) width(10): Generates a frequency histogram for variable1, starting at 15.5 with a width of 10.
  • Options:
    • frequency: Specifies a frequency histogram.
    • percent: Specifies a percent histogram.
    • start(): Specifies the starting value for the histogram.
    • width(): Specifies the width of each class.
    • bin: Specifies the number of bins.

Skewness and Kurtosis

  • Skewness measures the asymmetry of a distribution.
    • A symmetric distribution has a skewness of 0.
    • A distribution skewed to the right has a positive skewness.
    • A distribution skewed to the left has a negative skewness.
  • Kurtosis measures the