EPI514 Midterm Exam Flashcards

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/184

flashcard set

Earn XP

Description and Tags

Flashcards for EPI514 Midterm Midterm Day: 10/18/2023 Lecture 1 - Lecture 10

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

185 Terms

1
New cards

Labels

replace the variable name in the output

2
New cards

Adding Labels to Variables

makes the output significantly more readable

3
New cards

LABEL statement info:

  • Use a LABEL statement to assign a label to each variable.

  • After the keyword LABEL, enter a variable name followed by an equal sign and the desired label in quotes.

  • If the label is made in PROC step, it will only be used for that procedure.

  • If the label is made in DATA step, it will be applied permanently.

4
New cards

Formats

are used to change how the data is displayed.

5
New cards

PROC FORMAT

allows for creating user-defined formats

6
New cards

VALUE statement

used to define format

7
New cards

PROC FORMAT

use it to regroup values together

8
New cards

Format Library

  • If formats and labels are made in a DATA step, they are permanently associated with the data set

  • If the data set is saved permanently as a SAS data set with user-defined formats, you must store formats permanently as well

9
New cards

LIBNAME statement

use to create a libref (will create permanent data sets)

10
New cards

LIBRARY option in PROC FORMAT

use to save format library

11
New cards

OPTIONS statement

use to change SAS system options

12
New cards

FMTLIB option

use in PROC FORMAT to display the definitions of all formats in the library

13
New cards

SELECT statement

use to view only specific format

14
New cards

Reading Data from Excel into SAS

The IMPORT procedure can be used to convert Excel, STATA, SPSS, JMP, Microsoft Access, and other software’s data files into SAS data sets

15
New cards

PROC IMPORT

PROC IMPORT will scan a worksheet and determine variable types and lengths of character variables automatically

  • DATAFILE option used to specify the file you want to read

  • OUT option used to specify the name of the SAS data set you are

    creating

  • DBMS option tells SAS the type of data to import, e.g., XLS, XLSX,

    JMP, ACCESS, CSV

  • REPLACE tells SAS to replace the SAS data set if data set named in OUT option already exists

    • Different from DATA step which overwrites data set

<p>PROC IMPORT will scan a worksheet and determine variable types and lengths of character variables automatically</p><ul><li><p>DATAFILE option used to specify the file you want to read</p></li><li><p>OUT option used to specify the name of the SAS data set you are</p><p>creating</p></li><li><p>DBMS option tells SAS the type of data to import, e.g., XLS, XLSX,</p><p>JMP, ACCESS, CSV</p></li><li><p>REPLACE tells SAS to replace the SAS data set if data set named in OUT option already exists</p><ul><li><p>Different from DATA step which overwrites data set</p></li></ul></li></ul>
16
New cards

Accessing Excel File Using Engine

  • Can use the XLSX LIBNAME engine to read and write to an Excel

    file without converting it to a SAS data set

  • Can read an existing worksheet, replace a worksheet, or add a new

    worksheet

  • Cannot change individual values in a worksheet

17
New cards

XLSX Libname Engine

  • will use the first line of the data set as the

    variable name, determine each variable’s data type, assign a length

    to character variables, and recognize dates and numeric variables

    with dollar signs and commas

  • Uses LIBNAME statement to access excel file

  • As before a libref only remains associated with a SAS library for the

    duration of the SAS session or until it is changed with another

    statement

18
New cards

SAS Output Delivery System

Can use the SAS Output Delivery System (ODS) to create a CSV file from a SAS data set

19
New cards

Conditional Processing

  • Conditional processing allows program to make logical decisions based on data values

  • We looked at using IF statements previously to print observations

    that satisfied certain conditions using the PUT statement

  • Useful for creating new variables and subsetting data

20
New cards

IF and ELSE IF Statements

  • The expression following an IF statement is evaluated and if the

    expression is true the statement following THEN is executed

  • An ELSE statement can follow an IF/THEN statement and provides

    an expression to evaluate if the IF statement is false

  • ELSE IF follows an IF/THEN statement and is used to provide an

    additional IF statement to evaluate if the prior IF statement is false

  • If working with large data sets that requires the code to be as efficient as possible, place the if/else if statements most likely to be true first

21
New cards

Comparison Operators in SAS

  • Many different comparison operators in SAS

  • Can be expressed as a symbol or mnemonic equivalent

  • >= and <= are not supported in WHERE clauses or PROC SQL

<ul><li><p>Many different comparison operators in SAS</p></li><li><p>Can be expressed as a symbol or mnemonic equivalent</p></li><li><p>&gt;= and &lt;= are not supported in WHERE clauses or PROC SQL</p></li></ul>
22
New cards

Subsetting with IF Statement

  • We can use an IF statement to subset data

  • Place an IF statement in a DATA step without a THEN statement

  • If the condition is true, the DATA step continues

  • If the condition is false, the DATA step returns to the top and

    proceeds with the next observation

23
New cards

In Operator

  • IN checks if a value is contained in a list of of values

  • Can also use IN with numerics

<ul><li><p>IN checks if a value is contained in a list of of values</p></li><li><p>Can also use IN with numerics</p></li></ul>
24
New cards

SELECT Statement (in place of IF/ELSE)

  • SELECT statement is an alternative to a series of IF and ELSE IF

    statements

  • Select-expression follows the select statement

  • When-expression comes after select

  • Select-expression used to specify which variable we want to compare

    to each of the when-expressions

  • If variable in select-expression is equal to the value in the

    when-expression then the expression after WHEN is executed and we

    skip to the END statement

  • If none of the comparisons are true the otherwise-expression after

    OTHERWISE is executed

<ul><li><p>SELECT statement is an alternative to a series of IF and ELSE IF</p><p>statements</p></li><li><p>Select-expression follows the select statement</p></li><li><p>When-expression comes after select</p></li><li><p>Select-expression used to specify which variable we want to compare</p><p>to each of the when-expressions</p></li><li><p>If variable in select-expression is equal to the value in the</p><p>when-expression then the expression after WHEN is executed and we</p><p>skip to the END statement</p></li><li><p>If none of the comparisons are true the otherwise-expression after</p><p>OTHERWISE is executed</p><p></p></li></ul>
25
New cards

SELECT Statement (in place of IF/ELSE) not specify

  • We can also not specify a select-expression in the SELECT statement

  • Then each when-expression is evaluated until one is true and the

    statement after is executed

  • As before, if all are false then otherwise-expression is executed

<ul><li><p>We can also not specify a select-expression in the SELECT statement</p></li><li><p>Then each when-expression is evaluated until one is true and the</p><p>statement after is executed</p></li><li><p>As before, if all are false then otherwise-expression is executed</p></li></ul>
26
New cards

Boolean Operators

  • Can combine multiple comparisons using logical operators, also called boolean operators

  • NOT evaluated first, then AND, then OR

  • Let X, Y, and Z represent comparison statements

  • IF X AND Y OR Z; equivalent to IF (X AND Y) OR Z;

  • IF X AND NOT Y or z; equivalent to IF (X AND (NOT Y)) OR Z;

<ul><li><p>Can combine multiple comparisons using logical operators, also called boolean operators</p></li><li><p>NOT evaluated first, then AND, then OR</p></li><li><p>Let X, Y, and Z represent comparison statements</p></li><li><p>IF X AND Y OR Z; equivalent to IF (X AND Y) OR Z;</p></li><li><p>IF X AND NOT Y or z; equivalent to IF (X AND (NOT Y)) OR Z;</p><p></p></li></ul>
27
New cards

WHERE Statement

  • WHERE statements are very similar to IF statements, but they can

    only be used for SAS data sets

<ul><li><p>WHERE statements are very similar to IF statements, but they can</p><p>only be used for SAS data sets</p></li></ul>
28
New cards

Where Statement (List of Operators)

  • WHERE statements have a list of operators that cannot be used with

    IF statements

<ul><li><p>WHERE statements have a list of operators that cannot be used with</p><p>IF statements</p></li></ul>
29
New cards

Examples of Operators (Part 1)

  • IS MISSING

    • WHERE Age IS MISSING;

  • IS NULL

    • WHERE Age IS NULL;

  • BETWEEN AND

    • WHERE Age BETWEEN 20 AND 40;

  • CONTAINS

    • CONTAINS is case sensitive

    • WHERE Name CONTAINS ’er’;

    • Selects observations with name ’Jefferson’ and ’Peter’, but not ’Eric’

30
New cards

Examples of Operators (Part 2)

  • LIKE

    • LIKE is case sensitive

    • _can be any single character

    • % can be a string of any length

    • WHERE Name LIKE ’J s%’;

      • Would select Justin, Josh, Jessica

  • = ∗

    • Uses Soundex algorithm to compare whether the word sounds like

    • WHERE name =* ’Smith’;

    • Selects observations with names Smitt, Smythe, but not Schmitt

31
New cards

Looping

  • It is common to have a set of SAS statements that we want to execute multiple times

  • This can be accomplished in SAS using DO groups, DO loops, DO WHILE statements, and DO UNTIL statements

32
New cards

DO Groups

  • DO groups are often used with IF/ELSE statements when we have a group of statements that we want executed when the IF condition is TRUE

33
New cards

SUM Statement

  • The SUM statement adds the results of an expression to an accumulator variable

  1. SUM statement has the following form

    1. variable + expression

  2. Variable is set to 0 initially

  3. Variable is retained automatically

  4. Missing values are ignored

  5. Note there is no equal sign

34
New cards

RETAIN Statement

RETAIN statement causes a variable that is created by an INPUT or assignment statement to retain its value from one iteration of the DATA step to the next

35
New cards

SUM Statement (counter)

SUM statement is commonly used for creating counters

36
New cards

DO Loops

There are times where we want to execute the same code multiple times

37
New cards

DO Loops (plotting)

  • We can use a DO loop to increment X over a grid of points and calculate Y for each value of X

  • Can use an OUTPUT statement to create a data set that contains all of the (X, Y ) pairs

  • PROC SGPLOT can be used to plot the line using the data set

38
New cards

DO WHILE statements

  • DO WHILE statements execute a block of statements repeatedly while a condition is true

  • For DO WHILE the condition is evaluated at the beginning of the loop

  • For DO WHILE, we need to be sure that the condition will eventually stop being true

  • Otherwise will create an infinite loop that will never stop running Can press cancel to stop the program

39
New cards

DO UNTIL Statements

  • DO UNTIL statements execute a block of statements repeatedly until a condition is met

  • For DO UNTIL the condition, placed in parentheses after UNTIL, is evaluated at the bottom of the loop

    • Therefore the code is always executed at least once

  • Need to be careful when using DO UNTIL statements that the condition eventually becomes true

  • Otherwise will create an infinite loop that will never stop running Can press cancel to stop the program

40
New cards

Combining DO UNTIL/WHILE and DO Loops

  • We can use a DO loop that loops over an index variable while also containing an UNTIL statement that will stop the loop as soon as the condition becomes true

  • Also possible to combine a DO Loop with a DO WHILE statement

  • Then there is no possibility of an infinite loop as the loop will stop when the index variable has iterated through all of its values

41
New cards

LEAVE Statement

  • The LEAVE statement inside a DO loop ends the loop and moves to executing the statement after the END statement

  • LEAVE statement can also be used inside a SELECT group

42
New cards

CONTINUE Statement

  • The CONTINUE statement ends the current loop and moves to the next iteration of the loop and continues

43
New cards

SAS Dates

Dates are stored as the number of days from January 1, 1960

<p>Dates are stored as the number of days from January 1, 1960</p>
44
New cards

YRDIF function

returns the number of years between two dates given by the start date (DOB) and end date (Date2)

45
New cards

Date Constant

  • Suppose we want to calculate the age for everyone in the data set at a specific date

  • Can enter dates in a DATA step using a date constant

  • General form is one- or two- digit day, three character month, and two- or four-digit year in quotation marks followed by a d

  • For example, a date can be written as

    • ‘28Sep2022’d

  • This is the only form allowed for a date constant

  • Date constants can be used in any expression involving dates

46
New cards

Date Constant for Today

  • Can use the TODAY function to return today’s date

  • DATE() is identical to TODAY()

47
New cards

DATDIF Function

  • DATDIF function returns the number of days between two dates

  • DATDIF function has a required third argument, basis

    • DATDIF(start-date, end-date, basis)

  • Basis used to specify how we want to count the number of days

    • ’30/360’ uses a 30-day month and 360-day year regardless of the actual number of days in a month or year

    • ’ACT/ACT’ uses the actual number of days

    • ’ACT/360’ uses the actual number of days in a specific month and a 360-day year

    • ‘ACT/365’ uses the actual number of days in a specific month and a 365-day year

  • YRDIF function has a similar third argument that changes how the difference in years is calculated

    • Default is ‘AGE’ for calculating a person’s age

48
New cards

Extracting Day, Month, and Year from a SAS Date

  • WEEKDAY function returns the day of the week with Sunday = 1

  • DAY function returns the day of the month

  • MONTH function returns the month

  • YEAR function returns the four-digit year value

<ul><li><p>WEEKDAY function returns the day of the week with Sunday = 1 </p></li><li><p>DAY function returns the day of the month </p></li><li><p>MONTH function returns the month </p></li><li><p>YEAR function returns the four-digit year value</p></li></ul>
49
New cards

Creating a Date from Month, Day, Year

  • MDY function allows you to create a SAS date using a month, day, and year values

  • Any missing value results in a missing date

50
New cards

Date Interval Function

  • INTCK function computes the number of interval boundaries (e.g., months, quarters, years) that are crossed between two dates

  • INTNX function computes a date after a given number of intervals

<ul><li><p>INTCK function computes the number of interval boundaries (e.g., months, quarters, years) that are crossed between two dates </p></li><li><p>INTNX function computes a date after a given number of intervals</p></li></ul>
51
New cards

Partial List of Intervals for INTCK/INTNX

A partial list of the intervals used for INTCK and INTNX given by

<p>A partial list of the intervals used for INTCK and INTNX given by</p>
52
New cards
53
New cards
54
New cards

By default SAS assumes that data values are separated by ( )

one or more blanks

55
New cards

The input statement contains

the variables you want to associate with each data value

56
New cards

A period must be separated from other values by ( )

at least one space

57
New cards

A common way to store data on Windows and UNIX platforms is in ( ), which use commas instead of blanks as data delimiters

Comma- separated values

58
New cards

( ) is a sequence of one or more characteristics that marks the beginning or end of a unit of data

delimiter

59
New cards

( ) after the file name is an option for infile

delimiter-sensitive data

60
New cards

As an alternative file reference, you may use ( ) to specify file containing data.

filename

61
New cards

( ) overrides DSD when used together

Delimiter specified by DLM overrides DSD when used together

62
New cards

For TAB key delimiter need to use the ( )

hexadecimal equivalent

63
New cards

Hexadecimal is a numeral system with base ( )

Standard numeral system is base ( )

Uses 16 distinct symbols for each digit, commonly with ( ) representing 0-9 and ( ) representing 10-15

Used by software developers as ( ) which range from 00000000 to 11111111 can be written as hexadecimal number from ( ) to ( )

16

10

0-9

A-F

8 bit

00 to FF

64
New cards

Hexadecimal values can by used in SAS statement by placing ( ) followed immediately by ( )

value in quotes followed immediately by x (no spaces)

65
New cards

Can use ( ) to read data directly into SAS without using an external file

datalines

66
New cards

You can use DATALINES with ( ) and use DATALINES as file reference in ( )

INFILE options and INFILE statement

67
New cards

A method for reading data in fixed columns is called

column input

68
New cards

Column data cannot read in data that ( ) or ( ) and can only read in dates as

has no commas or dollar signs in it and can only read in dates as character values

69
New cards

Formatted input reads data from fixed columns similarly to column input, but allows for ( ) (e.g. containing dollar signs or commas) and ( ) in a variety of formats

nonstandard numerical values and dates in a variety of formats

70
New cards

( ) sign is called a column pointer and tells which column the variable starts at

@ sign is called a column pointer and tells which column the variable starts at

71
New cards

( ) tells SAS that its a character and w columns

mmddyy10. tells SAS the date is in the ( ) format

Storing the date as numeric as the number of days from January 1, 1960

$w. informat tells SAS that its a character and w columns mmddyy10. tells SAS the date is in the mm/dd/yyyy format

72
New cards

Can use ( ) in PROC PRINT to change how values are displayed

FORMAT statement

73
New cards

What does dollar 11.2 indicate to do?

use dollar format, with up to 11 columns and 2 decimal places

74
New cards

For informats a ( ) tells SAS to use specified informat but to stop reading value at delimiter

colon

75
New cards

May place INFORMAT statement before ( ) statement in the DATA step

Can apply informat to ( ) by listing multiple variables with a ( )

Can also place INFORMAT statement before the INPUT statement in the DATA step

Can apply informat to multiple variables by listing multiple variables with a single informat after

76
New cards

( ) statement tells SAS where to find the data

INFILE STATEMENT

77
New cards

The order of the values in input match the order of the values in the file

T or F

True

78
New cards

A missing numeric value is represented by a ( ) in SAS

single period

79
New cards

In a CSV file each line of the file gives ( ) observation, each unit of data is ( ) and character values may ( )

one observation, separated by a comma and each character value may not be in quotes

80
New cards

DSD changes the ( ) and assumes ( ) delimiters and removes ( )

The default delimiter to a comma, it assumes two delimiters in a row is a missing value, and it removes quotes (single or double) from character values

81
New cards

( ) allows you to specify the delimiter used in the file; you may also do this ( )

DLM option to INFILE statement; you may also spell out DELIMITER instead of DLM

82
New cards

Example of a hexadecimal in SAS

infile 'file name' dlm= '09'x

83
New cards

You can read data directly into SAS without using an external file, True or False?

True

84
New cards

( ) statement is equivalent to DATALINES

CARDS statement

85
New cards

( ) must be last statement in DATA step

DATALINES

86
New cards

Column input can read ( ) data and ( )

This method of input can read character data and standard numeric value

87
New cards

After variable name is SAS ( ) which tells SAS how to read a data value

After variable name is SAS informat which tells SAS how to read a data value

88
New cards

( ) is used for numerics

( ) value tells SAS how many columns to read

( ) is optional and tells SAS the number of digits to the right of the decimal point

If a decimal point is already in it (decimal counts as ( ) ) then ( ) is ignored

w.d informat is used for numerics

w value tells SAS how many columns to read

d is optional and tells SAS the number of digits to the right of the decimal point

If decimal point already in it (decimal counts as column) then d is ignored

89
New cards

All SAS formats end in either a ( ) or a ( ) followed by a ( )

All SAS formats end in either a period or a period followed by a number

90
New cards

FORMAT statement allow SAS to ( ) from variable or data set names

format name

91
New cards

FORMAT STATEMENT changes how the values are displayed in output and how they are stored. T or F?

F, only changes how the values in output are displayed but does not change how the variables are stored

92
New cards

Can also place format statement in the ( a ) step instead of PROC

If placed in ( a ) step, that format will be ( b) associated with the variable

Can override format in a particular PROC by including a ( c) statement

Can write FORMAT statement with ( ) to remove a format (format variablename;)

Can also place format statement in the DATA step instead of PROC

If placed in DATA step, that format will be permanently associated with the variable

Can override format in a particular PROC by including a FORMAT statement

Can write FORMAT statement with no format specified to remove a format (format variablename;)

93
New cards

Can also specify formats when reading data using ( )

Can also specify formats when reading data using list input

94
New cards

Need informat for a name if it is over ( )

8 bytes

95
New cards

Need informat for date if you want it stored as ( ) and not a ( )

numeric and not a character

96
New cards

list.txt contains the same data, ( ) as delimiters and ( ) around character values

list.txt contains the same data, but with blank spaces as delimiters and no quotes around character values

97
New cards

SAS is a collection of modules

collection of modules that are used to process and analyze data

98
New cards

SAS is the main software used in which industry

pharmaceutical industry

99
New cards

SAS was developed at which university? And what was it made for

North Carolina State University to create a statistical analysis system to analyze agricultural data in the late 60s

100
New cards

DATA Steps do what?

Read and modify data, create SAS data set