Note

0.0(0)

Chat with Kai

undefined Flashcards

0 Cards0.0(0)

Knowt Play

Explore Top Notes

Classical Conditioning

Note

Studied by 82 people

4.5(2)

1.1: introduction to business management

Note

Studied by 24 people

5.0(1)

Le Chatelier's Principle

Chp 2 Culture: Giving Meaning to Human Lives

Studied by 425 people

5.0(3)

Psychological Assessment Midterm Exam Reviewer

Psychological Testing and Assessment

I. HISTORY

Ancient Roots - psychological testing started from the time of Chinese civilization; used to determine who will serve as government workers (present counterpart: civil service examination).

Emphasis of assessment before: to determine human similarities. But the focus transitioned to individual differences.

Francis Galton- Father of Psychometrics; he established the testing movement. He wanted to determine whether people will behave differently after presenting the same stimulus.

Early Experimental Psychologists- they focused on standardizing/controlling experiments in laboratory set up.

Intelligence Testing (+ Theories) - psychologists were concerned about “feeble-mindedness” so they started to create structured intelligence tests. This continued until World War I.

Henry Goddard - first to translate the Binet Intelligence Test into English in 1908 and for introducing the term “moron.”

During World War I, Army Alpha and Army Beta were constructed. This is the start of group testing.

Robert Yerkes - implemented intelligence tests for recruits to the army.

Army Alpha - administered to literate individuals.

Army Beta - administered as non-verbal intelligence test.

Arthur Otis - he introduced the concept of multiple choice and other characteristics of tests.

Robert S. Woodworth - devised the very first personality test known as the “Personal Data Sheet” or “Woodworth Personal Data Sheet” also known as “Woodworth Psychoneurotic Inventory”

Intelligence tests were introduced during the World War I and more tests were developed afterwards.

PERSONALITY TESTING

Hermann Rorschach - Swiss psychiatrist and psychoanalyst who developed the Rorschach Inkblot Test (RIBT) to measure unconscious parts of the subject’s personality.

Edwards Personality Preference Schedule - objective counterpart of Rorschach Inkblot Test.

PSYCHOLOGICAL TESTING IN THE PH

Virgilio Enriquez - developed Panukat ng Ugali at Pagkatao (PUP)

BASIC CONCEPTS:

Objective of Psychometrics

1. To measure behavior (both overt and covert)

2. To describe and predict behavior and personality (starts with quantified aspect of behavior to the qualitative aspects and description of behavior)

3. Determine signs and symptoms of dysfunctionality

Testing vs. Assessment—testing and assessment differ in the following:

1. Objective

a. Testing- to obtain some measure, gauge, usually numerical in nature; what you yield are scores; to quantify

b. Assessment- arrive to a decision, solve a problem or referral question; we are looking for answers here. This includes other methods, not just testing.

2. Focus

a. Testing- the main focus is for examiner to identify how one individual would compare with others in a homogenous group; coming from the premise of nomothetic interpretation (from standardization sample/norm).

b. Assessment- focus is on the uniqueness of an individual. We do not care much about his score with reference to the group; follows the idiographic approach/interpretation. This approach appreciates the individual characteristics of a person.

3. Process

a. Testing- faster and easier because it can be administered by group or individually. Simply administering test, scoring, and interpretation.

b. Assessment- focus is on individual processes so this is more individualized. Looks at the various psychological underpinnings of behavior.

4. Role of evaluator- evaluator/test user/proctor/assessor

a. Testing- role of evaluator is not that critical in the process because he does not have any direct impact on the result, therefore our proctor can be substituted.

b. Assessment- the assessor is the most important key to the process. Process? (1) Process of selecting tools to be administered to confirm the hypothesis, (2) the evaluator will make the interpretation for the result.

5. Skill of Evaluator

a. Testing- technician-like skill is required (“technical know-hows”)

b. Assessment- intensified and advanced skills; educated selection of tools, integration of data; there is a need for intensive training for assessors to conduct assessment.

6. Outcome

a. Testing- series of test scores are yield. Concrete outcome: psychometric report- the result of the test, interpretation.

b. Assessment- outcome is the answer to the referral question. Concrete outcome: psychological report- includes the identifiers, the test result (psychometric report), later on this will be verified with other sources of data such as interview, third-party sources, case studies, etc.

7. Duration

a. Testing- shorter in terms of duration.

b. Assessment- longer; usually takes few days to accomplish.

8. Sources of data

a. Testing- source of data for this is the test-taker or the examinee himself/herself.

b. Assessment- data from the examinee himself is not enough; we need other people to corroborate our findings/results such as family, teachers. They will help us answer the referral question.

9. Qualification for use

a. Testing- knowledge of test and testing procedures

b. Assessment- apart from knowledge, there is also is temporary, if not addressed, this can lead to clinical depression.

2. Psychological traits and states can be identified and measured - we cannot measure it 100%, we are only trying to estimate. 3. Test-related behavior predicts non-test-related behavior.

a. Postdict process- our ability to estimate about something in the past; get conjecture of something that occurred before hand. Ex. Nagpapa-test ka ngayon to know what happened in the past. Ex. In guidance setting, students that are non-performing can be referred for assessment to know what happened that led to his present condition.

b. Predict- future-oriented. You are constructing test to predict what could possibly happen in the future. Ex. Aptitude testing

4. Tests and other measurement techniques have strengths and weaknesses- there are some tests that are better in one case, but not applicable in other cases. Not a “one-size-fits-all”

5. Various sources of error are part of the assessment procedure- errors cannot be removed but can be minimized.

a. Error (systematic and random)- long-standing assumption that factors other than what a test attempts to measure will influence performance on the test; has nothing to do with the ability that we are trying to measure.

b. Error variance- statistical amount of error that is present in the test.

Trait error- sources of errors that reside within an individual taking the test (such as, “I didn’t study enough,” “I felt bad that I missed blind date,” “I forgot to set the alarm,” excuses)

Method error- sources of errors that reside in the testing situation (such as lousy test instructions, too-warm room, or missing pages).

6. Testing and assessment can be conducted in a fair and unbiased manner. Always take note that the test should always be used in a fair manner.

7. Testing and assessment benefit the society.

Forms of Psychological Assessment

1. Therapeutic Psychological Assessment- pertains to the client; while the client is undergoing the process, the process is actually therapeutic on the part of the client per se.

2. Collaborative Psychological Assessment- the client (cooperates fully by giving data) and the assessor are working collaboratively to arrive to a diagnosis.

3. Dynamic Psychological Assessment- mechanical and highly structured; follows 3 phases starting with: (1) assessment (to arrive to a diagnosis), (2) intervention, (3) evaluation (did your intervention work? If yes, continue your work. If not, find another intervention).

Parties in Psychological Assessment- the first four is strictly adhering to the ethical principles of use and enterprise:

10. Cost a need for more in-depth or specialized knowledge for projective techniques, psychiatric focus or tests, in I/O psych-knowledge of job requirements.

Psychometrics are allowed to be involved in the assessment process for as long as his or her scope of work is only limited to what the law dictates. The psychometrician should also be guided by the psychologist.

The PRB is currently working on changing the nomenclature of “psychometrician” to “psychometrist”. In the Western world, psychometricians are Doctors in the field of psychology whereas psychometrists’ job is limited to test administration and other related only.

activities

a. Testing- cheaper on the context of clients because you will only pay for the tests.

b. Assessment- you will be charged not just for the test but also for the professional fees, session costs.

7 ASSUMPTIONS ABOUT ASSESSMENT

1. Psychological traits and states exist - we should be able to differentiate these two so we would know the right materials to administer. Sometimes as psychometricians we are expected to measure traits, sometimes states.

a. Trait- characteristic behaviors and feelings that are consistent and long lasting; characteristic behaviors that are most perseverative/lasting (the influence of time). Ex. Personality disorder (this is also a psychopathological trait/state)

b. State- temporary behaviors or feelings that depend on a person’s situation and motives at a particular time (example: when somebody is grieving). Ex of state: psychopathological tendencies (depressive tendencies for example) but not all psychopathological tendencies.

A state can be a trait if not addressed properly. Therefore, early detection is important. Ex. Depressive symptoms or “feeling depressed”

1. Test authors and developers

2. Test users

3. Test takers

Tools for Psychological Assessment

1. Tests- standardized measuring devices to gauge our client’s attitudes, personality, ability, etc.

a. Measurement- quantifying particular occurrences of a particular thing in a person; numerical/score; estimation/quantification; counting the score for a specific construct

b. Assessment- integrating/synthesizing the results of measurement in terms of norms.

c. Evaluation- to make the process of judgment; to make decision; extension of assessment (but assessment itself already involves decision- making.

2. Interviews

a. Structured- follows a strict line of questioning/guide question; follows a rigid way of scoring

b. Unstructured- does not have pre-determined question; usually used in rapport building or client engagement

c. Semi-structured- you are allowed to ask follow-up question

3. Documents

a. Portfolio Assessment- documentary analysis that tells us what our clients can do. Ex. Assessment for promotion of employees

b. Case history data

4. Behavioral Observation

All of these tools are used together to provide substantial psychological assessment. You cannot really say that you have conducted assessment if you only used one tool.

Ways to Classify Tests

1. According to qualifications and training of test user

a. The three-tier of psychological tests (required qualification) [Level A (knowledge of test, tool selection, administration, scoring, and interpretation-teachers), B (intensive knowledge on test properties, ex. psychometrician, and C- trainings for the use of higher-level psych tests- psychologists]

2. According to the number of test taker

a. Individual test

b. Group test

3. According to the variable being measured

a. Ability (achievement, aptitude, intelligence)

b. Typical performance (objective and projective personality tests, interest, values)

Three-Tier System of Psychological Tests

1. Type/Level A

2. 3.

a. Achievement test- not necessarily a psychological test because this can also be a teacher-made test. Ex. Drills

Type/Level B

a. Group intelligence tests

b. Objective personality tests

Type/Level C

a. Individual intelligence tests

b. Diagnostic tests

c. Projective tests

Power Test vs. Speed Test (only applicable to Ability Testing; not applicable to test of typical performance)

Power test

a. Requires an examinee to exhibit the extent of depth of his understanding or skill

b. Items with varying level of difficulty (typically follows the spiral omnibus format) because time is not an issue here

Power test are usually classified under Aptitude test. “Ceiling” Principle- after an n consecutive errors, that’s when we end the test.

Speed test

a. Requires to examinee to complete (correctly) as many items as possible (has something to do with the influence of time)

b. Contains items of uniform and generally simple level of difficulty because the concept of time is considered here

Ability Tests vs. Typical Performance

What do they measure?

a. Ability- measures the things that a person can do.

b. Typical performance- things that a person usually does.

What are their subtypes?

a. Ability- achievement, intelligence, aptitude (psych assessment usually focuses on intelligence and aptitude)

b. Typical performance- personality test, interest (preferences), attitude (disposition/judgment regarding a particular issue), values (things you consider important)

What are conditions where these are manifested?

a. Ability- where skills are to be presented

b. Typical performance-manifested at anytime, anywhere

What are the nature of answers?

a. Ability- always has correct, incorrect answers

b. Typical performance- all answers are correct

What are the objectives of motivation

a. Ability- exhibit the extent of what they know

b. Typical performance- honesty, self-awareness

Types of Ability Tests- ability refers to the things a person CAN DO. 1. Achievement (least consistent domain)

a. Purpose- measures the extent of what the person has learned.

b. Time orientation- past

c. Underlying assumptions (influences of prior learning) (related to the concept of reliability)- has total reliance on prior learning

d. Validation process- content validity will suffice

(content validation is the representativeness of the domains of the construct we are trying to measure)

e. Mode of scoring and interpretation- raw score is converted to standard score; modes of interpretation: normative (for testing) the score of the examinee is compared to the norms; ipsative (for assessment) our focus here is on the examinee, how his scores in all domains relate. Probably, the highest score of the examinee reflects his strengths, his lowest scores reflect his weaknesses.

2. Intelligence

a. Purpose - to have an idea of the general or overall mental ability of the examinee.

b. Time orientation- present; what is your current general mental ability at the moment?

c. Underlying assumptions (influences of prior learning) (related to the concept of reliability)- fluid (innate) and crystallized (things we acquire) 50/50

d. Validation process- apart from content validity, it

places high emphasis on construct validity (because of the various theories that govern your test)

e. modes of interpretation: normative (for testing) the score of the examinee is compared to the norms; ipsative (for assessment) our focus here is on the examinee, how his scores in all domains relate. Probably, the highest score of the examinee reflects his strengths, his lowest scores reflect his weaknesses.

3. Aptitude (this is the most consistent domain -look at the underlying assumptions)

a. Purpose- measures our potential for learning and performance.

b. Time orientation- future

c. Underlying assumptions (influences of prior learning) (related to the concept of reliability)- no reliance to prior learning; does not need past learning

d. Validation process- apart from content validity, it places high emphasis on criterion validity (as to what extent our test-taker should be able to achieve)

Achievement Tests

a. Stanford Achievement Test in Reading

b. Teacher-made tests

Intelligence Tests

a. Wechsler Scales (level C)

b. Stanford-Binet Scales (level C)

c. CFIT (level B)

d. RPM (level B)

Aptitude Test

a. Multiple Aptitude Test Battery

b. Special Aptitude Test

c. Differential Aptitude Test

TESTS

Classification involves assigning a person to one category rather than another.

a. Placement- sorting of persons to different programs

b. Screening- refers to quick and simple test or procedures to identify persons who might have special

characteristics or needs.

c. Certification- both have a pass/fail equality

d. Selection- similar to certification in that it confers

privileges such as the opportunity to attend a

university or gain an employment

Diagnosis- determining the nature and source of a person’s abnormal behavior pattern within an accepted diagnostic system.

Self-knowledge- in some cases, the feedback a person receives from psychological test results is so self-affirming that can change the entire course of a life.

Program Evaluation- social programs are designed to provide services which improve social conditions and community life.

a. Diagnostic evaluation- refers to evaluation conducted before instruction.

b. Formative evaluation- refers to evaluation conducted during or after instruction.

c. Summative evaluation- refers to evaluation conducted at the end of a unit or a specified period of time.

Research- to update knowledge and to make sure that we are practicing reflexivity.

TEST DEVELOPMENT

o Test Development – an umbrella term for all that goes into the process of creating a test

o Test Conceptualization – brain storming of ideas about what kind of test a developer wants to publish

o Questions to ponder on when conceptualizing for new tests:

1. What is the test designed to measure?

2. What is the objective?

3. Is there a need for this kind of test?

4. Who will use the test?

5. Who will take the test?

6. What content will the test cover?

7. How will the test be administered?

8. What is the ideal format of the test?

9. Should more than one form of test be developed?

10. What special training will be required of test users for administering or interpreting the test?

11. What types of responses will be required of test taker’s?

12. Who benefits from an administration of this test?

13. Is there potential harm?

14. How will meaning be attributed to scores on this test?

o Pilot Work/Pilot Study/Pilot Research – preliminary research surrounding the creation of a prototype of the test

▪ Attempts to determine how best to measure a targeted construct

▪ Entail lit reviews and experimentation, creation, revision,anddeletionofpreliminary items

Test Construction

▪ Test Construction – stage in the process that entails writing test items, revisions, formatting, setting scoring rules

o Scaling – process of setting rules for assigning numbers in measurement

▪ Process by which a measuring device is assigned and calibrated and by which numbers scale values are assigned to different amounts of the trait, attribute, or characteristic being measured

▪ Age-Based – age is of critical interest

▪ Grade-Based – grade is of critical interest

▪ Stanine – if all raw score of the test are to be transformed into scores that range from 1-9

▪ Unidimensional – only one dimension is presumed to underlie the ratings

▪ Multidimensional – more than one dimension

▪ Comparative and Categorical

▪ Rating Scale – grouping of words, statements, or symbols on which judgments of the strength of a particular trait are indicated by the test taker

▪ Summative Scale – final score is obtained by summing the ratings across all the items

▪ Likert Scale – scale attitudes, usually reliable;

▪ Thurstone Scale - involves the collection of a variety of different statements about a phenomenon which are ranked by an expert panel in order to develop the questionnaire

▪ Method of Paired Comparisons – produces ordinal data by presenting with pairs of two stimuli which they are asked to compare

▪ Comparative Scaling – entails judgments of a stimulus in comparison with every other stimulus on the scale

▪ Categorical Scaling – stimuli are placed into one of two or more alternative categories that differ quantitatively with respect to some continuum

▪ Guttman Scale – yields ordinal-level measures

o Item Pool – reservoir or well from which the items will or will not be drawn for the final version of the test

▪ A comprehensive sampling provides a basis for content validity of the final version of the test

▪ The test developer may write a large number of items from personal experience or academic acquaintance with the subject matter or experts

o Item Format – form, plan, structure, arrangement, and layout of individual test items

▪ Selected-Response Format – require test takers to select response from a set of alternative responses Multiple-Choice Format

✓ Has three elements: stem (question), a correct option, and several incorrect alternatives (distractors or foils)

✓ Should’ve one correct answer, has grammatically parallel alternatives, similar length, alternatives that fit grammatically with the stem, avoid ridiculous distractors, not excessively long, “all of the above”, “none of the above”

✓ Probability of getting the correct answer is 25% Matching Item

✓ Test taker is presented with two columns: Premises and Responses

✓ Should be fairly short and to the point and only one premise would match to one response

Binary Choice

✓ True-False Item

✓ Usually takes the form of a sentence that requires the test taker to indicate whether the statement is or is not a fact

✓ Contains single idea and not subject to debate

✓ Probability of obtaining the correct answer is 50%

Constructed-Response Format – requires test takers to supply or to create the correct answer, not merely selecting it

Completion Item

✓ Requires the examinee to provide a word or phrase that completes a sentence

✓ Should be worded properly so that the correct answer is specific

Short-answer item

✓ Should be written clearly enough that the test taker can respond succinctly, with short answer

Essay Item

word, or an individual; usually essay type, open-ended format

Test Tryout

o The test should be tried out on people who are similar in critical respects to the people for whom the test was designed o An informal rule of thumb should be no fewer that 5 and preferably as many as 10 for each item (the more, the better)

o Risk of using few subjects = phantom factors emerge

o Should be executed under conditions as identical as possible o A good test item is one that answered correctly by high scorers as a whole Item Analysis

o Statistical procedure used to analyze items

Item Difficulty – defined by the number of people who get a particular itemcorrect

o Item-Difficulty Index – calculating the proportion of the total number of test takers who answered the item correctly

▪ The larger, the easier the item

▪ For achievement testing

▪ Item-Endorsement Index for personality testing

▪ The optimal average item difficulty is approx. 50% with

items on the testing ranging in difficulty from about 30% to 80%

o Item-Reliability Index – provides an indication of the internal consistency of a test

▪ The higher this index, the greater the test’s internal consistency

o Item-Validity Index – designed to provide an indication of the degree to which a test is measure what it purports to measure ▪ The higher this index, the greater the test’s criterion-

Respond by writing a composition

Allows creative integration and expression of the material Tends to focus on a more limited area than can be covered in the same amount of time when using a series of selected-response items or completion items

Subject to scoring and inter-scorer differences

Item Banks – relatively large and easily accessible collection of test questions

Computerized Adaptive Testing – refers to an interactive, computer administered test-taking process wherein items presented to the test taker are based in part on the test taker’s performance on previous items

▪ The test administered may be different for each test taker, depending on the test performance on the items presented

▪ Reduce the number of test items that need to be administered by 50% while simultaneously reducing measurement error by 50%

▪ Reduces floor and ceiling effects

▪ Floor Effects – occurs when there is some lower limit on a survey or questionnaire and a large percentage of respondents score near this lower limit (test takers have low scores)

▪ Ceiling Effects – occurs when there is some upper limit on a survey or questionnaire and a large percentage of respondents score near this upper limit (test takers have high scores)

Item Branching – ability of the computer to tailor the content and order of presentation of items on the basis of responses to previous items

Cumulative Scoring – the higher score one achieved on the test, the higher the test taker is on the ability that the test purports to measure

Class Scoring/Category Scoring – test taker responses earn credit toward placement in a particular class or category with other test takers who pattern of responses is presumably similar in some way

Ipsative Scoring – comparing a test taker’s score on one scale within a test to another scale within that same test Semantic Differential Rating Technique - measures an individual's unique, perceived meaning of an object, a related validity

Item-Discrimination Index – measure of item discrimination

▪ Measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly

▪ Extreme Group Method – compares people who have done well with those who have done poorly

▪ Discrimination proportion

Index – difference between these

▪ Point-Biserial

dichotomous variable and continuous variable

Test Revision

o Characterize each item according to its strength and weaknesses

o As revision proceeds, the advantage of writing a large item pool becomes more apparent because some items were removed and must be replaced by the items in the item pool

o Administer the revised test under standardized conditions to a second appropriate sample of examinee

o Cross-Validation – revalidation of a test on a sample of test takers other than those on who test performance was originally found to be a valid predictor of some criterion

▪ Often results to validity shrinkage

o Validity Shrinkage – decrease in item validities that inevitably occurs after cross-validation

o Co-validation – conducted on two or more test using the same sample of test taker

STATISTICS REFRESHER

✓ Hypothesis Testing ✓ Descriptive Statistics ✓ Inferential Statistics

RESEARCH HYPOTHESES

✓ Null hypothesis- states no relationship between variables ✓ Alternative hypothesis- gives the predicted relationship.

Hypothesis Testing

In our assessment, we start with a hypothesis, and this comes in 2 types:

Measures of Variability- measures the spread of scores

✓ Range- difference between highest and lowest score

✓ Interquartile range- looking the difference between quartile 1 and3(Q3–Q1=IR)

✓ Semi-interquartile range- simply divide IQR by 2

✓ Standard deviation- distance of score from the mean

Measures of Location

✓ Percentile

✓ Quartile

✓ Decile

Percentile Formula- entails location of score # 𝑜𝑓 𝑒𝑥𝑎𝑚𝑖𝑛𝑒𝑒𝑠 𝑏𝑒𝑎𝑡𝑒𝑛 x 100

𝑇𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑒𝑥𝑎𝑚𝑖𝑛𝑒𝑒𝑠

Percentage- entails proportion of score # 𝑜𝑓 𝑜𝑏𝑡𝑎𝑖𝑛𝑒𝑑 𝑠𝑐𝑜𝑟𝑒 x 100

𝑇𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠

SKEWNESS

Positive skew- the tail of the distribution is pointing to the right

(+) side

(-) side

KURTOSIS- emphasis is on the peak of the distribution. The value of the kurtosis should be less than 10.

✓ Platykurtic

✓ Mesokurtic- nearest to normal distribution. ✓ Leptokurtic

Raw Scores

✓ Indicates the number of items correctly answered on a given test. In almost all cases, it is the first score a teacher obtains when interpreting data.

The Normal Curve

1. Null a.

Significant- H0 is false- reject H0 (accept the alternative counterpart= alternative significant) Insignificant- H0 is true- accept H0 (accept the alternative counterpart= alternative insignificant)

2. Alternative

a. Significant- H1 is true- accept H1

b. Insignificant- H1 is false- reject H1

Two questions we need to bear in mind when we want to determine the right statistical tool to use:

1. What type of data are we dealing with?

2. What are you looking for?

Descriptive Statistics

Primary Scales of Measurement

✓ Nominal- known to be non-parametric- the nature of these does not necessarily refer to numerical entities

✓ Ordinal- known to be non-parametric- the nature of these does not necessarily refer to numerical entities

✓ Interval- parametric- numerical data here would mean actual numbers; possesses magnitude; equally appearing intervals (there are values below zero)

✓ Ratio- parametric- numerical data here would mean actual numbers; has magnitude, equally appearing intervals, and absolute zero

Frequency of Distribution- translating nominal data to a number.

Measures of Central Tendency

(Mean > Median > Mode)

Negative skew- the tail of the distribution is pointing to the left

✓ ✓ ✓

Type of Data

Nominal Data

Ordinal Data

Interval/Ratio (Not Skewed/normal) Interval/Ratio (Skewed)

Mean- average of sets of scores Median- midpoint

Mode- most frequently occurring score

(Mean < Median < Mode)

***remember that the coefficient of skewness should be less than 3 in order to be accepted.

Measure

Mode Median Mean Median

Hypothesis Testing

O Statistical method that uses a sample data to evaluate a hypothesis about a population

O Alternative Hypothesis – states there is a change, difference, or relationships

O Null Hypothesis – no change, no difference, or no relationship

O Alpha Level or Level of Significance – used to define concept of “very unlikely” in a hypothesis test

O T-Test – used to test hypotheses about an unknown population mean and variance

Can be used in “before and after” type of research

Sample must consist of independent observationsꟷthat is, if there is not consistent, predictable relationship between the first observation and the second

The population that is sampled must be normal If not normal distribution, use a large sample

Correlation and Inference

O Correlation Coefficient – number that provides us with an index of the strength of the relationship between two things

O Correlation – an expression of the degree and direction of correspondence between two things

+ & - = direction

Number anywhere to -1 to 1 = magnitude

Positive – same direction, either both going up or both going down Negative – Inverse Direction, either DV is up and IV goes down or

IV goes up and DV goes down 0 = no correlation

O Pearson r/Pearson Correlation Coefficient/Pearson Product-Moment Coefficient of Correlation – used when two variables being correlated are continuous and linear

Devised by Karl Pearson

Coefficient of Determination – an indication of how much variance is shared by the X- and Y- variables

o Spearman Rho/Rank-Order Correlation Coefficient/Rank-Difference Correlation Coefficient – frequently used if the sample size is small and when both sets of measurement are in ordinal

Developed by Charles Spearman

O T-test (Dependent)/Paired Test – two groups nominal (either matched or repeated measures)

+ continuous scales

O One-Way ANOVA – 3 or more IV, 1 DV comparison of differences O Two-Way ANOVA – 2 IV, 1 DV

O Critical Value – reject the null and accept the alternative if [ obtained value > critical value ]

O P-Value (Probability Value) – reject null and accept alternative if [ p-value < alpha level ]

O Norms – refer to the performances by defined groups on a particular test

O Age-Related Norms – certain tests have different normative groups for particular age groups

O Tracking – tendency to stay at about the same level relative to one’s peers

o Norm-Referenced Tests – compares each person with the norm

o Criterion-Referenced Tests – describes specific types of skills, tasks, or knowledge that the test taker can demonstrate

Note