1/186
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
an assumption that refers to the way one individual varies from another
Assumption 1: Psychological Traits and States Exist
- "any distinguishable, relatively enduring way in which one individual varies from another"
Trait
also are relatively less enduring.
States
-an informed, scientific concept developed or constructed to describe or explain behavior.
Construct
Once it's acknowledged that psychological traits and states do exist, the specific traits and states to be measured and quantified need to be carefully defined.
The test developer must clearly define the construct.
Assumption 2: Psychological Traits and State Can Be Quantified and Measured
The objective of the test is to provide some indication of other aspects of the examinee's behavior.
The tasks in some tests mimic the actual behaviors that the test user is attempting to understand.
Predictions about future behavior.
In some forensic (legal) matters, psychological tests may be used not to postdict behavior.
Assumption 3: Test-Related Behavior Predicts Non-Test-Related Behavior
Competent test users understand a great deal about the tests they use.
Competent users understand how a test was developed, the circumstances under which it is appropriate to administer the test, how the test should be administered and to whom, and how the test results should be interpreted.
Assumption 4: Tests and Other Measurement Techniques Have Strengths and Weaknesses
Error refers to a long-standing assumption that factors other than what a test attempts to measure will influence performance on the test.
An intelligence test score could be subject to debate concerning the degree to which the obtained score truly reflects the person's intelligence.
Assumption 5: Various Sources of Error are Part of the Assessment Process
Strict adherence to the test manual.
Using culturally sensitive tools in testing and assessment.
Assumption 6: Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner
In a world without test:
- Hiring on the basis of nepotism rather than ability.
- People will be labeled as mentally ill without basis.
- People could present themselves as psychologists, engineers, or pilots regardless of credentials.
Assumption 7: Testing and Assessment Benefit Society
CLASSICAL TEST THEORY Formula
Score = True Score + Random Error
X = T + E
In _, a person's observed or obtained score on a test is the sum of a true score and an error score. X denotes a person's observed score, T is the true score, and e is the random error (shouldn't be systematic).
Classical Test Theory
Classical test theory is also called classical true score model or classical reliability theory.
• When it comes to a latent construct we are interested in, the true score is always an unknown.
• From the assumptions of CTT, we can derive the following result.
• var (x) = var(T) + var(e)
True Variance
- Consistency of the test across time.
- Coefficient of Stability - the estimate of test-retest reliability. Obtained using Pearson-r.
- Not appropriate when we measure states or dynamic characteristics.
Time Sampling: Test-retest
- the estimate of test-retest reliability. Obtained using Pearson-r.
Coefficient of Stability
> Compares two equivalent forms of a test that measure the same attribute.
> Coefficient of equivalence - correlation between the scores obtained using the two forms.
> Parallel forms - both forms have the same mean and SD.
> Pearson R
> Alternate/parallel forms should have
• Same no. of items
• Same format
• Same coverage
• Same difficulty
• Same instructions
• Same time limits
Alternate or Parallel Forms Method
• Item or Content sampling error - fluctuations in performance from one set of items to another.
Alternate-Forms Immediate
• Item or Content sampling error
• Time sampling error
Alternate-Forms Delayed
• Consistency within the test.
• It is the intercorrelations among the items.
• If all items on a test measure the same construct, then it has a good internal consistency.
• Factor Analysis
Internal Consistency
- measures one construct
• Homogenous (unidimensional)
- multiple constructs
• Heterogenous (multidimensional)
• In split-half reliability, a test is given and divided into halves that are scored separately.
• The results of one half of the test are then compared with the results of the other.
• The two halves of the test can be created in a variety of ways: odd-even, random.
• Spearman-Brown Correction Formula.
Internal Consistency: Split-half
• Used in tests with no right or wrong answers.
• Average of all split-halves.
• Disadvantage: affected by the number of items.
Cronbach's Alpha
• The formula for calculating the reliability of a test in which the items are dichotomous, scored o or 1 (usually for right or wrong).
KR-20
• A family of reliability coefficients (λ1-λ6) estimating internal consistency in different ways.
• λ3 is equivalent to Cronbach's alpha.
• More flexible; can adjust for unequal item variances and covariances. Some lambdas can give higher estimates.
Guttman's Lambda
• A measure used to evaluate the internal consistency of a test that focuses on the degree of difference that exists between item scores.
Internal Consistency: Average Proportional Distance (APD)
• The degree of agreement or consistency between two or more scorers (or judges or raters) with regard to a particular measure.
• Cohen's Kappa - used to know the agreement among ONLY 2 raters
• Fleiss' Kappa - used to know the agreement among 3 or more raters.
Interrater Reliability
- used to know the agreement among ONLY 2 raters
Cohen's Kappa
- used to know the agreement among 3 or more raters.
Fleiss' Kappa
• This model considers the problems created by using a limited number of items to represent a larger and more complicated construct.
• Ex. Getting words from the dictionary to come up with a spelling test.
" As the sample gets larger, it represents the domain more and more accurately.
Domain Sampling Theory
Statistical Treatments
Test-retest
Pearson R
Statistical Treatments
Alternate-Forms Delayed
Pearson R
Statistical Treatments
Alternate-Forms Immediate
Pearson R
Statistical Treatments
Internal Consistency Cronbach's Alpha & KR-20
Cronbach's Alpha or KR20
Statistical Treatments
Internal Consistency Split-Half
Pearson R & Spearman Brown
Statistical Treatments
Inter-rater
Kappa Statistic
- The extent to which a test measures what it is supposed to measure.
It is the agreement between a test score or measure and the characteristic it is believed to measure.
Validity
(A valid test is reliable.
Not all reliable tests are valid.)
is "the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests"
validity
is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment (Messick, 1989).
Validity
The process of gathering and evaluating evidence about validity.
Validation
- applied when test are altered in some ways such as format, language, or content
Local validation studies
Types of Validity Evidence
• Face
• Content
• Criterion - concurrent, predictive
• Construct - convergent, divergent
• Face validity - is the simplest and least scientific form of validity and it is demonstrated when the face value or superficial appearance of a measurement measures what it is supposed to measure.
• Item seems to be reasonably related to the perceived purpose of the test.
• Often used to motivate test takers because they can see that the test is relevant.
Face Validity
evidence tells us just how well a test corresponds with a particular criterion (an external measure; another way of measuring the same construct).
Criterion validity
• - forecasting function of a test. Test scores may be obtained at one time and the criterion measure may be obtained in the future after an intervening event.
Predictive Validity
- If test scores are obtained at about the same time that the criterion measures are obtained.
Concurrent validity evidence
• The construct validity of a test depends on the extent to which it truly reflects the construct that it purports to measure.
• Known as umbrella validity.
A test has a good construct validity if there is an existing psychological theory which can support what the test items are measuring.
Construct Validity
- The test is correlated to another test that measures the same/similar construct(s).
• EX. Depression test and Negative Affect Scale - these two are related. Hence, significant positive relationship is expected.
Convergent validity evidence
- A validity coefficient sharing little or no relationship between two tests measuring unrelated constructs.
• Mathematics Achievement and Marital Satisfaction test - the relationship should be low or insignificant.
Divergent/Discriminant Validity Evidence
The Process of Test Development:
Test Conceptualization
Test Construction
Test Tryout
Item Analysis
Test Revision
It is a measure of the proportion of examinees who answered the item correctly.
Item Difficulty (p) / Item Location
Item difficulty index should range from
0.30 - 0.70
Item Difficulty (p)
Higher than 70 =
too easy
Item Difficulty (p)
Lower than .30 =
too difficult
- a test, usually of achievement or ability, with a time limit; speed tests usually contain items of uniform difficulty level.
Speed Test
- A test, usually of achievement or ability, with (1) either no time limit or such a long time limit that all testtakers can attempt all items and (2) some items so difficult that no testtaker can obtain a perfect score.
Power Test
Measures of item discrimination indicate how adequately an item separates or discriminates between high scorers and low scorers on an entire test.
This estimate of item discrimination, in essence, compares performance on a particular item with performance in the upper and lower regions of a distribution of continuous test scores.
> Upper group (a.k.a. Masters [top 33%]) - possess the target ability.
> Lower group (a.k.a. Non-Masters [bottom 33%]) - do not possess the target ability.
Item Discrimination (d)
: Correlation between a dichotomous item (e.g., correct/incorrect) and the total test score.
Point-Biserial Correlation
: Correlation between a polytomous item (e.g., Likert scale) and the total test score.
Point-Polyserial Correlation
: Correlation between an item and the total test score, excluding the item itself to avoid bias.
Deleted Point-Biserial/Polyserial Correlation
- the most intelligent persons were those equipped with the best sensory abilities
Francis Galton
- Intelligence is based on multiple components- i.e., reasoning, judgment, memory and abstraction. Components cannot be separated.
Alfred Binet
defined intelligence in terms of the ability "to resolve genuine problems or difficulties as they are encountered".
Howard Gardner
"Intelligence, operationally defined, is the aggregate or global capacity of the individual to act purposefully, to think rationally and to deal effectively with his environment.
David Wechsler
defined intelligence in terms of "mental activities involved in purposive adaptation to, shaping of, and selection of real-world environments relevant to one's life".
Robert Sternberg
developed the first theory of intelligence and proposed that it is best thought of as a single, general capacity, or ability.
•g factor - general intelligence.
•s factor - ability to excel in certain areas or specific intelligence.
Charles Spearman
• Multiple-factor theory of intelligence
• Different aspects of intelligence are distinct enough that multiple abilities.
• Intelligence consists of 7 primary mental abilities:
> word fluency, verbal comprehension, number facility,
> spatial visualization, associative memory, perceptual
> speed, and reasoning.
Louis Thurstone
Fluid and Crystallized Intelligence
Raymond Cattell
• - used when dealing with new problems; not influenced by past learning and culture.
Fluid intelligence (Gf)
• - using already learned skills, experience, and knowledge to solve problems; involves past learning and is more influenced by culture
Crystallized intelligence (Gc)
• Visual processing (Gv)
• Quantitative processing (Gq)
• Auditory processing (Ga)
• Speed of processing (Gs)
• Facility with reading & writing (Grw)
• Short-term memory (Gsm)
• Long-term storage & retrieval (GIr)
Vulnerable abilities
Maintained abilities
John Leonard Horn
- decline with age and tend not to return to preinjury levels following brain damage.
• E.g. Visual processing
Vulnerable abilities
- they tend not to decline with age and may return to preinjury levels following brain damage.
• E.g. Quantitative processing
Maintained abilities
• General intelligence - very similar to Spearman's concept of "g."
• Broad intelligence - includes abilities such as crystallized and fluid intelligence, memory, learning, and processing speed.
• Narrow intelligence - includes many distinct abilities. Consists of 70 distinct abilities.
Cattell-Horn-Carroll's Three Stratum Theory of Cognitive Abilities (CHC Model)
• - very similar to Spearman's concept of "g."
General intelligence
• - includes abilities such as crystallized and fluid intelligence, memory, learning, and processing speed.
Broad intelligence
• - includes many distinct abilities. Consists of 70 distinct abilities.
Narrow intelligence
Did not support the theories about the existence of the g-factor.
Joy Paul Guilford
A key difference between psychological testing and psychological assessment has to do with
A. the number of hours it takes to proctor a test session.
B. the role of the test user in interpreting the results.
C. whether or not the evaluation includes psychological tests.
D. the reliability and validity of the instruments used.
B
When a student comes to you for career counseling and asks if she could take a test to find out what field she would be good at, what kind of test would you recommend?
A. Personality test
B. Achievement test
C. Aptitude test
D. Intelligence test
C
Objective tests are objective because
A scoring is heavily dependent on the judgment of the scorer
B they are scored in a simple, straightforward manner
C They contain items that have right or wrong answers
D they are based on responses to ambiguous stimuli
B
The assessor asked the client to interpret the meaning of "A journey of a thousand miles begins with a single step". This part of the MSE assesses the client's _______.
A. Judgment
B. Orientation
C. Abstract thinking
D. Thought content
C
a comprehensive set of questions and observations used by psychologists to gauge the mental state of a client.
Mental status examination
Mental Status Examination
: How does the client look? What kind of clothing does the client wear? Is the clothing appropriate for the occasion or the weather? What is the personal hygiene of the client?
Appearance
Mental Status Examination
: How does the client behave during the examination? Does the client show unusual verbal and non-verbal behaviour?
Behaviour
Mental Status Examination
: Is the client aware of who or where he is? Does the client know what time (year, month, date, day and time) it is?
Orientation
Mental Status Examination
: Does the client show any problems in immediate, recent and remote memory?
Memory
Mental Status Examination
: Is the client able to attend and concentrate during the examination? Does the client show problems in hearing, vision, touch or smell?
Sensorium
Mental Status Examination
: Does the client display a range of emotions during the examination? What are these emotions and how appropriate are they?
Affect
Mental Status Examination
: What is the general or prevailing emotion displayed by the client during the examination?
Mood
Mental Status Examination
: What does the client want to focus on during the interview? Does the client only want to talk about these things? Is the client able to clearly explain ideas during the interview?
Thought content and thought process
One of the principles of ethics in psychological assessment is to ensure that one test is complemented with another assessment method to arrive at a more comprehensive profile of the client. This ensures that the limitation in the test is being addressed and compensated by another assessment method. What assumption of psychological testing does this refer to?
A. Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner
B. Psychological traits and states can be quantified and measured.
C. Test and other measurement techniques have strengths and weaknesses
D. Test-related Behavior Predicts Non-test Related Behavior
C
A student who scores 145 on an IQ test, graduates with a 1.0 GPA, and receives several academic recognitions provides evidence for which assumption about tests and testing?
A. Psychological traits and states exist
B. Psychological constructs can be measured and quantified
C. Tests and other measurement techniques have strengths and weaknesses
D. Test-related behaviors predict non-test-related behaviors
D (predictive)
Before administering a psychological test, the most important thing a psychologist should ensure is that
A the test has local norms
B the test was developed by a local test developer
C the test has been reviewed in the Mental Measurements Yearbook
D the test is appropriate for the client's profile and background
D
Results for a client on a psychological test
A should not be interpreted by a person
B should be interpreted in isolation
C should always be interpreted by a computer
D should not be interpreted in isolation
D
The first theory of intelligence was developed by
A. Binet and Simon
B. Terman and Wechsler
C. Stanford and Binet
D. Spearman
D
If you intend to assess a person's personality and at the same time measure their intelligence it is BEST to use:
A. NEO-Pi-3
B. 16PF
C. BPI D. PAI
B
C. BPI (psychopathology)
D. PAI (psychopathology)
In test construction
A. random samples from the general population are always employed
B. representative samples from the population of interest are employed
C. accidental or convenience samples have been found to be as good as any other
D. random samples are employed for the initial analysis but not subsequently
B
The Flynn effect refers to the observation that
A. the raw score mean on intelligence tests has remained constant over the years
B. the standard deviation of scores on intelligence tests has remained constant over time
C. the raw score mean on intelligence tests has been increasing over the years in developed countries
D. the raw score mean on intelligence tests has been increasing over the years in the entire world
C
Majority of the immigrants evaluated using the assessment procedures employed by Henry Goddard failed because
A The test was in English and majority of the assessees were not well- versed with the English language.
B Most of the immigrants did not finish schooling, making it difficult for them to pass the assessment.
C The immigrants did not possess the minimum amount of intellectual ability to be allowed to enter the United States.
D All of these
A