Intelligence: What have we learned from Binet?
Personality: How has it expanded?
Statistics: Factor analysis.
Computers and the Internet: Experience sampling and self-monitoring.
Political: The impact of World Wars.
Funding and Policy: Educational testing.
Expanded coverage of intelligence testing and specific cognitive tests.
New aspects of personality not yet examined.
Increased emphasis on positive psychology.
New ways to measure psychological states and personality traits.
Social phenomena growing up digitally.
Political/War
A construct is a hypothetical entity with theoretical links to other hypothesised variables, proposed to relate to a consistent set of observable behaviours, thoughts, or feelings that is the target of a psychological test.
Theoretical advances, such as new constructs emerging in the literature, might give an idea of future tests and procedures likely to be developed.
The Big Five shaped the development of a number of assessment measures.
Intelligence theories have also shaped intelligence testing.
Emotional intelligence refers to a person’s capacity to monitor and manage emotions, understand the emotions of others, and use these insights to function better interpersonally.
Where to locate this in existing theory? Amalgamation of existing personality traits?
Is it a series of learned skills? Can interventions increase emotional intelligence or is it fixed?
Can you trust self-reports about emotional intelligence or must it actually be measured (objective)?
Integrity involves dependability, theft proneness, and counterproductive work behaviour.
It may require specific personality tests or direct measures to assess a job applicant's honesty, trustworthiness, or integrity.
These assessments are strongly prone to social desirability bias and superlative response styles (i.e., claiming extreme virtue).
Unobtrusive measures such as typing speed, mouse clicks, and eye tracking can be utilised.
Current research focuses on attitudes and behaviours towards women and minority groups, expecting an explosion of new research (e.g., incel subgroups, non-consensual image sharing, deepfaking).
Increasing access to computers and the internet over time has facilitated:
Computer-assisted psychological assessment (CAPA)
Use of smartphones for behavioural assessment
Smart testing techniques
Computerised and multidimensional adaptive testing
Time-parameterised testing
Latent factor-centred designs
Internet testing with non-obtrusive measurements
Potential for virtual reality and artificial intelligence in assessment
1950s: Computers first available for testing and assessment, with CAT conceived and new developments in test theory like item response theory emerging, though costs and skill levels limited mainstream use.
1980s: Proliferation of affordable home computers allowed access to computing power for test developers.
1990s: Growth of the internet presented opportunities for internet testing and rapid proliferation of tests.
2000s: Introduction of smartphones, enabling user access equivalent to desktop computing in a portable format, along with online surveys.
2010s: Widespread adoption of tablets facilitated cheap, accessible information and easier participant recruitment, reducing research costs.
Does computer presentation fundamentally change the construct being measured?
Generally, the answer is no, with cross-mode correlations of .97 (e.g., Mead & Drasgow, 1993 meta-analysis).
Not much difference observed between ticking a box on a questionnaire with a pencil or a mouse, as psychological decision-making processes remain the same.
However, reliability tends to be poorer, especially with attentional distractions from modern lifestyle factors (e.g., Netflix, multitasking).
Less rapport may affect motivation, especially during long surveys.
Exception noted, where speeded tests characterised by simple, quickly performed tasks show variation by response modality (i.e., pen and pencil vs computer) affecting results, (e.g., cross-mode correlation of 0.72).
Notably, gender differences in fine-motor skills may also impact results.
Kyllonen (1997) speculated about the future of testing with the development of a “smart test” focusing on ability testing, incorporating significant technologies associated with abilities measurement, including:
Computer delivery
Multidimensional adaptive technology
Time-parameterised testing
Latent factor-centred designs
MAT extends Computerised Adaptive Testing (CAT), applying the same adaptive testing principles to a battery of tests rather than a single test.
Recognises correlations between constructs measured, with cross-battery assessments capitalising on correlations across different types of tests.
Performance on each item informs items for every subtest in a battery, adapting simultaneously, significantly reducing overall test time without sacrificing measurement accuracy.
The Progress in International Student Assessment (PISA) utilises the MAT technique to measure:
Reading literacy
Mathematics literacy
Science literacy (with financial literacy added recently)
NAPLAN Online aims to incorporate MAT techniques for measuring educational outcomes.
Similar to CAT, requires much effort to develop a sufficiently large item bank, needing hundreds of items with parameter estimation.
Data from large samples of examinees with extensive testing is necessary, with more effort demanded than in CAT.
Users may find it confusing due to potential changes between item types, necessitating recall of instructions across subtests, which may be unrealistic for children.
Examples include Multidimensional Aptitude Battery - II (MAB-II).
A tension exists between speed and accuracy, potentially sacrificing one for the other, which complicates scoring and interpretation.
Computer-administered tests can capture response time, linking to the Implicit Association Test (IAT), an indirect measure of implicit beliefs, prejudices, and biases utilising reaction time.
Arguments arise to emphasise constructs measured rather than focusing solely on the specifics of the test lowed by traditional methods.
A construct focus may reveal new testing forms, including virtual reality, role play, and games that assess while engaging participants, shifting interest towards the latent factors underlying performance.
The IAT offers indirect measurement advantages, reducing susceptibility to socially desirable response styles.
While showing moderately high reliability, it poses questions about legitimacy for individual assessment, particularly for individuals with impaired motor skills.
Shirodkar (2019) indicates biases against Indigenous Australians, analyzing target concepts (e.g., b/w images of faces from in- and out-group) and attribute concepts (positive/negative terms) using samples that are largely overrepresented by Caucasians and highly educated individuals.
Has revolutionised the field, primarily impacting distribution over test development.
It allows for rapid circulation and updates of questions among psychologists.
Internet tests can be easily modified, facilitating dynamic norming potential.
A digital divide results in limited access to the internet based on socio-economic status continues to present major discrimination challenges in testing.
Security concerns surrounding highly sensitive information collected.
The integrity of tests may be compromised by rapid dissemination, online security threats, and the prevalence of non-evidence-based assessments in public domains.
Functions of supervision include authenticating test-takers, establishing rapport, ensuring adherence to administration standards, preventing cheating, and securing test integrity.
Open: Unsupervised, published online or in print for personal development (low-stakes testing).
Controlled: Password-protected, suitable for first steps in recruitment.
Supervised: Proctored, ensuring compliance with testing standards.
Managed: Secure conditions with extensive supervision (remote or local).
Future potential for various innovative technologies in assessment includes virtual reality, artificial intelligence, holograms, serious games, eye-tracking, mobile devices, and wearables.
Previously impractical, advancements have made VR more accessible.
Incentives for situational judgement tests, role play, and therapeutic applications (e.g., phobia treatments).
VR's efficacy assumes accessibility across demographics, while the prevalence of cyber sickness poses significant feasibility issues.
More research is necessary to ascertain efficacy in personality assessments and translate online interactions to offline behavior. Furthermore, the uncanny valley effect hampers newer technologies.
Progress made with AI in visual perception tasks, but natural language processing remains problematic, alongside major ethical concerns.
Historical context with AI, notably Eliza, marks the early development of chatbot technology.
Holograms and Augmented Reality: Early-stage feasibility exists, albeit with limitations for those experiencing cyber sickness.
Games designed beyond entertainment serve as assessment tools for promoting personal development and behaviour modification.
Offers non-obtrusive ways to assess attention and learning strategies, utilising various formats including a stationary mounted display.
Devices for recording, GPS tracking, and applications facilitate various forms of assessments securely and effectively.
Advancements enable accurate recognition of emotional states through intelligent software aiding in personal adjustment and health monitoring.
The broader social environment influences assessment development, necessitating increased accountability and transparency alongside ethical considerations.
Technology offers both challenges and opportunities, underscoring the importance of ethical conduct and critical reasoning toward future testing methodologies.