1/94
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are performance based assessments?
A type of classroom assessment where students are asked to demonstrate ability or skill by performing in some way or creating a product
What are the two main types of performance-based assessments.
Constructed response items and essays
What are constructed-response items?
These are assessment tasks that ask students to create a complex response, product, or written answer.
What kind of item is constructed-response an example of?
Supply-item
What are constructed response items best used for?
They are ideal for measuring complex student skills and abilities
What are the components of a constructed-response item?
A described task
the stimulus or instructions or prompt that tells the student what they are supposed to do or make
A response ( a student’s answer)
How do you design a good constructed response question?
Be clear on what it takes to get all the points
Share your scoring criteria with the test takers
Give good instructions
What are the forms in which essay questions come in?
Open-ended
Closed-ended
What are essay questions?
Items where the test taker writes a multi-sentence response to a question
What is a closed-ended essay question?
An essay question format where the respondent has very little freedom in terms of what content must be in the answer
What is an open-ended essay question?
An item where there are no restrictions on the response, including the amount of time allowed to finish.
What are the guidelines to writing an essay question (to get maximum benefit)?
Allow adequate time to answer the question
Be sure the question is complete and clear
These sorts of questions should be used only to evaluate higher-order outcomes, such as when comparisons, evaluations, analyses, and interpretations are required
What is a scoring rubric?
A written set of scoring rules, often in the form of a table, that identifies the criteria and required parts and pieces for a quality answer or a quality product
How are rubrics beneficial and how do they help teachers?
They help make up for the potential weakness of performance-based assessment- the subjective nature of the task means judgment is necessary when assigning scores.
They also help improve the reliability of performance-based assessments
They allow for quick scoring and quick feedback
They improve teaching
They encourage the growth of student meta-cognitive and critical thinking skills
They allow for meaningful sharing of student growth
What is a portfolio?
A collection of work that shows efforts, progress, and accomplishment
What are characteristics of a portfolios that make them “evidence based”?
A good portfolio is both formative and subjective in nature- This means that the evaluation is continuous—the efforts and accomplishments are recognized as the portfolio is being created (say, every 3 weeks or every four elements of the portfolio)—and summative in that there is a final evaluation.
A portfolio is a product that reflects the multidimensional nature of both the task and the content area- The portfolio invites the student to be expressive and think both differently and big (in size and ideas).
Portfolios allow students to participate directly in their own growth and learning- While being closely monitored (by a teacher or supervisor), the student can participate (with feedback) in the process of creating each element and gets to think and consider the direction in which their work is going and make adjustments as they go.
Portfolios allow teachers to become increasingly involved in the process of designing and implementing curriculum
What are open-ender supply items - especially ones as complex as performance based items- known for?
High Validity
What are surveys'?
An organized set of questions used in research to gather a lot of information form a sample of people
What are scales?
Sets of questions all meant to measure the same construct or concept. By combining responses across all the questions, they allow for a single score to be created that represents a variable
How do surveys questionnaires collect data?
They may use a set of unrelated questions to gather demographics and biographical information about people in order to describe them.
They may be interested in measuring some abstract concept or construct like attitudes and feelings and personality traits
What are the steps in scale development?
Determine clearly what it is you want to measure- define the construct. Use theory as an aid to clarity for research variables that are abstract constructs. Use specificity as an aid to clarity to help in writing items
Generate an item pool- Write lots of questions that seem to get at our construct
Determine the format for measurement- what will the items be formatted as? Will the questions be questions or will they be statements that people agree or disagree with?
Have the item pool reviewed- are there experts on the topic and if so have them evaluate the questions.
Consider inclusion of validation items-
Administer items to pilot sample- Give your survey to a sample of people similar to those that you are actually researching
Evaluate items- This is your chance to make changes to improve your scales and the entire survey instrument s. Can you increase the internal consistency reliability by removing a bad item or two? Or do you need to add some items to better measure your construct? Did it take your respondents too long to take your survey and many quit before the end? Did everything work technically with your online data collection?
Produce your final scale- refine the scale, tweak some things, (maybe do an overhaul), but finally use the instrument for a research study
How do you get valid and reliable responses in your research?
non-threatening questions
Threatening questions
How do you get valid and reliable responses from nonthreatening questions?
Make the topic salient- retrieving information from something that is important to you
Give reminders of the context of the question- More details are added into the question to help the participants recall information better.
Ask About recent behavior ( if you want to know about typical behavior)
How do you get valid and reliable responses from threatening responses?
Use long questions- Respondents can feel that their behavior is acceptable if the researcher is treating the question like it is ok
Use “loaded” questions- These are questions that are worded in a way that influences a certain response. These are questions that get people to admit they engage in certain behavior
What are some strategies for writing leading questions?
Claim that everybody does it
Use nonspecific authorities to justify the behavior- (i.e, using experts)
Provide good reason for behavior
What is random responding?
A technique where respondents are randomly assigned one question from two possibilities, one of which is a threatening question. Because only the respondent knows which question they are answering, it allows for an additional layer of privacy
Guidelines for writing good attitude questions
Word questions as simply as possible- You want to get to the point quickly and precisely with attitude items.
Keep the question balanced- You don’t want to lead people one way or another.
Avoid double-barreled questions- Poorly written questions sometimes ask about two different things at once
Avoid the use of negatives in questions- A little extra “not” in statement is easily missed and could lead to invalid responses
What is a Likert format/item?
A popular format for items on attitude surveys where answer options are symmetrical [an equal number of positively and negatively perceived answer options] and balanced an effort is made to make the “distance” between each answer option about equal
Who is Robert Likert?
A psychologist who suggested the Likert format and the idea that one could create a group of such items (the idea of a scale) and combine responses to get a valid and psychometrically sound measure of feelings and attitudes
Who is Louis Thurstone?
A social psychologist that was the first to develop whole theories about what an attitude is and how to measure it .
How to create a Thurstone scale:
Choose an attitudinal object that you can write opinion statements about.
Write dozens of attitudinal statements about the object- try to create a range of attitude levels
Create a panel of “judges”- Smart people who will tell you how strongly worded each of your statements is and whether they are positive or negative
Average the ratings across judges for each statement- these averages become the weights for the statements; they are the point value for each statement
Now you are ready to give your scale to real people
Social Exchange Theory
A theory that states that people will agree on a trade if rewards are high, the costs are low, and they trust each other
How do you include perceived rewards in your request for participation in a survey?
Provide information about the survey
show positive regard
Thank the respondent
Ask for advice
Support group values
Provide social validation
Provide incentives- could be money, gift card, etc
Make the questionnaire interesting
Inform respondents that time or opportunity to respondents
How do you keep the perceived costs low in a survey?
Avoid demanding language
Avoid using language that respondents will not understand
Include a direct link to the online survey in the email
Make the questionnaire appear short and easy to complete
Do not ask for personal information that is not critical to the design of the study
Emphasize similar requests to which participants have already agreed
How do you get participants to trust you to complete a survey?
Providing a token of appreciation
Securing sponsorship
Putting forth enough effort in constructing the survey to make the task appear important
Assuring and ensuring confidentiality and security
How do you calculate response rate?
Response rate = Number of people who filled out your survey/number of people who you asked to fill out your survey
How do we look at fairness?
Test bias- Are certain types of tests biased against identifiable groups of people, such as different genders or races, or with disabilities?
Equity and universal design- Can we design tests that are equally valid for everyone? equity, the idea that we should provide access to all, is a fairness concern in testing. as it is in society and, more broadly, in society. How does this goal play out in out measurement world
Law and ethics- What laws exist to support fairness in testing? What ethical principles have been adopted by the various professional organizations to which measurement folks belong to advocate for fair and just testing
What is test bias?
When test scores vary across different groups because of factors that are unrelated to the purpose of the test
What must be true for a test to be biased?
The test finds average differences in scores between relevant groups of test takers, and
There are not, in fact, average differences between the groups in the level of the trait
What are the views regarding fairness based on the Standards for Educational and Psychological Testing?
To test fairly and without bias, you have to provide opportunities for testing in a secure and controlled environment
The test must have no inherent bias already built into it
Whatever construct is being tested, the test taker must have full access to even taking the test in the first place with no obstacles regardless of demographics such as gender and ethnicity
The test scores must be fairly interpreted
What is test fairness?
The degree to which a test fairly assesses an outcome independent of traits and characteristics of the test taker unrelated to the focus of the test
What is consequential validity?
How tests are used and how their results are interpreted
What does FairTest- the National Center for Fair and Open Testing- do?
There are an organization that work to end the misuses and flaws of standardizing testing and to ensure that evaluation of students, teachers, and schools is fair, open, valid, and educationally beneficial
What are the basic principles that FairTest espouses?
Assessments should be fair and valid- they should provide equal opportunity to measure what students know and can do, without bias against individuals on the basis of race, ethnicity, gender, income level, learning style
Assessments should be open- The public should have greater access to test and testing data, including evidence of validity and reliability. They should be open to parents, educators, and students
Tests should be used appropriately- Safeguards should established to ensure that standardized test scores are not the sole criterion by which major educational decisions are made and that curricula are not driven by standardized testing
Evaluation of students and schools should consist of multiple types of assessment conducted over time- NO one measure can or should define a person’s knowledge, worth, or academic achievement, nor can it provide for an adequate evaluation of an institution
alternative assessments should be used- Methods of evaluation that fairly and accurately diagnose the strengths and weaknesses of students and programs need to be designed and implemented with sufficient professional development for educators to use them well
Difference-Difference Bias
This model says that if two groups differ on some factor obviously unrelated to the test, such as race, gender, racial, or ethnic group membership, or disability status, then the test is bias
What is differential item functioning (DIF)?
A type of analysis that can see if two groups of people differ on their item difficulty even after making sure those groups are equal on overall ability.
What do companies do in terms of analysis?
They do it in the terms of an item-by-item basis where group performance is examined for each item and discrepancies can be studied further.
They use the Item Response Theory information to make bias visible at the item level
What is the Cleary Model?
A regression model (a more advanced form of correlational model) that measures test bias by measuring different things for different people
It uses regression as the statistical meassurement
What are the steps for making sure that tests created by you and test created by others are not biased?
Try to be as clear as possible in recognizing your own biases or stereotypes
If in doubt, show the test to someone who is part of the group you feel might be slighted
If at any time you find your own test development efforts learning toward biased, stop and start over, ask for help from someone more experienced than you
Test takers are different from one another, not only in their abilities or personalities but in the ways they learn and what forms of assessment are most accurate for them
What is universal design?
This is the design that does a good job at assessing the differences between people and works well for every person when it comes to taking the test
A test using this test is more valid for everyone
How do you design a test in terms of the mechanics and formatting?
Text formatting- Text should be flush to the left margin because that is easier to read for most Westerners
Typefaces- Certain typefaces (what we call fonts) work best for those who are visually impaired
White Space- A lot of blank space around questions images and other page elements is believed to lower test anxiety and also make information clearer
Contrast- Tests should be used (on paper tests) with off-white or light-pastel colors with a nonglossy coating to prevent glare and the type should be black.
Illustrations- illustrations can be a problem if they cause competition between picture and text. Avoid red/green combinations.
When does the content and wording of tests make it accessible?
Test takers share the same experiences and prior knowledge necessary to figure out what is being asked and what answer options are available
Regardless of the test taker’s development level, the complexity of the sentences and the vocabulary used should be appropriate
Break complex sentences into shorter sentences
What is adaptive testing?
A technique that uses computers and selects which items will be administered based on the estimated difficulty of items and the estimated ability of the test taker.
Overall design adapts the questions based on the skill level of the test taker
This doesn’t match the goal of universal design as the test changes based on the capability of the test taker
Also called computer adaptive testing (CAT)
What is the No Child Left Behind act (NCLB)?
An act signed into law by George W. Bush in 2002 (based on the Elementary and Secondary Education Act of 1965)
It is a federal law that focuses on academic achievement in the elementary grades
What is the purpose of the NCLB?
“to close the achievement gap with accountability, flexibility, and choice, so that no child is left behind.”
How was the mission of the NCLB act meant to be conducted?
A huge amount of testing had to occur on a regular basis but the problem with it was that it was very expensive to implement
Why was the ‘Every Student Succeeds’ Act (ESSA) created?
It was created as a revision to the NCLB Act and was signed in 2015 by Barack Obama
It fixed the problem
What makes the Every Student Succeeds Act different form the No Child Left Behind Act?
It provides more support for states for testing and other requirements and also relaxes the requirements for testing each student
Each state now had more choice in terms of how they help students, schools, and districts that score low on state tests
ESSA offers support to states in developing and using high-quality assessments and has a goal of helping teachers in using assessments to foster deep learning among students
What are (a few) of the objections of the approach taken by the ESSA and NCLB?
The policy requires that all students must be tested, meaning that children even of low proficiency in English and with significant disabilities are expected to perform at grade level
Students are not only tested but are also tested using standardized tests, which doesn’t help students learn
only public schools (and other schools that receive federal funding) are expected to meet these standards
The bills have never been funded at the levels proposed by the federal government, leading to shortages in teachers and training for those who are not highly qualified, as dictated by law.
There are no rewards for doing well, only sanctions for not
What constitutes a “highly qualified” teacher is open to discussion (subjective as there would need to be a set qualification of what a 'qualified teacher’ is)
What was the Education for All Handicapped Children Act?
Signed by President Gerald Ford in 1975
Known as the Public Law (PL 92-142)
It was a law signed that guarantees all children, regardless of disability, the right to a free and appropriate public education; now known as the Individuals with Disabilities Education Act, or IDEA (predates this)
What does least restrictive environment mean (LRE)?
An environment that places the fewest restrictions on a child with disabilities
What were the purposes of the PL 94-142 (Education for all Handicapped Children Act)?
“Assure that all children with disabilities have available to them… a free appropriate public education which emphasizes special education and related services designed to meet their unique needs”
“Assure that the rights of children with disabilities and their parents are protected”
“Assist States and localities to provide for the education of all children with disabilities”
“Assess and assure the effectiveness of efforts to educate all children with disabilities”
However, the law was amended in 1997 and became the Individuals With Disabilities Education Act (IDEA).
What is the Individuals With Disabilities Education Act?
A federal law that guarantees a free and appropriate education for all children regardless of level of disability
Created in 1997
Why was the Individuals With Disabilities Education Act (IDEA) created?
More than 10 million children in the United States today have a variety of special needs—from mild to severe physical, cognitive, emotional, and intellectual disabilities that effect their ability to learn—and traditional special education programs were not meeting their needs.
now almost 200,000 infants and toddlers (and their families) and 7 million children/youth received special education and related services (about 14% of all students)
What are the principles of IDEA?
All children are entitled to a free and appropriate public education- This means that special education and related services will be provided at public expense without any charges to the parents, and these services will meet standard set by the state and suit the individual need of the child
Evaluations and assessments will take place only to the extent that they help place the child in the correct program and measure their progress- the people conducting the evaluations must be knowledgeable and trained, procedures must be consistent with the child’s level of skill, and test must be as non-discriminatory and unbiased as possible
An Individualized Education Program (IEP) will be developed and adhered to for each child- These written plans will be revised regularly and include input from parents, students, teachers, teachers, and other important interested parties
Children with disabilities will be educated in the least restrictive environment- “They will take classes with their nondisabled peers, and only those children who cannot be educated in regular education in a satisfactory fashion should be removed”
Both students and parents play an important role in the decision-making
There should be a variety of mechanisms to ensure that these previous five principles are adhered to and, if not, that any disagreements between students, teachers, parents, and school can be resolved constructively
some of these include parental consent, mediation, parental notification, and parental access to record
What is an individualized education plan (IEP)?
A component of Public Law 94-152A; written plan of educational goals and strategies for children who are referred to special programs
What is the Truth in testing law?
Passed in 1979 with the support of New York State Senator Ken LaVelle
A law first passed in New York State that guarantees access to tests and their results (required that admission tests be available for review of their content and scoring procedures, that the test items be released to the public, and that there be due process for any student accused of cheating
What is the Family Educational Rights and Privacy Act (FERPA)
Also known as the Buckley amendment and was passed in 1974
It is a federal law that protects the privacy of students and their test results
The law applies to any public or private elementary, secondary, or post secondary school and any state or local education agency that receives federal funds
Important milestones in shaping ethical practices
1620: Francis Bacon publishes The Novum Organon. in which he argues that scientific research should benefit humanity
1830: Charles Babbage publishes Reflections on the Decline of Science in England, And Some of Its Causes, in which he argues that many of his colleagues were engaging in dishonest research practices, including fabricating, cooking, trimming, and fudging data.
1874: Robert Bartholomew inserts electrodes into a hole in the skull of Mary Rafferty caused by a tumor. (Rafferty fell into a coma and died a few days after the experiment)
1912: Charles Dawson discovers a skull in at Piltdown gravel bed near Surrey, U.K that people thought was the “missing link” between apes and humans (The skull was fake btw; chemical treatment was done to make it look old)
1939-1945: German scientists conducted morally abominable research on concentration camp prisoners, including experiments that exposed subjects to freezing temperatures, low air pressures, ionizing radiation and electricity, and infectious diseases; as well as wound-healing and surgical studies
1944-1980s: The U.S. Department of Energy sponsors secret research on the effects of radiation on human beings. Subjects were not told that they were participating in the experiments
1945: Vannevar Bush writes the report Science: The Endless Frontier that argues for a major increase in government spending on science and defends the ideal of a self-governing scientific community free from significant public oversight.
What are the key ethical principles of testing and measurements?
Nothing should be done that harms the participants physically, emotionally, or psychologically. - All testing formats and content should be carefully screened to be sure that such threats are eliminated or, if that’s not possible, minimized.
When behavior is assessed, especially in a research setting, the test takers should provide their consent. - If there are children or adults who are incapable of providing consent, then someone who cares for them should provide the consent.
If incentives are offered (such as paying people to complete a survey), the incentives should be reasonable and appropriate.
Unless there is a necessity otherwise (like when testing students or clients), test takers’ responses should be anonymous.
Not only should the research materials be anonymous, but as the researcher in your testing and assessment activity, you must ensure that the records will be kept in strictest confidentiality.
You have to be judicious in your reporting, even if you wish to share things with the participants- If you are a licensed school psychologist and you are administering a personality test, you want to be careful how much information you share with the child versus how much you share with the parents. You want to be informative and helpful but not provide more information than the child needs to know, because the information may end up being hurtful.
Assessment techniques should be appropriate to the purpose of the testing and appropriate to the audience being tested- It means that if you can’t find the right tool to use, then build your own or don’t use any
. Tests that are constructed under your watch need to have all the qualities of any assessment tool that we recognize to be most important- they should be reliable and valid
Finally, what you re doing needs to have an important purpose- frivolous testing is exploitive and unfair
What does the FERPA law say?
Parents of students can inspect the students’ education records, but schools are not required to provide copies of records unless it is impossible for parents or eligible students to review the records
parents or eligible can request that their school correct records that the parents or students think are inaccurate.
Schools must have written permission from the parent or eligible student to release any information from a student’s education record. However, this can be shared by the school for the following reasons
School officials with a legitimate educational interest
other schools to which a student is transferring
specified officials for audit or evaluation purposes
appropriate parties in connection with financial aid to a student
Organizations conducting certain studies for or on behalf of the school
Accrediting organizations
Appropriate parties in compliance with a judicial order or lawfully issued subpoena
Appropriate officials in cases of health and safety emergencies
State and local authorities within a juvenile justice system, pursuant to specific state law
What are the downsides to the FERPA law?
A huge amount of paperwork is generated by the implementation of the law
Lots of questions were left unanswered by the original law, such as secondary school students’ access to college letters of recommendations and cocnerne
What is the Buckley Amendment?
A part of the FERPA law that prohibits professors from telling other students what grade you received on an exam/assignment or posting the grades publicly
What makes a job a profession?
There are a set of standards for training and performance that have been established, usually by a group of people in that profession.
What are the three major professional organizations that worry about the right way to use tests and measurements?
American Psychological Association
American Educational Research Association
National Council on Measurement in Education
They have combined to write, maintain, and publish the Standards for Educational and Psychological Testing
What is the Standards for Educational and Psychological Testing book about?
It provides the standard that all professionals involved in testing should follow in the development, selection, and interpretation of tests.
What are the parts of the Standards For Educational and Psychological Testing?
Part 1 focuses on the key principles of validity, reliability, and fairness
Part 2 provides the professional standards for making tests and administering them
Part 3 discusses the use of tests in various contexts, like schools, psychology, and the workplace
What are the issues that relate to testing in general?
The Flynn effect
Teacher Competency - how do you measure the competency of a teacher
School admissions- based on what criteria should students be admitted to schools
Cyril Burt fabrication data-
What is the Flynn effect?
Based on the work of James Flynn who published a study in 1994 that showed that scores over the past 60 years increased from one generation to the next (between 5 and 25 points).
How can the Flynn effect impact the costs of standardized testing?
If scores continue to increase, it means that the tests have to be re-normed or re-standardized every few years so that there will be consistent and accurate standards across all test takers. This is expensive, time consuming, and controversial
Who was Cyril Burt? What was he accused of?
A luminary in the field of educational psychology who was accused in 1976 in peer-reviewed journals of fabricating data
He was known for his work in the specialized statistical technique of factor analysis and his investigations of the role that genetics or inheritance plays in intelligence
He designed a study to examine twins’ similarities in intelligence and compared the findings to less “related” siblings as well as other pairs of children using correlational analysis.
How many years have organizations been producing standards for psychological testing?
About 60 years
In 2010, how many U.S. children were receiving special education services?
More than 6 million students
What does “saliency” mean?
The characteristic of being outstanding, striking, memorable
What does resiliency mean?
the process and outcome of successfully adapting to difficult or challenging life experiences, especially through mental, emotional, and behavioral flexibility and adjustment to external and internal demands.
What can several Likert items be combined into?
They combine into a scale to create a total score that summarizes those item response
What is the result of unbiased tests?
A higher validity
How are portfolio assessments beneficial for teachers?
They are flexible
they are highly personalized for both the student and teacher
They are an attractive alternative to traditional methods of assessment when other tools are either too limiting or inappropriate.
What has happened regarding college admissions testing since the Truth in Testing law was enacted?
Nearly 1,000 institutions of higher learning, smaller and larger, have decided not to require the test (SAT,ACT) for undergraduate admissions, or to make it optional
What surveys measure opinions and feelings?
Attitude surveys
What is the meaning of Thurston items? How are they formatted? How are they weighted?
Used to measure attitudes
They are formatted using various ranges of statements
They are weighted based on their strength as a result of the different levels of points based on how strongly they are worded
What type of test formats are associated with the different levels of Bloom’s Taxonomy?
Memorized Knowledge- Multiple choice items because recognition is easier than creating or organizing a response
Comprehension
Application
Analysis
Synthesis
Evaluation- Essay questions
Who is Samuel Messick and what did he propose
An American psychologist who took our conventional definitions of validity and extends them one step further to form consequential validity
Colleges that use test scores and other selection criteria for admission decisions must be sure of what?
They want to make sure that there is some philosophy or set of value-based rules that underlies your model of how this admission information will be used and this philosophy should be written down and explicit so anyone can read and understand it.
Who is Allan Bakke?
He was a student who applied twice to the University of California Medical school; once in 1973 and the other in 1974. However he was rejected even with his entry test scores being higher than other students
He took the school to court claiming that the admissions policy was discriminatory because it was based in part on race.
Overall the decision was ruled 5 to 4 in favor of Bakke stating he was discriminated against and had to be admitted.