Assessment - Assessment comes from the Latin word assidere,
which means to sit beside. In education, this "sitting beside"
includes observing, collecting information, coaching, and otherwise
supporting the learning process. Assessment is an integral part of
learning. It invites the learner to also take a close look at
his/her own learning process, to reflect on it, to build on
strengths, and to work on improving.
Alternative Assessment - Alternative assessment is any
assessment in which the learner creates a response to a question
rather than choosing from responses that have been provided.
Alternative assessments might include short answer questions,
essays, performance assessments, oral presentations, exhibitions and
portfolios.
Achievement Test - Achievement tests are standardized tests
designed to measure the amount of skill or knowledge students in a
school or district have gathered with respect to a very focused
area. The "standardized" component of these tests has nothing to do
with how good or complete the test is. It simply refers to the fact
that all the tests are administered and scored the same way
(essentially by machine), and also that the tests are designed to
measure content that has (presumably) been taught to students in a
fairly standardized way.
Analytic Trait Scoring - Analytic scoring identifies traits
essential to success in a given performance and requires trained
raters to score those traits individually. In six-trait analytic
writing assessment, for instance, instead of getting just one score
for "overall effectiveness," a paper receives six separate scores -
for ideas, organization, voice, word choice, fluency, and
conventions. Together, these scores create a profile of performance.
The use of a scoring guide in which traits are defined in writing
helps ensure consistency in the way writing (or any kind of
performance) is assessed.
Authentic Assessment - Authentic assessment is based on tasks
that mimic real life as closely as possible. A good example is a
driving test, in which a would-be driver is asked to cope with many
of the situations he/she will encounter in everyday driving.
Competency Test - This is a test intended to demonstrate that
a student has met established standards of skills or knowledge.
Criteria - When you hear the word criteria, think language.
In performance assessment - of which writing is one example - we do
not have "right" or "wrong" answers as we would on, say, a
multiple-choice test. Instead, we have a continuum of performance
that ranges from beginning levels through developing right on up to
proficient. Criteria are the language or the descriptors that define
levels of performance for each trait assessed.
Criterion-Referenced Test - In a criterion-referenced test,
students are not compared to each other. Instead, each student's
performance is measured against criteria that define success. If
every student meets the criteria (standards) considered important,
each student will be regarded as successful.
Essay Test - An essay test requires students to respond to a
question (prompt) by writing original text.
Evaluation - An evaluation is a judgment about whether a
behavior, product, program (or whatever) is or is not producing the
desired results. Evaluations are usually based on multiple sources
or information that might include surveys, test scores,
observations, and many other sources.
High Stakes Testing - High stakes testing occurs whenever a
major, significant decision with significant consequences is made on
the basis of test results. Examples include promotion,
certification, graduation, and denial of or access to learning
opportunities.
Multiple-Choice Testing - A multiple-choice test is one in
which students select the correct or best answer from several
alternatives.
Norm-Referenced Test - In a norm-referenced test, a student's
or group's performance is compared to that of students who are like
them - a peer group known as the "norm group."
Objective Test - The term objective is a little misleading.
It is often taken to mean "fair" or "free of human judgment."
Actually, both impressions are a little off the mark. An objective
test is one in which scoring procedures do not depend on human
judgment (quite unlike, say, a writing assessment, in which human
judgment of performance is the whole point). Usually, objective
tests are multiple-choice and are machine scored. It is important to
keep in mind, though, that while human judgment does not influence
the scoring of objective (e.g., multiple-choice) tests, it is a very
large factor in test construction - that is, in determining test
content or test design. On a multiple-choice test, each item must be
written by someone who decides (1) which content is worth testing,
(2) how it will be tested, and (3) how both "correct" and
"incorrect" responses will be worded. If we take a close,
scrutinizing look at a multiple-choice test, we'll likely find that
some questions were important to ask and were worded clearly with
well-defined correct answers. In other cases, though, we might
wonder whether a given question was worth asking in the first place
- or a careful reading may show that it was worded in a confusing
manner or that more than one option could be considered "correct".
In short, "objective" has nothing to do with fairness or quality,
but only with the way in which the test is scored.
Performance Assessment - Performance assessment is based on
direct observation of a student's work (a writing sample) or process
(the performance itself - say, a dive or an oral presentation). The
quality of the performance is judged on the basis of clearly
specified criteria that define what the given performance looks like
at the beginning, developing, and proficient levels. Sound
performance assessment is characterized by clear targets; a
well-defined sense of purpose (how will we use results?); sound,
thoroughly tested criteria that are known to everyone (including
students); and quality tasks that are engaging, challenging (without
demanding the impossible), and relevant to what we really want
students to be able to do.
Portfolio - A portfolio is a purposeful collection of
significant work, carefully selected, dated, and presented to tell
the story of a student's achievement or growth in well-defined areas
of performance (writing, reading, math, etc.). A portfolio usually
includes personal analysis in which the student explains why each
piece was chosen and what it shows about his/her growing skills and
abilities.
Prompt - A prompt is a picture, word, phrase, sentence, or
paragraph intended to generate ideas and give a student a starting
point for writing. A prompt is just that - a stimulus. In most
writing assessments, therefore, students are scored on the quality
of the writing, therefore, not meticulous attention to following the
directions of the prompt. For instance, a prompt might ask a student
to write about a favorite or memorable place. A student might, as
one did, write about the inside of his own mind - his imagination.
Some people might argue that this is not a "place" in the sense that
"New York" is a place. But such literal interpretation rarely seems
as important as giving students every opportunity to respond
creatively to a prompt and to show what they can do as writers.
Rater - A rater is a person who is trained to use criteria
consistently and skillfully in assessing performance. Raters are
most often teachers, but can also be professional whose work is
relevant to the area being assessed (e.g., editors or journalists
for writing performance) or parents with teaching or content area
experience.
Reliability - Reliability is a measure of consistency - over
time, over similar performances, or over raters. We would not want
the scores on any performance to be simply a matter of chance! Good
training and sound criteria help ensure that comparable performances
will receive comparable scores - regardless of when the scoring
occurs or which rater does the scoring. Sound performance
assessments should guarantee reliability; otherwise, results are
neither meaningful or useful.
Rubric - Rubric is another word for scoring guide.
Scoring Guide - Written criteria used to judge a particular
kind of performance: e.g., writing, public, speaking, math problem
solving. Criteria are the language that defines how performance
looks at various levels: beginning, developing, and proficient.
Task - A task is simply the activity the student is required
to do as part of an assessment. Sample task include completing a
chemistry lab, preparing an argument of debate, writing a paper, or
solving an open-ended math problem.
Task-Specific Scoring Guide - A task-specific scoring guide
is designed for use in judging performance on a particular
assignment - e.g., a literary analysis of The Helen Keller Story.
(Compare this highly specific, focused approach to assessing, say,
performance in writing.) Such scoring guides are not time efficient
since a separate one must be developed for every task assessed.
Generalizable scoring guides (guides that can be used with almost
any assignment in a given content areas, such as math or writing)
are preferred by most teachers.
Validity - Validity is an indication of how well an
assessment actually measures what it is intended to measure. For
example, a valid measure of writing focuses primarily on the
writing, not the student's ability to read and interpret a difficult
prompt.
*Reproduced with permission from Northwest Regional Education
Laboratory
![]()