Five Characteristics of Quality Educational Assessments

Assessment literacy involves understanding how assessments are made, what type of assessments answer what questions, and how the data from assessments can be used to help teachers, students, parents, and other stakeholders make decisions about teaching and learning. Assessment designers strive to create assessments that show a high degree of fidelity to the following five traits:

Content Validity
Reliability
Fairness
Student Engagement and Motivation
Consequential Relevance

In this blog post, we’ll cover the first characteristic of quality educational assessments: content validity.

One of the most important characteristics of any quality assessment is content validity. Simply put, content validity means that the assessment measures what it is intended to measure for its intended purpose, and nothing more. For example, if an assessment is designed to measure Algebra I performance, then reading comprehension issues should not interfere with a student’s ability to demonstrate what he or she knows, understands, and can do in Algebra I. Content validity is evidenced at three main levels: the assessment design level, the assessment experience level, and the assessment question, or item, level.

The assessment design is guided by a content blueprint, a document that clearly articulates the content that will be included in the assessment and the cognitive rigor of that content. The content standards which the test is designed to assess determine what content makes it into the test’s item pool.

The next level where content validity matters is the assessment experience itself, meaning when the student sits down to take the assessment, what items do they see? In a fixed form, grade level test, most or all students at a given grade level see the same item set, namely those assessing the grade-level standards to which the student is assigned. In a cross-grade, computer adaptive test, an item selection algorithm presents each student with items sampled from a broad range of standards and adapts to the in-the-moment performance of the test taker. Each student sees items at the difficulty level that’s appropriate for them, based on their previous responses. This adaptivity enables test developers to provide very precise information about a student’s learning and performance in a domain area.

Content validity is a concept germane to the building block level of MAP® Growth™ from NWEA as well: the questions, or items, themselves. Experts in both content and assessment design items to measure the concepts and skills in the standards at the indicated levels of cognitive complexity. Every item in a high-quality assessment goes through a rigorous development process with several levels of review, which ensures that item content is clear, accurate and relevant. The result is a robust and aligned item pool that serves to provide the most accurate information possible about a student.

Content validity is supported in a number of ways in educational assessments, such as:

+ General assessment design principles that control for readability

+ Content expert review cycles

+ Evidence-centered design methodology

+ Statistical analysis of student performance on test items

One way to check content validity is to ask these guiding questions:

+ How closely does what the assessment measures match the intended (instructed) content?

+ What knowledge or skills does the student most need to perform successfully on this assessment?

+ If the student performs successfully on this assessment, what does that mean?

Content validity is foundational to making accurate inferences. If one is unclear about what the assessment is measuring, then the inferences made will be uninformative – in other words, it means that the assessment has failed in its prime directive: to provide valuable information about what the test taker knows and can do. An assessment can have all sorts of bells and whistles, incorporate cutting edge technology and functionality, have a great suite of reports that tell a compelling assessment narrative, but if the test is lacking content validity, it is not worth much. What’s more, when data from an assessment that lacks content validity are used to inform instruction, the result could include wasted time and inappropriate growth expectations of students. For these reasons, content validity is central to a high quality educational assessment.

About NWEA

NWEA® is a not-for-profit organization that supports students and educators worldwide by providing assessment solutions, insightful reports, professional learning offerings, and research services. Visit NWEA.org to find out how NWEA can partner with you to help all kids learn.