A concept inventory is a criterion-referenced test designed to help determine whether a student has an accurate working knowledge of a specific set of concepts. Historically, concept inventories have been in the form of multiple-choice tests in order to aid interpretability and facilitate administration in large classes. Unlike a typical, teacher-authored multiple-choice test, questions and response choices on concept inventories are the subject of extensive research. The aims of the research include ascertaining (a) the range of what individuals think a particular question is asking and (b) the most common responses to the questions. Concept inventories are evaluated to ensure test reliability and validity. In its final form, each question includes one correct answer and several distractors.
Ideally, a score on a criterion-referenced test reflects the amount of content knowledge a student has mastered. Criterion-referenced tests differ from norm-referenced tests in that (in theory) the former is not used to compare an individual's score to the scores of the group. Ordinarily, the purpose of a criterion-referenced test is to ascertain whether a student mastered a predetermined amount of content knowledge; upon obtaining a test score that is at or above a cutoff score, the student can move on to study a body of content knowledge that follows next in a learning sequence. In general, item difficulty values ranging between 30% and 70% are best able to provide information about student understanding.
The distractors are incorrect or irrelevant answers that are usually (but not always) based on students' commonly held misconceptions. Test developers often research student misconceptions by examining students' responses to open-ended essay questions and conducting "think-aloud" interviews with students. The distractors chosen by students help researchers understand student thinking and give instructors insights into students' prior knowledge (and, sometimes, firmly held beliefs). This foundation in research underlies instrument construction and design, and plays a role in helping educators obtain clues about students' ideas, scientific misconceptions, and didaskalogenic ("teacher-induced" or "teaching-induced") confusions and conceptual lacunae that interfere with learning.
Concept inventories are education-related diagnostic tests. In 1985 Halloun and Hestenes introduced a "multiple-choice mechanics diagnostic test" to examine students' concepts about motion. It evaluates student understanding of basic concepts in classical (macroscopic) mechanics. A little later, the Force Concept Inventory (FCI), another concept inventory, was developed. The FCI was designed to assess student understanding of the Newtonian concepts of force. Hestenes (1998) found that while "nearly 80% of the [students completing introductory college physics courses] could state Newton's Third Law at the beginning of the course. FCI data showed that less than 15% of them fully understood it at the end". These results have been replicated in a number of studies involving students at a range of institutions (see sources section below). That said, there remain questions as what exactly the FCI measures. Results using the FCI have led to greater recognition in the science education community of the importance of students' "interactive engagement" with the materials to be mastered.
Since the development of the FCI, other physics instruments have been developed. These include the Force and Motion Conceptual Evaluation concept and the Brief Electricity and Magnetism Assessment. For a discussion of how a number of concept inventories were developed see Beichner.
In addition to physics, concept inventories have been developed in statistics, chemistry, astronomy, basic biology, natural selection, genetics, engineering, geoscience. and computer science.
In many areas, foundational scientific concepts transcend disciplinary boundaries. An example of an inventory that assesses knowledge of such concepts is an instrument developed by Odom and Barrow (1995) to evaluate understanding of diffusion and osmosis. In addition, there are non-multiple choice conceptual instruments, such as the essay-based approach and the essay and oral exams concept to measure student understanding of Lewis structures in chemistry.
Some concept inventories are problematic. The concepts tested may not be fundamental or important in a particular discipline, the concepts involved may not be explicitly taught in a class or curriculum, or answering a question correctly may require only a superficial understanding of a topic. It is therefore possible to either over-estimate or under-estimate student content mastery. While concept inventories designed to identify trends in student thinking may not be useful in monitoring learning gains as a result of pedagogical interventions, disciplinary mastery may not be the variable measured by a particular instrument. Users should be careful to ensure that concept inventories are actually testing conceptual understanding, rather than test-taking ability, language skills, or other abilities that can influence test performance.
The use of multiple-choice exams as concept inventories is not without controversy. The very structure of multiple-choice type concept inventories raises questions involving the extent to which complex, and often nuanced situations and ideas must be simplified or clarified to produce unambiguous responses. For example, a multiple-choice exam designed to assess knowledge of key concepts in natural selection does not meet a number of standards of quality control. One problem with the exam is that the two members of each of several pairs of parallel items, with each pair designed to measure exactly one key concept in natural selection, sometimes have very different levels of difficulty. Another problem is that the multiple-choice exam overestimates knowledge of natural selection as reflected in student performance on a diagnostic essay exam and a diagnostic oral exam, two instruments with reasonably good construct validity. Although scoring concept inventories in the form of essay or oral exams is labor-intensive, costly, and difficult to implement with large numbers of students, such exams can offer a more realistic appraisal of the actual levels of students' conceptual mastery as well as their misconceptions. Recently, however, computer technology has been developed that can score essay responses on concept inventories in biology and other domains, promising to facilitate the scoring of concept inventories organized as (transcribed) oral exams as well as essays.