Assessing in VET: Issues of reliability and validity - Review of research

By Shelley Gillis, Andrea Bateman Research report 11 June 1999 ISBN 0 87397 542 1

This review of research reviews both the Australian discussion papers on reliability and validity of competency-based assessment as well as international empirical research in this field. The review discusses two types of competency-based assessment — paper-based objective testing techniques and performance assessments — as well as the implications for validity and reliability of each type of assessment. The review includes guidelines for establishing procedures to enhance reliability and validity.

Executive summary

Main conclusions

Validity of an assessment refers to the use and interpretation of evidence collected, as opposed to the assessment method or task. It is not simply a property of the assessment task. An assessment task that is highly valid for one use or context may be invalid for another.

There are a number of different types of validity, including face, content, construct, criterion (concurrent and predictive) and consequential. Each type needs to be considered when designing assessment tasks and/or interpreting assessment outcomes for a particular purpose.

Validity is largely determined through inferences made by both the task developers and users.

An essential component of the validity of assessments is the assessor's intention. Assessors should be very clear about their intentions when assessing candidates against competency standards, and should identify the boundaries and limitations of the interpretations they make of assessments for a particular purpose and context.

The validity of workplace assessments is often defended on the grounds of the authentic nature of the assessments. Although this provides evidence of face validity, further evidence of content, criterion, construct and consequential validity is needed before the assessment can be said to be valid.

The reliability of an assessment is an estimate of how accurate or precise the task is as a measurement instrument. Reliability is concerned with how much error is included in the evidence.

There are common sources of error associated with both objective tests and performance assessment. These are associated with:

the method of gathering evidence (i.e. the level of precision of the assessment task and the degree of standardisation of the administration and scoring procedures)

the characteristics of the candidate (e.g. fatigue if a long test)

In performance assessment, there are additional sources of error:

the characteristics of the assessor (e.g. preconceived expectations of the competency level of the candidate)

the context of the assessment (e.g. location)

the range and complexity of the task(s) (e.g. the level of contextualisation)

Each of the above factors need to be controlled throughout the assessment in order to improve reliability. Assessment procedures need to be developed to minimise the error in the evidence collected and interpreted by assessors. Establishing clear task specifications, including evidence to be collected and decision-making rules, will increase reliability.

Evidence is crucial in establishing reliability and validity of assessments. The methods used to collect the evidence will impact on the reliability, whilst the way in which assessors use and interpret the evidence collected will impact on the validity of the assessment. As reliability creates a foundation for validity, an assessment should aim to reduce the error or 'noise' in the evidence collected or used.

Validation of an assessment process should therefore address the various forms of reliability and validity. It will require the assessment task developers and users (i.e. assessors) to make an holistic judgement as to whether this evidence supports the intended use and interpretation of assessment evidence for the specified purpose(s). The intended use, context and limitations of the assessment task need to be reported to potential users. Ultimately, the validation of an assessment requires evidence of task development, clear and concise assessment criteria against the competency standards, appropriate task administration procedures, adequate scoring/decision-making rules and recording procedures.

Findings and directions for further research

The review of literature has revealed a number of areas requiring further research. These include research into:

validation approaches used by workplace assessors and VET practitioners within Australia

transferability of competencies outside the assessment event

consequences of competency-based assessments in both vocational educational settings and the workplace

factors that influence judgements in competency-based assessment and how such factors impact on reliability and validity

Download

TITLE	FORMAT	SIZE
Assessing-in-vet-435	.pdf	2.1 MB	Download

Assessing in VET: Issues of reliability and validity - Review of research

Description

Summary

Executive summary

Download