Kirkpatrick Levels in Course Evaluation

Objectives of Evaluation

crowdone The term evaluation can hold a number of meanings to the audience, depending on its intent. Our meaning in this article is to review with the intent to improve something. In this case, we are talking about review procedures for producing training, either in the classroom or for distance education.

We will be using four descriptive levels to explain how this might work, as defined by the seminal work, Evaluating Training Programs, by Donald and James Kirkpatrick. Their work outlined four levels of measuring a successful training program: reaction; learning; behavior; and results. The intent of this consideration is to gain a clearer understanding of the effectiveness of a course, measure the impact of the training on the wider organization, understand the cost and benefits of implementation, and determine where improvements can be made.


Pre-delivery evaluation

The pre-delivery step in the evaluation process actually begins during the conception of the course, and maintains a cyclical pattern of review and measurement up to the point of release of the course to its intended audience. In a professional environment, multiple groups with different skill sets are involved in the pre-delivery evaluation of a course. It is up to the project leader to determine how to best use the input of these groups to move the course development forward.

In order to determine if developed material is ready to be delivered to the field, content is reviewed by multiple audiences at multiple milestones (Bradley & Martin, 2002). For the purpose of clarity, this process has been consolidated into one final checklist for all groups. This format is actionable from a process perspective. The results here can be directly applied to altering the course to make it ready for delivery. What follows is a list of the primary participating groups and their anticipated contributions (Table 1 Topic: Contributing Course Development Groups.)

Group Contribution
Content Experts
  • Determine accuracy of content
  • Determine if depth of content is effective
  • Confirm that content is appropriate for audience
Instructional Designer
  • Affirm that major message is consistent
  • Confirm sequencing of information is effective and orderly
  • Determine consistency in presentation of information for media type
  • Confirm validity of interim and final assessment
  • Confirm that content is correct for stated learning objectives and audience
Editor
  • Ensure that page organization and hierarchy are consistent for levels of information (example: headers and call-outs)
  • Note spelling, grammar, typeface, punctuation and translation errors
  • Note written and spoken narration errors
Designer/Multimedia Developer
  • Confirm that typefaces/fonts are properly represented and suggest alternatives
  • Confirm that look and feel are consistent within design guidelines
  • Confirm color coordination is effective and adheres to corporate requirements
  • Confirm that graphics are as clear as possible
  • Confirm that audio is as clear as possible whether produced mechanically or through human recording
  • Confirm that media is being used effectively to convey content in most direct method possible
  • Suggest additions in the form of audio, video, animation, interaction, simulation or immersive learning simulation (ILS) that could enhance content

Pre-delivery Evaluation Checklist

During each stage of evaluation, some consistent and quantifiable method is necessary to collect input from the evaluating group. These are often called checklists, and are merely organized forms for the evaluators to fill out, depending upon their findings. They may be paper-based or electronic in form. A simple example for Content Experts might be like the following (Table 2 Topic: Content Evaluator Checklist.)


Review Criteria Content Accuracy S U Comment/Ref.
Example: Material for Lesson One is accurate.
If U is selected, then indicate the slide number and the specific error. Error
Example: Slide #
 
 

For Designers or Multimedia experts, a pre-evaluation checklist might look similar, but with a different focus (Table 3 Topic: Design Evaluator Checklist.)

Review Criteria Content Errors S U Comment/Ref.
Example: Material for Lessons One through Four meet the criteria of successful adherence to the design guidelines for color, typeface, fonts, graphic clarity, audio clarity, media usage, and effective communication. Also, suggest additional animations, simulations and interactions that could be beneficial to the content.
If U is selected, then indicate the slide number and the specific error. Non-Compliance or Suggestion for Enhancement
Example: Slide #
 
 

Realizations of the Checklist

For evaluating the accuracy, relevancy and impact of the content, as well as improving the anticipated learner experience, a checklist is an important tool. It does not, however, give insight into how the student will react to the course itself. Evaluating the learner reaction takes place after the content has been made available to the intended audience, and gives an initial indicator to the relative impact of the course.


Learner Reaction Evaluation

In order to quantify the reactions of the learner (customer), it is necessary to specify the data requested for the anticipated answers. Since we are evaluating the confidence level of the learner in the material and he has used similarly structured online courses as prerequisite material to reach this course, the focus of the reaction is primarily upon the content and less upon the elements such as registration, navigation, and the general learner experience, as this information would be redundant.

It is necessary to design a tool that gathers reaction data from the anticipated audience with the focus of sufficient relevancy, breadth and depth of the material, and to affirm or refute the rationale behind the development choices. This evaluation is measuring reactions by participants, and not necessarily knowledgeable critiques. Their experiences will be colored by other details such as experience in the product, satisfaction with their assignments or the product, workload, and individual personality traits. For these reasons, it is important to structure the reaction sheet so that it guides the awareness of the important choices made for presenting the material to the students so that they can provide useful data to the evaluator. A simple example for Measuring Learner Reaction might be similar the following (Table 4 Topic: Learner Reaction Evaluation Form.)


Learner Reaction Evaluation Form Example
Question Not Satisfactory Satisfactory More than Satisfactory Additional Comments
1. How did the course answer your questions as to the primary differences between the ABC and its replacement, the XYZ ?        
2. How did the course avoid extraneous material?        
3. Where large amounts of information were presented, how did the course present you with the relevancy of that information?        
4. How was the on-screen text in adding to your understanding of the concept?        
5. How was the narration in adding to your understanding of the concept?        
6. How were the accompanying workbook and downloaded materials in adding to your understanding of the concept?        
7. Do you have any comments or concerns about the material not addressed in these questions?  

Creating a Variable Range Scale

In properly quantify any data from created evaluation tools, it is necessary, especially in feedback instances, to create some type of variable range scale to give relative weight to the reactions. By example, for the Learner Reaction form, we might provide a range as follows (Table 5: Topic: Variable Range of Reaction Values.)


Variable Range of Reaction Values
Selection Value
Not Satisfactory 0
Satisfactory 1
More Than Satisfactory 2

The Implication of Individual Change on the Success of the Organization

If behavioral change in the individual learners can be measured, then the efforts of a learner to modify their environment through application of that change can be measured. Taken in aggregate through enough agents of change, a significant organizational impact can be felt (Smith, 2001.) The question is how change can be most effectively measured. Part of the answer is to include internal, through the learner, and external, through a manager, evaluations that measure direct application of the changed behavior and the results. In this case, the goal is to measure how the field technician applies their training, what impact they perceive it has, and how their manager perceives that impact.


Organization Results Measurement

Though Return On Investment (ROI), is more easily quantifiable, a direct cost-to- results ratio is not always the best initial evaluation step for training within an organization. One concept about business organizations that is often misunderstood by those outside of the business arena is that the goal of any successful, legitimate business is to be the best business organization possible. In training representatives of that business, it is important that the results reinforce the business plan being followed by that organization.(Table 6: Topic: Differences Impact Survey.)


Differences Impact Survey Results Estimation
Event Percentage Additional Comments
If it occurred, what was the change of X from the previous quarter? +/- %  
To what percent did the revenue related to uptime usage change from last quarter? +/- %  
What was the change in secondary results since last quarter? +/- %  

The Requirement to Move Beyond Impressions

Measuring impressions is valuable and productive for this project for improving long-term results within the organization, as is using those results to extend the useful life-span of trained skills and behaviors (Rands, 2007). However, at some juncture, it necessary to arrive at more of a cost/benefit ratio to continue the discussion with upper-management that may not directly equate training with continued improvement for the technical services division. In order to put the findings into a more direct context for this group, a Return On Investment, or ROI approach is more applicable.


ROI Measurement

Though so often used in the business parlance in the United States that it has become a common phrase in general speaking, ROI is intended to represent a specific comparison between measurable expenditures and results, in an effort to give some approximation of the inherent cost-effectiveness of and undertaking. In this case the ROI formulas are being used to evaluate the proper alignment of resources necessary for the production of this type of training, and to gain insight into areas for greater efficiency and cost-effective strategies, and their benefits for the company as can be directly measured. ROI is typically calculated for both a percentage return, based upon expenditures vs. savings or profits, and as a cost/benefits ratio (Sheperd, 1999.)


Limitations of This Type of Measurement

There exist several limitations to this approach and its results in this instance. Primarily, instruments such as this attempt to evaluate the measurable cost of directly-applied budget to the creation of the resulting course material. It does not take into account the overhead costs for the administrative personnel, subject-matter experts, technical support personnel, or persistent technology costs, such as the licensing of tools, media, storage, distribution, communication, tracking or measuring that is not used exclusively for the course (Kruse, 2004.)(Table 7: Topic: Hard Data Elements).


Element Assigned Value Total Cost
Project Management Work Hours (n) $x/hour $x*n
Content Research Work Hours (n) $x/hour $x*n
Curriculum Developer Work Hours (n) $x/hour $x*n
Layout and Design Work Hours(n) $x/hour $x*n
Additional image Assets (n) $x/image $x*n
Audio production (n minutes) $x/minute $x*n
Animation(n)/Simulation Assets(n) $x per animation / $x per simulation $x*n
Testing/Review/Evaluation Hours (n) $x/hour $x*n
LMS Placement Hours (n) $x/hour $x*n
Total  

Soft Data Elements

These are sample costs that may not be directly tracked.

  • Subject Matter Expert (SME) production milestone evaluation and input
  • LMS administration
  • Review cycles
  • Web services and support group

ROI Impact Results

Using the standard evaluation measurement for ROI, calculate the hard data results.: ROI = (benefits - costs / costs x 100)


Cost/Benefits Ratio

Using the standard formula of Benefit/Cost Ratio, calculate the soft data results: Benefit/Cost Ratio = (benefits/costs)


Conclusion

The application of the Kirkpatrick levels in this manner allow for the stratification and isolation of development, evaluation and interpretation of a course and its recipients to a more granular level than might be normally feasible in a typical production environment.

The benefit of this is to allow serious and conscious thought to application of theoretical educational ideas in a corporate environment, which does not produce education as a product, but rather uses education to further the business model.

Evaluation plans such as being evaluated here should be a best approach example using a corporation model of training production, delivery and evaluation, with the training goal being the increase of sales and improvement of service. Though the Kirkpatrick levels may applied as distinct evaluations, such distinctions are not necessarily absolute. For example, the distinction between organizational results and ROI would most likely become blurred in any ongoing evaluation and discussion during post review stages of the effectiveness of the program. Evaluation is mutable based on the needs of the business process.

The greatest benefit achieved from this type of examination and exploration is a clearer understanding of the accepted tools available and their current interpretations and limitations, in order to build organization-specific tools that meet the business process need as accurately as possible.

References

Bradley, C., Martin, O. (2002). Developing e-learning courses for work-based learning. Retrieved October 22, 2007, from www2002.org Web site: http://www2002.org/CDROM/alternate/703/

Kirkpatrick, Kirkpatrick, D. L. & J. D. (2006). Evaluating training programs. San Francisco, CA.: Berrett-Koehler Publishers, Inc.

Kruse, K. (2004). Measuring the total cost of e-learning. Retrieved November 21, 2007, from http://www.e-learningguru.com/articles/art5_2.htm

Rands, A. (2007). Extending the half-life of training. Training Journal,40-43. Retrieved November 15, 2007, from ABI/INFORM Global database. (Document ID: 1214903871).

Sheperd, K. (1999). Evaluating Online Learning. Retrieved November 20, 2007, from Fastrak Consulting Web site: http://www.fastrak-consulting.co.uk/tactix/Features/evaluate/evaluate.htm

Smith, M. (2001). 360-degree assessment: To see ourselves as others see us. Training Journal, 26. Retrieved November 15, 2007, from ABI/INFORM Global database. (Document ID: 75379408).



Return to Top