9

Course Evaluation and Improvement


Study to show thyself approved.
II Timothy, 2:15

 

9.1  INTRODUCTION

We now turn from student assessment to course evaluation. In its most literal sense, evaluation is judgement of something's worth. Measurement, of course, is part of the process of carrying out an evaluation, but there are two other essential ingredients besides this. First, evaluation involves making value judgements about what constitutes 'good' or 'bad' in our courses and teaching. Second, evaluation itself has little purpose unless it is to be followed by adequate review of the findings and appropriate action where necessary.

In this chapter, we show how evaluation methods can be used for a variety of related purposes: to diagnose course problems, to identify the reasons for the success of particular courses and teaching methods, to guide course innovations and to demonstrate course effectiveness. Our aim is to explore some of the more useful evaluation methods currently available, provide practical guidance on how to implement them, report on how they have been applied in geography courses and draw out broader implications wherever appropriate. In doing so, we positively endorse the value of evaluation exercises in promoting opportunities for effective student feedback and, in turn, the health of teaching, advocating a culture which sees evaluation as beneficial rather than threatening.

The methods that we discuss in this chapter will be grouped according to the objects that can be evaluated, such as courses, teaching methods and materials, and teachers. There are many other ways of thinking about what can be evaluated. For example, the pioneering Association of American Geographers study on college evaluation (Hastings et al., 1970) identifies the primary objects of evaluation as the context, content, process and outcomes of instruction.

 

Table 9.1  The objects, methods and personnel of evaluation

  Who
 
What Students Teachers Peers Outsiders

Lesson D OIJ O O
Module QD JG I I
Course QD G I I
Teaching method   OJT OT OT
Teaching material   OJT OT OT
Teacher Q J OI OI
Student J G D  
Department DQI DJ   DO

J, journal; Q, questionnaire; O, observation; I, interview; D, discussion; G, grades analysis; T, testing

Inevitably, there is a degree of overlap between the categories that we have chosen; course evaluations, for example, usually include judgements on both teaching methods and teachers. Nevertheless, we hope that our framework matches the common experience of our readers. Within each section, we describe methods that are available for evaluating a particular 'object', and we shall identify the people who are best qualified to carry out the evaluations. The relationship between the objects, methods and personnel of evaluation is summarized in table 9.1.

 

9.2  EVALUATING COURSES

Student questionnaires

It's no good a teacher claiming that his or her teaching is successful if the students don't corroborate the claim. (Walford, 1979: 54)

Examining students' feelings about courses provides a useful starting point for our exploration of evaluation methods. Fox and Wilkinson (1977: 67), for instance, claim that students' feelings about a course are important, since, if they dislike it 'they are unlikely to be as well-motivated as when they are generally receptive to the ideas and approach of the course'. In some countries, direct student feedback constitutes a major component of higher education course evaluation (Flood Page, 1974): indeed, in the USA, it is frequently the only form of evaluation used (Cranton and Smith, 1986). One way of acquiring course feedback from students is to ask them to fill in a loosely structured form of the type illustrated in figure 9.1. By this means, students can be asked to comment both on the delivery of a course (i.e. the teaching performance and the course content) and on what they feel they gained from it. This type of form is useful for picking up unforeseen issues and problems. Because of its simplicity and brevity, it can be used to good effect at convenient points in a course, such as at the end of a 4-week sequence of lectures on a particular topic or after a 6-week block of practical classes. Used in this way, areas of weakness in a course can be pinpointed and rectified while the course is still being taught.

Figure 9.1  An informal student evaluation feedback form

 

At the end of a course, a more detailed and structured questionnaire can be issued which addresses all aspects of the course, including content and structure, teacher performance and attitude, student learning gains and problems. Many such questionnaires can be purchased from educational suppliers, especially in North America, or can be designed by teachers themselves or by the college's educational methods unit (if there is one). Custom-designed questionnaires will ideally build on the results of earlier student feedback through less formal means, with Doyle (1975) providing a guide to some of the key technical issues (especially reliability, validity, generalizability and utility) that are involved in the design of effective student ratings. Some colleges store material for a comprehensive questionnaire on a word-processor, allowing tailored versions to be run off to meet the particular needs of individual courses.

However the questionnaire is produced, students are typically presented with a series of questions to which they are asked to provide a numerical rating, usually in the form of a standard 5-point scale. Where student attitudes are being elicited, it is common for questions to be phrased as statements, to which students are asked to record the extent of their agreement or disagreement (e.g. 'the teacher is enthusiastic about his or her subject' or 'the course helped me to think through contemporary issues more clearly'). Such questionnaires are commonly divided into sections dealing with particular aspects of a course, and figure 9.2 illustrates a typical example (see also Hastings et al., (1970) for examples of questionnaires designed for geography courses and Gibbs and Haigh (1983a,b) for a discussion of the principles behind their design). It is also useful to include in such questionnaires broader summary questions such as 'overall, how would you rate this course?' or 'would you recommend this course to next year's students?'

Among the advantages of using structured student questionnaires are that they are easy to administer and computerize, producing clearly tabulated results. There is the danger, however, that students are reduced to merely ticking boxes. Some commercially produced student evaluation forms, for example, restrict students to making soft pencil marks in small oval shapes ready for optical input to a computer. It is essential that this type of exercise be given a human face, that it provides students with an opportunity to reveal their own particular concerns and that it is alert to the many unpredictable consequences of teaching and learning. For this latter reason, the questionnaire should always include 'open' questions in each section, and perhaps a 'free comment' section at the end. This will be of particular value to those implementing innovative courses. The benefits of such feedback are illustrated in Jackson's (1989) account of his teaching of racism in an urban geography course.

Figure 9.2  A formal student evaluation questionnaire

 

The vast majority of course questionnaires are of the checklist variety. They attempt to assess the level of student contentment or dissatisfaction and help to identify unexpected or potential trouble spots, but they rarely probe deeply into student attitudes or examine their approaches to study. One major exception to this rule, however, is provided by the 'Approaches to Studying' and 'Perceptions of Course' questionnaires developed at the University of Lancaster (Entwistle and Ramsden, 1983). These course evaluation questionnaires, which have become standard instruments for course evaluation in several countries, are concerned with students' approaches to learning. They attempt to distinguish between students who adopt a 'deep' approach as opposed to a 'surface' approach to their study. By applying the questionnaires before and after students take a course, it is possible to determine whether the teaching strategy adopted on the course has had any impact on the study approaches adopted by students.

In the Lancaster system, questionnaire scores are used to quantify particular 'scales', such as whether students are adopting a 'meaning orientation' or a 'reproducing orientation' in their study. This analysis is performed in two stages. First, scores from individual questions are combined in batches of between three and five to build 'subscales', and then scores from between three and four subscales are aggregated to form higher level scales. Therefore, for example, the meaning orientation scale (which identifies those students who intend to understand what they are studying rather than simply to reproduce it) is created from the following four subscales: deep approach, relating ideas, use of evidence and intrinsic motivation. The entire system of scales and subscales, together with the questionnaires used to derive them, is presented by Entwistle and Ramsden (1983).

Haigh (1986) provides an example of the application of these questionnaires in evaluating an innovative introductory course in physical geography. Rather than teach first year students various disparate elements of physical geography, Haigh organized his course according to the integrating concepts of general systems theory, illustrating these with themes from physical geography. His aim was to encourage students to adopt an integrative approach to their study, and to encourage them to move away from compartmentalized thinking and superficial memorization. It was to test whether these goals had been achieved that the Lancaster questionnaires were used.

The results of Haigh's evaluation were both disappointing and encouraging. Although the evaluation exercise revealed no overall increase in the meaning orientation scale, statistical analysis of the questionnaire results revealed that this general result masked a difference between the non-academic students, who retreated into memorization, and the better students who improved in their ability to think integratively.

We can now offer a number of practical suggestions for implementing student questionnaires.

Teachers new to this type of exercise can find it deeply disturbing to read highly critical comments about their teaching. They may become angry at blunt student comment or retreat behind the defence of 'teacher knows best', but a more dangerous reaction is to choose to ignore issues raised by students. A single barbed comment might well be from the pen of a disaffected student, but a pattern of consistent critical comments has something important to say. One way around the problem of teacher sensitivity is to ask a trusted colleague, perhaps in another department, to analyse the individual forms and to present a summary of their contents, starting with points of appreciation and ending with matters that could be improved. Any personally disparaging comments can be removed.

There are several ways of ensuring that the most effective use is made of student feedback information.

 

Conversations with students

One of the most sobering experiences in teaching occurs when you realize that you have taught a class for an entire year, yet know next to nothing about the progress that individual students may have made or the learning difficulties that they may have encountered during that time. If course evaluation is to be at all meaningful, then it must come to grips with this problem and try to assess the functioning of a course in terms of the students' personal experiences.

Unfortunately, the way that higher education is organized does not make it easy to monitor individual students closely. Teachers often meet a group of students only once a week, making it difficult to identify the specific learning difficulties or course-related problems of individuals. This contrasts strongly with early stages of school education, where groups of pupils are taken through the entire year by a single teacher and where the progress of each child can be monitored almost continuously.

Teachers can identify the problems that individual students are encountering on particular courses simply by listening to what they have to say (Fink, 1977). Guidelines for this form of evaluation include the following.

 

Small Group Discussions

Student opinion can also be acquired by setting time for informal but structured discussions with groups of students. One way of doing this is to hold a 'round-table' discussion, similar to a seminar, but where the topic for discussion is the course itself rather than some element of geography. The mechanics are as follows.

An alternative approach is to meet with the entire class, and adopt the 'pyramiding' style of discussion advocated by Gibbs (1982).

This approach ensures that individual students have time to think about their personal reactions before being immersed in group discussion. It also ensures that minority views on a course are given less attention than those held by a majority of students. Jenkins and Youngs (1983) used structured discussions of this kind in their 10-week course on film in geography, as part of a strategy of evaluating a course before, during and after it was run.

 

  Strengths, good points Weaknesses, bad points
 
About the course
 
   
 
About the teaching
 
   
 
About yourself
 
   

Figure 9.3  Student reaction sheet for classroom discussion

 

Course open house

The evaluation methods discussed so far are most appropriately directed at individual courses and are largely student based. We turn now to an evaluation method that is more appropriately aimed at an entire degree programme, and involves teachers as well as students.

The idea is simple. As many students and their teachers as is practicable meet together in a room for a fixed period of time and air their views, put questions to each other and exchange comments and suggestions. Although this may sound impractical, chaotic and liable only to generate hostility, these impressions are usually dispelled after the first experimental meeting. Experience suggests that these group discussions can perform several useful functions.

To get the best out of these meetings, the following ground rules need to be observed.

This type of event must not be overdone one or two such meetings a year are probably ample. The final requirement is a procedure for taking up the issues raised in the open house session, for identifying which of the problems discussed need to be examined further, for generating solutions to significant problems and for putting these solutions into effect. As with other diagnostic evaluation methods, action should be seen to be taken as a result of these meetings, and feedback to all staff and students must take place.

The course evaluation methods discussed in this section have ranged across a broad spectrum: non-interactive feedback from individual students, one-to-one and one-to-many discussions between students and their teachers, and many to-many meetings of everyone associated with a course. Each of these methods involves listening to what learners have to say and, together, they can tell us a great deal about our courses. In putting them forward, we fully recognize that student evaluation causes unease, especially for those who work in systems where such evaluation is not common. We also recognize that it is commonly argued that evaluation is unnecessary, because its function is fulfilled by other mechanisms, or that students cannot evaluate a course properly since they only have narrow objectives and, as a consequence, their views on such issues as course content, level of difficulty, study methods and the like will be coloured by short-term objectives.

While taking these points seriously, we would argue that, in practice, both sets of objections are primarily defensive and contrary to experience. The widespread acceptance of student evaluation in North America would certainly suggest a large measure of support in higher education for the idea of listening to the views of those who are the immediate 'consumers'.

However, one further objection is raised, namely that student-centred course evaluations only tell us what students think that they have learned from a course, or how successful they feel that the course has been. Critics claim that the acid test is how the students actually perform when assessed in these courses. This criticism, which centres around what assessment results can tell us about course quality and how those results can be analysed to derive this information, needs discussion in its own right.

 

Using assessment information for evaluation

The nature of assessment was discussed in the previous chapter, but here we explore the view that 'assessment can be regarded as a way of testing [ course ] effectiveness, that is as a diagnosis of faults.... It is a means to an end, the end being the improvement of the curriculum' (Humphrys, 1978: 82). However, while assessment is an indicator that things may or may not be well with a course, it will probably only provide diagnostic information at the extremes. In other words, it may help to identify excellent courses and very bad courses, but will fail to say anything significant about the majority of courses in between. Examination results, in particular, will at best only provide a starting point in the evaluation of courses; they will have little to say about how to improve the curriculum and teaching (Torrance, 1986). Certainly, the following reasons would suggest that it is a mistake to rely exclusively on such information as an indicator of the success or failure of courses.

Despite these limitations, assessment information should not be entirely neglected for evaluation purposes because, first, it is regularly available for virtually every course and, second, its periodic availability can provide an appropriate stimulus for teachers to explore ways in which they can improve their courses. Therefore the question is: How can we make the most effective use of assessment information for course evaluation purposes?

One readily apparent approach is to compare the results gained on the constituent course units in a degree programme. An example is reported by Moffatt (1986), who evaluated his course in computer modelling of environmental systems by comparing each student's assessment on this course with their overall degree results. The high degree of correlation between the two, over a period of 4 years, was held to indicate that the students' results in the modelling course was broadly in line with their overall ability as indicated by their final degree classification.

Geography teachers at Carleton University adopted a different approach. They hoped that student exposure to an innovative self-paced mastery course in the first year would create improved standards of learning, and that these would be reflected in improved grades awarded for subsequent courses in the second year (Fox and Wilkinson, 1977). They examined student grades to see if this was the case, and in an ensuing course review (Fox et al., 1987) they also compared the grades of students taught in the traditional way with those taught by the new approach. However, the authors urge caution in attempting such comparisons if, as in their case, there are significant differences not only in the teaching methods used, but also in the instructors, course materials, assignments, examinations and markers. We shall return to this issue a little later when we examine controlled tests of teaching methods.

 

Timing

Before leaving our discussion of course evaluation methods, it is important to consider a preliminary issue, namely, the question of timing. In much the same manner as for assessment (see chapter 8), an end of course timing is generally accepted as the norm, but there are also other times at which evaluation can usefully be carried out.

'Pre-course' evaluation  The 'brainstorming' technique described in chapter 5 can effectively be used to discover the amount of pre-knowledge of the subject to be covered by the course, and to elicit students' preconceptions and expectations from the course. This information can be used to modify the course content and orientation before it begins, and can also serve as a baseline against which information from later evaluations can be compared. This type of evaluation was successfully adopted on an economic geography course at the University of Chicago (Fink, 1973).

'Mid course' evaluation  An evaluation undertaken half-way through a course provides findings which can be used to improve that course to the benefit of the current cohort of students. Mid-course evaluations were used in the exercise reported by Fink (1973), and also in the film-based geography course described by Jenkins and Youngs (1983). In the latter course, a mid-course evaluation revealed that students saw little connection between the course themes and geography per se. As a result, the content of a subsequent class was changed, and a simulation exercise was constructed to get over this difficulty.

There is a major issue of principle here. Each group of students that takes a particular course has a distinctive ethos and its own special mix of learning styles, problems and abilities. Consequently, it may be far more sensible to use the results of a mid-course evaluation to improve the course for the current cohort of students than to use the findings from an end of course evaluation to modify the course for subsequent student groups.

'One year later' evaluation  Here students are asked to evaluate a course that they had completed a year earlier. This can be useful because it tends to pick up broader issues about course effectiveness and can identify those elements of a course, if any, which have had a lasting influence on students. This technique can be used to monitor individual course units by interviewing final year students about their first year courses, and can also be used to evaluate an entire degree programme by interviewing students a year after they have graduated. The latter exercise can be implemented when graduates visit the institution for a degree awards ceremony or through a mailed questionnaire distributed by an alumni association. This type of evaluation is frequently able to provide information on the value of a course as a preparation for life or, more narrowly, for subsequent employment. A complementary approach, illustrated by Unwin (1986), is for academics to ask employers for their opinions of former students and to use this information to re-orient courses.

 

9.3  EVALUATING TEACHING

There are several ways in which teaching and teaching methods can be evaluated. We shall begin our survey with a naturalistic method-the teaching diary-before moving on to consider systematic data-gathering exercises and then rigorously controlled evaluation experiments.

 

The teaching diary

Most teachers emerge from a class with clear feelings about how it went. The teaching diary or journal is one way in which these reflections, comments and ratings can be recorded for diagnostic use. Several items can be recorded in a teaching diary about each class: the general context of the class (time, date, teaching methods used, subject matter covered); comments on specific teaching issues (e.g. general student group attitudes or problems raised on a particular topic); teaching problems that occurred and their possible causes; suggestions on how the class might be improved next time round.

It is tempting to use a teaching diary to record gut reactions to each class: 'That hour just dragged - none of them said anything'; 'I rather enjoyed that. I covered everything I intended to say, and they all seemed interested for the entire hour'; 'That practical exercise just doesn't work. I think I'll abandon it next year.' There are two reasons, however, why comments such as these may not be too helpful as a basis for evaluating your teaching.

First, many of your impressions might not be shared by your students. Therefore, before writing anything down, check out your perceptions of how a class went by talking it through with some of those on the 'receiving end'. For example, your 'excellent' lecture may not have had the positive impact you expected, but your 'flat' seminar or 'chaotic' workshop session might have scored an unexpected hit.

Second, the 'stream of consciousness' type of comment does little to explain what has happened in the classroom and why. It is far better to use the diary to record critical analyses of your teaching, rather than become a repository of broad unexamined value judgements. The most useful comments are those which point more clearly to causes and possible solutions to classroom problems, or that attempt to pin down why a particular approach seems to have worked.

The three principal advantages of keeping a teaching diary are its immediacy, with its contents undistorted by hindsight, its richness of data and its encouragement of a generally reflective approach towards teaching. These can be used directly as the basis for improvements in teaching, identifying, say, the amount of material included in teaching sessions, the way ideas are sequenced in a class, the most appropriate mix of teaching methods and the 'sticking points' at which students most frequently experience learning difficulties. If all else fails, and you are stuck for a remedy, follow Unwin's (1984) example and write up your reflections for a geographical or educational journal. You may be pleasantly surprised at the helpful advice that other readers are able to offer.

 

Systematic observation

Systematic observation of classroom behaviour can furnish valuable information on which to base improvements in teaching. It can be carried out by teachers themselves when students are engaged in activities such as simulations or practicals, or undertaken by a colleague during activities when the teacher is more centrally involved. The data produced depend on the type of observational method that is being employed. The observer, for example, may just be recording simple frequency counts (e.g. how many times students appear to be engaged in a particular desired activity) or timing activities to build up detailed 'activity budgets' for individual students or small groups of students who are working together.

An alternative method of undertaking detailed classroom observations is to use a video recorder. Video-taping has one major advantage over standard observation methods: the recording can be viewed over and over again in order to decide which activities are really going on in the classroom. As many colleges have a video-recording service on campus, and some have consultant services to help with educational analysis of tapes, this type of approach need not be as daunting as it might first appear.

Video-recording can also be employed to provide immediate feedback on teaching and learning behaviour in order to improve classroom interaction. This was attempted by Hollis and Terry (1977) when introducing small group discussion sessions into a geography practical course in the University of London. In their experiments, a selection of sessions were recorded and played back to the groups involved. This helped both the teacher and the students to clarify the previously unclear objectives of the discussion sessions, and further review of the tapes helped the teacher to establish more effective ways to begin the discussions. It was also found that playing the tapes encouraged the participants to feel a greater sense of shared responsibility for making a success of subsequent discussion sessions.

 

Student interviews

Interviews with students can provide complementary information to that gathered by classroom observation. Students can be interviewed at various times and for a variety of purposes. They can, for example, be interviewed before they engage in a teaching or learning activity in order to identify their expectations and attitudes. They might be interviewed again during teaching or learning activities to elicit perceptions of what they are learning. Finally, students can be interviewed after teaching has been completed in order to check on their learning experiences and to identify possible learning outcomes.

Evaluation interviews should largely consist of open-ended questions, and should not be dominated by pre-coded questions that are normally found in self-completion questionnaires. The object ofthe exercise, above all, is to elicit the unexpected, not to confirm the teacher's prior expectations. For this reason, and also to encourage greater frankness in student responses, it is recommended that evaluation interviews are carried out by those not directly involved in teaching the course.

Several examples of the use of student interviewing to gauge the effect of geography courses have been published in the literature. Jenkins and Youngs (1983), for instance, used student interviews alongside structured questionnaires to determine student perceptions of a geography course on film. Interviewing a sample of students is highly recommended whenever a new approach to teaching is being adopted, and is especially useful for evaluating the first-time use of new teaching materials.

 

Controlled tests

To many educational researchers, evaluation is essentially a form of hypothesis testing. It should not come as a surprise, therefore, to find that the entire paraphernalia of the scientific method has found its way into educational evaluation. One of the fundamental questions asked in evaluation studies is: 'How well does this particular teaching method perform its intended role?'. Converted into a testable hypothesis, this can be re-expressed in the following way: 'Teaching topic A by method B will produce outcome C in student group D.' The standard scientific approach to testing such hypotheses is to establish students' existing knowledge before teaching takes place, by means of a pre-test, and then to identify, by means of a post-test carried out after the teaching has been completed, whether the intended learning has been achieved.

This method goes beyond the rather impressionistic approach of the teaching diary or the loosely structured approach of classroom observation and student interviewing. Evaluating teaching by means of controlled tests attempts to glean 'hard' and unequivocal evidence about what happens as a result of using a particular teaching method. The general procedure for applying this approach has been developed from experimental science, and involves the principles and methods of statistically grounded quantitative research. The six steps to be taken in carrying out a formal test of a teaching method are as follows.

  1. Specify the aims and objectives of the teaching.
  2. Adopt teaching method(s) that you believe may best achieve these aims and objectives.
  3. Design pre-test questions to determine the prior knowledge of the students, and design post-test questions to measure subsequent attainment of the objectives.
  4. Carry out the teaching and implement the tests. If appropriate, monitor and record student performance during the teaching.
  5. Compare the results of the two tests statistically in order to measure the extent of student learning gains. Draw suitable conclusions.
  6. If the learning objectives are not attained, modify or replace the chosen teaching method(s). Stating the learning objectives required from your teaching may not be as straightforward as it seems. Bloom et al. (1971: see also chapter 8), for example, suggest that a learning objective should not be a statement of the content matter to be covered (e.g. glacial landforms or urban ghettos), despite the frequent appearance of such statements in course syllabuses. Nor should it be a statement of what the teacher intends to do (e.g. to demonstrate the skill of using a soil auger or to provide an overview of government industrial location policy). Rather, a learning objective should focus on the way in which it is intended that the students should change in response to the teaching or learning activity. Although objectives do not necessarily have to be highly specific, they are usually more helpful if clearly and unambiguously stated and testable by the chosen evaluation procedures. Aspects of learning objectives are explored further in chapter 10 on curriculum design.

Setting up a suitable pre- and post-test presents additional problems. An acceptable test should include a reasonably large number of test items, and these should be chosen so as to probe those aspects of learning that are stated in the evaluation objectives. A number of standard tests are commercially available for 'mainstream' curricula such as mathematics and languages, but few are available for geography (see, however, Marsden,1976).

A third problem concerns the validity of test results. Even if testing suggests that students have changed in the way intended, can this be attributed to teaching alone? For example, an evaluation of a course on natural hazards might reveal that students had grasped the mechanisms involved in situations of environmental damage from human activities, yet that understanding might have been due as much to a major television documentary on pollution that was broadcast during the trial period as to the teaching methods used. A more general threat to the validity of test findings comes from the testing itself. It is well known, for example, that the experience of doing pre-tests can sometimes help students do better on the matched post-test and can alert them to the content and expectations of the course. What this means is that in order to make a valid claim for the efficacy of your teaching, you have to ensure that all such rival explanations are eliminated.

Cooke et al. (1980) used the testing approach to determine whether their use of the Visual Response System to teach basic map skills was effective. To the
usual pre-test and post-test they also added further 'post-checks' at 2 and 6 weeks after completion of their teaching. The runs of scores across the four sets of tests indicated successful acquisition of the skills taught by this particular method.

Not all teaching methods are susceptible to evaluation by formal testing. In particular, new teaching methods and materials cannot always be evaluated simply by applying pre-tests and post-tests. This is because the full range of learning effects to be achieved with these methods and materials are often not yet known. Evaluation in these circumstances therefore needs to take a different tack, aiming to identify the learning outcomes that can be achieved, rather than to measure the match between expected and actual learning gains.

 

9.4  EVALUATING TEACHING MATERIALS

Many geography teachers use a mixture of resources in their teaching (e.g. overhead and slide projectors, video and film, computer software), and prescribe other materials for use by students (e.g. resource-based learning materials, textbooks, programmed instruction). There are several aspects to evaluating these materials, including the decision to adopt, evaluation of materials during their use and the formative evaluation used during the design of new teaching materials. (Formative evaluation, which seeks to form or mould the object being evaluated to bring about improvements, is considered further in chapter 10.)

The process of identifying appropriate materials for adoption on a course is a perennial concern, and is one of the most common faces of evaluation, but how are educational resources best evaluated for potential use in teaching? Most educational journals support a third party evaluation industry, in that the 'reviews' or 'resources' section of such publications regularly report the views of practising teachers on textbooks, audiovisual materials, laboratory and field equipment and, more recently, educational software. Yet there are several reasons why such reviews do not always serve the needs of practising geography teachers.

The main conclusion to draw from this is that the only effective evaluations of prospective teaching material are ones that are carried out by the intending users and that focus directly on its suitability for their teaching. We shall now describe some techniques for carrying out such evaluations.

 

Evaluating individual teaching materials

A possible procedure for selecting individual teaching materials can be illustrated by looking at educational computer software. A large amount of educational software has appeared in recent years aimed at teachers of geography at various levels, and poses considerable problems as to what materials will be suitable for use in a course, how they will assist student learning and how they will affect the existing pattern of teaching.

One of the simplest ways round these problems is to run through a checklist of questions, answering each of them in order to arrive at a decision. Table 9.2 shows one such list, based on a form devised by the American educational software clearing house CONDUIT (see also MicroSIFT, 1982; Council of Ministers of Education, 1985). Note particularly how technical subject matter and teaching issues are clearly separated in this list and how the user is encouraged to rate each attribute.

 

Table 9.1  A computer assisted learning evaluation checklist

Criteria Quality   Importance

Subject matter content
Definition of key concepts
1 2 3 4 5   A B C D
Discussion of underlying assumptions
1 2 3 4 5   A B C D
Validity of theories, principles, techniques, facts
1 2 3 4 5   A B C D
Guide to relevant literature
1 2 3 4 5   A B C D
Overall quality of subject matter content
1 2 3 4 5   A B C D
Use documentation
Clarity of presentation
1 2 3 4 5   A B C D
Completeness
1 2 3 4 5   A B C D
Adequacy of instructions for operating the software
1 2 3 4 5   A B C D
Documentation for different users (e.g. teachers, students)
1 2 3 4 5   A B C D
Worksheets and other teaching materials
1 2 3 4 5   A B C D
Consistency with the accompanying program
1 2 3 4 5   A B C D
Overall quality of documentation
1 2 3 4 5   A B C D
Educational Support
Ease of integration with teaching and learning styles
1 2 3 4 5   A B C D
Potential for improving grasp of principles and theories
1 2 3 4 5   A B C D
Potential for improving facility with methods and techniques
1 2 3 4 5   A B C D
Potential for improving retention and recall of knowledge
1 2 3 4 5   A B C D
Overall quality of educational support
1 2 3 4 5   A B C D
Student motivation
Potential for capturing student interest
1 2 3 4 5   A B C D
Ability to stimulate student creativity
1 2 3 4 5   A B C D
Appropriateness for student-centred work
1 2 3 4 5   A B C D
Overall quality for student motivation
1 2 3 4 5   A B C D
Software
Freedom from errors
1 2 3 4 5   A B C D
Ease of use
1 2 3 4 5   A B C D
Compatibility with available computer hardware
1 2 3 4 5   A B C D
Overall quality of software
1 2 3 4 5   A B C D
Overall evaluation

Quality ratings: 1, excellent; 2, very good; 3, average; 4, poor; 5, bad. Importance ratings: A, critical; B, important; C, optional; D, inappropriate.

 

This type of evaluation can suggest various forms of action. For example, an item of educational software is technically incompatible with existing hardware but scores well in terms of it's teaching potential, then it might well be worth either purchasing appropriate computer equipment or converting the software to run on available hardware.

 

Comparative evaluation of teaching materials

There is much to gain from evaluating related teaching materials comparatively rather than individually. Salisbury (1981), for example, carried out a comparative review of the subject matter and teaching style of a number of introductory physical geography textbooks. In the USA, the National Council for Geographic Education has published a textbook evaluation form that can also be used for this purpose. A similar approach used for the evaluation of educational computer software is found in the CCCEM model produced by the University of Florida, which provides descriptive ratings of programs on the basis of physical, presentation, instruction and management characteristics (Micceri, 1989). It is claimed that the use of objective ratings provides a reliable and consistent evaluation of software, and avoids some of the vagaries of human judgement typical of many standard published reviews.

At first glance this might appear a somewhat trivial approach to evaluation, but we would suggest that its usefulness is partly due to its simplicity. Moreover, systematically derived summary tables can draw a teacher's attention to gaps in subject matter or educational approaches that the reading of individual books might miss. To make the most of such an evaluation, it is important not to focus exclusively on content, but to consider elements that are specifically designed to help students learn, such as self-assessed questions, data-gathering activities, reading assignments and data response exercises. Above all, it is important for each teacher to consider how easily each book can be adapted for use on their particular course and their own particular teaching style, or whether the adoption of a significant new teaching resource will itself create the need for change.

 

Evaluating teaching materials during their design

Evaluation methods can also be used during the design of new teaching materials to test whether they will perform satisfactorily in use (termed 'formative evaluation' above). In particular, several techniques were pioneered in the UK during the 1970s to provide 'illuminative' information during the design of new curricula (Tawney, 1976).

Whitelegg (1982), for instance, describes the evaluation of video-recordings that were used experimentally in a first year geography practical class. Besides providing information used to improve the video material, this exercise raised a number of significant methodological issues about the evaluation itself, such as how to evaluate a teaching material if it is used in a class by a different teacher from the rest of the course. Watson (1987) reports on the rigorous cycle of evaluation and testing that educational software development teams now build into their design efforts; indeed, in some items of educational software, the evaluation of the program's performance is built into the program itself.

Recent developments in intelligent educational software have taken this a step further. So-called 'intelligent tutoring systems' have been developed that are able to give a continuous evaluation of their own behaviour in relation to student performance, and adjust their instructional strategy accordingly (Sleeman and Brown, 1982 Polson and Richardson, 1987 Wenger,1987). In such developments, evaluation is no longer a separate activityit has become an integral part of the design and delivery of the curriculum (see chapter 10).

 

9.5  APPRAISING TEACHERS

This chapter has focused on evaluating teaching and improving courses, but teaching is also about teachers. Although we recognize that there are those who view teacher appraisal as a weapon that can be used against teachers by those in charge of the educational system, we believe that sensitive and sensible teacher appraisal should be a fundamental part of course improvement. There are three principal ways (Braskamp, 1980) in which this can be carried out: self-evaluation, peer appraisal and student assessment.

 

Self-evaluation

Self-evaluation has the considerable advantage of being non-comparative and non-competitive, and of allowing features that are unique to each teacher to come out into the open. One way that this can be achieved is by use of teaching diaries in which the teacher can record self-evaluations (see above). A more formal alternative is to produce a periodic (e.g. annual) written narrative of course objectives, teaching strategies and perceptions of student performance, which can be collated and reviewed at course or departmental level.

Critics argue that self-evaluation will be self-serving at best, and at worst will camouflage poor teachers. Although the research evidence (Braskamp, 1980) suggests that self-evaluations are generally not overfavourable to the teachers who make them, some argue for a method of self-evaluation that produces information capable of being checked for validity.

One such method is for teachers to fill in complementary versions of the course questionnaires distributed to students. The students are asked: 'Have the major objectives of the course been made clear by the teacher?' The teacher is asked: 'To what extent have you made the major objectives of you course clear to students?' Similarly, the student question 'Does the teacher encourage critical thinking and analysis?' is replaced on the teacher's form by 'To what extent have you encouraged critical thinking and analysis?' A comparative analysis of the teacher and student scores on each question will reveal discrepancies between the teacher's intentions and students' perceptions, and indicate areas where further effort is required to ensure that the course delivers what it promises.

 

Peer appraisal

Teaching is an unusual occupation in that the majority of workers rarely perform in the presence of their peers. Even on team-taught courses, most teachers work alone with their students throughout their entire career. When they do perform in public, as on geographical field courses, there is rarely a formal mechanism for one colleague to evaluate another's performance with a view to seeking improvements. By comparison, students who participate in regular seminars and tutorial groups are constantly subjected to evaluative criticism from their peers. What, then, can be done to encourage peer evaluation of teachers and what methods are available?

The most direct approach is for one teacher's classroom practice to be observed by a colleague, but peer evaluation can also involve the examination of a teacher's lesson plans, course handouts, coursework assignments, grading methods and evaluation arrangements. These methods are widely used in the training of school teachers and during probationary periods for new teachers. They are also used by the United Kingdom's educational inspectorate during their periodic visits to review courses or departments (Jenkins and Smith, 1990). A similar approach is taken in those North American schools which employ an educational consultant as an in-house evaluator and counsellor.

As to who should carry out a peer evaluation, we feel that while expert observers can make a significant contribution to both evaluation and improvement, observation by one's own colleagues can also be highly beneficial. An incidental benefit of being evaluated by one's colleagues is that information about non-conventional methods used by innovative teachers on a degree programme can become more widely known and practised. One of the largest problems to be overcome in peer evaluation is fear of exposure to one's colleagues. This fear can be reduced by adopting one or more of the following tactics.

Several options exist for teachers who remain uneasy about appraisal by their immediate colleagues. One is for the head of department to carry out the appraisal, perhaps as part of a regular cycle of staff development (Brunn, 1990). Another is for the appraisal to be undertaken by a single external assessor, either from a neighbouring department or from a geography department in another institution Hastings et al. (1970), for example, describe interviews of geography teachers by trained staff from the Centre for Instructional Research at the University of Illinois. They found that interviews took less time than regular self-evaluation but yielded more information than filling out a questionnaire. Finally, as Lawton (1986) indicates, the assessor's role can be carried out by external examiners, alongside their main role of scrutinizing grades awarded to students by internal departmental examiners. The latter approach could serve to forge a link between the evaluation of courses based on a study of their outcomes (i.e. student grades) and evaluations based on inputs (i.e. teacher performance).

Finally, what aspects of a teacher's performance does peer evaluation best illuminate? Braskamp (1980) suggests that evaluation by one's peers is most appropriate for judging a teacher's professional competence and the extent to which a scholarly orientation is being shown towards the subject matter being taught.

 

Student assessment of teachers

The idea that student views on how we teach should be a major part of evaluation exercises is not always easy for teachers of geography to accept. However, North American practice has long been to use student opinion in the appraisal of teachers and the experience does not support the worst fears of those suspicious of such a system. Surveys show that about 80 per cent of US institutions use student evaluations for staff development purposes (Centra, 1977). Student rating of instruction is often concerned with teacher characteristics such as skill, dynamism, fairness, enthusiasm, sense of humour, openness, approachability, availability, appearance and interest (see Dowell and Neal, 1982; Kyriacou and McKelvey, 1985).

According to Braskamp (1980), the three commonest methods of collecting student information on teachers are responses to a set of alternative fixed items in survey form, written comments on open-ended questions and verbal comments in a semi-structured interview conducted by an independent third party. The choice of data collection method does not seem to matter much if an overall rating of an instructor is all that is required. However, each method varies in the amount of diagnostic and feedback information they provide, and the manner in which information is communicated back to the instructor.

We have already discussed the use of feedback forms, interviews and questionnaires in our earlier section on course evaluation; most of the principles discussed in these contexts apply equally to student assessment of teachers. Fink (1985) provides an example of the types of question that geography students can be asked on a structured questionnaire based on the IDEA course evaluation system developed at Kansas State University. This asks students to rate what they have learnt on a course. The teacher abilities about which students can be questioned are as follows:

General
Knowledge of the subject
Attitude towards teaching
Ability to design courses
Desire to continue learning about teaching

Particular abilities
Ability to make objectives clear
Establishes good relationships with students
Effectively communicates course content
Uses particular teaching techniques effectively
Generates enthusiasm
Provides frequent and useful feedback to the students
Changes the teaching approach when appropriate
Provides intellectual leadership
Constructs good tests

Summing up, the main lessons to draw from this discussion of teacher appraisal are as follows:

 

9.6  CONCLUSION

The previous discussion of teacher appraisal points to the wider dimensions of evaluation beyond mere concern with a particular teaching method, an individual class with students or a new item of educational technology. It is also possible to go beyond teacher appraisal to towards a holistic appraisal of the total teaching system, including evaluation of the complete degree programme, the department or even the institution as a whole. While a large literature exists on this subject (e.g. Parlett, 1977; Moos, 1979; Rutherford, 1987; Kogan, 1989), its substance lies outside the terms of reference of this chapter or, indeed, this book.

We have endeavoured here to consider the evaluation of several features in the educational system: course units, individual classes, teaching methods and materials, and appraisal of the teachers themselves. We have described a number of complementary evaluation methods, and have discussed the issue of who best undertakes evaluations and when. Throughout, our main concern has been with the use of evaluation to improve how we help students to learn. In doing so, we have tried to show that evaluation need not be complex and time consuming to be useful. It is now time to draw out some general conclusions from the material that we have presented. The key ideas can be summarized in the form of a handful of precepts.

We offer the following suggestions that you may consider for further action.

 

KEY READING

The Encyclopaedia of Educational Evaluation (Anderson et al., 1975) provides a valuable overview of the concepts, methods and terminology of educational evaluation. This contains more than 140 cross-referenced articles, each of which provides a guide to further reading. Murphy and Torrance's ( 1987) Evaluating Education brings together classic material on evaluation issues and methods. Dressel (1976), Doyle (1983) and Tuckman (1985) are textbooks that supply useful overviews of the basic issues. Bloom et al. (1971) provide a comprehensive survey of evaluation methods which adopts the objectives testing tradition most highly developed in North America. A broader, but no less comprehensive, compilation of techniques is provided by Patton (1990).

For those who require a historical review of the changing emphases in modern evaluation, Curriculum Evaluation by Hamilton et al. (1977) can be recommended. The early 1970s switch to qualitative approaches is well illustrated by Tawney (1976). This essentially British view is complemented by the North American perspectives of Eisner (1985).


Return to Contents

GDN Home

Pages created 20 May 1999
Pages created by Heidi Meehan and Phil Gravestock