James Lawley and Penny Tompkins
2. Research methodology
is a kind of ‘behind the scenes’ process we all do regularly. We are constantly informally evaluating people's behaviour against our own internal (often subconscious) standards. What makes a formal assessment 'formal' is that the standards and process of assessing are known, and hopefully well-defined.
A different kind of evaluation happens when we make an assessment of another person's evaluation. To do so we need to take into account their
means of evaluating, which may not be the same as ours. How accurate – or not – are we are calibrating
another person's evaluation?
We devoted the December 2011 Developing Group to Clean Evaluative Interviewing
. The aim on that day was to learn how to use Clean Language as a research interview method when the topic being researched was how people evaluate an experience.
Dr. Susie Linder-Pelz and James have recently concluded an academic research project in which six coaching sessions were evaluated from three perspectives: by the coach, the client and an expert-assessor.*
At the 2nd August Developing Group James updated the group on the findings of the research, and we explored how we can individually and collectively make use of the conclusions. in particular we experientially investigated:
- As a coach, how aware are you of how your client and an expert would evaluate a coaching session?
- Does knowing your client and an expert's opinion affect your own evaluation?
Calibration and Evaluation
Over the years we have approached the topic of calibrating in different ways.
We have long noticed that when people on a training course are asked to evaluate a practice coaching session they often give an answer which varies wildly with the opinion of the client and/or us as expert observers.
For example, one coach said a session was “catastrophic”, while the client said “I got some useful insights and lots to think about”. James who was observing said to the coach, “You did what the activity called for. The client got what they asked for with their desired outcome. A more direct approach might have got to the meat earlier, and even so, you and they now have a lot more of a landscape to work with and a good basis for the next session.”
When the coach was asked what their evaluation of the session was now, having heard the opinion of the client and expert they said “Well I’m pleased the client got something out of it and I still think it was catastrophic”. We wonder what scale the coach was using to evaluate their effectiveness, and what they would have labelled a much worse session! (See The Importance of Scale)
Our modelling of excellent facilitators (not only those who use Clean Language) showed that a key skill was the ability to calibrate the experience of the client and to notice when it changed and in what direction. (See Systemic Outcome Orientation)
There are lots of ways to calibrate, and what seems more important than the method of calibrating is that (a) the facilitator is actively calibrating moment-by-moment; (b) there is a correspondence between the facilitator’s calibration and the client’s experience; and (c) the facilitator can quickly change in response to the results of their calibration. This led us to make the “First Principle of Symbolic Modelling” (See REPROCess and Modelling Attention):
Know what kind of experience the client is having (i.e. what you are modelling).
While calibrating is a matter of efficacy, we have pointed out that it is also an ethical matter. If you do not calibrate the kind of experience the client is having, how do you know whether what what you are doing is, or is not, working for the client? (See Calibrating Whether What You Are Doing is Working – Or Not)
James and Susie’s research of coaching sessions shows that even experienced coaches and experts can give widely differing ratings compared to those of the client and to each other. While this may be surprising at first, once it is appreciated that each tend to use different criteria in coming to their evaluations, the variation makes more sense.
In our opinion, a bigger issue is the difficultly there appears to be in managing multiple perspectives when they diverge. Many certification and evaluation processes use one perspective: Experts decide if a coach is competent to be certified or suitable for a job, or clients decide if they are satisfied with the service. Rarely are both
taken into account. Even more rarely does the coach’s ability to calibrate both the client and the expert perspective become part of the assessment.
One reason for this may be the difficultly in comparing apples, oranges and bananas. This is compounded if the aim is to find a single composite score. The result is likely to be an arbitrary weighting of the contribution of each perspective. Rather than trying to reduce the perspectives to a single rating, an alternative is to live with the complexity of three perspectives and set acceptable levels in all three.**
By bringing our own evaluations out from ‘behind the scenes’ and making them 'centre
stage' we can play with our own patterns of assuming, and get a ‘reality
check’ on our how and what we are unconsciously calibrating.
— — — — — —
* The first part of the study was published as: Linder-Pelz, S. & Lawley, J. (2015). Using Clean Language to explore the subjectivity of coachees' experience and outcomes. International Coaching Psychology Review
, 10(2):161-174. http://shop.bps.org.uk/publications/Download a free preprint version: Linder-Pelz_Lawley-ICPR_preprint_15_Jun_2015.pdf
The second part of the study was published as: Lawley, J. & Linder-Pelz, S. (2016). Evidence of competency:
exploring coach, coachee and expert evaluations of coaching, Coaching:
An International Journal of Theory, Research and Practice
Download a free preprint version: Lawley&Linder-Pelz_CIJTRP_preprint_03_May_2016.pdf
** We are grateful to Michelle Duval who helped us to get clear on this point.
Research Methodology used at Developing Group, 2 Aug 2014.
1. A Goal-focused Coaching Skills Questionnaire (GCSQ) was emailed to participants in advance with a request to complete it and bring it on the day.* The instruction given was:
Circle the number that most reflects your assessment of your current clean coaching competency.
2. Twelve questionnaires were completed. The average scores for each person were converted to the equivalent scores out of ten:
The scores ranged from 6 to 9 out of 10, with an average of 7.3.
3. Ten of the participants were paired up and allocated an expert-observer (a recognised assessor of Clean Facilitator competencies). The participants in each dyad took turns to be the coach and the client for an observed 30 minute session.
4. At the end of each session the client, coach and observer completed in private
a sheet designed specifically for that role.** The sheets were collected without other participants seeing them. The sheets contained requests for:
(i) numerical evaluations out-of-ten from various perspectives, and
(ii) a textual list of the key criteria used in the evaluation of the session (see below).
5. After both coaching sessions had finished and the figures entered into a computer, the sheets were handed back to each triad for reflection and discussion.
6. An anonymised summary of the results were shown to the whole group for more reflection and discussion.
7. Lastly, the group was spit in half and two of the expert-observers conducted a 30 minute coaching session observed by the other four participants and another expert-observer. Evaluation sheets were completed as #4 and compared within the group. NOTES
* Questionnaire used with permission. See: Goal-focused Coaching Skills Questionnaire (GCSQ), Anthony M. Grant & Michael J. Cavanagh. Social Behavior and Personality, 2007, 35 (6), 751-760.
The GCSQ sent out was slightly modified from the original. One word in each of questions f, g, h and j was changed to make the questions compatible with clean coaching. Download: Clean_modified_GFCSQ_questionnaire.pdf
** The three sheets asked for the following evaluations and information. The letters (a)-(f) refer to Table 1 in Findings:
Completed by CLIENT:
On a scale of 1 to 10, the value of the session to me was .......................... (a)Completed by COACH:
Please list the key criteria you used to assess the value of the session to you:
On a scale of 1 to 10, I evaluate the quality of the session as ..................... (f)Completed by EXPERT-OBSERVER:
Please list the key criteria you used to assess the quality of the session:
I estimate the CLIENT rated the value of the session to him/her as ............. (b)
I estimate the OBSERVER rated my clean coaching skills as ....................... (e)
On a scale of 1 to 10, I evaluate the coach’s clean coaching skills as ........... (d)
Please list the key criteria you used to assess the clean coaching you observed:
I estimate the CLIENT rated the value of the session to him/her as.............. (c)
Table 1 shows the evaluations for the 5 triads involving 10 observed coaching sessions.
a = Client's rating for value of session to them.
b = Coach's estimate of client's rating (a)
c = Observer's estimate of client's rating (a)
d = Observer's rating of coach's clean coaching skills
e = Coach's estimate of observers' rating (d)
f = Coach's rating for quality of the session
Table 1: Ratings by Clients, Coaches and Expert-Observers
Table 2: Clients' criteria used to assess the value of the session
Criteria collated into three categories:
Effect on self
Relationship with coach
Other coaching skills
Note: The italics has been added by the authors to indicate why category was chosen.
EFFECT ON SELF
- New insights.
- Some ‘aha’ moments.
- I like to feel [I've] have had insights.
- New information came out.
- Do I feel I have a clearer idea of what I want and what the current state is?
- I now have a clearer idea of what is happening and how I can go forward.
- Did I feel clearer?
- Had clarification of what I needed to do and a check on whether I would actually do it.
- Helped me clarify and develop my outcome.
- The importance of the situation is now more obvious.
- The things that are combining to perpetuate the present situation are more developed and understandable.
- Changes in the metaphorical representation.
- Changes in my inner response.
- Paper mapping provided different perspective.
- I “renovated” - reframed two [of my] coaching programs into a new offer for 2015.
- Disconnected a big value criterion.
- I like to feel like I’ve made progress
- The movement towards what I want out of the session.
- Learned something about ‘system’ outcome.
- Support in identifying actions that feel appealing and potentially useful.
- This is also something I can develop with [name].
- I feel I have something I can take away of use that will make a difference
- Confidence in following thru on identified actions.
- Able to take action on my outcome.
- Had actions to do what felt correct.
- Resonates with things which have come up before.
- My resources from the past are memorable now and can be used in the future.
- The importance of the topic that I was working on (the value to me).
RELATIONSHIP WITH COACH
- Sense of permission to explore.
- Sense of acceptance that I could explore whatever I wanted to explore.
- How safe I felt in the session to say what I wanted to say.
- Attention of coach on my words and actions.
- Level of presence of the coach.
- Would have liked to feel more presence from the coach.
- Rapport with coach.
- Did facilitator feel sympathetic
OTHER COACHING SKILLS
- Coach was a valuable asset in this.
- The fluidity/flow of the session.
- Did not write so session flowed.
- Necessary conditions to ‘feel’
- [Getting to a] Decision point, gut-mind together.
- Clean Space modelling of 1st step provided valuable information.
- The extent that the coach could distinguish between problem and desired outcome
- Responsiveness and useful directing of attention.
- When had to answer questions that did not feel relevant/or clean that detracted from the session
Table 1 shows that six of the ten clients rated their session as 7.5 or 8 out of 10. Two rated it higher at 9 or 9.5, one at 6 and one much lower at 3.
Table 3 compares the figures in Table 1 in pairs. The codes a to f refer to the key above Table 1.
Table 3: Comparison of Client, Coach and Observer ratings.
Table 3 shows that, generally the coaches and client's ratings were close with eight being
within one point of each other (f-a). The two exceptions with larger
variations occurred when the client
rated the session highest (E at 9.5) or lowest (J at 3).
It was similar picture for the coaches estimate of their client's rating (b-a).
this suggests that the coaches were able to calibrate their clients
when the session was 'close to the norm', but they were less able to do
so when the client's evaluations were at the extremities.
comparison of observer and client ratings (d–a) and observer estimate of
the client's ratings (c–a) showed a similar pattern to the coach-client
comparisons. However, in two instances, the observer ratings of
the coach’s skills and
the client ratings were at a variance by three or more points (Observer 1 and Observer 3).
Interestingly, these were also the sessions where observers most
misjudged the client’s ratings (c–a). This shows that even expert's can,
on occasion, seriously misjudge the value perceived by the client. (This is inline with the Linder-Pelz & Lawley research)
The coaches seemed to have more
difficulty estimating the observer's rating (d–e) than they did the
client's rating (b–a). Only four of the coaches estimates were within
one point of the observer's rating of them, suggesting some coaches
either lack awareness of what the observer is looking for, or
are unable to take the observer perspective while (or immediately after)
they are coaching.Client-Rating Criteria
criteria mentioned by the ten clients and listed in Table 2 were clustered into three group:
28 - 62% - the effect on the client
8 - 18% - relationship with the coach
9 - 20% - other coaching skills
Much research suggests that the coaching relationship (or "alliance") is the primary factor in the outcome of coaching. And while this may be so, for these clients at least, it was mentioned a relative low number of times compared to the effect (outcome) on the client.