WHAT IS
IMPROVEMENT IN THERAPY? HOW IS IT MEASURED?
Imagine the
average poor prognosis (HOUND) client, who has an average MMPI score of 90
before therapy and improves to an average of 70 after therapy. Compare this
client with the good prognosis (YAVIS) client who improves pre-post from 60 to
50. Who is most improved? Is the amount of change more important, or how well
they are doing after therapy?
There are 4
common ways of measuring improvement from therapy.
1. Global
improvement scores, which have the benefits of good face validity and low cost.
Drawbacks are subjectivity and reactivity.
For
example, Garfield, Prager, & Bergin (1971) had
therapists and their supervisors fill out a disturbance rating (0=none,
5=great) for each client before and after therapy (pre-post). The client,
therapist, and supervisor also gave global improvement ratings after therapy.
Notice the discrepancy between the global improvement scores and changes in
disturbance ratings.
Improvement
measured by 1 pt. changes in pre-post disturbance ratings: therapist-26%,
supervisor-21%.
Improvement
rates from global ratings: Therapist-80%, supervisor-56%, client-80%.
The amount
of agreement among sources of ratings of global improvement is quite low. In
the Temple Study (Sloane et al. 1976), global rating of
improvement were obtained from the client, therapist, independent
assessor, and informant.
Correlations
of
Therapists and
assessors,
r = .13
Patients, r
= .21
Informants,
r = -.04
Assessors and
patients,
r = .65
Informants,
r = .40
Patients and
informants,
r = .25
2. Raw
(pre-post) change scores, have the advantages of being objective and the
disadvantages of (a) increased unreliability (subtracting 2 less than perfectly
reliable measures from each other increases the unreliability of the composite)
and (b) regression to the mean (the amount of raw change is correlated with the
initial level).
3. Residual
change scores. These scores are “residual” in the sense that they are
calculated as the deviation of scores on a 2nd occasion (post) from those
predicted by regression analysis from a knowledge of
scores on the 1st occasion (pre). These are often calculated by an analysis of
covariance using post treatment measures as the primary dependent measure with
pre-treatment scores as the covariate. They have the advantage of being
objective and the disadvantages of a) perhaps statistically throwing out real
change, and b) a lack of agreement on how the scores should be calculated.
4.
Post-therapy (end point) scores have the advantage of being objective and two
disadvantages:
a) they don’t measure change, and b) YAVIS will always be >
HOUND.
The state
of the art in therapy outcome research:
1. Use of
composites of many measures to increase reliability. If this is done, agreement
across different sources and types of data achieve reliabilities of r = .5 to
.7.
2. Some
studies use #4 above, most studies use #3 above, especially if there are group
differences in pre-test measures.
CLINICAL
SIGNIFICANCE is generally a more stringent criterion than statistically
significant change. A client generally can be said to have shown clinically
significant change when s/he moves from a dysfunctional distribution into a
functional distribution, and the magnitude of change
exceeds measurement error.
In
investigating “Moving Targets”; Sorensen, Gorsuch,
& Mintz (1985) did telephone interviews with 22
couples in family therapy, assessing the severity of 3 different target
complaints (How much is the problem bothering you? 0=Not at all, 5=Pretty much,
10=Couldn’t be worse).
There were
2 groups in the study: Group 1 was assessed at weeks 1 and 10 of the study;
Group 2 was assessed at weeks 1, 4, 7, and 10.
On
re-contact clients were asked: “ARE THERE ANY ADDITIONAL PROBLEMS”?
25 of 44
clients (56%) listed new complaints: 18 had 1 new complaint, 5 had 2 new
complaints, 2 and 3 new complaints. Clients were also asked to rate improvement
on previous target complaints.
The primary
data analysis was a multiple regression using global improvement ratings at the
end of therapy as the criterion and patient rating of improvement on target
complaints as the predictors.
For 19 patients with only 3 initial
problems: Improvement ratings on the 3 initial problems correlated R = .77 with
global ratings (59% of variance) and the 1st target complaint accounted for
most of the 59% of the variance (complaints 2 and 3 added little).
For the 25 patients with >3
complaints, 1st 3 problems accounted for 70% of variance in global rating of
outcome. Adding the 4th complaint led to a sig. Increment in predictability of
global outcomes (to 79%). When the 4th problem was entered 1st, it explained
most of the variance.