John M. Grohol
NOVA SOUTHEASTERN UNIVERSITY
Abstract Developing a reliable and systematic rating system to study the psychotherapy process first began with Strupp's (1957) method of categorizing therapist utterances. Since that time, dozens of rating systems have been constructed, with varying degrees of success, reliability, acceptance and use by other researchers. This article outlines the basic constructs of some of these widely-known rating systems and compares them to the Cognitive Elaboration Rating System (CERS). The CERS was recently developed as an attempt to reliably assess the occurrence and number of positive cognitive elaborations verbalized by clients in therapy sessions and the therapist interventions which led to these elaborations. The development and construction of the CERS is discussed, as well as future applicability for researchers.
Development of the Cognitive Elaboration Rating System (CERS) John M. Grohol The process of psychotherapy has been the focus of a wide range of research for the past few decades, as researchers and clinicians alike look for those variables which might make the greatest impact on the outcome of therapy. Psychotherapy has often been described as an "art" (e.g., Lindner, 1982; Storr, 1980) which is composed of many intricate parts. These parts include personality variables of both the client and the therapist, the psychotherapeutic relationship, the therapeutic alliance and rapport established between the client and therapist, and the specific interventions the therapist brings to therapy. Research on these factors has shown that there may be important relationship factors that influence positive therapy outcomes, that the client's contribution to the therapeutic alliance is also important, and that various types of therapies work best for specific disorders (Windholz & Silberschatz, 1988). This latter point, however, is often disputed, because of research which shows that improvement in therapy is more strongly correlated with therapeutic relationship factors than with specific therapist techniques (Garfield & Begin, 1986). One method of examining the process of therapy is to divide therapy sessions into discrete units of verbal communication and examine the therapist-client interactions that correlate most highly with positive therapy outcomes. By devising a system of coding therapist utterances, Strupp (1957) was one of the first researchers to realize this method of examining the therapeutic process. Strupp's first system, however, was limited; it measured only therapist communications. Since that time, dozens of classification systems have been developed for research, with varying degrees of success (for reviews, see for example, Greensberg & Pinsof, 1988; Elliott, Hill, Stiles, Friedlander, Mahrer, & Margison, 1987; Russell & Stiles, 1979). It is beyond the scope of the present article to summarize or evaluate all psychotherapy process rating systems that are currently found in the literature. Rather, the present author will examine the construction and validation of some of the more stable and comprehensive systems in contemporary process research. The development of the Cognitive Elaboration Rating System (CERS) will then be described, and pertinent aspects of this system will be compared to the other psychotherapy rating systems discussed. The future direction and applicability of the CERS in psychotherapy process research will also be examined. Vanderbilt Psychotherapy Process Scale (VPPS) The Vanderbilt Psychotherapy Process Scale (VPPS) was originally devised in 1974 by Strupp, Hartley, and Blackwood (Suh, O'Malley, Strupp, & Johnson, 1989). The scale has undergone two revisions since then, by Gomes-Schwartz in 1978 and in 1983 by O'Malley, Suh, and Strupp. The VPPS seeks to be neutral in its theoretical orientation as a measurement of positive and negative predictive outcome variables in therapist-client interactions. It consists of 80 5-point Likert-type items - rated on an ordinal scale from 1 ("not at all") to 5 ("a great deal") - which are divided into therapist and client sections. Each section also contains two parts, one dealing with characteristics of each person's behavior and the other part dealing with characteristics of each person's demeanor during the session (Suh et al., 1989). Eight subscales are derived from these 80 items, the first five of which are patient-oriented while the remaining three are therapist-oriented (Table 1)(Suh et al., 1989). Each scale contains between 6 and 13 items. The scales were derived from the items on the basis of a principal components factor analysis (O'Malley, Suh, & Strupp, 1983). Patient participation describes the extent to which the client is actively involved in the therapeutic relationship. Patient hostility is used to tap the more negative aspects of the client's behaviors and beliefs. Patient psychic distress seeks to measure the client's emotional state, especially feelings of discouragement. Patient exploration describes the extent to which the client is engaged in examination of his or her feelings and experiences. Patient dependency is used to measure the client's dependency and reliance on the therapist. Therapist exploration measures the therapist's attempts to examine the client's behaviors, emotions, and underlying motivations. Therapist warmth and friendliness seeks to measure the therapist's display of emotional involvement while Negative therapist attitude describes therapist attitudes and behaviors that may frighten, threaten, or intimidate the client. Table 1. Vanderbilt Psychotherapy Process Scale Subscales Patient Therapist --------------------------------- -------------------------------------- Patient participation - 8 items Therapist exploration - 13 items Patient hostility - 6 items Therapist warmth - 9 items Patient psychic distress - 9 items Negative therapist attitude - 6 items Patient exploration - 7 items Patient dependency - 6 items The VPPS has been used in a number of different ways to rate therapist-client interactions within therapy. Early on, investigators defined the entire therapy hour as the unit length and rated only the third session of twenty-five clients (Gomes-Schwatz & Schwartz, 1978). Gomes-Schwartz (1978) refined the sampling method by using 10-minute segments from thirty-five different client sessions, randomly chosen from pre-defined representative sessions (session 3, sessions one-half and three-quarters of the way, and the next-to-last session). The current version of the VPPS was developed using a systematic sampling method (5 minutes from the beginning, middle, and end of the hour) and 15-minute unit lengths taken from thirty-eight clients (O'Malley, Suh, & Strupp, 1983). Only the first three sessions of therapy were used for each client. Suh et al. (1989) make their case for this sampling method: ... Early sessions appear to be critically important for the subsequent course of therapy. Furthermore, even if ratings from later sessions demonstrate stronger associations with outcome, they may not elucidate the actual processes responsible for the development of qualities manifested in the later sessions. (p. 136) Multiple raters and media ranging from written transcripts to audio and videotapes of therapy sessions can be used in conjunction with the VPPS. Current investigators working with the VPPS (Suh et al., 1989) describe one rating procedure, the consensus team method, in which raters working in pairs first independently rate videotaped therapy sessions, then compare ratings with each other, and finally reach a consensus on items in disagreement through discussions and videotape review. Suh et al. (1989) suggest that raters using the VPPS should be at least graduate students with minimal clinical experience; no other rater selection criteria are given. Raters are first trained to criterion (r = .85 to .90) on 12 to 19 training segments and then begin working in assigned pairs. Although various forms of media can be used with the VPPS, researchers have found that ratings based upon audio and videotapes are the most accurate and caution that transcripts should not be used (Suh et al., 1989). Interrater reliabilities have ranged from .60 to .94, averaging .86 across three studies (cited in Suh et al., 1989). Suh et al. (1989) review the use of the VPPS in process research and conclude that it has sufficient reliability and validity as a psychotherapy research instrument (see also Windholz & Silberschatz, 1988). Suh et al. (1989) cite studies which have found that the client's level of interpersonal functioning prior to therapy is predictive of the client's participation in therapy, which is then predictive of client outcome. Another discovery the researchers mention is that changes in therapist attitudes early on in therapy is important to client outcomes (cited in Suh et al., 1989). These findings come from the Vanderbilt Psychotherapy Research Project I; the second project is currently underway and is investigating the efficacy of time-limited dynamic psychotherapy. Six raters are being used, working in consensus teams as described above. Moras and Hill (1991) recently examined the rater selection criteria found in current process rating systems and categorized these systems based upon the amount of inference required. While Hill's own system (the CVRMCS, which will be discussed below) and Stiles' system (VRM, also discussed below) were classified as "moderate inference instruments," Moras and Hill describe the VPPS as a "high inference instrument," that is, a system that requires individuals to rate stimuli that are complex and require a large amount of inference on behalf of the raters. Because of this factor and since the VPPS categories measure intentions and internal states that focus on the client (and is somewhat psychodynamically-oriented, despite the researchers' claims otherwise), this system is inadequate to measure the cognitions and elaborations in which the present researchers were interested. It also fails to accurately measure, with just three categories, the wide range of therapist interventions that especially interest the current researchers. Verbal Response Mode System (VRM) Perhaps the most detailed, comprehensive, and complex of rating taxonomies is Stiles' (1992) Verbal Response Mode system (VRM). VRM was developed by Stiles' influence by Jerry Goodman, Stiles' clinical supervisor at UCLA in 1969. Goodman categorized six distinct response modes in therapy: question, advisement, silence, interpretation, reflection, and disclosure (Stiles, 1992). After further exploration of response modes, Stiles and a colleague proposed three underlying principles of classification: source of experience, presumption about experience, and frame of reference. By using these principles, Stiles (1992) proposed that raters could categorize utterances simply by answering three questions: Whose experience is the topic?; does the utterance require the speaker to presume knowledge of the other's experience?; and whose frame of reference is used? Two additional response modes, edification (providing information) and confirmation, were eventually added to Goodman's original categories. The "silence" category was renamed "acknowledgment," in an effort to better identify such responses (Stiles, 1992). By utilizing Stiles' principles of classification, utterances automatically fall into one of the eight categories. Disclosure describes thoughts, feelings, perceptions, or intentions. Edification states objective information. Advisement attempts to guide behavior with suggestions, commands, permission and prohibition. Confirmation compares speaker's experience with other's through agreement, disagreement, and by sharing experiences or beliefs. Question describes a request for information or guidance. Acknowledgment conveys receipt of a communication (including salutations). Interpretation explains or labels the other and can describe judgments or evaluations of the other's experiences or behaviors. Reflection puts other's experiences into words through repetitions, restatements, and clarifications (Stiles, 1992). VRM seeks to transcend traditional category systems by focusing on who is speaking (rather than "therapist" and "client" categories) and the "other" person participating in the discussion (Table 2). In this way, VRM recognizes that clients can also make verbalizations usually ascribed exclusively to therapists, such as reflections and interpretations. VRM is a generalized coding system, applicable to coding any conversation in almost any context. Table 2. Stiles (1992) Verbal Response Mode (VRM) System Source Presumption Frame of Reference of Experience About Experience Other Speaker ------------- ---------------- --------------- ---------------- Other Other Reflection (R) Interpretation (I) Speaker Acknowledgment (K) Question (Q) Speaker Other Confirmation (C) Advisement (A) Speaker Edification (E) Disclosure (D) Another way in which VRM was developed to be used as a generalized rating taxonomy is in its definition of response units. Stiles' system breaks down conversations into their most basic and fundamental units, called utterances. Since each utterance can be a simple sentence, an independent clause, a nonrestrictive dependent clause, an element of a compound predicate, or a term of acknowledgment, evaluation, or address, there can be dozens of codes given to one talking turn alone (Stiles, 1992). Each utterance is coded twice - once for form or literal meaning and once for intent or pragmatic meaning - making Stiles' system very detailed. For instance, "Would you roll up your sleeve?" would be coded as a question in form, but as an advisement in intent (Stiles, 1992). Interrater reliabilities for form range from .50 to .98, averaging .81, and for intent from .30 to .96, averaging .68. Reliabilities for intent were usually significantly lower than those for form across two studies (cited in Stiles, 1992; Elliott, Hill, Stiles, Friedlander, Mahrer, & Margison, 1987). A negative aspect of coding this much detail for research purposes is that transcripts of conversations to be rated must be unitized first. Stiles (1992) recommends that individuals who divide a transcript into units not be the same persons who then code each unit. Although audio or videotapes can be used for rating, Stiles cautions that these modalities require more skill because of the level of complexity they involve. Raters using the VRM system should have a "high verbal aptitude, interest in interpersonal communication, patience with details, and intensive training and practice. Competence in basic grammar is essential" (Stiles, 1992, p. 21). The VRM system has been successfully used in a wide number of research studies. One study found that the mode intents of therapists vary dramatically with regard to their theoretical orientation (cited in Stiles, 1992, the finding also supported by Hill's  research). Other research using the VRM taxonomy has dealt with topics ranging from the differences in relationships and roles, to medical interviews, relationship styles, state and trait anxiety, awkward silences, etc. (cited in Stiles, 1992). Stiles (1992) claims that hundreds of raters have coded thousands of utterances under the VRM system in dozens of studies. While the complexity of Stiles' system mirrors the difficulty of coding human speech, it was overly detailed for the present study's use. The categories appear to be more useful than those found in the VPPS for detecting elaborations of thoughts, emotions, and experiences, but the VRM system is a painstakingly complex system that has a lengthy learning curve and requires committed, long-term raters. (Initially, it takes an average of 5 hours to code a 1-hour therapy session; after 6 months, it still takes over 2 hours to rate a 1-hour session [Stiles, 1992].) The required resources - such as time to adequately train raters and code dozens of sessions, the availability of long-term raters, the ability to provide accurate unitized transcripts, etc. - were not available to the present researchers. Counselor Verbal Response Mode Category System (CVRMCS) Hill (1978) attempted to develop a counselor response category system that incorporated the components of systems existing at the time. The result of this attempt is the Counselor Verbal Response Mode Category System (CVRMCS). Five stages of development were needed to obtain the final categories used for ratings. Throughout its development, the same two raters were used to help identify important and reliable rating categories. During the first stage, 25 categories taken from the existing rating literature were used to rate two practice sessions. Discussion resulted in revising some categories to reduce overlap. During the second stage, 24 categories were used to rate five practice sessions. During these first two stages, interrater reliability remained low. Further discussion and ratings in the third version of the system resulted in interrater reliability on two practice sessions at 80% and 90% agreement. Face validity was then tested by asking three experienced counseling psychologists to match examples of the various categories with the appropriate definitions. Only half of the examples were matched, leading to a reexamination and clarification of the existing 24 categories. This fourth version, now with only seventeen categories, was given to another panel of three experienced counseling psychologists, two of whom were able to obtain 80% agreement on matching definitions with the appropriate examples. The fifth and final revision used for Hill's (1978) initial study was just a reworded and clarified version of the fourth version. This revision contained the following categories: minimal encourager, approval- reassurance, structuring, information, direct guidance, closed question, open question, restatement, reflection, nonverbal referent, interpretation, confrontation, self-disclosure, silence, friendly discussion, criticism, and unclassifiable (Hill, 1978). The CVRMCS, like Stiles' VRM system, uses complete and accurate transcripts of therapy sessions as its primary rating material. Transcripts are divided into what Hill (1978) terms "response units (essentially grammatical sentences)," (p. 463) which include brief phrases such as "mmhmm" and "yes." (Three years later, Hill better defined this unit as any independent clause [cited in Friedlander, 1982].) Raters independently listened to therapy tapes, which consisted of 12 intake sessions (as opposed to therapy sessions, which also can be used), and followed along in a unitized transcript, rating each response unit according to one of the 17 categories. Each response unit could be placed into one or more of the categories. Disagreements between raters were resolved using a procedure similar to the the consensus team method used in the VPPS, in which discussion was used to reach an unanimous agreement for those items that were discrepant. At the conclusion of Hill's (1978) initial study, she determined that there were 14 statistically significant, mutually exclusive categories. Minimal encourager describes an acknowledgment, simple agreement, or understanding. Approval-reassurance provides emotional support, approval, or reinforcement. Information describes information usually taking the form of facts, data, or resources that is supplied. Direct guidance consists of directions or advice that the therapist gives to the client. Closed question is a type of question that usually only requires a one- or two-word answer, such as yes or no. Open question is a type of question which requests a clarification of feelings or an exploration of some situation. Restatement describes a simple restating or rephrasing of the client's statement which often contains similar but fewer words and is more concrete and clear than the original statement. Reflection is a simple restating or rephrasing of the client's statement which contains reference to stated or implied feelings. Nonverbal referent points out body posture, voice tone or level, facial expressions, etc. Interpretation may take several forms, but always goes beyond what the client has stated. For instance, it might establish connections between seemingly unrelated events or statements; it interprets defenses, feelings, resistance, or transference; it might indicate themes, patterns, or causal relationships in the client's behavior. Confrontation is defined by two parts: the first part may be implied rather than stated and refers to some aspect of the client's message or behavior; the second part usually begins with the word "but" and presents a discrepancy or contradiction. Self-disclose describes a statement in which the therapist shares his or her own personal experiences or feelings with the client and usually begins with the word "I." Silence is a pause of five seconds or more. Other describes statements that are unrelated to the client's problems, such as small talk and salutations (Hill, 1978). Friedlander (1982) refined Hill's (1978) rating system by examining some of the most prominent problems with the CVRMCS. Two major problem areas were identified: the mixture of classical and pragmatic coding categories (as defined by Russell & Stiles, 1979) within the 14 categories used, and the inconsistency of the definition of the response unit (Friedlander, 1982). Friedlander (1982), using interrater discrepancies and face and content validity tests similar to Hill's (1978), combined a number of redundant categories, resulting in nine mutually exclusive categories (CVRMCS-R). Those categories are: encouragement/approval/ reassurance, reflection/restatement, self-disclosure, confrontation, interpretation, providing information, information seeking, direct guidance/advice, and unclassifiable. The scoring unit was also redefined to include any dependent or independent clause that at the minimum, contained a verb phrase. Compound predicates also constituted individual units. Ratings again were conducted from unitized transcripts and disagreements between independent raters were handled in the same manner as Hill (1978). Rater qualifications were not initially specified in either of the above systems. Two undergraduate students majoring in psychology and a counseling psychologist (Hill) were used as raters in Hill's (1978) study; interrater reliability was reported at .80. Two raters were used for Friedlander's (1982) updated system; interrater reliability was reported as .85. Elliott et al. (1987) found interrater reliabilites for Hill's system to range from .48 to .94, averaging .64 and for Friedlander's system from .32 to .82, averaging .57. Hill (1986) later did note selection criteria and training for raters; raters were selected on the basis of "a high grade-point average, motivation, and ability to do the task" (p. 140). Training required raters to become familiar with the rating categories and then practice with the system until at least two out of the three raters agreed on 75-80% of all categories (Hill, 1986). As with the other rating systems described here, a rater training manual is available. Hill (1986) also suggests weekly meetings amongst raters to correct for rater drift, provide an opportunity for affiliation to reduce boredom and loneliness, and reconcile disagreements. Judgments of two out of three of the raters are usually accepted without discussion; when all three raters disagree on a category, discussion ensues. To ensure that no one rater dominates or influences the discussion process, Hill (1986) suggests alternating which person talks first and allow equal time and respect to each rater's opinions and reasons for disagreement. Since 1987, only a handful of studies have utilized the CVRMCS taxonomy in research. Cummings (1989) used the system to discover that novice counselors used more information-oriented responses when addressing a help-seeking individual with intrapersonal problems (such as procrastination, loneliness, etc.) and used more reflection-oriented responses when an individual presented with interpersonal problems (such as dealing with conflicts or relationships with other people in that person's life). Other studies have dealt with changes in graduate students after taking a course devoted to developing counseling skills (Kivilghan, 1989) and an overview of effective therapist techniques by Hill (1992). There is a larger base of studies conducted before 1987 that use the CVRMCS to examine effectiveness of various theoretical orientations in therapy (cited in Hill, 1986). Unfortunately, most of these rated only the initial intake session, with only a few examining response modes across the course of treatment (Hill, 1986). While not as complicated as Stile's VRM system, nor as content- or dynamically-oriented as the VPPS, neither the CVRMCS nor Friedlander's refinement of the CVRMCS rating system were adequate for the present study, for two important reasons. First, the CVRMCS is a counselor-oriented rating system and does not include ratings of client responses (Hill developed a similar, yet separate system for rating client responses [Hill, 1986]). Second, like the VRM system, the CVRMCS highly recommends that verbatim transcripts of therapy session be used. The present researchers had limited time and resources available and could not provide such unitized transcripts for this study. Cognitive Elaboration Rating System (CERS) Categorizing talking turns in therapy sessions is a difficult undertaking. Past studies have illustrated problems in attempting to define and rate therapist and client interactions. For instance, interrater reliability in such studies have averaged only in the high .50's (Elliott, et al., 1987). When examining the most reliable rating categories, questions (.71) were followed by advisement (.66), information (.64), and self-disclosure (.61). These researchers concluded that there was no "best" response-mode rating system, and that researchers should use a rating system best suited to their own particular needs (Elliott, et al., 1987). The Cognitive Elaboration Rating System (CERS) was developed as an extension of a literature review on the development and current research of the elaboration likelihood model (Petty & Cacioppo, 1986). The elaboration likelihood model (ELM) states, in general, that when clients expand or elaborate on positive experiences, thoughts, or emotions on their own during a therapy session, they are more likely to gain positive outcomes from therapy. Successful outcomes in ELM can be best measured by examining the therapeutic process. The positive-outcome model of cognitive elaboration theorizes a particular series of events in therapy. First, the client's negative or problem thoughts, ideas, or cognitions are identified and examined in therapy. The therapist then targets those faulty thoughts that are most likely to be susceptible to change and begins a series of interventions designed to promote new, healthier or more adaptive cognitions. By presenting favorable, strong arguments for these new cognitions, the therapist encourages the client to begin elaborating on them, through verbal and imagery strategies. Short-term change is then likely to take place, at which time clients may exhibit actual or superficial change, which must be accurately assessed. Finally, by incorporating the new constructive thoughts into their own cognitive schema, the client maintains changes and begins to generalize the changes to other ideas and behaviors (Barone & Hutchings, 1993). The current researchers were interested in examining the client's cognitive elaborations as a result of the therapist's arguments for change. It would be informative to discover, for example, whether certain types of therapist responses would be more likely to elicit positive client elaborations. By incorporating a number of pre-treatment and post-treatment outcome measures, it could then be determined whether greater cognitive elaboration in therapy sessions contributes to more positive therapy outcomes and, more specifically, the type of therapist interventions that result in greater amounts of positive client cognitive elaborations. Numerous studies have been conducted that support the above general hypothesis, but none so far have been completed on a clinical population (Barone & Hutchings, 1993). Such studies are likely to be large undertakings that require analyzing dozens of individual therapy sessions of different clients and therapists in a clinical setting. A rating system would need to be developed to address the specific research issues unique to studying cognitive elaborations within therapy sessions. Initial Development of CERS The Cognitive Elaboration Rating System (CERS) is the direct product of those needs. CERS was developed over approximately a one-year period of time. The researchers first modeled a response mode rating system based upon the findings of Elliott, et al. (1987) and the basic theoretical underpinnings of ELM (Petty & Cacioppo, 1986). This experimental rating system eventually contained fifteen categories - nine for coding therapist communications and six for rating client communications (Table 3). Three of the six client categories were also rated according to whether they were positively, neutrally, or negatively goal-directed, according to each client's individual treatment plan. It was hypothesized that by noting whether certain interventions in therapy were followed by either positive or negative cognitive elaborations, the impact of such interventions could be an important factor for measuring treatment efficacy. Hence, for example, if a therapist who asked an abundance of open-ended questions continually received positive self-disclosures in response to his or her questions from different clients across the spectrum, this would be an important variable to note. Table 3. Original experimental categories for CERS Therapist Client ----------------- ----------------------- Confrontation Compliance (+, -, N) Information Resistance (+, -, N) Advisement Self-disclosure (+, -, N) Interpretation Reflection Self-disclosure Question Reflection Other (Inaudible) Reassurance Question Other (Inaudible) The units of measurement in this initial study were complete talking turns between therapist and client. Each talking turn was rated according to the speaker's categories, while minimal encouragements (such as, "Mmhm") were ignored. Talking turns could receive more than one rating, which occurred most often during long stretches of speech. Ratings were drawn from audiotaped therapy sessions of clients being seen in a local, suburban community mental health center. Clients carried an Axis I or Axis II mental disorder diagnosis from the Diagnostic and Statistical Manual of Mental Disorders - 3rd Edition, Revised (APA, 1987) and were generally of lower socio-economic status. Therapists were doctoral-level clinical psychology students operating in a general adult outpatient psychotherapy program and were supervised by a licensed clinical Ph.D. psychologist. Therapists represented the entire continuum of therapeutic orientations and interventions currently practiced. Initially, ratings were completed for the entire 45-minute audiotape. While the system was under development, raters - two doctoral-level students in a clinical psychology program and two clinical Ph.D. psychologists - independently rated each session according to the initial CERS categories. Treatment plans describing specific treatment goals for each client were provided and reviewed at the onset of each rating session. After approximately 20 to 25 talking turns had been completed, the audiotape was stopped and ratings among the four individuals were compared. Discussion resulting in a consensus among the raters helped clarify discrepancies between rating categories as the study progressed. After approximately six months of developing this initial cognitive elaboration rating system, it was completely discarded for a number of reasons. First, it was discovered to be inefficient for examining therapy sessions for cognitive elaborations and the context in which they occur. That is, more ratings were performed than were necessary for the detection of cognitive elaboration and the therapist's prompting of client elaboration. Second, adequate interrater reliability among the researchers using this system was never obtained, and varied widely depending upon the complexity of the session being rated (r = .20 to .78, averaging .53). Discussions regarding discrepancies among raters occurred very frequently, even after the rating categories had been defined and used for months by the same set of raters. Third, the rating system required that raters remember the client's treatment goals for therapy throughout the entire rating period. These goals were developed by the therapist and client without regard to research concerns. Consequently, ratings were often open to interpretation as to whether a client's responses were positively, negatively, or neutrally goal-oriented. Final Version of CERS The response mode rating system which rated verbal talking turns in therapy was exchanged for a rating system which coded the content of therapists' and clients' responses during 3-minute time intervals. Although a time-interval mode was judged to be less sensitive than a response mode rating system, it was believed that it would be accurate and sensitive enough to determine whether cognitive elaboration took place and during which types of client-therapist interactions. At the same time, new categories were developed to simplify rating tasks and increase interrater agreement. These categories were: old or existing experiences, emotions, or beliefs/ideas; and, new or developing experiences, emotions, or beliefs/ideas (Table 4). A rating worksheet was used and during each 3-minute interval, raters were instructed to simply check off the occurrence of one of the above categories. Face validity for these categories was never obtained. Table 4. Final categories for CERS Therapist Client ---------------- ----------------- Old/existing: Old/existing: Experiences Experiences Emotions Emotions Beliefs/ideas Beliefs/ideas New/developing: New/developing: Experiences Experiences Emotions Emotions Beliefs/ideas Beliefs/ideas At the end of each 15-minute interval (or five 3-minute intervals, as the worksheet was designed [Appendix B]), raters halted the audiotape and also rated the following five categories on a 7-point Likert-type scale: client's agreement with therapist, client's amount of elaboration, polarity of client's elaboration, therapist's persuasive attempts, and therapist's prompting of elaboration. Client's agreement with therapist refers to the amount of agreement or disagreement the client had with the therapist during this segment. Client's amount of elaboration refers to the amount of elaborations the client made compared to the overall percentage of time the client spoke and compared with all other clients. Polarity of client's elaboration refers to whether the client's elaborative attempts were mostly negative, neutral, or positive in the 15-minute interval. Therapist's persuasive attempts refers to how much the therapist attempted to persuade the client as compared to the overall percentage of time the therapist spoke and compared with all other therapists. Therapist's prompting of elaboration refers to the amount of elaboration the therapist prompted in the client as compared to the overall percentage of time the therapist spoke and compared to all other therapists. While the three previous raters (one of the Ph.D. psychologists discontinued rating materials) independently rated the same audiotapes used in the earlier rating system, one of the raters would keep track of the time. When three minutes had passed, the audiotape was stopped and ratings for that 3-minute segment were compared. Discussion ensued over disagreements among the ratings and final ratings were based on a consensus among the raters. During these discussions, it was discovered that a number of rating categories were difficult to distinguish and were sometimes misunderstood. It was at this point that a rater's training manual was begun, to record rules that were developed for these difficult decisions (Appendix B). Rules were developed from criteria discussed amongst the raters and based upon pragmatic concerns. For instance, it was decided that small talk that often occurred at the beginning of many therapy sessions could be ignored, since it involved client-therapist interchanges that were of little therapeutic value. The manual also provides an overview of the rating system, explains each of the rating categories (including the five Likert-type scales), and describes in detail each of the eight rules developed to distinguish especially difficult rating situations. Raters were encouraged to familiarize themselves with this manual and review it prior to each rating session. When it became evident that this system had much greater interrater reliability (averaging 83% agreement), the initial interrater reliability study was initiated. This study followed essentially the same format as described above, with a few minor changes. Only two raters were used, one female doctoral-level student previously unrelated to the study and one male doctoral-level student who had been one of the raters throughout the development of the CERS. An audiotape with a tone sounding every three minutes was made to free the raters from the additional task of time-keeping. It was determined pragmatically that the best sessions to rate for each client would be the third session, some mid-point session, and the last session of therapy. This interrater reliability study was conducted and the details of this study, as well as its results, will be presented in an independent paper. Raters were selected independent of their theoretical orientations or personality characteristics. Due to practical limitations, raters selected for the initial and final versions of the CERS were always doctoral-level students and the researchers themselves. No effort was made to prescreen the desirability of raters based upon a theoretical understanding of ELM, cognitive elaboration, or therapy experience; all of the current raters, however, did have at least minimal therapy experience. Moras & Hill (1991) suggest that raters be selected based upon preset criteria established by individual researchers and that such criteria be noted. The criteria used for the present study included the attainment of a preset interrater reliability statistic among raters (r = .85 or better) and a motivation to complete the task. The CERS is unique in its unit length. Whereas most other systems define the unit as either an utterance or a response unit, the CERS uses a time-interval unit of every 3 minutes within three complete therapy sessions. The VPPS has been used in studies with varying unit lengths, from 10 to 15 minutes, to the entire therapy hour. Researchers currently recommend 15-minute unit lengths systematically sampled throughout a therapy session (Suh et al., 1989). Also like the VPPS, the CERS utilizes Likert-type scales for some of its ratings, and uses 15-minute unit segments. The development of the CERS somewhat parallels the construction of the other systems presented here. CERS was developed and refined over a period of time in which categories were fully explored and defined, while ratings were discussed at great length amongst the system's authors. This method was similar to Hill's (1978) development of the CVRMCS, which also underwent different stages of development and refinement (culminating in Friedlander's 1982 revision of the original system). Other systems began from more stable roots and have changed only slightly since their introduction, including the VPPS, which is only in its second full version, and the VRM system, which remains largely unchanged since 1976 (Stiles, 1992). Once the reliability and validity of the CERS has been established, researchers can begin examining the purpose for which CERS was constructed. That is, namely, to examine the existence of cognitive elaborations in therapy and under which types of therapist interventions positive and negative client elaborations occur. With a system sensitive enough to distinguish such interventions in therapy, yet pragmatic enough to train raters on it quickly and reliably, a large research study can be undertaken with a clinical population. It must be emphasized that the final version of CERS has not yet been determined to be sensitive enough to distinguish the therapist interventions which elicit greater client cognitive elaborations. The therapeutic process is a dynamic process that includes client personality variables, nonverbal behaviors, often dissimilar intentions between the therapist and client, and situational variables, all of which must be taken into account when examining therapeutic outcomes. CERS is just one small part of understanding the entire therapeutic process, but may prove extremely useful in this understanding. References American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. Barone, D.F., & Hutchings, P.S. (1993). Cognitive elaboration: Basic research and clinical application. Clinical Psychology Review, 13, 187-201. Cummings, A.L. (1989). Relationship of client problem type to novice counselor response modes. Journal of Counseling Psychology, 36, 331-335. Elliott, R., Hill, C.E., Stiles, W.B., Friedlander, M.L., Mahrer, A.R., & Margison, F.R. (1987). Primary therapist response modes: Comparison of six rating systems. Journal of Consulting and Clinical Psychology, 55, 218-223. Friedlander, M.L. (1982). Counseling discourse as a speech event: Revision and extension of the Hill Counselor Verbal Response Category System. Journal of Counseling Psychology, 29,425-429. Garfield, S.L., & Bergin, A.E. (Eds.). (1986). Handbook of psychotherapy and behavior change. New York: Wiley & Sons. Gomes-Schwartz, B. (1978). Effective ingredients in psychotherapy: Prediction of outcome from process variables. Journal of Consulting and Clinical Psychology, 46, 1023-1035. Gomes-Schwartz, B. & Schwartz, J.M. (1978). Psychotherapy process variables distinguishing the "inherently helpful" person from the professional psychotherapist. Journal of Consulting and Clinical Psychology, 46, 196-197. Greensberg, L.S., & Pinsof, W.M. (Eds.). (1986). The psychotherapeutic process: A research handbook. New York: Guilford Press. Hill, C.E. (1978). Development of a Counselor Verbal Response Category System. Journal of Counseling Psychology, 25, 461-468. Hill, C.E. (1986). An overview of the Hill Counselor and Client Verbal Response Modes Category Systems. In L.S. Greenberg & W.M. Pinsof (Eds.), The psychotherapeutic process: A research handbook (pp. 131-160). New York: Guilford. Hill, C.E. (1992). Research on therapist techniques in brief individual therapy: Implications for practioners. Counseling Psychologist, 20, 689-711. Kivlighan, D.M. (1989). Changes in counselor intentions and response modes and in client reactions and session evaluation after training. Journal of Counseling Psychology, 36, 471-476. Lindner, R. (1954). The fifty-minute hour. New York: Delta. Moras, K., & Hill, C.E. (1991). Rater selection for psychotherapy process research: An evaluation of the state of the art. Psychotherapy Research, 1, 113-123. O'Malley, S.S., Suh, C.S., & Strupp, H.H. (1983). The Vanderbilt Psychotherapy Process Scale: A report on the scale development and a process-outcome study. Journal of Consulting and Clinical Psychology, 51, 581-586. Petty, R.E., & Cacioppo, J.T. (1986). Communication and persuasion: Central and peripheral routes to attitude change. New York: Springer-Verlag. Russell, R., & Stiles, W. (1979). Categories for classifying language in psychotherapy. Psychological Bulletin, 86, 406-419. Stiles, W.B. (1978). Manual for a taxonomy of verbal response modes. Chapel Hill: Institute for Research in Social Science, University of North Carolina at Chapel Hill. Stiles, W.B. (1992). Describing talk: A taxonomy of verbal response modes. London: Sage Publications. Storr, A. (1990). The art of psychotherapy (2nd ed.). New York: Routledge. Strupp, H.H. (1957). A multidimensional system for analyzing psychotherapeutic techniques. Psychiatry, 20, 293-306. Suh, C.S., O'Malley, S.S., Strupp, H.H., & Johnson, M.E. (1989). The Vanderbilt Psychotherapy Process Scale (VPPS). Journal of Cognitive Psychotherapy: An International Quarterly, 3, 123-154. Windholz, M.J., & Silberschatz, G. (1988). The Vanderbilt Psychotherapy Process Scale: A replication with adult outpatients. Journal of Consulting and Clinical Psychology, 56, 56-60.
Last reviewed: By John M. Grohol, Psy.D. on 15 Sep 2002
Published on PsychCentral.com. All rights reserved.