AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
The ARF Copy Research Validity Project, which was completed in 1990, had its roots in a speech made by Ted Dunn at the ARE Annual Conference in 1977. Addressing the issue of the validity of copy testing he hypothesized that the answer existed in the archives of copy-testing experience but that, because it took so long to accumulate large enough data bases to permit generalizations and because each copy researcher saw only a small piece of the total, we might never learn the "truth." He called for the formation of a committee to survey advertisers about the amount of validation work that had been done and their willingness to submit their copy-testing files to the ARF--given strong guarantees of confidentiality.
Accordingly, such a committee was formed and, in accordance with time-honored principles, Ted Dunn was named chairman.
A three-year effort devoted to finding available cases bore little fruit. Cases were either unavailable due to confidentiality restrictions, skewed toward certain companies and brands, or unable to discriminate based on sales.
At this point the committee was convinced that the answers everyone was seeking were not in the files, at least not in the available files. But it had become even more convinced that the issue of copy-testing validity was an important one, that finding out more about it would be a service to the industry, and that the ARF was the logical organization to design and oversee research in that area.
The next question was how best to approach something then being called the "Forward Experiment" to differentiate it from any approach that involved the collection of past experience. The committee quickly discovered that a complete experiment covering all aspects of copy research would involve some 6,480 design cells and would cost in the neighborhood of $2 billion. So a design committee was appointed to develop an affordable compromise.
In June 1982, the design committee came up with preliminary design specifications reflecting various combinations of on-air versus off-air, recruited versus self-selected audiences, in-program versus naked contexts, and single versus multiple exposures. They called for split cable (no matched markets), a minimum test length of six months, and a target of 10 pairs of commercials that were producing significant sales differences. In November 1983, a copy research workshop audience was polled on the kinds of copy-testing contrasts that were of most interest to them. In rank order those were:
* Recall vs. Persuasion
* On-Air vs. Off-Air
* Recruited vs. Self-selected Audiences
* Single vs. Multiple Exposures
* In-program vs. Naked Testing Environments
* Pre/post vs. Post-only Designs
Judging from the quantity of votes the first two of these topics were clearly considered the most important.
A funding drive was then launched involving a six-cell design (three off-air and three on-air) and a financial target of approximately $750,000. However, by November 1984, it was apparent that the drive would fall short and that design changes would be required if the project were to be brought to fruition. Fortunately, commercial copy-testing firms stepped in and assumed the responsibility for conducting, at their own expense, the on-air tests. They were assured that their identities would be carefully protected regardless of the outcome of the testing and, in turn, agreed not to promote the results of the tests no matter how well their service fared.
The design committee also developed a cost-efficient alternative. Rather than simply putting pairs of commercials that had been copy tested into very expensive sales tests, it recommended that the process be reversed. In other words, the sales-test results would be examined first, and only those pairs that were producing significantly different sales results would be copy tested. This avoided situations in which copy tests showed differences but sales tests did not. It was felt that, while that kind of situation is more likely to occur, it offers less opportunity for learning. After considerable discussion, it was also decided that to expand the design to include this circumstance (i.e., where no sales difference occurred) would be of less interest to the community and be beyond the financial resources available for the study. So the objective of the final design became "to see how well the types of copy tests in common use can identify known sales winners from pairs of commercials for specified brands."
The committee then began to look for pairs of commercials that were producing significant sales differences with a target of 10 such pairs. Once an eligible pair was located, those commercials were quickly copy tested in markets that were close geographically to the markets from which the sales information had come but at the same time far enough away so that the people in those markets had not been exposed to the commercials.
Split-cable research companies, of course, could not divulge the names of client companies that were running copy tests so the committee contacted clients directly. For a year and a half the research directors of companies thought to be using split-cable copy testing were regularly contacted to determine whether or not they had an eligible pair of commercials and, if so, whether they would be willing to release that pair to the ARF for its own copy testing.
As mentioned above, the goal had been to find 10 eligible pairs. However, it was quickly found that the conditions placed on eligibility made pairs of commercials hard to locate. To be eligible the pairs had to be commercials not previously aired, from an advertiser making minimal use of print in the test markets, commercials that were the only ones in use during the test, and ones for which at least six months' (and preferably a year's) sales data were available. While it was apparent that strict adherence to these criteria would substantially lengthen the period needed to obtain pairs, it was decided that the ideal of relatively clean tests was worth the wait.
By 1989 five pairs had been obtained and tested. Some had been lost to fast rollouts and some were rejected because the sales differences, while statistically significant, were only one or two percentage points. It was getting progressively more difficult to locate eligible pairs and funds for testing purposes were running low. So it was decided that results would be reported for the available cases with the proviso that, if additional cases turned up and industry interest levels were high enough to indicate that additional funding would be available, the project might be extended.
A top-line report on findings was made at the ARF Annual Conference in April 1990 and a final report at the ARF Copy Research Workshop in July 1990.
Objectives and Research Designs
The objectives of the research that emerged from this process can be summarized as follows:
* How well do copy tests, as presently conducted, identify known sales "winners"?
* Which individual measures do the best job?
* Which general types of measures are most predictive?
* Are on-air designs preferable to off-air?
* Are pre/post designs preferable to post-only (matched group) designs?
* Are multiple-exposure designs better than single-exposure designs?
* Is any one copy-testing system superior to the others?
The final research design involved six copy-testing methods, three of which were off-air and three on-air. Ten commercials (five brand pairs) were tested via each of the six copy-testing methods; for each pair it was known that one of the commercials was producing significantly more sales than the other (by margins from +8 percent to +41 percent). So the study design consisted of 30 cells and looked as shown in Figure 1.
Between 400 and 500 people were interviewed in each cell, thus totaling between 12,000 and 15,000 interviews. All respondents were members of the target groups as defined by the brands being advertised. In most cases these were simply category users.
The products advertised were all packaged goods in the food and health-and-beauty-aids categories. The brand names were all established brand names, and for each of them television was the predominant form of advertising.
The copy-testing measures that were built into the questionnaires for the off-air cells fell into six general types: measures of persuasion, brand salience, recall, communications (playback), overall commercial reaction (liking), and commercial diagnostics.
Within each general type were a number of specific measures as follows:
Individual Measures
(1) Persuasion
* Brand Choice
* Constant Sum
* Purchase Interest (top box)
* Consideration Frame
* Overall Brand Rating (top box and average)
(2) Salience
* Top-of-mind Awareness
* Unaided Awareness
* Total Awareness (Unaided Plus Aided)
(3) Recall
* Recall Brand from Product Category Cue
* Recall from Brand Cue
* Claimed Recall from Full Set of Cues
* Related Recall from Product Category Cue
* Related Recall from Full Set of Cues
(4) Communication
* Main Point Communication
* Ad Situation/Visual
* Ad Positive
* Sales Point Communication
* Main Point Situation
(5) Commercial Reaction (liking)
* One of the Best Seen Lately (top box and average)
* Like/dislike (top box and average)
(6) Commercial Reaction (diagnostics)
Positive
* I learned a lot from this advertising.
* Tells me a lot about how product works.
* Helps me to find the product that I want.
* Told me something new about the product that I didn't know before.
* This advertising is funny or clever.
* I find this advertising artistic.
* This advertising is enjoyable.
Negative (Items reversed)
* The product doesn't perform as well as this ad claims.
* This ad doesn't give any facts, just creates an image.
* This advertising insults the intelligence of the average consumer.
* This advertising is boring.
* This advertising is in poor taste.
Detailed descriptions of the on-air cells are not available for reasons of maintaining confidentiality. The research firms involved used their normal procedures. The three off-air systems were a pre/post system, a post-only system, and a multiple-exposure (re-exposure) system, as shown in Figure 2. Respondents for the off-air cells were recruited in shopping centers and exposed individually to a 10-minute program that contained a single three-commercial pod, one of which was the commercial of interest with the remaining two included to provide noncompetitive clutter. All commercials were 30 seconds in length. Respondents were recruited to watch the program (not the commercials). Interviewing took …