Consumer/Patient Decision-support programs: Should they be regulated

Humphrey Bldg, Washington, DC, 4/17/97
Remarks during a panel on the question:
Joseph V. Henderson, MD
Interactive Media Laboratory
Dartmouth Medical School
Hanover, NH 03755

In this last segment of the afternoon we’re asked to consider the question of whether decision-support systems for patients (or consumers) should be regulated. Let me begin by focusing the discussion on health decisions that are significant, that is, where there are significant concerns—in the long or short term—about impact on quality and quantity of life. And where an individual, or those he or she would care for, may reasonably choose among possible interventions. We’re concerned with the technology-based support of such choices, in which the delivery of information and education—or decision support—services is made via new media. (I define new media as interactive access to information, ideas, and experiences, using technologies combining computers, electronic media, and communication networks).

Regulation, if enacted, must be applied at the level of program development, so let me next characterize the potential "regulees, "the developers. They are clearly the vital link between the consumer and various sources of health-related information. But they’re a diverse group. They range from small, one-person concerns to large, multi-party efforts. They may be motivated by passion about a topic or cause, by potential for economic profit, or both. They may be communicating with text or brochure-like presentations, or with more complex and interactive uses of multiple media. They may be very experienced and knowledgeable about the topics covered, about how to apply technology, about how to communicate, or not. They may have enough time and an adequate budget to do an excellent job in developing their information services, or they may not. And they may be equipped to perform adequate evaluations of their products, and have a will to do so… or they may not.

Now let me touch on the use of new media to educate, inform, in some cases persuade.

The craft of creating impactful, engaging presentations has been honed to a very fine edge. Sounds, video, computer animations can make for very memorable experiences. We now have a mature art and method of recording, or creating, human stories that can — in the best cases — convey meaning and insight, perhaps even engender wisdom. Combined with sound design and married to computers, the art and technology of media can provide interactive learning and decision-support that is impactful and appropriate, that combines the best that education and media have to offer, to help make complex concepts clear and to bring the complexities of decision-making under difficult circumstances under some control by the individual who would choose.

On the other hand, these powerful media tools have often been used inappropriately to persuade rather than to educate; subtle or blatant, biased information and propaganda of various shades — commercial, political, or for the "public good" — are widely used to shape public and private decision-making. Health care, subject to powerful economic forces and vested interests, is no stranger to these kinds of bias. Consumer health information services are likely to become a lucrative market, and network capacity will initially create a programming vacuum that many commercial interests will seek to exploit. And with interactive technologies the presentation can be even more powerful. Great care must be taken to assure that information provided to decision makers, be they clinicians or patients, is as free of content and framing bias as possible; it's unlikely that this will happen without vigilance on the part of public-interest agencies.

Development must include some evidence that what is imparted, and how it’s imparted, is at least doing no harm and, in what we would hope for the great majority of cases, is doing good. That means some type of evaluation must be part of the development process. However, even when circumstances are optimal, with well-equipped and well-intentioned developers, problems remain. There is the balancing of excellence in content, approach, and execution against the constraints of tight budgets and schedules. This is where the realities of the market-place, capitalism, shrinking public funding, and the realities of evaluation intrude. A "scientific," rigorous evaluation, even when the topic, audience, and setting lend themselves to such studies, is expensive and time-consuming, in some cases exceeding the costs and timelines for initial system development. (I’ll return to the topic of evaluation later, to distinguish evaluation in this domain from others, such as drug trials). If funded as part of a commercial venture, an excessively demanding development process involving evaluation and revision may eventuate in the failure of the product, because delays in getting to market may cost competitive advantage and obsolete information may cause the product—again—to lose value or become a liability. And this would be true even were the product outstanding and might provide a valuable and useful service.

Those considering the development of consumer health information systems must strike a good balance between the essential assurance of quality and the practical constrains of this very complex domain. And policy makers, and health care providers who might purchase and apply these products, must do the same. Proposals to assure excellence, via a process of evaluation and, some have proposed, certification or approval, must carefully consider these and other realities. We should keep clearly in mind a goal to promote the development of useful products that serve the public well. This requires a development environment that assures sufficient quality, while allowing the timely and profitable delivery of these products and services.

Let me give some examples to give you some sense of how complex the development, and attendant evaluation, of a "full-up" decision-support system is. I’ll frame these in terms of some principles that I think good decision-support systems for consumers should adhere to.

Accuracy of content: It ’s axiomatic that if information is to be valuable, it must be accurate for the individual concerned. Of key concern is having appropriate, available expertise. Even when content is based on good data and peer-reviewed literature, the development team must know how to judge their appropriateness and accuracy. In the best of worlds, the team should have a resident expert (ie, full-time and committed); ideally, that individual is trained generally in health care and is experienced in technology-based learning and decision-support. If appropriate experts aren’t available, the team must recruit one. If recruitment is required, as is usually the case, problems are common. The development team often discovers—too late into a project—that the development schedule is increasingly slowed because of the experts’ clinical duties… and their waning commitment to the project. Even with committed experts, some fields (eg, HIV or, to a lesser extent, breast cancer) change so rapidly that the development process must be truncated, or go on forever. In fact, development, via a process of schedules revisions and updates, MUST be a continuing process; funding and development must allow for timely updates or the product can rapidly lose its value and become a liability. The key to accuracy of content is good staffing,committed for the life of the project (initial development, revision, and maintenance phases); it can be a very long haul with many falling by the wayside. And, of course, there must be good, and timely, communication among the team.

There are also instances where the lay public and peers provide information. These can be extremely insightful, valuable sources of information. It may be that such sources, operating as autonomous, individual efforts driven by interest and passion, should not fall under the rubric, "developer." However, if made part of a more fully implemented system, as part of a larger development effort, these sources should be subject to concerns about accuracy as for other components of the system. Our concern is that the integrity of the whole will suffer if any component is not held accountable for accuracy of content.

Evaluation of content is difficult. Meta-analyses of the literature pose methodological problems. Consensus among experts is difficult to achieve. And, even when the content is laid flat (in a content document, as opposed to buried in the branching logic of a program), it’s difficult to get sufficiently careful attention of busy, often distracted, content experts. Planning for delays at this stage is axiomatic. However, careful design for review, providing guidance to focus experts’ attention and minimize frustration over trivia, getting informed consent when recruiting experts are all techniques that can help.

Another example is appropriateness of content: Information and its presentation should be appropriate to the characteristics and needs of the consumer. I’ll focus on two aspects: accuracy and understandability.

As discussed previously, a potential advantage of new media is the ability to tailor information to the individual. This raises issues beyond the overall accuracy of content. Tailoring requires appropriate granularity of information, ie, categories of content that adequately reflect the combinations of health situations that can exist among consumers; this is an extension of our previous concerns about accuracy of content and it necessarily involves expertise in content and in data management. It requires the gathering of consumer-specific information and this can be extremely difficult, particularly when the information is elicited directly from a consumer or lay surrogate. Interface and communications/understandability issues abound; This is clearly a vital, demanding area for research, much less practice. Careful evaluation of this aspect is absolutely essential, when dealing with any significant health concern. Having gathered data from the consumer, the system must then map that input accurately into the various categories of content that appear appropriate. While this may be relatively easy to do on paper, mapping logic is very difficult to implement. The mapping model (categorical, mathematical, actuarial, etc.) must be appropriate, and this often requires special expertise and research. Programmers will generally not be able to see the connections between their code and the information it presents, and it is difficult for them to know where to place greatest care while developing and testing their code. Since the permutations for mapping are usually very large, it is very difficult, sometimes impossible, adequately to exercise and test the system. There is great potential for inappropriate, in some situations dangerous, mistakes in architecture, mapping logic, or coding, mistakes that go undetected until they do attributable harm. We must develop and implement generic evaluation processes to assure the accuracy of information tailoring. This is clearly an area for policy to provide guidance and incentives. This committee will formulate recommendations to policy-makers in this regard. We will conduct a survey of leading developers and compile a list of "best practices" regarding this aspect of development.

Understandability is obvious to describe, difficult to specify, develop, test and implement. Various methods exist to establish the "reading level" of text presentations in terms of grade level (eg, 10-th grade level); these are widely available and, when used with an understanding of their strengths and weaknesses, they provide an automated way to characterize understandability. No such methods exist for media other than text, however. When we add images, motion video, the spoken word, things change. Moving beyond the text and its narrow band-width, communication can occur at many levels and change dramatically. Nuance, inflection, facial expression and body attitude all convey meaning, sometimes at an unconscious level (see discussion of bias, below). Addition of graphics, animations, sounds, music, can all assist or impede communication (and there is a tendency to insert them without clear purpose other than having the capability or to fulfill a marketing need for "multimedia"). Other methods must be used to establish understandability, combining formative evaluation (subjective response; easier to do) and summative evaluation (objective measures of learning—harder to do; examining outcomes—MUCH harder and more expensive to do). Some evaluation in this regard must be con

There are many other principles, each marking traps into which development, and the final product, can be ensnared.

Bias: content vs. framing (what is said vs. how it’s said). Intentional vs. unintentional.

Usability: an information service or product must be easily used. Current preponderance of query- or browser-driven access to database-like information structures, interfaces and structures; may not be suitable for a majority of potential users. Media professionals add sizzle to the database approach, but so far, the result has been a static presentation of brochure-like information, with the only (but very significant) advantage being rapid access.

Maintainability: to be of continuing value, the product or service must be current. How to do this. Contrast typical short-term strategies (one-shot sponsored research; next-quarter’s profits of business approach) with needs: medical knowledge and choices EVOLVE; the decision-support produce must also do so. How funded? What models?

As a last topic to bring forward, let me return to evaluation methodology. It assert that, in the great majority of cases, it is unwise, and probably pointless, to view the evaluation of technology-based consumer health information programs as a scientific process, in the sense that drug trials are. There are, in most cases, no hard endpoints or outcomes, only feelings and opinions. These are valuable, but, as Neil Postman points out in his book, Technopoly, application of the scientific method to matters of opinion is inappropriate; it’s pseudoscience. There are other methods adopted by market researchers, anthropologists, educators, psychologists that do allow for opinion gathering and formative evaluation; summative evaluation in terms of actual patient outcomes (opinions, attitudes, and behaviors) is also possible under certain circumstances (these, being adequate time and budget, a rare event).

My point is this. Regulation requires, it would seem obvious, clear criteria by which we approve a product or not. Clear results must be measured against those criteria, using methodologies that are generally accepted as producing good results. We can establish general criteria that can be measured: accuracy of content may be the easiest, but even this is difficult (what’s the yard-stick? Western medicine? What city you live in? Research?). In some cases, outcomes research has made great strides in establishing some criteria for accuracy that’s research-based, but these are oases in the desert of a very diverse practice world. What criteria would we establish for evaluation methodologies? Evaluation of learning and decision-support technologies is a developing field, and evaluators resort to a grab-bag of approaches, most of which give useful, but highly subjective information. And even if we can agree on methods, how do we establish criteria for effectiveness, or even acceptability, when we get into issues beyond accuracy: usability and bias, for example.

In the end, I think the answer to assuring quality in these decision-support systems is NOT in regulation. There IS a role for policy, however, in assuring that well-intentioned, caring, careful developers have the tools they need to make evaluation part of the development process. That process has already started in the formation by HHS of the Scientific Panel on Information and Consumer Health. A result of their efforts will, we hope, be a set of practical guidelines to assist developers (and those who would apply and use their programs). As discussed previously, developers will greatly benefit from applied research to develop methods—in the form of procedures and software tools—of building quality assurance into the development process; policy-makers will want carefully to consider the application of funding and other incentives for such development. Finally, policy should fund and otherwise promote demonstration projects that yield generally useful knowledge about how to develop, evaluate, and deploy decision-support systems that address a major health problem and —simultaneously—the question of how best to use these technologies in the future. They’re coming, in some cases they’re here.