Guest Blog – Jason R. Baron
“Quality…you know what it is, yet you don’t know what it is. But that’s self-contradictory. But some things are better than others, that is, they have more quality. But when you try to say what the quality is, apart from the things that have it, it all goes poof! … Obviously some things are better than others … but what’s the “betterness”? … So round and round you go, spinning mental wheels and nowhere finding anyplace to get traction. What the hell is Quality? What is it?”
- Robert Pirsig, Zen and the Art of Motorcycle Maintenance
Ralph Losey has once again graciously given over his column to me – and so this week I wish to use this platform to share a few personal, informal thoughts on the “hardness” of the problem of e-discovery search. I also wish to shamelessly “hawk” the June 6, 2011, DESI IV workshop in Pittsburgh — intended to be a high-level exploration of what should or could be future e-discovery standards governing the search for electronic evidence. Even if you choose to skip reading the rest of this blog, please note: all devotees of Ralph’s column are warmly welcomed to submit papers to DESI IV.
Robert Pirsig spent a good deal of time on his motorcycle in the 1970s contemplating the metaphysics of quality. In my own way, I’ve been on a similar quixotic mission for at least the past eight years — in search of “quality” in the e-discovery search space. This particular quest for the Holy Grail has involved seeking out the perfect search where one finds “just” highly relevant documents in response to a FRCP 34 document request, or, as a matter of early case assessment, “just” the hot documents one needs to win the case. I’ve searched the world over for answers, and along the way decided that I had been asking the wrong question.
At one time, I thought I knew what the problem was, and what the information retrieval “task” should be to overcome the problem. The problem, I thought, was simply the naïve use of keywords. Or at least, the way lawyers naively think about keywords when going about the task of searching for electronic evidence. I think many lawyers still practice with the assumption that using simple keywords, without more, to find responsive ESI is sufficient to get them through the day in dealing with their e-discovery obligations. While this remains a problem, it is not in my view the problem. And the task is not simply to try to “beat Boolean” with other search methods.
When I started thinking about this back in the Dark Ages (no, not those Dark Ages, I mean before the 2006 Rules changes), I was convinced that there “must” be a better method out there that reliably beat the kind of keyword and Boolean searches that lawyers used. My views were formed after being tasked to search through a “mere” 20 million presidential emails from the Clinton era in search of “tobacco”-related documents of relevance in US v. Philip Morris (still-active litigation pending in federal district court in D.C. on remand from the Supreme Court). After dreaming up an arbitrary set of keywords, and running a search that initially produced 200,000 hits but ultimately yielded 100,000 relevant documents after a six month document review exercise conducted by 25 archivists and lawyers, I understood that the profession was facing a crisis point. Simply put, the exponentially increasing volume of information would soon render impossible the task of manually sifting though 1% of unimaginably large volumes of electronically stored information (ESI). Not just costly, resource-intensive and woefully inefficient, but simply impossible, given real-world constraints.
Knowing that there was no way that we could “absorb the hit” of manually searching through 1% of a projected cumulative total of 1 billion White House emails by 2017, I started on my journey of e-discovery, in search of better searches. I joined The Sedona Conference® in 2003, to hobnob with leading thinkers in the legal profession, especially in the Working Group on Electronic Document Retention and Production. Although no one professed to have “the answer,” there were many like-minded folks who found the question interesting enough to spend time on pursuing. In The Sedona Conference Best Practices Commentary on Search and Information Retrieval (2007), took on the issue of limitations in keyword searches, based on the problem of language, and for the first time proposed alternative search methods for the profession to consider, based on thesauri and mathematical/statistical means. These efforts helped frame the issues and led to citations in leading opinions from the e-discovery bench, including Judge Facciola writing in Disability Rights and US v O’Keefe, and Judge Grimm’s initial decision in Victor Stanley v Creative Pipe (a/k/a Victor Stanley I).
At some point into my journey, I sidled up to academics with fancy PhDs, going to various sorts of information retrieval conferences, like at CIKM and especially TREC. I liked hanging out with these types so much and found the experience so valuable that in 2006 I founded the TREC Legal Track, along with Doug Oard and Dave Lewis (more on TREC in a moment), I also decided that lawyers and academic types should hang out more together, to mutually benefit from application of esoteric information retrieval theories to the rough and tumble practice of e-discovery “in the trenches.” And so I created along with Doug Oard the DESI (Discovery of ESI) workshop series. DESI IV follows three successful workshops, in Palo Alto (DESI I), London (DESI II) and Barcelona (DESI III), the first and third of which were co-located with conferences put on by the International Association for Artificial Intelligence and Law, and the second sponsored by University College London. (Ralph has kindly allowed me to report from London and Barcelona in prior guest blogs. Losey, R., Electronic Discovery: New Ideas, Trends, Case Law, and Practices (West Thomson Reuters, 2010)). These gatherings have featured a wide array of individuals coming together for the first time — to engage e-discovery practitioners and various research communities. The “best of the DESI” workshop papers have been expanded into full peer-reviewed articles in a special e-discovery issue of the Artificial Intelligence and Law journal.
Two fundamental obstacles to achieving “the perfect search”: the intractability of language, and the problem of expanding ESI volume. George Paul and I wrote a law review article, Information Inflation: Can the Legal System Adapt?, highlighting these difficulties, and made an initial stab at suggesting coping strategies for the profession, including the use of alternative search methods, sampling, and cooperation, including staged negotiations amongst parties (i.e., multiple meet and confers) — all of which I am happy to report have come to the fore over the last few years. (A sidenote: for an amusing excursion into the mysteries of language, including whether there is such a thing as a “grammatical imperative” when it comes to the meaning of adjectival forms of nouns, check out the Supreme Court’s March 1, 2011 decision in AT&T vs FCC . The decision involved whether the term “personal privacy,” as used in Exemption 7 of the Freedom of Information Act, could be invoked by AT&T to shield documents from public disclosure. Disagreeing with the Third Circuit, which had held that the term “personal” was equivalent to “person” and that corporations are well understood to be “persons” as a matter of corporate law, Justice Roberts on behalf of a unanimous Court took occasion to note the “distinct meanings” of adjectives and nouns, such as “crab” and “crabby,” “corn” and “corny,” and “crank” and “cranky,” and holding that the construction of statutory language often turns on context. In ruling that AT&T did not deserve protection under the FOIA, the Court ends the Opinion saying “We trust that AT&T will not take it personally.”)
Context is, indeed, everything. As language is infinitely malleable, none of us mere mortals can reliably account for all possible word-choices that exist in deep repositories of ESI that make documents relevant (or that render documents seemingly relevant when they are not), without necessarily relying on more powerful strategms in the form of concept search, predictive analytics, artificial intelligence, and the like. I address these new forms of search, and the efficiencies that practitioners are gaining with them, in my own forthcoming piece, titled “Law in the Age of Exabytes,” in the Spring 2011 e-Discovery issue of the Richmond Journal of Law and Technology.
And volume. I’ve had fun along the way thinking about the problem of volumes of data. I created that infamous movie with Ralph, “e-Discovery: Did you Know?,” with the loud electro-trance music, which Greg Bufithis of The Posse List is working on translating into a variety of foreign languages soon. Prof. Richard Esenberg of Marquette Law School at a recent Federalist Society (!) meeting on changing the federal rules of civil procedure called our video a notable instance of the genre of “Electronic Gothic” – scary images of the future of lawyering amidst an exploding universe of ESI. (See his remarks on Youtube, at 27 minutes in). But I digress.
The TREC Legal Track has been a giant step forward in advancing our knowledge of what constitutes “quality” in e-discovery searches. The mission of the TREC Legal Track has been to serve as a research platform for “evaluating” the efficacy of search methods applied in a legal context, using two large data sets (first the 7 million OCR documents in a tobacco litigation collection, and in the last couple of years the Enron data set of emails and attachments). The Legal Track is now in its 5th year, with the latest year’s results from the running of the track in 2010 to be reported sometime soon. (Results can be found in Overview papers on the TREC Legal Track web page, and individual participant papers are reported on the NIST TREC website by year, starting in 2006.)
In the early years of the track, the paradoxical results coming out of the research were that Boolean systems only found on the order of 22% or so of relevant documents, with 78% found by the use of other alternative search methods, but that no single alternative fully automated search method reliably beat a well-formed Boolean search as a matter of one-on-one competition. More recently, what has emerged out of the research is that we can in fact do a much better job finding relevant documents if we employ iterative processes with human-in-the-loop experts serving as topic authorities — in other words, a form of hybrid approach that relies neither on brute force manual search nor fancy computer algorithms alone. In an article to appear in the above-mentioned Spring 2011 e-discovery issue of the Richmond Journal of Law and Technology, Maura Grossman and Gordon Cormack present tantalizing findings derived from the 2009 running of the Track’s “interactive task,” in which participating teams could use any combination of search methods including keyword searches, machine learning, and/or human review. The article supports the use of technology-assisted review and places one more nail in the coffin where the “myth of manual review being the gold standard for the legal profession” resides (or should).
How does one benchmark or evaluate whether you’re doing well in a particular search task? If the two measures you use to judge quality are recall (the percentage of relevant documents you found compared with the overall number of relevant documents) and precision (the percentage of relevant documents you found compared with all of the other ‘false positive’ documents that came up based on your search), then it turns out that the picture is somewhat muddy out there. Take a look at data from the 2008 and 2009 runnings of the TREC Legal Track from a variety of participants, in both the academic and commercial e-discovery sectors:
Some participants did very well in both recall and precision, some only in recall, some only in precision, and some in neither. One can’t even fit a line to this data easily — it’s simply all over the place! This simple illustration tells me that it is still the Wild Wild West out there, with the definite possibility that we as lawyers — who start off generally not being all that very well informed about the quality of the search algorithm being employed by a legal service provider — have a lot of questions to ask. Outside a particular legal setting, the larger academic question of interest devolves to: what kind of “process” has been employed so as to achieve optimum results in a given legal context. Or, to put it more bluntly: how do we get to the top right of the above chart, or as far as possible in that direction?
I still consider myself a “seeker” in the area of designing better searches, but my views continue to evolve as to the difficulty of the problem presented and what constitutes a good question to work on asking. As I said, the wrong question to ask is whether a given method “beats Boolean.” The right questions: how does one go about designing an optimal process that produces a quality result. And are there ways to regularize or standardize that process so as to “certify” the result in a way that is defensible?
A partial step forward is the growing cottage industry of cases and commentaries discussing what constitutes good search negotiations and protocols. See, e.g., The Sedona Commentary on Achieving Quality in E-Discovery. There is an emerging consensus that quality consists of sound project management of the process, and using an array of statistical and industrial techniques grounded in quality assurance and quality control, including testing and sampling of results, so as to ensure defensibility. However, there is no widely agreed-upon set of standards or best practices for how to conduct a reasonable e-discovery search for relevant evidence. The Sedona Conference and others have, however, called out to industry and academia to assist in the further development of standards of what constitutes “best practices” in the area of performing searches of electronic evidence.
In initiating a discussion about standards for what constitutes a “quality” process in e-discovery search, the DESI IV workshop will serve to achieve the aim of bringing together academia and industry in the development of standards in this area. A recent article that I co-authored on “Evaluation of Information Retrieval in E-Discovery” in the journal Artificial Intelligence and Law’s special issue on E-Discovery, suggested that:
One broad class of approaches that has gained currency in recent years … is known as ‘process quality.’ Essentially … the important thing is that we agree on how each performer of E-discovery services should design measures to gain insight into the quality of the results achieved by their particular process. The design of their process, and of their specific measures, is up to each performer. Of course, some performers might benefit from economies of scale by adopting measures designed by others, but because the measures must fit the process and because process innovation should be not just accommodated but encouraged, forced convergence on specific measures can be counterproductive. So process quality approaches seek to certify the way in which the measurement process is performed rather than what specifically is measured.
The DESI IV workshop is intended to provide a platform for discussion of an open standard governing the elements of a state-of-the-art search for electronic evidence in the context of civil discovery. The dialog at the workshop might take several forms, ranging from a straightforward discussion of how to measure and improve upon the “quality” of existing search processes; to discussing the creation of a national or international recognized standard on what constitutes a “quality process” when undertaking e-discovery searches; to a more ambitious discussion concerning creation of a type of standards “authority” or certification entity that could certify compliance with whatever standard emerges. Issues to be considered include the potential benefits of a standard (e.g., reducing the need for evidentiary hearings on the reasonableness of a process), versus its potential costs (e.g., the risk of inhibiting innovation through early convergence on rigid standards), and timelines.
For more details on how the workshop will be constructed, see the DESI IV workshop homepage. The day will be organized in four parts:
- There will be an initial overview of recent developments in the area of e-discovery search, including a presentation of recent case law, the needs of the legal profession in the area of reducing cost through better use of automated methods, and recent work on evaluation design.
- Then we will be holding brief presentations selected to illustrate the diversity of present e-discovery search processes (ranging from fully manual, to machine assisted, to fully automatic) and a second set of brief presentations to illustrate the diversity of present standard-setting efforts in the area of “quality” (e.g., ISO 9000 family of international quality management systems standards, and Capability Maturity Model Integration (CMMI) in software engineering).
- After that, we will encourage breakout sessions to engage the workshop participants in brainstorming with respect to what process quality standards for e-discovery search might entail. Each breakout group will be asked to initially look at the problem from the perspective of some specific process.
- Finally, there will be a facilitated session with panelists representing the various breakout groups discussing the ideas that have emerged, and further discussion with all participants. The panel will conclude by focusing on recommendations for further work, building from two questions:
● What research questions should be explored, so as to contribute to the development of process quality standards in the e-discovery search area?
● Who, beyond those already in the room, do we need to engage with to address the issues that we have identified?
We will be inviting both e-discovery stakeholders and practitioners from the law, government, and industry, along with researchers on process quality, information retrieval, human language technology, human-computer interaction, artificial intelligence, and other fields connected with e-discovery.
To help craft the program, we encourage the submission of research papers and position papers on emerging best practices in e-discovery search as well as papers discussing the efficacy of standards setting in this area. Accepted position papers and accepted research papers will be made available on the Workshop’s Web page and distributed to participants on the day of the event, and some speakers may be selected from among those submitting position papers. See the Call for Submissions for submission details.
There may well be many good ideas that we can draw on that have been worked out in the context of existing standards-setting processes in other fields, such as:
I am hopeful that what comes out of the workshop will be an emerging set of best practice guidelines if not standards, that might be susceptible to codification, either by a certification entity or by participants themselves. My dream: future litigation in which proactive judges working with sophisticated litigants can ensure that parties do not engage in resource intensive, evidentiary disputes about search protocols and methods. Rather, parties will be able to reach agreement over a search protocol that measures up to some known quality standard.
In fact, I am so enthusiastic about DESI IV that I have gone ahead and with Doug Oard, Dave Lewis, and Maura Grossman have organized a second SIRE workshop in Beijing as part of SIGIR 2011. If you can’t make it to the mystical, exotic city of Pittsburgh in June, perhaps you can join us in China in July. . . .!