The Rand Corporation is a well-known and prestigious non-profit institution. Its stated charitable purpose is to improve policy and decision-making through research and analysis. It has recently turned its attention to electronic discovery. Rand concluded, as have I, and many others, that the primary problem in e-discovery is the high cost of document review. They found it constitutes 73% of the total cost of e-discovery. For that reason, Rand focused its first report on electronic discovery on this topic, with side comments on the issue of preservation. The study was written by Nicholas M. Pace and Laura Zakaras and is entitled Where The Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery. It can be downloaded for free, both a summary and the full report (131 pages). A nicely bound paper version can be purchased for a modest fee of $20.
The full report is actually much better than the summary, in no small part because it shows the degree of care they used, and the honest disclaimers they make concerning the research. The disclaimers are needed because the study was only based on input from eight corporations. Still, it is a well written report with excellent analysis. I suggest you make time to read the full report.
The Rand Corporation Confirms Our Own Analysis and Makes The Same Bold Recommendations
The report not only analyzes the problem, it recommends a solution. Basically it says what I have been saying now for years, be bold, and take forward thinking action now to fight the high-cost problem head on. See Impactful, Fast, Bold, Open, Values: Guidance of the “Hacker Way.” As I have said before, the words lawyer and timid are not supposed to go together. Yet that is what we have here when it comes to the Bar’s use of advanced technologies, even when it is in the clients’ best interests. The Rand report recognizes the widespread timidness of many in the legal community, and makes the following recommendation at page 83, one that I strongly endorse:
To truly open the doors to more-efficient ways of conducting large-scale reviews in the face of ever-increasing volumes of digital information, litigants that have complained in the past about the high costs of e-discovery will have to take some very bold steps.
What action does the Rand study recommend as the core solution to the high costs of review? Again, it is the same mantra that most everyone in the field of e-discovery has been saying, fight the problems caused by technology (i.e. – too much information) by the intelligent use of even more technology. By intelligent we mean use the technology as part of a valid legal methodology, one based on the law. Do not just use technology on its own, for its own sake. The technology has to be run by lawyers, not techs. Sorry my tech friends, lawyers have to drive the CAR, computer assisted review.
The legal method I promote for CAR is called: Bottom Line Driven Proportional Review. It is based on the well established legal doctrine of proportionality. See eg.: Good, Better, Best: a Tale of Three Proportionality Cases, Part One and Part Two. Of course, my way is not the only way for the CAR highway. There are many other valid legal methods to use advanced technologies. There are many other reasonable applications in use by other respected attorneys in the field. The focus on budgeting, estimation, transparency, cooperation, and proportionality is just my particular method. One that I encourage others to follow.
The Rand Report does more than just recommend the use advanced technology, it actually endorses one particular type of technology, my friend Predictive Coding. That’s right, this prestigious, non-profit, independent group has reach the same conclusions that I have, and many, many others have (in fact, you would be hard pressed to find any bona fide expert to argue against the idea of predictive coding). It is now official. Predictive coding is the best answer we have to the problem of the high costs of e-discovery. Of course, there will be good faith debates for years to come on the best methods to use this new technology, and in what cases it is appropriate. The Rand report discusses all of these considerations.
The most promising alternative available today for large-scale reviews is the use of predictive coding and other computerized categorization strategies that can rank electronic documents by the likelihood that they are relevant, responsive, or privileged. Eyes-on review is still required but only for a much smaller set of documents determined to be the most-likely candidates for production. Empirical research suggests that predictive coding is at least as accurate as humans in traditional large-scale review. Moreover, there is evidence that the number of hours of attorney time that would be required in a large-scale review could be reduced by as much as three-fourths, depending on the nature of the documents and other factors, which would make predictive coding one answer to the critical need of significantly reducing review costs. …
Despite the apparent promise of predictive coding and other computerized categorization techniques, however, the legal world has been reluctant to embrace the new technology. … the key reason is the absence of widespread judicial approval of the methodology, specifically regarding any acknowledgment of the adequacy of the results in actual cases or whether the process was a reasonable way to prevent inadvertent privilege waiver. Without clear signs from the bench that the use of computer-categorized review tools should be considered in the same light as eyes-on review or keyword searching, litigants involved in large-scale reviews are unlikely to employ the technologies on a routine basis. …
The use of computerized categorization techniques, such as predictive coding, will likely become the norm for large-scale reviews in the future, given the likelihood of increasing societal acceptance of artificial intelligence technologies that might have seemed like improbable science fiction only a few decades ago. The problem is that considerable sums of money are being spent unnecessarily today while attitudes slowly change over time. New court rules might move the process forward, but the best catalyst for more-widespread use of predictive coding would be well-publicized instances of successful implementation in cases in which the process has received close judicial scrutiny. It will be up to forward-thinking litigants to make that happen.
Again, I join the call to all forward-thinking litigants to, in the words of Star Trek, boldly go where no man has gone before. See eg. Predictive Coding Based Legal Methods for Search and Review; and, New Methods for Legal Search and Review. I am reminded once again of the words of a famous Indian lawyer turned saint: Be the change that you wish to see in the world. Mahatma Gandhi.
By the way, even though this report basically affirms my own analysis and blogs, I had absolutely no involvement in the research or preparation of this report. I am not sure I have even met Nicholas Pace and Laura Zakaras. But I note that two of the top experts in our field did help out the Rand newcomers, mainly Thomas Y. Allman and Jason R. Baron. I am of course influenced by their many excellent writings, just as I will henceforth be influenced by the Rand report of Pace and Zakaras. That is how knowledge always advances in every field of law, technology, and science. As my readers well know, my opinions are an amalgamation of the thinking of all of the leaders in the field. Only a few of my thoughts are truly original. If I occasionally appear to be smart and far-seeing, it is only because I am standing on the shoulder of giants. It has always been so.
Rand Describes Predictive Coding
Pace and Zakaras not only recommend predictive coding, they venture deeply into the who, what, when, where and why of the new technology. For instance, they do a nice job of describing how predictive coding works at page 59 of the report:
Predictive coding, sometimes referred to as suggestive coding, is a process by which the computer does the heavy lifting in deciding whether documents are relevant, responsive, or privileged. This process is not to be confused with keyword-based Boolean searches or the similarity detection technologies described in Chapter Four. Near-duplication techniques, clustering, and email threading can help provide organizational structure to the corpus of documents requiring review but do not reduce the document set that has to be reviewed by attorneys for specific aspects, such as responsiveness or privilege. Predictive coding, on the other hand, takes the very substantial next step of automatically assigning a rating (or proximity score) to each document to reflect how close it is to the concepts and terms found in examples of documents attorneys have already determined to be relevant, responsive, or privileged. This assignment becomes increasingly accurate as the software continues to learn from human reviewers about what is, and what is not, of interest. This score and the self-learning function are the two key characteristics that set predictive coding apart from less robust analytical techniques.
They go on to point out at page 61 what they call an ironic feature of predictive coding, which, by the way, I now sometimes also like to call Intelligent Review or Probabilistic Review:
As should be clear from this description, predictive coding does not take humans out of the review loop. It requires intensive attorney support throughout the process in order to advance machine learning. Ironically, for a technique that could substantially reduce discovery expenses, the best results will be achieved if the attorneys most closely involved in the case select the seed documents and review sampled extracts, effectively precluding the use of lower cost contract attorneys or LPO vendors for these particular tasks. Moreover, attorney judgment continues to loom large in the process after the application has completed its work, with eyes on review required, for example, to check documents of unknown relevance and responsiveness or look for privileged communications.
Advanced technologies like predictive coding do not replace lawyers. Instead they require better educated lawyers. Still, the days of vast armies of minimum skilled contract lawyers are numbered. Fewer lawyers will be needed for intelligent review, but they will have to be better trained about the case and the technology. They will need to be SMEs – subject matter experts, and technophiles. I know that most contract lawyers will be quite happy about this change, as they have only been willing to suffer through the drudgery of never-ending email reading because of the economy. I predict that many of these lawyers will rise to the occasion and become the best SMEs of the future.
Rand Dares to Mention the Elephant in the Room
The Rand report discusses many resistance factors against the widespread adoption of predictive coding technologies. They even touch on the one that most analysts dare not mention. They raise the issue of the vested financial interests of certain companies and law firms to continue expensive, over-review of documents. Here is how Pace and Zakaras describe it at page 76:
Resistance of External Counsel
Another barrier to the widespread use of predictive coding could well be resistance to the idea of outside counsel motivated not so much by accuracy issues as by the potential loss of a historical revenue stream. Some interviewees reported grumblings from outside counsel when their companies decided to directly handle a fraction of the overall review process or to markedly reduce what was shipped out for review through the use of additional data processing.
My applause to the Rand Corporations for this bold statement of the obvious. I hope they have been warned, as I was when I stood next to the elephant in the picture, not to touch him. If he steps on your toes, your whole foot will be crushed.
I always include in my essays on predictive coding a call for vendors to bring down the prices of these advanced software features. The high prices are a serious impediment to adoption by even brave attorneys and forward-thinking litigants. The prices of most vendors today usually restrains the use of predictive coding to big cases. The Rand report once again validates my complaints at page 98:
Moreover, computer applications for conducting review are unlikely to be economically viable options when dealing with smaller document sets, in which any savings in attorney hours might be overwhelmed by vendor costs and machine-training requirements. Existing approaches, such as deduplication, cluster analysis, and email threading, may provide a more practical answer in these situations.
By the way, predictive coding is not a replacement of all other search methods, it is a supplement. It is the current crown jewel of search, to be sure, but it is still just one of many methods. It is one tool in an arsenal or weapons. That is why I call my search method multimodal. It features predictive coding, but includes other types of review too, including keyword search and human eyes-on review. Predictive Coding Based Legal Methods for Search and Review.
As the Rand report indicates, cases with smaller documents sets are not yet economically viable for predictive coding. But, when vendors do finally heed my call and lower prices, predictive coding will be economically viable for many more cases. Then the full arsenal of truth-seeking missiles can be used in even medium-sized cases.
The Rand report also looks into corporate complaints of the high cost of preservation. This topic is something of an add-on to the primary topic of review, but it is still well worth reading. Preservation expenses are, after all, present in every case, which is not necessarily true with expensive review costs. The survey showed that preservation has become a significant financial burden for many companies, with many explanations on why, but nobody seemed to have good metrics on the burdens. Rand recommends that corporations begin to systematically track costs in this area. Uncertainty and conflicts in the law of preservation were also discussed, but no recommendations were made. For a new case finding gross negligence in preservation, but only awarding monetary sanctions, see Telecom, Inc. v. Global Crossing Bandwidth, Inc. No. 05-CV-6734T (W.D.N.Y. Mar. 22, 2012). Compare with Aviva USA Corp. v. Vazirani, No. 11-0369 (D. Ariz. Jan. 10, 2012) where monetary sanctions and an adverse inference is granted. Compare both with Spanish Peaks Lodge, LLC v. Keybank National Assoc., No. 10-453 (W.D. Penn. Mar. 15, 2012) where no sanctions were granted. Compare all of these with United Factory Furniture Corp. v. Alterwitz, No. 2:12-cv-00059-KJD-VCF, 2012 WL 1155741 (D. Nev. Apr. 6, 2012) where mirror imaging was ordered for preservation.
Where The Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery is a must read that is within everyone’s budget. It can be downloaded for free, both a summary and the full report (131 pages), but I recommend you read the full report. Although I disagree with a few points in the report, they are not worth examination. For the most part they got it right. It will be interesting to see what companies, if any, heed their call for forward-thinking litigants to take bold steps to use predictive coding. Regardless, kudos to the Rand Corporation, the RAND Institute for Civil Justice and the authors of the report, Nicholas M. Pace and Laura Zakaras, for a job well done.