The Legal Implications of What Science Says About Recall

January 29, 2012

I hear a lot about how different software will find all relevant documents. That would be 100% recall. I also hear demands from requesting parties to find and produce all relevant documents. In the context of large disorganized banks of electronic data, such as email collections, these claims and demands are not only contra to the rules of law, embedded as they are in reasonability, but they are also unrealistic and contra to the latest scientific research. In my Bottom Line Driven Proportional Review article I showed how this kind of demand for all relevant ESI is not permitted under the rules and Doctrine of Proportionality in big data cases (and most cases these days are big data cases). I explained, as many have done before me, that the rules do not require production of all relevant documents, if the burden to do so is disproportional. I also shared my method for keeping the costs for review proportional to the value and importance of the case and the production request. But aside from the cost issue, how practical is it expect to find all relevant ESI? I examined this question at length in my Secrets of Search series, volumes one, two and three. Still, people find it hard to accept, especially in view of the unregulated clamor of the marketplace.

So as my last gift to readers before Legal Tech starts tomorrow, the ultimate event of marketplace claims and competing exaggerations,  I present you with a hard dose of reality, I present you with more findings on legal search from the world of science. This time I direct you to an important article, Evaluation of Information Retrieval for E-DiscoveryArtificial Intelligence and Law, 18(4)347-386 (2011). It was written by leaders of TREC Legal Track and established giants in the filed of legal search: Douglas W. Oard, Jason R. Baron, Bruce Hedin, David D. Lewis, and Stephen Tomlinson. They analyzed the now fully published test results of the experiments in 2008, and carefully examined the interactive task, topic 301, as the best test of competing legal search technologies. This task made use of a subject matter expert and an appeals process for quality control on relevance determinations. Four teams of experts participated in the test, two academic and two commercial. A well known e-discovery vendor won the test (scientists hate it when I put it that way). They won because they attained better precision and recall scores than the three other participants.

Now we come to the punch line, the winning vendor attained a recall rate of only 62%. That’s right, they missed 38% of the relevant documents. And they were the winner. Think about it. The other three participants in the scientific experiment attained recall rates of less than 20%! That’s right, they missed over 80% of the relevant documents. Now what do you think about a requesting party who demands that you produce all of the relevant email?

Find my summary of the experiments hard to believe, then read the report for yourself. Here is the excerpt on which I rely at page 24 of Evaluation of Information Retrieval for E-Discovery:

On the basis of the adjudicated sample assessments, we estimated that there are 786,862 documents (11.4% of the collection) relevant to Topic 103 in the test collection (as the topic was de fined by the TA). All four teams attained quite high precision; point estimates ranged from 0.71 to 0.81. One team (notably the one that made the most use of TA time) attained relatively high recall (0.62), while the other three (all making significantly less use of TA time) obtained recall values below 0.20.

The team of information scientists, and their lawyer guide, Jason R. Baron, next report on the 2009 TREC experiments, specifically the one they found most representative, the interactive tasks, again with subject matter consultations and appeals. This time there were eleven teams participating in the experiment, three academic and eight commercial. That’s right, eight e-discovery vendors were in the game this time. How did they do? They did a little better, but not much. Five of the teams, and just five only, got a little over 70% recall.

The post-adjudication results for the 2009 topics showed some encouraging signs. Of the 24 submitted runs (aggregating across all seven topics), 6 (distributed across 5 topics) attained an F1 score (point estimate) of 0.7 or greater. In terms of recall, of the 24 submitted runs, 5 (distributed across 4 topics) attained a recall score of 0.7 or greater; of these 5 runs, 4 (distributed across 3 topics) simultaneously attained a precision score of 0.7 or greater.

Id. at pgs. 24-25. If you follow the article’s direction and see the Overview of the TREC 2009 Legal Track, by B. Hedin, S. Tomlinson, J. Baron, and D. Oard, you can find more details of the 2009 test results. After you wade through the wonderfully dense language that information scientists love to use to convey information, you find section 2.3.5 Final Results. There you are pointed to a table of numbers: Table 6: Post-adjudication estimates of recall, precision, and F1.

What does this chart tell us? The best anyone did was an 86.5% recall on one of the seven tasks. Look at the third column from the left for the recall rates attained. The lowest was 9%. Digging deeper the analysts found that the teams with the highest scores appealed the most, and those with the lowest scores, not at all. Consultation with the topic authority also helped improve scores. But the bottom line for purposes of my point today, is that the average recall rate was only 41% (993/24), and even the best attained on one search, by one team of experts, was only 86%. Demands for recall in the 80s for every project are thus unrealistic.

Conclusion

The scientific research proves, once again, that it is unreasonable to ask for any better recall than 70%, in fact, it should be substantially less. Law demand reasonable efforts, not perfection. The best recall results attainable in scientific experiments, with the best software and top experts at the helm, is way too high a standard for reasonable efforts. Reasonability should be more like average results attained by average lawyers making good faith efforts, not results attained by information scientists and specialists using the best software money can buy. So that means it should be less than less than the 41% average of experts. Even standards like that should be used with caution and the efforts meter should always be tempered by costs. Proportionality of efforts should, if they are in good faith and reasonable, always trump any quality control efforts. See Bottom Line Driven Proportional Review.

In fairness to my vendor friends, the latest reports from TREC are dated. That was then, 2008 and 2009, this is now, 2012. The test scores showed substantial progress from 2008 to 2009. In my experience, the predictive coding type search software has significantly improved in the last year or so. I have also heard unsubstantiated reports of much higher recall rates attained in the 2011 TREC Legal Track tests, but I take all of these claims with a big grain of salt. Until Dr. Oard and his information scientist crew (that, by the way, includes two lawyers, Jason Baron and Maura Grossman) publish results, obtuse as their publications are, I will remain skeptical. Right now science shows that if you can find an estimated 41% of the relevant documents in a large collection of ESI, then you are doing just as good as the experts. That has got to be good enough to meet the reasonable efforts required under the law.

You should be skeptical of any claims or demands for better results than that. You should stop chasing, or being chased, by unreasonable demands for high recall rates. The only way to attain 70% or higher rates today is by document dumps, where precision plummets as you produce irrelevant documents, or, perhaps by budget busting, near-endless iterations of search and seed-set training. Even then, your expensive pursuit is quixotic from the point of view of science, where the fuzziness measurement issue remains unresolved. Furthermore, and most importantly, in today’s world of big data, where everyone has 100,000 emails, it is wasteful in the extreme to try to find all relevant documents. If your are still trying to find them all, and not just the few super-relevant smoking guns, you have not understood that in today’s age, relevant is irrelevant, nor that the ultimate goal of discovery is to prepare for trial, where the 7±2 rule of persuasion reigns supreme.


Reply to an Information Scientist’s Critique of My “Secrets of Search” Article

January 28, 2012

One of the leading information scientists in the field of legal search, was kind enough to write a detailed critique of my Secrets of Search series. I tried to post a response on his blog, Information Discovery where it appeared. But the website would not accept the comments for some technical reasons, so I am replying here, knowing that Herb will find them, and maybe some of his readers. I bring Dr. Roitblat’s comments to your attention, even though they are in the nature of a critique, because I want my readers to hear all sides of the story, not just mine. I am just a lawyer, and especially welcome peer review from the scientific community. That is part of a team approach to e-discovery, where Law, Science, and IT work together, and learn from each other, to do e-discovery right. The world is too complex, the electronic haystacks too vast, for lawyers to find relevant evidence without such an interdisciplinary team approach.

It is also interesting to see that a lot of Herb Roitblat’s stated disagreements appear to be based on misunderstandings of what I was trying to say. That is common in interdisciplinary team efforts. I accept that it was probably my fault, as I was writing about information science topics in Secrets of Search. But I don’t beat myself up too much about it because it is just so damned difficult to write intelligibly on the matrix between e-discovery law and information science. That is one reason that almost no one even tries. Still, this apparent miscommunication presents an opportunity. By addressing Herb’s issues we can attain greater clarity of the emerging consensus. His comments, and my responses, suggest that we both agree on far more than we disagree. From my perspective, at least, the points of disagreement are really minor and technical. They pale in comparison to our mutual agreement as to the superiority of technology assisted review over mere manual review.

Before I post the reply, I have to give you an idea of  Dr. Roitblat’s commentary, so you can understand what I’m replying to. But better still, take few minutes to read his entire article, On Some Selected Search Secrets. Only in this way will my response make full sense. I do not really know Herb, although I think we’ve met at some events over the years, but I certainly know of him. He is one of the few info scientists around that is focused on legal search and actually makes a living out of it. (Apparently he also likes dolphins and killer whales.) He owns a company, OrcaTec, that, in his words, provides professional services and software for information discovery and information management. My Secrets of Search article frequently cited one of his works with the e-Discovery Institute: Roitblat, Kershaw, and Oot, Document categorization in legal electronic discovery: computer classification vs. manual review; Journal of the American Society for Information Science and Technology, 61(1):70–80, 2010.

Summary of Herb Roitblat’s Critique

Here is Herb’s analysis, which begins in a very flattering manner that I don’t deserve:

Ralph Losey recently wrote an important series of blog posts (here, here, and here) describing five secrets of search. He pulled together a substantial array of facts and ideas that should have a powerful impact on eDiscovery and the use of technology in it. He raised so many good points, that it would take up all of my time just to enumerate them. He also highlighted the need for peer review. In that spirit I would like to address a few of his conclusions in the hope of furthering discussions among lawyers, judges, and information scientists about the best ways to pursue eDiscovery.

These are the problematic points I would like to consider:
1. Machines are not that good at categorizing documents. They are limited to about 65% precision and 65% recall.
2. Webber’s analysis shows that human review is better than machine review
3. Reviewer quality is paramount.
4. Human review is good for small volumes, but not large ones.
5. Random samples with 95% confidence levels +/- 2 are unrealistically high.

Ralph’s Response: Thanks for taking the time to provide input on my article. I appreciate your comments, and actually agree with most of the points you make. I think you may have misunderstood some of what I was saying and your disagreement is actually agreement, but I much appreciate your clarifications. I will respond based on your enumerated points above.

Dr. Roitblat then explains the five problems that he had with a few of the conclusions that I made in Secrets of Search. Again, I urge you to read all of his comments, but for ease of reference, I here quote what I think is the essence of each of his five issues, and then follow with my response.

Issue [1]: Machines are not that good at categorizing documents. They are limited to about 65% precision and 65% recall. Losey quotes extensively from a paper written by William Webber, which reanalyzes some results from the TREC Legal Track, 2009, and some other sources. Like Losey’s commentary, this paper also has a lot to recommend it. Some of the conclusions that Losey reaches are fairly attributable to Webber, but some go beyond what Webber would probably be comfortable with. The most significant fact, because important arguments are based on it, is a description of some work by Ellen Voorhees that concluded that 65% recall at 65% precision is the best performance one can expect. The problem is that this 65% factoid is taken out of context. In the context of the TREC studies and the way that documents are ultimately determined to be relevant or not, this is thought to be the best that can be achieved. The 65% is not a fact of nature. It says, actually, nothing about the accuracy of the predictive coding systems being studied. Losey notes that this limit is due to the inherent uncertainty in human judgments of relevance, but goes on to claim that this is a limit on machine-based or machine assisted categorization. It is not. …

Ralph’s Response [1]: I agree with you. I was not trying to say 65% precision or recall is all that is possible to attain, just that the fuzziness of our lenses makes it hard to prove anymore than that, unless special review controls are put in place for the measurements. These controls have been lacking in most legal tests to date. TREC is making progress with limited subject matter expert input, but even there, thanks to monetary constraints, we still have a ways to go to use a true gold standard that could improve our measurements. So I agree with you that 65% is no “fact of nature” as you put it, or inherent limitation in human relevancy determinations. (I am not ruling that possibility out entirely, but if such a mental limit like that does exist, my experience tells me that it is higher than 65%.) This fuzziness issue is more than a mere anomaly and deserves wide-spread discussion and recognition. In so far as large-scale human reviews are concerned, reviews unassisted by technologies, the kind of reviews that were common in the past, the 65% fuzzy focus may well be an inherent human limit. With predictive coding and other automated process, however, this barrier can be broken. Finally, I like your suggestion to improve TREC experiments by using both an authoritative training set and an authoritative judgment set.

Issue [2]: Webber’s analysis shows that human review is better than machine review. I have no doubt that human review could sometimes be better than machine-assisted review, but the data discussed by Webber do not say anything one way or the other about this claim. Webber did, in fact, find that some of the human reviewers showed higher precision and recall than did the best-performing computer system on some tasks. But, because of the methods used, we don’t know whether these differences were due merely to chance, to specific methods used to obtain the scores, or to genuine differences among reviewers. Moreover, the procedure prevents us from making a valid statistical comparison. …

Ralph’s Response [2]: Again, I agree with you. I get that Webber’s analysis suggests that humans are only sometimes better, not always. In fact, I would go much further and say that humans always lose over large-scale review (weeks on end of 8 hours a day reviewing hundreds of thousands of boring documents) when paired against today’s good software. Still, Webber pointed out what no one else had before about the TREC results, that the humans sometimes did win on the small-scale, even when substandard manual review methods were used. I think it is wrong to just sweep that under the rug as an anomaly or luck. This realization of human abilities is important for proper application of the predictive coding process, where, in my opinion, input by experts on the seed coding is key. These experts need a clear understanding of what is relevant, and what is not. Otherwise, no matter how good the software, the computer principle of garbage in, garbage out, will control.

This realization of the continued importance of Man in the technology equation is also important to defeat the sophistic arguments of some plaintiffs’ lawyers (or better put “requesting party lawyers). They are now arguing in multiple courts around the country that a defendant (responding party) should forego any manual review and just turn over documents based solely on automated review. They use that argument to oppose motions for protective orders based on excessive cost and burden to review. They are misusing distorted reports of scientific research to try to force quick peek disclosures. But the truth is, automated coding is not good enough yet to dispense with final manual quality control reviews to protect confidential information in a litigation context. Webber’s findings help prove that. The advantage to plaintiffs’ counsel of such disingenuous, forced quick peek strategy is obvious and substantial. Claw backs and Rule 502 are inadequate protections. Once the bell has been rung, the damage is done, regardless of whether the documents are returned. The main point I was trying to make by publicizing Webber’s finding is that humans still have a place at the table, not that they should sit there alone without reliance on the latest software for culling review. I suspect you agree with me on that.

Issue [3]: Reviewer quality is paramount. Webber found that some assessors performed better than others. Continuing the argument of the previous section, though, we cannot infer from this that some assessors were more talented, skilled, or better prepared than others. … The best reviewers on each topic could have been the best because they got lucky and got an easy bin, or they got a bin with a large number of responsive documents, or just by chance. Unless we disentangle these three possibilities, we cannot claim that some reviewers were better or that reviewer quality matters. In fact, these data provide no evidence one way or the other relative to these claims. … In some sense, the ideal would be for the senior attorney in the case to read every single document with no effect of fatigue, boredom, distraction, or error. Instead, the current practice is to farm out first pass review to either a team of anonymous, ad hoc, or inexpensive reviewers or to search by keyword. Even if Losey were right, the standard is to use the kind of reviewers that he says are not up to the task.

Ralph’s Response [3]: I think you misunderstood my point and again assumed incorrectly that I was advocating for large-scale manual review. I am not. I agree the reviewers are not up to the task, even the best. As explained above, I think humans cannot perform over long periods of time, and so I am not advocating against machine review, I am advocating for hybrid review, man and machine working together. Like you, I advocate for change. So really we agree.

But I do disagree with some of your statements here. To paraphrase Shakespeare: me thinks thou dost protest too much. I don’t think it is wrong to assume a correlation between accuracy and skill. That connection is based on experience and common sense. All large-scale review project metrics show that some reviewers are better than others, just like some trial lawyers are better than others, and some scientists, etc. It is inherent that we all perform to different levels at different tasks. I do not understand the need to try to explain all of the variances as just luck or chance. (As the great golfer Gary Player used to say: the more I practice, the luckier I get.)  Although I concede some chance or luck is possible, the same could be said of the software tested. Perhaps the “winning software” just got lucky. I would not seriously make that argument, so I am surprised to hear it made about the reviewers here. TREC only tried to measure comparisons, as you said, and lady luck knows no favorites.

Issue [4]: Human review is good for small volumes, but not large ones. This claim may also be true, but the present data do not provide any evidence for or against it. The evidence that Losey cites in support of this claim is the same evidence that, I argued, failed to show that human review is better than machine review. It requires the same circular reasoning. … Based on other evidence from psychology and other areas, it is likely that performance will decline somewhat with larger document sets, but there is no evidence here for that. If this were the only factor, we could arrange the situation so that reviewers only looked at 500 documents at a time before they took a break.

Ralph’s Response [4]: I agree this was not tested. I was again relying on my experience outside of these experiments and relying on my common sense built from over 30 years of doing document review, paper and electric, big and small. But I get your point of scientific discipline that it was not tested and so not here proven. Still, I’m not a scientist, nor do I care to become one. Also, I write primarily for lawyers, not scientists (although I am very happy a few of you are interested enough to read them too,  at least when it touches on your work). I am a lawyer interested in learning from science for purposes of improving law, not visa versa, although that may be a secondary benefit. That would depend on scientists like yourself. Also, as you know, there is more to establishing best practices in review processes than simply adding in periodic breaks.

Issue [5]: Random samples with 95% confidence levels +/- 2% confidence intervals are unrealistically high. It’s not entirely clear what this claim means. On the one hand, there is a common misperception of what it means to have a 95% confidence level. Some people mistakenly assume that the confidence level refers to the accuracy of the results. But the confidence level is not the same thing as the accuracy level. … I suspect that Losey means something different. I suspect that he is referring to the relatively weak levels of agreement found by the TREC studies and others. If our measurement is not very precise, then we can hardly expect that our estimates will be more precise.

Ralph’s Response [5]: You have correctly divined my intent here on sampling. I was again referring to the measurements fuzziness issue reported by your scientific colleagues, Voorhees, Webber and to some extent Oard. I understand that you are uncomfortable with their findings and conclusions on accuracy. I sincerely hope that you and other scientists will work this issue out.

I want accurate measurements too, especially when important points of justice are at stake. I want all of the scientific research out there for full public view, even the troubling preliminary conclusions of Voorhees, Webber and Oard.  If the measurements are disputed, I want full disclosure on that. If it takes more money, time and effort to get these measurements done properly in scientific testing, then lets raise the funds to do it right. I support the important scientific research now going on in legal search. On that point I suspect we once again agree.

Again, thanks for your comments on my article.

_________________

DEAR READERS: I’m off to LegalTech, where I will not only be presenting with Craig Ball and Judge Andrew Peck in the much hyped debate on Tuesday, January 31st at 4:00 at the Sutton Center on the 2nd floor (sponsored by BIA), but I will also be presenting three more times on predictive coding related subjects. I am thinking of preparing for all of that the way Pat Sajak prepared to host Wheel of Fortune.

On Monday the 30th, I present at 12:30 on The Promise and Challenge of Predictive Coding and Other Disruptive Technologies with Judge Andrew Peck, Maura Grossman and Dean Gonsowski (sponsored by Clearwell/Symantec).

On Wednesday, February 1st, I present at 10:30 on Technology Assisted Review: When to Use it and How to Defend It, with Maura Grossman, Judge Frank Maas, and Ann Marie Gibbs (sponsored by Daegis).

My last gig on Wednesday is at 1:45 in the Sutton South Parlor on E-discovery Circa 2015: Will Aggressive Preservation/Collection and Predictive Coding be Commonplace? My fellow panelists are David Kessler, Robert Trenchard, Julie Colgan, Stephanie Blair, and Craig Carpenter (sponsored by Recommind and ARMA).

If you see me around, please stop and say hello. I like to meet all of my readers whenever possible. Please forgive me if you catch me at a time during the day when I don’t have time to chat, but I always have time to shake hands and say hello.



Bottom Line Driven Proportional Review

January 15, 2012

I have been working on the problem of out-of-control e-discovery costs since 2006. At that time I phased out my general trial practice, went full-time e-discovery, and started this blog. (By the way, did you notice the new ® in the blog title? It means the U.S. Patent and Trademark Office granted me the trademark to e-Discovery Team.) I focused on the expense side because it was obvious that crazy high e-discovery cost was a core problem of civil litigation. It still is. Indeed, the high price of e-discovery, and the uncertainty of  these costs, are the main reasons most attorneys still avoid e-discovery like the plague. For more reasons see Tell Me Why?

The primary expense of e-discovery comes from the document search and review process; most estimate that it constitutes from 60% to 80% of the total. The core expense of the review process comes from the final manual quality control checks of each document to be produced to verify relevancy and to protect confidentiality by redaction and privilege logging. Confidentiality protection is an enormous problem in litigation. See Anonymous, An Open Letter to the Judiciary – Can We Talk? Parts One and Two.

Further, you cannot just dispense with final manual review. As I explained in my series Secrets of Search, Parts One, Two and Three, we are not going to turn that over to the Borg anytime soon. I’ve asked around and no law firms do that now. No experts advocate that approach either, even the most extreme advocates for automation (of which I’m one). The only exception I have heard of is in non-litigation circumstances, such as second reviews with production to the government. Automated review is nowhere near good enough to go it alone. You use predictive coding to speed up the final manual review to be sure, but only a fool (or con artist trying to get at a producing parties secrets) trusts coding software today without human verification.

My thinking and experiments since 2006 have focused on how to control the final review costs. By early 2008 I came up with one possible method that looked promising. I have been testing and refining this invention ever since with several e-discovery teams. I have also talked about it with many other attorneys, friend and foe, and used this new method in many law suits, big and small. I am now ready to write publicly about my proposed fix for the first time. I call it Bottom Line Driven Proportional Review and Production. A more technical description for it, the one I used in a legal methods patent application, is: System and Method for Establishing, Managing, and Controlling the Time, Cost, and Quality of  Information Retrieval and Production in Electronic Discovery. But I usually just call it Bottom Line Driven Review, and who knows, if it catches on – and I think it should because it really works – I may trademark that phrase too.

In the meantime, try it out. The more attorneys that use this method, the more accepted it will be by judges. Right now they are hearing it from my teams for the first time, and, like anything new, it takes some explaining and getting used to. But, once understood, it appears obvious, and I expect all thinking clients will demand that their attorneys use this approach. It saves money.

Bottom Line Driven Review

The bottom line in e-discovery production is what it costs. Believe me, clients care about that …. a lot! In Bottom Line Driven Proportional Review and Production everything starts with the bottom line. What is the production going to cost? Despite what some lawyers and vendors may tell you, that is not an impossible question to answer. It takes an experienced lawyer’s skill to answer, but after a while, you can get quite good at such estimation. It is basically a matter of man-hours estimation. With my method it becomes a reliable art that you can count on.

Price estimation is second nature to me, and an obvious thing to do before you begin work on any big project. That is primarily because I worked as a construction estimator out of college to save up money for law school back in the seventies. Believe me, estimating review costs is basically the same thing, projecting materials and labor costs. In construction you come up with prices per square foot. In e-discovery you estimate prices per file, as I will explain in detail later.

My new strategy and methodology is based on the bottom line. It is based on projected review costs, defensible culling, and best-practices of review. Under this method the producing party determines the number of documents to be subjected to costly final review by calculating backwards from the bottom line of what they are willing, or required, to pay for the production.

The process begins by the producing party calculating the maximum amount of money appropriate to spend on ESI production. A budget. This requires not only an understanding of the ESI production requests, but also a careful evaluation of the merits of the case. The amount selected for the budget should be proportional to the monies and issues in the case. Any more than that is unduly burdensome and prohibited under Rule 26(b)(2)(C), Federal Rules of Civil Procedure and other rules that underlie what is now known generally known as the Proportionality Principle. See Rule 1, Rule 26(b)(2)(C), Rule 26(b)(2)(B), and Rule 26(g) Federal Rules of Civil ProcedureCommentary on Proportionality in Electronic Discovery, 11 SEDONA CONF. J. 289 (2010); Oot, Kershaw & Roitblat, Mandating Reasonableness in a Reasonable Inquiry, Denver University Law Review, 87:2, 522-559 (2010); Also see Rule 403 of the Federal Evidence Code (inadmissibility of cumulative evidence).

The budget becomes the bottom line that drives the review and keeps the costs proportional. The producing party seeks to keep the total costs within that budget. The budget should either be by agreement of the parties, or at least without objection, or by court order. The failure to estimate and project future costs, and to decide in advance to conduct the review so as to stay within the budget, is the primary reason that e-discovery costs are so high.

After analysis of the case merits and determination of the maximum expense for production proportional to a case, the responding party makes a good faith estimate of the likely maximum number of documents that can be reviewed within that budget. The document count represents the number of documents that you estimate can be reviewed for final decisions of relevance, confidentiality, privilege and other issues, and still remain within your budget. The review costs you estimate must be based on best practices and be accurate (no puffing).

The producing party then uses smart search techniques and quality controls to find the documents most likely to be responsive within the number of documents that the budget allows. This is usually based on relevancy ranking, and thus the need for hybrid multimodal best practices in the search and review. Predictive coding is inherently rank based and so it makes bottom line driven review especially easy to do. That is one reason I am especially pleased to see the price of predictive coding software finally coming down. It can be done without predictive coding ranking to be sure, but it is harder to be accurate, especially with recall. Using best methods allows you to get the most bang for your buck, the core truth, and thus persuades the requesting party or court to go along with your budgetary limits. More on the new gold standards in a minute.

Example

An example may help clarify how it works. If you set a proportional cost for a case of $100,000, and estimate that it will cost you $5.00 per file for the final manual review before production of the ESI at issue, then you can  review no more that 20,000 documents and stay within budget. It is basically that simple. No higher math is required.

The only difficult part is the legal analysis to determine a budget proportional to the real merits of the case. But that is nothing new. What is the golden mean in litigation expense?  How to balance just, with speedy and inexpensive? The essence of the ideal proportionality question has preoccupied lawyers for decades. It has also preoccupied scientists, mathematicians, and artists for centuries. They claim to have found an answer that they call the golden mean or golden ratio:

In law this is the perennial Goldilocks question. How much is too much? Too little? Just right? How much is an appropriate spend to produce documents? The issue is old. I have been dealing with this problem for over thirty years. What’s new is applying that legal analysis to a modern-day high-volume-ESI search and review plan. Unfortunately, unlike art and math, there is no accepted golden ratio in the law, so it has to be recalculated and reargued for each case. (Side Note: If the golden ratio were accepted in law as an ideal proportionality, the number is 1.61803399, aka Phi. That would mean 38% is the perfect proportion. I have argued that when applied to litigation that means the total cost of litigation should never exceed 38% of the amount at issue. In turn, the total cost of discovery should not exceed 38% the total litigation cost, and the cost of document production should not exceed 38% of the total costs of discovery.  (It’s like Russian dolls that get proportionally smaller.) Thus for a $1 Million case you should not spend more than $54,872 for document productions (1,000,000 – 380,000 – 144,400 – 54,872). See Losey, R., Beware of the ESI-discovery-tail wagging the poor old merits-of-the-dispute dog. But I digress too far.)

Estimation for bottom line driven review is essentially a method for marshaling evidence to support an undue burden argument under Rule 26(b)(2)(C). It is basically the same thing we have been doing to support motions for protective orders in the paper production world for over sixty years. The only difference is that now the facts are technological, the numbers and variety of documents are enormous, sometimes astronomical, and the methods of review are very complex and not yet standardized.

The calculation of projected cost per file to review can be quite complicated, and is frequently misunderstood, or is not based on best practices. Still, in essence this cost projection is also fairly simple. You basically project how long it will take to do the review and the total cost of the time. Thus, for example, and this is a gross over simplification, in a review project of 20,000 documents (after computer assisted culling – it probably started as 100,000 or 200,000), if the average review and coding rate is 50 files per hour, it will take 400 hours to complete. If the projected total cost for the reviewer time, including supervision and other costs, is $250 per hour, the projected total cost for the review is $100,000 ($5.00 per file).

This may seem high when you consider the cost of contract lawyers is $50 or less for their time, but you have to also include expensive partner and senior associate management time, direct supervision, quality control reviews, and privilege logging, etc. Do not be fooled by promises of $1.00 per file charges by contract review companies (or even less than that). That does not include the law firm of record time and expenses to supervise, etc., and often is based on a pre-culling rate for file count. In this business one of the hardest aspects for good estimation is getting true apples-to-apples comparisons from vendors.

Also, quality control is important and best practices can be expensive, even though with bottom line driven review the total cost is still dramatically less than old-school. As I said before, when you talk about capping the number of documents you review, you also have to talk about finding the most likely relevant documents for this capped review. You have to provide the most bang for your buck, the most truth. That, along with transparency to earn trust, is a key to the success of this method, a key to persuading the other side or court to accept this new approach to reasonable discovery.

Estimate of Projected Costs

Another key to persuade the requesting party or court is to be sure your estimate is realistic. You cannot just dream up estimates, or puff the likely expense. The estimate must be based on knowledge of the types of documents that you will be reviewing in your particular case. It must be based on the times that you find it takes for manual review of the documents. Some document collections are faster and easier to review than others. The speed is measured in files per hour. (I like the sound of that, plus pages per hour is just a relic of the paper world. Computer files don’t have pages, only paper print-outs do.)

The typical speeds we see today in final manual review are anywhere from 25 files per hour, for collections with a lot of dense long documents and crappy review software, to 100 or even 200 files per hour for collections with easier to skim documents and the best software. Putting aside the question of the wide divergence in the quality of review software, we tend to see faster files per hour rates in email-heavy cases with few attachments, than we do in cases with a high percentage of complex documents and spreadsheets. You have to know your case, know your ESI, to make a proper estimate.

The projected costs must also be based on best practices for economical review and not be inflated. You can’t justify $500 per hour partners to do all of the review (although they may be needed as subject matter experts to do the seed set review in predictive coding). Of course, old-fashioned full manual review is out of the question. It has to be hybrid with computers doing most of the first-pass document culling. Even if you use technology assisted review, you still must also use best practices for methods. You cannot justify old-fashioned stupid review methods, such as batch out to reviewers solely based on first in, chronological, or just random. It has to be a best practice based multimodal type of review, where, for instance, you batch out documents for manual review based on issues, clusters, language, or other smart review methods. Best practice also means quality control and a random button as discussed at length in the Secrets of Search series. If you do not use the nine best practices to get the most bang for your buck, the core truth, the requesting party or court may not agree to limit the number of documents to be reviewed.

The processes behind the estimate should also be transparent. This means you should be willing to disclose it to the requesting party. That is how you can convince them that the estimate is reasonable and that you are not still stuck in the old paradigm of hide-the-ball discovery games. I cannot overstate how important it is to develop trust between counsel on discovery and often the only way to do that is through transparency. You do not have to disclose all of your trade secrets, but you have to keep the requesting party pretty well informed and involved in the process. That is what cooperation looks like.

In general, I have found that in 2011, $5.00 per document was a good place to start in projecting costs for review of a typical email collection (an email is one file, and each attachment is another). This price includes the expensive redaction and privilege logging processes. Review with a simple relevant or irrelevant coding is the easiest and cheapest to do, and is also fairly rare. There are usually multiple additional factors to consider.

The five dollars per file is a starting point of estimation, a rule of thumb that is often correct, but sometimes way off. It is comparable to the rule of thumb in construction estimation where you start with the typical costs to build on a square footage basis. But in some cases, especially ones involving cross-border issues, the costs could go much higher, as high as $15 per file. In others, where the review is simple, it could go as low as $2.00 per file. It is just like construction where various buildings in different locations have different costs.

The $5.00 per file price is based on my recent experiences in 2011. In 2009 the average cost was more like $6.50 per file, and I expect average costs will keep going down a little in 2012 and then level off.

It is important to note that you can justify starting with a much higher number based on legal precedent alone. For instance, the Department of Justice spent $9.09 per document (or file, same thing) for review in the Fannie Mae case, even though it used contract lawyers for the review work. In re Fannie Mae Securities Litig., 552 F.3d 814, 817 (D.C. Cir. 2009) ($6,000,000/660,000 emails). There were no comments by the court that this price was excessive when the government later came back and sought cost shifting. My current $5.00 per file general rule is also lower than the $6.09 per document that Verizon paid for a massive second review project that enjoyed large economies of scale and, again, utilized contract review lawyers.  Roitblat, Kershaw, and Oot, Document categorization in legal electronic discovery: computer classification vs. manual review. Journal of the American Society for Information Science and Technology, 61(1):70–80, 2010 ($14,000,000 to review 2.3 million documents in four months).

So, if your experience suggests a starting review rate higher than $5.00 per file, there is legal justification to use a higher number. Just be prepared to go to the next steps and back it up.

The price per file is just a starting point, a way to get a quick picture, a quick estimate, without doing all of the detail work. A more accurate picture starts to emerge with sample reviews and more detailed analysis of the tasks required in the review and the actual data to be reviewed. You have to, as I like to say, get your hands dirty in the digital mud. You have to know your ESI collection. Even in just one type of ESI, the one most common in e-discovery today, email and attachments, the variances in email collections can be tremendous.

Once you get your hands on the data you need to start to breakdown and analyze the time involved in the various tasks required in the review project. Here, as in construction estimation, the spreadsheet is your friend. This move to actual examination of the ESI at issue, and study of the specific review tasks that need to be performed in your case, is equivalent to the move in construction estimation from rough estimates based on average per square foot prices, to a careful study of the buildings plans and specifications, and a site visit with inspection and measurements of all relevant conditions. No builder would bid on a project without first doing the detailed real world estimation work.

Even in the same organization, and just dealing with email, the variances between custodians can be tremendous. Some for instance may have large amounts of privileged communications. This kind of email takes the most time to review, and if relevant, to log. High percentages of confidential documents, especially partially confidential, can also significantly drive up the costs of review. All of the many unique characteristics of ESI collections can effect the speed of review and total costs of review. That is why you have to look at your data and test sample the emails in your collection to make accurate predictions. Estimation in the blind is never adequate. It would be like bidding on a building without first studying the plans and specs.

Even when you have dealt with a particular client’s email collection before, a repeat customer so to speak, the estimates can still vary widely depending on the type of law suit, the issues, and on the amount of money in controversy or general importance of the case.

Although this may seem counter-intuitive, the truth is, the complex, big-ticket cases are the easiest to do e-discovery, especially if your goal is to do so in a proportional manner. If there is a billion dollars at issue, a reasonable budget for ESI review is pretty big. On the other hand, proportional e-discovery in small cases is a real challenge, no matter how simple they supposedly are. Many cases that are small in monetary value are still very complex. And complex or not, all cases today have a lot of ESI.

The medium size to small cases are where my bottom line driven proportional review has the highest application for cost control and the greatest promise to bring e-discovery to the masses.

The Quest for Gold

In Secrets of Search Parts One, Two and Three, I outlined the five key characteristics of search today, using the rubric of secrets. To support my outline I used the latest scientific research on legal search, and focused on the work of William Webber. Re-examining the Effectiveness of Manual Review. In Part Three I summarized my ideas on search and review using the symbol of the Pythagoreans, the five-sided polygon, or pentagon:

With this blog on Bottom Line Driven Proportional Review I add a sixth idea, where the process gets real and takes money into consideration. Here I have shared my method to use estimation, projections, budget, cooperation, transparency, and the legal doctrine of proportionality to control the costs of search and review. With this final piece my proposal for a new gold standard of search and review is complete.

Bottom Line Driven Review is a method to try to control the key problem in electronic discovery law today, the run away costs of review. The number of documents we have to review seems to double every two to three years, so this new legal method is imperative. New and better software, especially predictive coding type, is also important. As shown, the ranking of relevancy and other categories built into the latest algorithms is, under my bottom line driven analysis, an especially helpful new capability.  You rank the documents within your budget limit that the computer predicts, based on your training, will be the most relevant to your case. But new technology alone is not enough. We must also have new legal methods. Technology and law have to work together, grounded in science, to create a new gold standard.

In Secrets of Search Part II, I proposed a new gold standard, one that would replace the now disgraced old-gold brute-force manual review unassisted by technology. I drew upon the findings in the latest scientific research, legal literature, and my over thirty years of experience with discovery to create a first draft list of the nine criteria of the new gold. The first criteria listed was Bottom Line Driven Proportional Review, which I promised to explain later and have now done so. Here is how I put it in Part II:

The old gold standard of average human reviewers, working in dungeons <smile>, unassisted by smart technology, and not properly managed, has been exposed as a fraud. What else do you call a 28% overlap rate? We must now develop a new gold standard, a new best practice for big data review. And we must do so with the help and guidance of science and testing. The exact contours of the new gold are now under development in dozens of law firms, private companies, and universities around the world. Although we do not know all of the details, we know it will involve:

  1. Bottom Line Driven Proportional Review where the projected costs of review are estimated at the beginning of a project (more on this in a future blog);
  2. High quality tech assisted review, with predictive coding type software, and multiple expert review of key seed-set training documents using both subject matter experts (attorneys) and AI experts (technologists);
  3. Direct supervision and feedback by the responsible lawyer(s) (merits counsel) signing under 26(g);
  4. Extensive quality control methods, including training and more training, sampling, positive feedback loops, clever batching, and sometimes, quick reassignment or firing of reviewers who are not working well on the project;
  5. Experienced, well motivated human reviewers who know and like the AI agents (software tools) they work with;
  6. New tools and psychological techniques (e.g. game theory, story telling) to facilitate prolonged concentration (beyond just coffee, $, and fear) to keep attorney reviewers engaged and motivated to perform the complex legal judgment tasks required to correctly review thousands of usually boring documents for days on end (voyeurism will only take you so far);
  7. Highly skilled project managers who know and understand their team, both human and computer, and the new tools and techniques under development to help coach the team;
  8. Strategic cooperation between opposing counsel with adequate disclosures to build trust and mutually acceptable relevancy standards; and,
  9. Final, last-chance review of a production set before going out the door by spot checking, judgmental sampling (i.e. search for those attorney domains one more time), and random sampling.

I have probably missed a few key factors. This is a group effort and I cannot talk to everyone, nor read all of the literature. If you think I have missed something key here, please let me know. Of course we also need understanding clients who demand competence, and judges willing to get involved when needed to rein in intransigent non-cooperators and to enforce fair proportionality. Also, you should always go for confidentiality and clawback agreements and orders.

I repeated this nine-point list of the new gold in Part III of Secrets of Search, and again repeated my invitation for input with a comment on standards that bears repetition:

I have probably missed a few key factors. This is a group effort and I cannot talk to everyone, nor read all of the literature. If you think I have missed something key here, please let me know. I will be at Legal Tech New York for three days with four presentations. Seek me out and let’s talk. You can reach me at ralph.losey@gmail.com.

You may note that I am herewith joining the call of other leaders in the field to develop best practice standards, notably including Jason Baron, and have overcome my initial reluctance to go there for a variety of reasons. See Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-Discovery Search, XVII RICH. J.L. & TECH. 9, at 29-33. My concerns on arbitrary standards and unfounded malpractice claims remain, but I think we have no choice but to develop some basic industry standards. The nine characteristics of good document review outlined above constitute a first modest step in that direction.

I will be at LegalTech NY on January 30th, 31st and February 1st. My invitation for dialogue and input from readers continues. Seek me out and let’s talk, but spare me the sales pitches, please. (I am, however, open to writing pitches.) My main focus right now is the quest for a new gold standard of search and review. I know that many of you share this quest, so let’s use the power of groups, or team work, to make it happen. What do you think of my nine-point first draft list? Any suggestions to add new criteria, consolidate, or add to what any of these nine mean?

Two of my readers have already responded to my outreach for input, Bill Hamilton and Larry Chapin. They provided some more concrete details to the criteria number six in the list, new tools and psychological techniques. They submitted an essay on one such new technique, which is actually of ancient origin, and well-known to the best trial lawyers, namely the use of story and storytelling to improve legal review. Storytelling: The Shared Quest For Excellence in Document Review. If you have ideas for an article, please send me an email with outline or first draft and I will consider it for possible publication.  As coaches everywhere love to say, there is no “i” in team, even if there is in e-Discovery Team ® <grin>.

Conclusion

I have a dream, like all humans do. It is one of the key attributes that separates us from machines. My dream is not as noble or stirring as the public dreams of Martin Luther King or John Lennon. Those were grand dreams indeed. But my dream is important to me, and you can probably relate to it. I dream of a day where man and computer work together to bring truth and justice for all, not just the elite few who can afford it now. I dream of a day where e-discovery is affordable and used in all size cases. This dream of truth and justice for all is deeply rooted in my psyche. I suspect it is in yours too. We all grew up understanding the importance of Superman’s never-ending battle for truth, justice, and the American way. Join me in this battle. Join the e-discovery team fight for truth, justice, and the American way. (For my many readers outside of the U.S., my American way reference is not meant to be nationalistic or exclusionary, but rather to refer to the highest ideals of a great country.) All professionals in the field are invited. So too are computers, especially the latest generation of super smart ones. Yeah, their programmers too. All behind the scenes coders, techs, and scientists are an important part of the e-Discovery Team.

High tech lawyers working with computers and their handlers are key to my version of the archetypical American dream of truth and justice. Techs and computers helped bring about the nightmare we must now overcome – the explosion of ESI that hides the truth and makes justice too expensive. They helped get us into this mess, they can help get us out. We cannot turn back.

Jason Baron’s depressing prophesy of information dystopia, where we all drown in a flood of information, is no prophetic dream. It is a realistic assessment of the current state of the law and the discovery of electronic evidence. The reality today is that the vast majority of lawyers avoid the discovery of information in computers, even though that is where the truth lies. They have a prejudice against it. They believe in the inherent superiority of paper. We all know that most of the truth left paper filing cabinets over a decade ago (with the sole exception, perhaps, of the federal government), yet most lawyers still look there, and only there, for justice. Jason’s dream of extreme information overload is a projection of the current reality getting worse. He speaks the truth, but only if we don’t do something about it, if we don’t continue in the never-ending battle. Truth and justice can triumph. They must.

The motivation is clear. So is the solution. Imagine a world where the fool’s errand, the paper chase, comes to an end. Imagine a world where these old ways, based as they are on ignorance and delusion, are replaced by an affordable and effective process of e-discovery. Imagine a world where all the people, the litigants in all size cases, can all afford to do e-discovery. Imagine all the people living life in justice. It isn’t hard to do. As John Lennon said: You may say I’m a dreamer, but I’m not the only one. I hope some day you’ll join us. And the world will be as one.

I have a dream of a new method of technology assisted discovery, where Man and Machine work together to find the core truth. This day will come, in fact it is already here. As William Gibson said: “The future is already here – it’s just not evenly distributed yet.” The key facts you need to try a case and to do justice can be found in any size case, big and small, at an affordable price. But you have to open your mind. You have to embrace change and adopt new legal and technical methodologies. The Bottom Line Driven Review method is, I suggest, an important part of that answer. It is working for me today, it can work for you too. Our dreams can come true. The nightmare scenarios of justice for only the super-rich can be avoided. The battle for truth and justice must continue.

I  see a way out, where we can overcome, where truth and justice can be attained for all the people. I see a day where the truth in our computers can be found and brought to the court room for justice to prevail.

Although I also have a dream of a new generation of tech-smart lawyers, who understand and apply the new methods, the new gold, to keep e-discovery available for all. We do not need to wait for this slow gradual change. We can win the battle now, even without the young geek Supermen. The time for change is now – in this generation, not the next. As King said in his I Have Dream speech, this is not time to take the tranquilizing drug of gradualism.

Join me in the dream of e-truth and justice today. As a team we can get there, we can and shall overcome, we shall be free of the paper-prejudices of the pre-computer world. And when that day comes, let the bells of truth and justice ring throughout the world. To quote the end of King’s famous speech:

And when this happens, when we allow freedom to ring … [we] will be able to join hands and sing in the words of the old Negro spiritual, “Free at last! free at last! thank God Almighty, we are free at last!”


Storytelling: The Shared Quest For Excellence in Document Review

January 8, 2012

Guest Blog by William F. Hamilton and Lawrence C. Chapin.

Bill Hamilton is an attorney with nearly thirty years of experience in business litigation who is a partner at Quarles & Brady. Bill also serves as the Dean of the E-Discovery Department of Bryan University, which includes an online educational program in e-discovery project management. Bill is also an Adjunct Law Professor teaching Electronic Discovery and Digital Evidence at the University of Florida, and has frequently contributed to this blog. See Eg. The E-Discovery Crisis: An Immediate Challenge to Our Nation’s Law Schools, and The E-Discovery Sanctions Cube.

Larry Chapin is an attorney with 30+ years experience, including corporate Wall Street law, who now works as a contract review lawyer in New York City. Larry has taught at the New School for Social Research in NYC and currently serves on the Board of Directors for an asset management company in Stockholm Sweden. Larry is the first graduate of our e-Discovery Team Training program. He contributed a must-read blog here earlier this year entitled Contract Coders: e-Discovery’s “Wasting Asset”?

_____________

EDITOR’S NOTE: Over the last several blogs on the Secrets of Search we have examined the latest scientific research on manual and automated reviews. The research shows that although brute-force manual linear review is as dead as a doornail, or should be, there is still an important place for skilled human reviewers and review, even in the latest predictive coding models. But the emphasis is on skilled human reviewers and skilled methods. Simply asking some lawyers to look at documents all day on a computer screen for weeks on end and decide relevance or not is unacceptable. If that is how you conduct manual reviews, and just bid things out to the lowest paid reviewers, then you are inviting error. You probably would be better off turning it over to the Borg, and just skipping final quality control reviews altogether. But if you care about quality, if you are diligent in the protection of client confidentiality – and as a lawyer you have a clear ethical duty to do so – then you must improve and innovate on manual review. This guest blog by professional reviewer, Larry Chapin, and an expert in e-discovery and project management, Bill Hamilton, help show the way.

In Part III of Secrets of Search I listed a nine-point checklist for quality reviews. Point number six was: “New tools and psychological techniques (e.g. game theory, story telling) to facilitate prolonged concentration … ” This guest blog will flesh out a new approach that Chapin and Hamilton have developed to use storytelling to improve the quality of contract reviews. I think this is a great idea. Lawsuits are essentially a battle of competing stories. They can become high drama as the Casey Anthony trial that took place across from my office in Orlando showed in 2011. Good trial lawyers already know the importance of story to a case. They should quickly understand this idea and appreciate how this new review technique could help their cause. All attorneys, and especially companies that do contract review work, should look into including this new technique into their projects. Feel free to email Bill Hamilton or Larry Chapin to see how they may be able to assist.

_____________

By putting its faith in logic, control and optimization, command-and-control management has lost sight of the crucial role that passion plays in human action.

Stephen Denning, The Leader’s Guide to Storytelling

_________

Storytelling: The Shared Quest  For Excellence in Document Review

by William F. Hamilton and Lawrence C. Chapin

What is the future of large-scale human document reviews? With the startling advances of search technology, is human document review about to be consigned to the dustbin of history? Some believe so. Yet, others think that the death of human review has been grossly exaggerated. There is no doubt that computer assisted reviews will be increasingly important for large and even moderate scale reviews. However, the contest between human and computer, between manual and automated review is far from over. In this blog, Ralph Losey recently discussed some of the implications of the fascinating work of information scientist William Webber. It seems that in the proper setting, the best human reviewers can still out-perform the automated review.

Watson may be the Jeopardy winner, and IBM’s Deep Blue the chess champion, but the identification and evaluation of documents in the litigation context stretches the utility of computer algorithms. In document review setting, well-trained, well-led and properly motivated women and men are, in fact, able to excel. How can we build reviews to maximize human review performance? What can be done about the powerful disincentives of long hours of dreadfully monotonous work at rates of pay already low and still in decline? Put more constructively, what can be done to tap the intelligence, marshal the talents, and harness the energies of the contract lawyers who fill the ranks in the typical review? How do we rid ourselves of the upstairs-downstairs mentality that isolates and confines our reviewers, turns them into servants and cripples their reviews?

We believe that the answer may be found in an approach to document review that harkens back to a simpler time, before litigators faced the enormous volumes of documents common in our digital age. That is to say, answers are to be found in building reviews around the art of storytelling. Shakespeare was right: the entire world’s a stage, and all the men and women players. Certainly, litigation is drama. It is the drama of competing and clashing human passions. It is the stuff of stories. Document reviews must be understood as a central player in the litigation storytelling process.

A fundamental shift in the way that lawyers think, speak, and conduct document reviews is required. We propose a new paradigm. We propose building  “story-centric” reviews. First, though, let’s face it. Storytelling usually gets a hard knock. It’s for children. It’s the stuff of fairy tales. Storytelling is said to have no place in the hard-edged, logic driven, command-and- control culture to which the legal and business communities have grown accustomed. Euphemisms – like “business narrative” – have been invented so that stories might have a place of some kind in the working world.

Yet, storytelling has long been a part of lawyering. Good trial lawyers have always known that cases are won on the strength of their story. Even crazy ones can be convincing. Empirical studies also show that appellate briefs, too, are more persuasive if they tell stories rather than rely on logic alone.  A case can’t resonate with a judge or jury – emotionally, intellectually, or intuitively – unless it’s tied to a compelling story. The litigation team itself can’t know what evidence most belongs before the court unless it knows the story to which the evidence belongs. The discovery process serves to yield the elements and the contours of the story, and shed light on the connections between the cause and effect that are at its heart. It is the job of the entire team – including document reviewers – to construct the most persuasive story possible, and to diminish and discredit the tale told by the other side.

Our experience, unfortunately, is that too many lawyers separate document review from that creative process. They fail to see document reviewers for what they are: investigators sharing fully in the common tasks of discerning, shaping, and telling the client’s story. This kind of engagement requires that the review structure and evaluation adopt the elements and language of the story. It’s an orientation that triggers active reviewer participation and has real potential to address the problems now plaguing review. We believe that the failure to engage the review team in this way results in a process that is less true and just than it might be.

Suggestions to Add Story to Document Reviews

Accordingly, we offer a series of suggestions for the use of storytelling in the discovery process, toward building a story-centric review.

First, at the outset, use the client’s story and its themes to define the goal of the review project. Articulate clearly the central purpose of every reviewer’s contribution: to enable the story to be told. The story needs to remain the constant center of their focus. We might liken reviewers to crew members who sailed in search of new lands during the great age of exploration. Not every day was filled with adventure. During more days than we realize, their ships were becalmed on windless seas. What got them through those days was their purpose for being there, the vision of things that had launched their journey. So they kept their focus, mindful always of the possibility of a sighting and the promise of discovery. In that way, let the story of the case be what drives and sustains the review team. Remember that the critical document, like new land, may be just a moment away. Everyone needs to stay alert.

Project metrics should be designed to reflect this orientation. Story-centric metrics should measure: linkage, the degree to which documents pull the story together tightly to help tell the tale; gravity, the degree to which the document collection gives weight, heft and power to the tale; and resonance, the degree to which documents provide compound richness to the story.

“Linkage Docs” provide the basic story line. They establish the necessary cause and effect that transforms otherwise isolated facts into a real story. They reflect the fact that every story is composed of details that unfold at a time, place, during a temporal extension, and that involve human motivations and conflicts. They are the sinews, the connecting tissues without which a story does not exist. For example, in a case involving a business breach of contract for failure to maintain premises, a document that shows the defendant’s  financial distress shortly before the breach establishes linkage. Linkage allows the story to begin to congeal.

Links are related to gravity, but different. “Gravity Docs” are those documents that move the story events out of stasis towards resolution. They function as a pivotal column or anchor that marks a transition, direction or resolution within the story. We ultimately want links that tie to these pivotal columns. The documents with gravity are the turning point documents.

Finally, “Resonance Docs” are those documents that strike a chord in us. They evoke sympathies in ways that align us with the actors in the story. They establish decisive commonalities between persons hearing the story and those person within it. In helping the story ring true, they persuade us. The lead us safely past any temptation to turn to unpersuasive  clichés, triteness, and banality in telling the story. A document that provides resonance will  tie story links (sub-plots) and pivotal gravity markers (the main plot) together.

The story can have links  and pivoting documents, and still be unpersuasive. Resonating documents provide  understanding, the “now I get it feeling,” and are often documents that directly speak to human motivation and intention ( or give rise to strong presumptions of actual motivation).  The irony of the traditional review is that a  review team shackled by  traditional coding blinders can row past a proverbial “smoking gun” document and not recognize its value to the story. Reviewers should not resemble the galley rowers portrayed in Ben-Hur who are driven to exhaustion as the pace of the review escalates to ramming speed.

The review team must be able to recognize documents with story-centric values, not merely label documents  as responsive or non-responsive according to abstract coding rules. A good review team requires graphics. The review team’s identification of Linkage Docs, Gravity Docs and Resonance Docs compose the story as the review progresses. The review team needs to literally see the story mapped as it develops. The story-centric review replaces the  traditional white board with a large story board that simultaneously shapes and is shaped by the review.

Linkage, gravity, and resonance can be seen as three overlapping circles. In practice, depending upon the story, the circles may vary in size and shape (e.g. oblong),  but in the overlapping section we are likely to find the 7±2 documents that the trial team needs to tell the winning story.

So invest your own time in a solid understanding of the client’s story. Invest more time still in discussing it with the review team, so that together you reach a shared grasp of its themes and important facts. This initial investment may turn out to be substantial, but the rewards will be enormous. Don’t make the mistake of taking more time to talk about the software the team will be using than on the story they will be helping to tell.

On one project of which we are aware, the trial and discovery teams developed a highly detailed, rule-based review book. It was more than one hundred and fifty pages long, but devoted fewer than one hundred words in not even ten lines of text to actually telling the client’s story. Don’t do that. Don’t let a narrow focus on the chains of logic obscure the compelling threads of the underlying narrative.

Second, use storytelling with the review team to create a sense of quest. Remember again our metaphor of voyage. The reviewers are, of course, engaged in a real pursuit – weaving a tight, compelling story worthy of being told. Beyond that, quests intimate a feeling of authentic commitment – even a passion – among members of the review team. The power of the story transforms the document review experience. Stories have a unique ability to bind members of the team to a broader purpose, and to each other. As we work together, we are reminded of the human drama that has already unfolded for our client. We remember, too, that our own stories are still unfolding in our work together. On several levels, then, we feel connected. The present has new and important depth.

The organization of the review teams is critical to a sense of quest. The reviewers must identify with the quest to face its hardships and celebrate its victories. The review itself should be seen as a story that has drama, disappointments, dead ends, clues, and ultimately triumph. Banish forever the factory concept of document review as a mass production based on the principles of Taylorism and Fordism.

Third, use of a lawsuit’s stories serves to continually define and redefine the team’s analytical tasks, and to sharpen their focus as the review progresses. Use graphics and models to demonstrate the elements and cohesion of the story as the review is taking place. If the reviewers can’t understand and relate to your story, no judge or jury ever will. Emphasize that the story being told to them is provisional, and that their investigation may, in fact, bring about a retelling of the story. Reiterate key themes as you talk to the members of the team. Challenge them to discern both its strengths and its weaknesses. Provide opportunities for them to share their impressions and their hunches, their discoveries and concerns. This might be as simple and productive as it was on one recent project in which every day or so, one of the law firm’s associates on the case went among the reviewers and asked them, “What are you finding? What do you think?”

It is hard to exaggerate the importance of these interactions. They’re not drive-by questions that are all too easily answered with a yes or no. They are chances for leaders to demonstrate deep listening. They are open-ended invitations to contribute to the group’s learning. They are small streams of one-on-one talk that contribute to what Denning has called the river of conversation that keeps the project moving forward. They are also brief opportunities for members of the team to be acknowledged and affirmed in their work. The goal is to create short but meaningful exercises in team building and flushing out the law suit’s story.

Fourth, share “discoveries” among the team. After all, many of the decisions made by reviewers are close calls, and need to be shared and socialized for consistency and accuracy. In part, this question of sharing is a matter for science.  There are, no doubt, a wide variety of wiki-like technologies that might be brought to bear for purposes of shared learning. But there are several things to be remembered in that regard. First, the technologies seem to be variations on the same theme. That is, they provide ways in which reviewers can articulate their rule-based questions, which are then migrated upwards for consideration by someone on the trial team. The review team is then given access to a database containing all the questions and their answers.  There are many other technologies available for broader, more open learning, but sadly they are rarely employed. It is ironic that in this digital era that has spawned massive reviews, few of the readily available social networking and communications tools have been applied to “humanize” the review process. Then again, the reason is clear: non-story-centric reviews seem to have little use for creativity and collaboration.

The reviewers should be organized into “review teams.” Review teams should ideally be small teams (10-15 reviewers) located in physical proximity. The identification of Linkage Docs, Gravity Docs, and Resonance Doc should be quickly shared and celebrated. Review team members should encourage one another. Review metrics should not exclusively focus on number of documents reviewed per hour. All genuine work and creativity has valleys and plateaus. A review should not be a forced march. The football team regroups in the huddle before each play as it creatively marches down the field. A good, productive review will have its own rhythm. To facilitate this rhythm the successes of one review team should be shared with other teams. Success encourages success and friendly goal oriented competition. Reporting, feedback, and encouragement should be emphasized.

Why have we ignored the lessons of sports competition in our document reviews? Sports motivation coaches are paid millions to inspire athletes and teams. Yet in million dollar reviews, and where even more is at stake in the litigation, we tolerate performance that would be banished elsewhere. What is needed are the genuine “review coaches.”

Fifth, collaboration thrives on human face-to-face contact. The 2009 Text Retrieval Conference (TREC) validated this important point. The TREC team sponsored by the School of Information Sciences of the University of Pittsburgh was provided with shared digital space that allowed them to communicate with each other and to store and organize results. Early on, communication between the searchers consisted mostly of texting, with very little actual, verbal communication. Later on, as tasks became more difficult and the need to collaborate became greater, real talk between the searchers virtually replaced texting, as trust and familiarity developed.

The Pittsburgh team results suggest that while wiki-like technologies are useful in knowledge sharing, trust-based communication such as that involved in document review will gravitate towards ordinary face-to-face communication. It also reminds us that, especially in knowledge sharing exercises, “talk is work” as Stephen Denning has said. This may be another surprise for readers. Absolute silence may not simply mean a focused project. It may be signal a failure to share critical information.

It is precisely in such spontaneous conversations that members of the team draw from the pool of cognitive diversity. A good team will comprise individuals with different strengths, training and backgrounds. When left to themselves high functioning teams learn to take full advantage of their diversity. Good leaders will make sure that team members know their neighbors. Sadly, that rarely happens. On one project related to the life sciences, one reviewer had nearly a decade of law firm experience in that field. But the rest of the team never found out, because the supervisors never thought or wanted to ask. In another project involving the global capital markets, one of the reviewers had two decades of high-level experience trading financial instruments. He decided not to reveal that to anyone. Somehow, the message had gotten across to him that the smart approach to “surviving” document review was to “keep your head down.” It’s a saying you hear a lot on the project floor. What a terrible reflection upon the kind of “supervision“ and “management” to which document reviewers are commonly subjected!

Sixth, use storytelling to generate the connections that will make document review a meaningful experience. The most profound concerns about document review have always revolved around the lack of connection between the purposes of the work and those doing it. Storytelling, on the other hand, is all about connections. Remember what stories are: accounts of causally connected events. So, document review is really an investigation into the nature of those connections. Further, stories are a shared human experience; we all have our own stories. In working together to formulate the story of the case, our own stories become part of the story of the group.

Storytelling establishes common meanings and transmits the values characteristic of high-performing teams. Denning writes that the most striking thing about being part of a great team is the meaningfulness of the experience. “People talk about being part of something larger than themselves, of being connected, or being generative … their experiences as part of truly great teams stand out as singular periods of life lived to the fullest.” We have seen the reviewers’ faces light up, their smiles appear, and genuine excitement erupt when participating in story-centric reviews.

In our view, these are issues of leadership, more than management. The dominant language of document review management reflects the values of traditional command-and-control culture. Such management is about structure, schedules, budgets and the like. This management operates out of hierarchical schemes and derives its presumed effectiveness from the power of authority. Naturally, such things have their place in well-run reviews, as most published literature attests. Metrics matter; things need to be measured and counted. But traditional measures of performance are not always the most revealing.

Consider, for example, the story told in the movie Moneyball about Billy Bean’s discovery that the “five tools” traditionally used to evaluate baseball players missed the mark. Metrics such as batting average and speed on the bases mattered, but they were really pointing to something else that was the most telling factor between ball players on winning and losing teams, that is, on base percentage. What mattered was how often batters got on base by any means. What if the metrics relied upon in review command-and-control structures – such as documents per reviewer per hour – are off the mark?

Seventh, remember that the document review may have to be explained and defended. If challenged as to its reasonableness, the review will have its own story to be told. The McDermott case now is a powerful reminder of what may be at stake. Stories about the labors of well equipped, fully engaged, and highly motivated reviewers are bound to be the most persuasive stories of all.

Conclusion

Good storytelling lies at the very heart of good litigation. Neither the information revolutions of the digital age, nor the dizzying advances of technology have changed that.

The challenge lawyers face is that of adapting the storytelling art to the requirements and capacities of our day. Discovery and review must articulate the client’s most compelling story. It must disable the counter-story told by the other side. Story-centric reviews serve as powerful levers for the other assets – both human and hard – committed to the work of review excellence. This is important work. Justice depends on a compelling story and injustices arise when we forget that.


Follow

Get every new post delivered to your Inbox.

Join 96 other followers