Now finally we come to the conclusion of this series on the Secrets of Search where all will be revealed. Secrets of Search: Parts One, Two, and Three. (Well, to be entirely honest, not all will be revealed. I’m still going to keep a few trade secrets up my sleeve for law partners and family.) As you can see by the photo, junior here was quite astonished by the latest revelations. I hope you will be too.
Recap of the First Three Secrets
Before I get to the fourth secret of search, I need to review the first three again and connect a few more dots. The first secret was already known to many. (Craig Ball said it was about as much of a secret as the square root of 256.) It was that keyword search, done alone, and as part of a blind Go Fish game of dueling attorneys, is remarkably ineffective. Keyword search only works when performed as part of an interactive, multi-modal process, one which uses constant sampling and review. Still, keyword search is yesterday’s (1960s) technology. No matter how many Boolean bells and whistles and interactivity and quality controls you may add to keyword search, its only real strengths are familiarity and quick peeks. The future of legal search, the best promise for adequate precision and recall, lies in artificial intelligence software. By this I mean the so called predictive coding algorithms where expert humans train computer agents, plus ever improving legal methods.
The second secret really was a secret, kind of like knowledge of the square root of two was in ancient Greece. This secret was little known outside of information science circles, who, in speech at least, tend to emulate the Pythagoreans in enigmaticalities. This secret is that the gold standard used to test precision and recall is, like keyword search, remarkably ineffective. That so-called gold standard is human review. This is a very imprecise, very fuzzy standard. The few studies we have on big data projects, ones where humans reviewed thousands of documents for days on end, reveal terribly inconsistent relevancy calls. (Not surprising when you consider how bleary-eyed and underpaid they were.) For instance, in the $14 Million Verizon project, human reviewers only agreed 28% of the time. This means that our yardstick for recall measurements has nothing smaller on it than a foot. All claims for precision within a few inches are bull. We really have no way of knowing that.
As information scientist William Webber notes, our maximum possible mean precision and recall rate (“F1″) measured objectively is only 44%, and other studies suggest an only slightly higher F1 rate of 66%. This is very significant because it means there is no objective basis to ever demand a recall rate of better than 66%. A requesting party that asks for recall better than that is asking for something that cannot be reliably measured.
Logically, this also means random samples with 95% confidence levels +/- 2 are also unrealistically high. Plus or minus 5 might be more realistic considering the vagaries of our measurements and subjective determinations. I favor random sample buttons on software, but I want our use of them to be realistic and not budget busting. What is the point of such accuracy when the underlying data is so fuzzy? The demands of 99% confidence level, or plus or minus one confidence interval, are completely misplaced and illogical. Our measuring stick is too imprecise to justify such large sample sizes. The experts who ask for that kind of delusional certainty have not understood the second secret. Either that, or they are just trying to drive up the costs of the other side’s quality control efforts.
Still, sampling is a powerful tool if used right, and if you understand what it can, and cannot do. For instance, it cannot by itself improve accuracy of search at all. It is just a tool to get an idea of how you are doing in your search processes. Since I am a strong proponent and have been urging all software providers to add a random sample generators to their programs for years, I decided to practice what I preach and figured out a way to add one on this blog. It can now always be found on the blog sidebar on the right, identified as a Math Tool for Quality Control.
The third secret is that even though humans are terrible at large-scale reviews, it is a completely different story when dealing with small-scale reviews. When reviewing small sets of data, in the 500-1,000 document range (this is the number of documents reviewed by the individual TREC reviewers), there were several professional reviewers in TREC who were more precise and had better recall than the best computer systems, even though they were not subject matter experts and had no access to such experts. Even a couple of the law students won a few times. Webber’s analysis showed that the complete demise of human reviewers has been grossly exaggerated. Re-examining the Effectiveness of Manual Review.
Although pure manual review is good for a few hours, it is poor and inaccurate over large scales, as the second secret revealed. Even if it were not, manual review is far too expensive and slow for large-scale review projects. We cannot go it alone. We need the machines. But we also need to keep the arts alive, the special skills of persuasion and evidence evaluation that we lawyers have refined over centuries. (More on that in the fifth secret at the end of this blog.)
Requesters who demand production with only machine review, and any responders foolish enough to comply, have not understood the third secret. It is way too risky to turn it all over to the machines. They are not that good! The reports of their excellence have been grossly over-stated. Humans, there is need for you yet. The Borg be damned! Jobs may have passed away, but his work continues. Technology is here to empower art, not replace it. (For more on this see the blog comments at the end.)
Webber’s research, and the common experience of our best law firms and vendor review teams nationwide, suggest that a hybrid multi-modal combination of both manual and machine review is the best approach. The new emerging gold standard uses the talents of both and a variety of automated tools. It also uses extensive interactivity between humans, and between humans and machines. In Part Two of Secrets of Search I suggested nine characteristics of what I hope may become an accepted best practice for legal review worldwide. I invited peer review and comments on what I may have left out, or any challenges to what I put in, but so far this list of nine remains unchallenged:
- Bottom Line Driven Proportional Review where the projected costs of review are estimated at the beginning of a project (more on this in the next blog);
- High quality tech assisted review, with predictive coding type software, and multiple expert review of key seed-set training documents using both subject matter experts (attorneys) and AI experts (technologists);
- Direct supervision and feedback by the responsible lawyer(s) (merits counsel) signing under 26(g);
- Extensive quality control methods, including training and more training, sampling, positive feedback loops, clever batching, and sometimes, quick reassignment or firing of reviewers who are not working well on the project;
- Experienced, well motivated human reviewers who know and like the AI agents (software tools) they work with;
- New tools and psychological techniques (e.g. game theory, story telling) to facilitate prolonged concentration (beyond just coffee, $, and fear) to keep attorney reviewers engaged and motivated to perform the complex legal judgment tasks required to correctly review thousands of usually boring documents for days on end (voyeurism will only take you so far);
- Highly skilled project managers who know and understand their team, both human and computer, and the new tools and techniques under development to help coach the team;
- Strategic cooperation between opposing counsel with adequate disclosures to build trust and mutually acceptable relevancy standards; and,
- Final, last-chance review of a production set before going out the door by spot checking, judgmental sampling (i.e. search for those attorney domains one more time), and random sampling.
I have probably missed a few key factors. This is a group effort and I cannot talk to everyone, nor read all of the literature. If you think I have missed something key here, please let me know. I will be at Legal Tech New York for three days with four presentations. Seek me out and let’s talk. You can reach me at firstname.lastname@example.org.
You may note that I am herewith joining the call of other leaders in the field to develop best practice standards, notably including Jason Baron, and have overcome my initial reluctance to go there for a variety of reasons. See Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-Discovery Search, XVII RICH. J.L. & TECH. 9, at 29-33. My concerns on arbitrary standards and unfounded malpractice claims remain, but I think we have no choice but to develop some basic industry standards. The nine characteristics of good document review outlined above constitute a first modest step in that direction.
The Fourth Secret of Search:
Relevant Is Irrelevant
Sorry to sound like one of Steve Jobs’ Zen Masters, but a contradiction like Relevant Is Irrelevant has more impact than the technically more accurate statement, which is: merely relevant documents in big data reviews are irrelevant as compared to highly relevant documents. In other words, all that counts in litigation are the hot documents, the highly relevant ones with strong probative value, not the documents which are just relevant, not to mention just responsive. In fact, in big data collections, I could care less about merely relevant documents. Their only purpose is to lead me to highly relevant documents. Moreover, as we will see in the fifth and final secret, I only care about a handful of those.
In a case involving tens of thousands of documents, much less hundreds of thousands of documents, or millions of documents, almost all of the documents that are merely relevant will not be admissible into evidence. (I’ll explain why in a minute.) For that reason alone their discovery should be subject to very close scrutiny. The gathering of evidence for admission at trial is, after all, the only valid purpose of discovery. Discovery is never an end in itself, although many litigators (as opposed to true trial lawyers) and vendors often lose that track of that basic truth. Discovery is only permitted for purposes of preparation for trial. It is never permitted to extort one side into a settlement to avoid the costs of a document review, or to at least gain a strategic edge, although we all know this happens all of the time.
Why won’t most merely relevant evidence be admissible as evidence you may wonder? For the same reason that most of the even highly relevant evidence won’t be admissible. Even though relevant, this evidence is a cumulative waste of time, and for that reason is inadmissible under Rule 403 of the Federal Evidence Code and its state law equivalents. To refresh your memory on the Evidence Code:
Rule 403. Excluding Relevant Evidence for Prejudice, Confusion, Waste of Time, or Other Reasons.
The court may exclude relevant evidence if its probative value is substantially outweighed by a danger of one or more of the following: unfair prejudice, confusing the issues, misleading the jury, undue delay, wasting time, or needlessly presenting cumulative evidence.
Also see Rule 611. (“The court should exercise reasonable control over … presenting evidence so as to … (2) avoid wasting time”)
The typical fact scenario used in law school to exemplify the principle of cumulative evidence is a situation where 100 witnesses see the same accident. Each would each give roughly the same description of the event and the testimony of each would be equally relevant. Still the testimony of 100 witnesses would never be allowed because it would be a waste of time, and/or a needless presentation of cumulative evidence, to have all 100 repeat the same facts at trial. The same principle applies to documentary evidence. If there are 100 emails that show essentially the same relevant fact, you cannot admit all 100 of them. That would be a cumulative waste of time.
The question of admissibility presented in Federal Rule of Evidence 403 requires a balancing of the costs and benefits of logically relevant evidence. This is sometimes referred to as the Rule 403 balancing test. This is similar to the balancing tests in Rule 26(b)(2)(C)(i) and (iii) of the Federal Rules of Civil Procedure between the benefits and burdens of discovery.
26(b)(2)(C) The frequency or extent of use of the discovery methods otherwise permitted under these rules and by any local rule shall be limited by the court if it determines that:
(i) the discovery sought is unreasonably cumulative or duplicative, or is obtainable from some other source that is more convenient, less burdensome, or less expensive; … or
(iii) the burden or expense of the proposed discovery outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties’ resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues.
New e-discovery Rule 26(b)(2)(B) has a similar balancing test for hard-to-access ESI. So too does Rule 26(g) that requires only a reasonable inquiry of completeness in a response to discovery. Perhaps more importantly, Rule 26(g)(1)(B) also prohibits any request for discovery made “for any improper purpose, such as to harass, cause unnecessary delay, or needlessly increase the cost of litigation” and prohibits any request that is unreasonable or unduly burdensome or expensive “considering the needs of the case, prior discovery in the case, the amount in controversy, and the importance of the issues at stake in the action.” All the rules point to reasonability in discovery, and yet in e-discovery we routinely engage in unreasonable, cumulative overkill. See Patrick Oot, Anne Kershaw and Herbert L. Roitblat, Mandating Reasonableness in a Reasonable Inquiry, Denver University Law Review, 87:2, 522-559, at 537-538 (2010).
The rules clearly state that cumulative evidence is not, or at least should not, be subject to discovery. It would be a waste of time and money. Thus even though the documents might be relevant, if they are unreasonably cumulative, repetitive, or duplicative, such that the burden outweighs the benefit, they are not only inadmissible as evidence, but they are, or should be, outside of discovery.
This is buttressed by the prime directive of the Federal Rules of Civil Procedure, Rule 1. It requires all of the other rules of procedure to be interpreted and applied so as to make litigation just, speedy and inexpensive.
In spite of the clear law against cumulative, over burdensome discovery, lawyers and judges faced with big data cases today still routinely engage in discovery overkill. A 2010 survey of large cases that went to trial in 2008 showed that on average, 4,980,441 pages of documents were produced in discovery, but only 4,772 exhibit pages were entered into evidence. Duke Litigation Cost Survey of Major Companies (2010) at pg. 3. That is a ratio of over one thousand to one! Also see DCG Sys., Inc. v. Checkpoint Techs., LLC, No. C-11-03792 PSG, 2011 WL 5244356 (N.D. Cal. Nov. 2, 2011) (little benefit to justify burden of large scale email production because on average only “.0074% of the documents produced actually made their way onto the trial exhibit list” and in appeals “email appears more rarely as relevant evidence”).
These are absurd numbers for a variety of reasons. The 4,772 admitted into evidence is ridiculous over-kill, as will be shown further in the fifth secret, and so is the number of documents produced. The producing parties, acting in concert and cooperation with the requesting parties, should do a better job of culling down the irrelevant documents and marginally relevant documents. They are not needed for trial preparation.
This so-called Duke Survey, which was commissioned by the Lawyers for Civil Justice, not Duke, also offered opinion convergent with my own that such discovery is excessive (although we disagree on causation):
Whatever marginal utility may exist in undertaking such broad discovery pales in light of the costs. … Reform is clearly needed. A discovery system that requires the production of a field full of “haystacks” of information merely on the hope that the proverbial “needle” might exist and without any requirement for any showing that it actually does exist, creates a suffocating burden on the producing party. Despite this, courts almost never allocate costs to equalize the burden of discovery.
The Fifth Secret of Search: 7±2
Should Control All e-Discovery (But Doesn’t)
We have already established that the purpose of discovery is to prepare for trial. But what is the purpose of a trial? We have to understand that to be able to grasp the fifth secret: 7±2. We have to understand that the purpose of all trials is to persuade. It is a time and place, a level playing field, where lawyers try to persuade a judge and/or jury as to what happened and what should be done about it.
In this place of trial of humans, by humans, the rule of 7 ± 2 reigns supreme. It always has and, unless we allow robots as jurors, always will. Unfortunately, most litigators are unaware of this rule of the transmission of information, or if they did know of it, most fail to see its connection to discovery and search. The rule of 7±2 now has little place in e-discovery analysis.
It is a secret, and because it is unknown, we have gone astray in e-discovery. Because this secret is unknown vast sums of money are routinely wasted in the production of fields full of “haystacks” of information. Because the secret has not yet been heard, and its clear implications have not been yet been understood, trial lawyers everywhere still scratch their head in disbelief at the sheer mention of e-discovery. Yes, this secret is also the key to the seventh insight. The insights into wide-spread lawyer resistance to e-discovery analyzed in Tell Me Why?
I have alluded to this rule of seven in a few past blogs, and discussed it at a few late night dinners. But this is the first time I have written at length on the magic power of seven, plus or minus two. I hesitate to go to this deep place of information transmission and cognitive limitations, but, in order to keep the search for truth and justice on track, we really have no choice. We must, like the Pythagoreans of old, consider the significance of the number seven and its impact on our work, especially on our conceptions of proportionality.
The fifth secret of search is based on the legal art of persuasion and the limitations of information transmission. The truth is, no jury can possibly hold more than five to nine documents in their head at a time.
It is a waste of time to build a jury case around more documents than that. Judges who are trained in the law, and are quite comfortable with documents, can do a little better, but not that much. In a bench trial you might be able to use eight to twelve documents to persuade the skilled judge. But even then, you may be pushing your luck. Judges, after all, have a lot on their mind, and your particular case is just one among hundreds (in state court make that thousands).
Computers Expand Document Counts, Not Minds
Even though the computerization of society has exploded the number of documents we retain a trillion-fold, the ability of the human mind to remember and process has remained the same. We still can only be persuaded by a handful of writings. That is all of the information we can retain. Presenting dozens of documents is a waste of time.The only reason to present more that five to nine documents at trial is to provide context and an evidentiary foundation. The few dozen other documents that you may need at trial are merely window dressing, a frame for the real art.
A computer can easily process and recall millions of documents, and can do so in minutes, but we cannot. Even fast readers are limited to about 500 words per minute or a skim-review rate of 1,000. No matter how much time we may have, and in legal proceedings the time is always constrained, our ability to read, understand, and comprehend relevant writings is limited. This is especially true in the high pressure and expedited schedule of a trial and formal presentation of evidence in court. That is why all experienced trail lawyers I have talked to agree that the average juror is likely to remember and be influenced by only a handful of documents. By the way this rule of seven in persuasion is a corollary to the KISS principle (“keep it simple, stupid”), well known to all persuaders, along with “tell-tell-and-tell.”
Although most trial lawyers learn this just from hard experience, there is good theoretical support in psychology for such memory limitations. It is sometimes called Miller’s Law, after cognitive psychologist George A. Miller, a professor at Princeton University. Professor Miller first described this limitation of human cognition in his 1956 article: The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information, Psychological Review 63 (2): 81–97. This is supposedly the most widely quoted psychology paper of all time. According to Wikipedia, Miller’s paper suggests that seven (plus or minus two) is the magic number that characterizes people’s memory performance on random lists of letters, words, numbers, or almost any kind of meaningful familiar item. He essentially found that human beings were only capable of receiving, processing and remembering seven (plus or minus two) variables at any one time.
Professor Miller’s ends his famous paper on the limits of our capacity to process information with this somewhat odd remark, especially considering his reputation as a scientist:
What about the magical number seven? What about the seven wonders of the world, the seven seas, the seven deadly sins, the seven daughters of Atlas in the Pleiades, the seven ages of man, the seven levels of hell, the seven primary colors, the seven notes of the musical scale, and the seven days of the week? What about the seven-point rating scale, the seven categories for absolute judgment, the seven objects in the span of attention, and the seven digits in the span of immediate memory? For the present I propose to withhold judgment. Perhaps there is something deep and profound behind all these sevens, something just calling out for us to discover it. But I suspect that it is only a pernicious, Pythagorean coincidence.
George A. Miller, The Magical Number Seven, Plus or Minus Two (1956), 42-3.
Apparently some psychologists think Professor Miller overestimated the average human capacity when he said it was between 5-9. They think the limit is more likely to be from two to six, that the magic number is 4, not 7. Farrington, Jeanne, Seven plus or minus two, Performance Improvement Quarterly 23 (4): 113–6. doi:10.1002/piq.20099 (2011).
In any event, it is not hundreds of documents, much less thousands or millions. Yet in an average large case today 4,980,441 pages of documents are produced and 4,772 pages allowed into evidence. What is wrong with this picture? The discovery chase has lost track of the goal.
An experienced trial lawyer, who may use hundreds of exhibits in a very large trial for context and technical reasons, will still only focus on five to nine documents. They know jurors cannot handle more information than that. They know the rest of the documents that go into evidence will have little or no real persuasive value.
The limitations of the human mind thus provide a consistency and continuity with the trials and systems of justice of our past pre-computer civilizations. No matter how many more documents may exist today within the technical scope of legal relevance, our jurors’ capacities are the same; the art of legal persuasion remains the same.
These mental persuasion limits provide a governor on the number of documents useful to a trial lawyer, judge, and jury. The human mind has its limits. Computer discovery must start to realize these limits and take them into consideration. This is a basic truth that we e-discoverers have lost sight of.
It is the core of why most old-time trial lawyers think the whole business of e-discovery is ridiculous. It is high time for the secret of seven to be outed and, more importantly, to be followed. The rule of seven should have significant consequences on our legal practice and scientific research.
Uneducated Searchers Will Never Find the Top 7±2
The location of these few highly relevant documents has always been a problem in the law. But in the low volume paper world it was never an overwhelming one. The paper document search and retrieval process was a relatively simple problem traditionally assigned to the youngest, most inexperienced lawyers. Today the search for the smoking e-guns is much more difficult than ever before, yet untrained young associates are still commonly given this task. Many are simply told to go do e-discovery. They are provided with little more training than attendance of a few CLEs, which, you should know by now, don’t really teach you that much.
That is one compelling reason I took the time to make my law school training program available online to law firms, attorneys, paralegals and students everywhere. e-DiscoveryTeamTraining.com. It provides over 75 hours of instruction, which is what it takes to really learn something. Just don’t try to learn more than seven things at a time. Take your time and study online whenever it is convenient to you.
Lack of real education is the primary impediment to further progress in all e-discovery issues, including search. Patrick Oot, Anne Kershaw, and Herbert Roitblat explained it well in their excellent Mandating Reasonableness article:
The problem is not technology; it is attorneys’ lack of education and the judicial system’s inattentiveness to ensure that attorneys have the proper education and training necessary for a proportional and efficient discovery process. Lack of attorney education aggravates the problem because uneducated litigators are unable to make informed judgments as to where to draw the line on discovery, thereby creating unrealistic expectations from the courts—particularly as to costs and burdens. For example, failing to understand how different methods of search methodology work, some judges will unnecessarily mandate traditional and expensive “brute force” attorney review. …
Simply put, the legal system has a crisis of education. Both attorneys and judges need to better understand technology as it applies to the reasonable inquiry.
Mandating Reasonableness, supra at pg. 545, 547.
Just Give Me the Smoking Guns
Since only a few documents are needed for analysis of a case and even less for persuasion at trial, the search of paper-only has sufficed, until recently, for most trial lawyers. They have found the few they needed in printouts. But these days are now all but gone. The few important documents found by paper searches, and even by ESI searches that are driven by old paper based systems, are not likely to uncover the best documents. The smoking guns will remain hidden in the data deluge. Lawyers will not find the top seven needed for the judge and jury.
As the nature of documents changes, and the previously noted habits of witnesses to print key documents disappears, this problem will worsen. No one today says incriminating things in paper letters. Very few still even write paper letters. They say it in emails, text messages, instant messages, Facebook posts, blogs, tweets, etc., and almost no one prints these out and puts them in filing cabinets.
There is a key lesson for e-discovery in the trial lawyer wisdom of seven. To be useful discovery must drastically cull down from the millions of ESI files that may be relevant, to the few hundred that are useful, and the five or nine really needed for persuasion. Culling down from millions to only tens of thousands is not serving the needs of the law. It is a pointless waste of resources, a waste of client money. A production of tens of thousands of documents, not to mention hundreds of thousands, is unjust, slow and inefficient.
Many vendors today brag about how their smart culling was able to eliminate up to 80% of the corpus. They will tell you this is an excellent cull rate before you begin review. It is not. They may also tell you that it is unreasonable for you to try to cull out more than that. They are wrong. They have a financial motivation to take such conservative positions. The more documents you review, the more money they make. Some law firms see it that way too. But they won’t last, the firm’s clients will eventually catch on and switch their work away from the haystack builders.
Even if well-intentioned, many vendors (and lawyers) don’t understand that the law requires only reasonable efforts, and proportional efforts, not perfect or exhaustive efforts. They don’t understand the basic limitations of a trial or cumulative evidence. Many have never even seen a trial, much less tried one. Vendors are not supposed to give legal advice, yet I hear them do it all of the time when, for instance, they talk about how much you should review to meet your obligations under the law. Or they may say it would be very risky to try to cull out more than that. As if they could ever really eliminate risk, much less quantify risk. The only way to eliminate risk is by cooperation or court order. Not by following vendor best practice suggestions.
When you understand the fourth and fifth search secrets, you realize that a cull rate of at least 90% is proportional. It does not matter if you weed out a few merely relevant documents. If you have a million files, you should be able to weed out at least 90%, 900,000 documents, before you begin review. In fact, you should aim for elimination of 98%+ by using relevancy ranking, and only do a human hybrid review of the remaining 20,000 documents.
New e-discovery search and culling methods need to be perfected that limit the quantity of documents to a size that the human mind can deal with and comprehend. The processes should try to find all, or nearly all, of the highly relevant documents, even if a significant percentage of marginally relevant documents are missed. Who cares about these technically relevant documents? No one, except maybe those dazzled by recall stats who do not understand the natural speed limits of the mind. All that really matters are the hot documents. That is the lesson of the fourth secret of search, that Relevant Is Irrelevant.
The lesson of the fifth secret, 7±2, is that the true goal of e-discovery should be the five to nine of the hot documents that the triers of fact can understand. If your search finds those magic seven, and no others, it is a great success, regardless of all of its other misses. If your search finds a million relevant documents, and attains a precision and recall rate of 99%, but misses the top seven key documents, it is a complete failure. We have to change our search methods to focus on the top seven.
Change the Scientific Testing
We also have to redesign our scientific testing to measure what really counts, the 7±2, plus time and money. I suggest that the TREC Legal Track have a seeded test set next year where all searchers look to find seven planted Easter eggs. Whoever finds them all, or finds the most, and does so the fastest, and at the least expense, gets the highest score. In fact, for the tests to be fair and realistic, they should be time limited, and cost limited. Participants should no longer be allowed to keep that secret. In the law time and money matter. A search process is worthless that costs too much, or takes too long.
So far, all of the scientific experiments I have heard about in e-discovery have measured effectiveness, meaning how well or poorly a search performs, by only looking at Relevance measures, primarily precision and recall (or the harmonic mean thereof – F1). But in information science, Relevance is just one of the four basic measures of search effectiveness. The other three are Efficiency, Utility, and User Satisfaction. Sándor Dominich, The Modern Algebra of Information, Pgs 87-88 (Springer-Verlag, 2008). According to Dominich, the Efficiency measures are the costs of search and the time it takes. We need to start to include Efficiency measures in our tests, as well as provide heavy ranking to our Relevance measures.
In Law One Key Document is Worth a Million Relevant Documents
Too few experts in e-discovery today understand the fifth secret of search, namely the magic limiting power of seven. On the other hand, all experienced trial lawyers seem to know it well, even if they have never heard of Professor Miller. As a result of 7±2 being such a secret to many of my friends in e-discovery, they have erroneously focused on an effort to recall as many relevant documents as possible. They pride themselves in amassing large volumes of relevant documents, when in fact that is the last thing real trial lawyers want. They don’t want ten thousand relevant documents; they want ten. They want just a handful of killer documents that will help persuade the jury, that will make their story clear and convincing. The failure of e-discovery proponents to focus on this is another reason, the 7th in fact, why so many lawyers think e-discovery is stupid.
Electronic discovery search is not an academic game to be played. It is all about finding evidence for trial. Statistics and methods are worthless unless they properly weigh recall statistics by persuasive impact. One highly relevant document can, and usually does, counteract ten million relevant ones. It is like one grand master playing a thousand amateurs. The amateurs don’t have a chance. Because of this if your search is not designed to find the five to nine most persuasive documents, then your search is flawed, no matter what your precision and recall rates are.
High recall rates are only imperative for highly relevant documents, the hot documents. Nothing else matters, except for the costs involved, the time and money it takes to find evidence. If you don’t focus your search on 7±2 hottest documents, you may never find them.
I know that some will argue you have to find all of the relevant documents in order to be able to find the top 7±2. That was true in the paper world of linear review of hundreds of documents, but is not true in large-scale electronic review. You can now use software that focuses its search on the highly ranked relevant documents. But you hve to adopt your methods accordingly.
New methods for ESI review should be used that focus on retrieval of ranked relevancy, not just relevancy. The methods should focus on finding the hot documents with the understanding that merely responsive documents are, due to their extreme number, of little importance. Relevant is irrelevant. The same ranking applies to identification of privileged and confidential ESI. If one hot privileged document is missed in a privilege review, it can be far more damaging that the inadvertent production of hundreds of marginally privileged ones.
Bottom line, to follow the fourth and fifth secrets we have examined in this blog, the key feature you should look for in search software is the ability to accurately rank the probable relevant documents. Ranking must be a far more sophisticated function than simply counting the number of times a keyword, or pattern, appears in a document. It should epitomize all of the criteria and indices used by the software black box – latent semantic, four-dimensional geometric, or otherwise.
The ideal e-discovery Watson computer must not only search and find, he must rank. Put the highest on top please. Watson may not be able to put the five you will use as the first five documents shown, but it is not too much to expect that the 7±2 will be in the top 5,000. The humans working with Watson will narrow them down, and the trial lawyers making the pitch will make the final selections.
Recap of All Five Secrets
To recap, in Part I we discussed the first two secrets. The first is that keyword search sucks and so most attorneys still using this old method are searching for ESI the wrong way. The second secret is that large scale linear manual reviews also sucks and this means we do not have a reliable gold standard by which to make precision and recall measurements. We do, however, know that a hybrid approach of man and machine, using keyword, predictive coding and other automated methods, is at least as accurate as manual review and far faster and less expensive.
In Part II we discussed the third secret that in small scale reviews of 500-1,000 documents professional reviewers are still better than our best automated methods, and it is foolhardy to take human review out of the final computer proposed production set. We need human review not only to instruct the computer, but for quality control and confidentiality protection. We also discussed the parameters for a new gold standard of hybrid, multimodal search and review.
In this Part III we discussed the fourth secret that relevant is irrelevant, meaning that smart culling that follows best practices is required by the rules to keep the time and cost of review proportional. The fifth secret gleaned from our friends the trial lawyers, 7±2, reminds us of the true goal of e-discovery and the need to heavily weight and constrain our relevancy searches.
The following graphic summarizes these thoughts using the symbol of the Pythagoreans, the five-sided polygon, or pentagon, who were, by the way, famous among the ancient Greeks for secret keeping and a relentless search for truth.
As you have no doubt guessed by now, my real goal here was not to give away secrets, but to lay the foundation for new standards of search and review. The pentagon shows the first five steps, but there is still one more. In the next blog I will discuss that step and use the six-sided figure, a hexagon, to show my current understanding of best practices.
Way back in 1947 the Supreme Court in Hickman v Taylor, the landmark case on discovery, stated that “[m]utual knowledge of all the relevant facts gathered by both parties is essential to proper litigation.” 329 U.S. 495, 507 (1947). The opinion was written by Justice Frank Murphy (1890-1949) shown right. Today his statement is obsolete in so far as it says ALL the relevant facts gathered should be shared. This statement was reasonable when written in 1947, but not today. In those days, the forties, all of the relevant facts could be found in a few dozen documents. In the sixties that became at most a few hundred. In the nineteen seventies and eighties, a few thousand.
Today, sixty-five years after Hickman v. Taylor, we now live in a completely different world. Today written words profligate and multiply with the help of computers in a way that our ancestors could never imagine. Now you can gather hundreds of thousands or millions of relevant documents in even small cases. Now we write all of the time, and our writings multiply and remain, albeit in electronic form only.
The sharing of marginally important knowledge is no longer essential to proper litigation. In fact, as we have seen, it is contrary to the rules, especially Rule 26, Federal Rules of Civil Procedure. Most merely relevant documents today are inadmissable. Rule 403, Federal Rules of Evidence. They are a cumulative waste of time. It is unreasonable to gather them, much less disclose them. Rule 1 prohibits such a waste of time and money. Moreover, it is unjust. For it is easy to bury the truth in mountains of technically relevant haystacks. Document dumps are a way to hide the truth essential to proper litigation.
We need to design our e-discovery to be reasonably calculated to lead to admissible evidence, which means non-cumulative. We need to focus on the hot documents. We need to remember that all that really matters are the five to nine of the hottest documents. This is what the trial lawyers need to tell their story of prosecution or defense. The few other documents that you may want to put into evidence are just window dressing. The millions of other technically relevant documents are of little or no use in the preparation for trial, and of no use whatsoever in the conduct of a trial.
This means we need smart AI enhanced software tools. Software that we can teach to find the hottest documents. Software that has ranking built in as a core function. It also means that we need informed e-discovery attorneys who understand the secrets of search. They can then bridge the gap that now exists with trial lawyers. Then maybe the current e-discovery strategy used by most lawyers today of avoidance will be abandoned. Then maybe all lawyers will adopt proportional e-discovery designed for trial. There is a new year coming. Let’s all resolve to work together as a team to make it happen! Let’s focus our efforts. As Pythagoras supposedly said: Do not talk a little on many subjects, but much on a few.