Fifteenth Class: Step Seven – ZEN Quality Assurance Tests
[There have been no significant changes in this step from Version 3.0 to Version 4.0.] ZEN here stands for Zero Error Numerics. Predictive Coding 4.0 requires quality control activities in all steps, but the efforts peak in this Step Seven. For more details than provided here on the ZEN approach to quality control in document review see ZeroErrorNumerics.com.
The class begins with a video introduction by Ralph Losey. The video does not go into the math, concentration and reviewer focus, or things like that. Ralph’s video instead provides an introduction to the main purpose of Step Seven from a work-flow perspective, to test and validate the decision to stop the Training Cycle steps, 4-5-6.
The Training Cycle steps 4-5-6 continue until the expert in charge of the training decides to stop. This is a decision to complete the first pass document review. You decide to stop the review after weighing a multitude of considerations, including when the software has attained a highly stratified distribution of documents.
The all important stop decision is a legal, statistical decision requiring a holistic approach, including metrics, sampling and over-all project assessment.You decide to stop the review after weighing a multitude of considerations. Then you test your decision with a random sample in Step Seven.
By the way, we are using the phrase “accept on zero error” in the general quality control sense, not in the specialized usage of the phrase contained in the The Grossman-Cormack Glossary of Technology Assisted Review. Losey did not even know that phrase was in their glossary until recently. The e-Discovery Team has been using this phrase in the more general sense for several years. We do not advocate use of the accept on zero error method as defined in their glossary.
In Step Seven a random sample is taken for two distinct, but related reasons: as a quality assurance test of the stop decision made at the end of the last step six; and, to evaluate the recall range attained. The recall calculation method currently favored is described in detail in Introducing “ei-Recall” – A New Gold Standard for Recall Calculations in Legal Search – Part One, Part Two and Part Three. Also see: In Legal Search Exact Recall Can Never Be Known.
The ei-Recall test is based on a random sample of all documents to be excluded from the Final Review for possible production. Unlike the ill-fated control set of Predictive Coding 1.0 methodologies, the sample here is taken at the end of the project. At that time the final relevance conceptions have evolved to their final form and therefore much more accurate projections of recall can be made from the sample. The documents sampled can be based on documents excluded by category prediction (i.e. probable irrelevant) and/or by probable ranking of documents with proportionate cut-offs. The focus is on a search for any false negatives (i.e., relevant documents incorrectly predicted to be irrelevant) that are Highly Relevant or otherwise of significance.
Total 100% recall of all relevant documents is said by the professors to be scientifically impossible (unless you produce all documents, 0% precision), a myth that the e-Discovery Team shattered in TREC 2015 and again in 2016 in our Total Recall Track experiments. Still, it is very rare, and only happens in relatively simple search and review projects, akin to a straightforward single plaintiff employment case with clear relevance. In any event, total recall of all relevant document is legally unnecessary. Perfection – zero error – is a good goal, but never a legal requirement. The legal requirement is reasonable, proportional efforts to find the ESI that is important to resolve the key disputed issues of fact in the case. The goal is to avoid all false negatives of Highly Relevant documents. If this error is encountered, one or more additional iterations of Steps 4, 5 and 6 are required.
As explained in the video in step seven you also test the decision made at the end of step six to stop the training. This decision is evaluated by the random sample, but determined by a complex variety of factors that can be case specific. Typically it is determined by when the software has attained a highly stratified distribution of documents. See License to Kull: Two-Filter Document Culling and Visualizing Data in a Predictive Coding Project – Part One, Part Two and Part Three, and Introducing a New Website, a New Legal Service, and a New Way of Life / Work; Plus a Postscript on Software Visualization.
When the stratification has stabilized you will see very few new documents found as predicted relevant that have not already been human reviewed and coded as relevant. You essentially run out of documents for step six review. Put another way, your step six no longer uncovers new relevant documents. This exhaustion marker may, in many projects, mean that the rate of newly found documents has slowed, but not stopped entirely. I have written about this quite a bit, primarily in Visualizing Data in a Predictive Coding Project –Part One, Part Two and Part Three. The distribution ranking of documents in a mature project, one that has likely found all relevant documents of interest, will typically look something like the diagram below. We call this the upside down champagne glass with red relevant documents on top and irrelevant on the bottom.Also see Postscript on Software Visualization where even more dramatic stratifications are encountered and shown.
Another key determinant of when to stop is the cost of further review. Is it worth it to continue on with more iterations of steps four, five and six? See Predictive Coding and the Proportionality Doctrine: a Marriage Made in Big Data, 26 Regent U. Law Review 1 (2013-2014) (note article was based on earlier version 2.0 of our methods where the training was not necessarily continuous). Another criteria in the stop decision is whether you have found the information needed. If so, what is the purpose of continuing a search? Again, the law never requires finding all relevant, only reasonable efforts to find the relevant documents needed to decide the important fact issues in the case. This last point is often overlooked by inexperienced lawyers.
Another important quality control technique, one used throughout a project, is the avoidance of all dual tasking, and learned, focused concentration, a flow-state, like an all-absorbing video game, movie, or a meditation. Here is a short video I did on the importanced of focus in document review.
Speaking of relaxed, thought free, flow state, did you know that United States Supreme Court Justice Stephen Breyer is a regular meditator? In a CNN reporter interview in 2011 he said:
For 10 or 15 minutes twice a day I sit peacefully. I relax and think about nothing or as little as possible. … And really I started because it’s good for my health. My wife said this would be good for your blood pressure and she was right. It really works. I read once that the practice of law is like attempting to drink water from a fire hose. And if you are under stress, meditation – or whatever you choose to call it – helps. Very often I find myself in circumstances that may be considered stressful, say in oral arguments where I have to concentrate very hard for extended periods. If I come back at lunchtime, I sit for 15 minutes and perhaps another 15 minutes later. Doing this makes me feel more peaceful, focused and better able to do my work.”
Apparently Steve Breyer also sometimes meditates with friends, including legendary Public Interest Lawyer, Professor and meditation promoter, Charles Halpern. Also see Halpern, Making Waves and Riding the Currents (2008) (his interesting autobiography); Charles Halpern on Empathy, Meditation, Barack Obama, Justice and Law (YouTube Interview in 2011 with interesting thoughts on judicial selection).
Document review is not as stressful as a Supreme Court oral argument, but it does go on far longer. Everybody needs to relax with a clear mind, and with focused attention, to attain their peak level of performance. That is the key to all quality control. How you get there is your business. Me, in addition to frequent breaks, I like headphones with music to help me there and help me to stay undistracted, focused. So, sit comfortably, spine erect, and enjoy this moment of ZEN.
For more details on step seven see ZeroErrorNumericcs.com.
Still working on suggested “homework” assignments for this class. Check back with us later to see what we come up with.
e-Discovery Team LLC COPYRIGHT 2017
ALL RIGHTS RESERVED