TAR Course Updated to Add Video on Step Seven and the All Important “Stop Decision”

June 11, 2017

We added to the TAR Course again this weekend with a video introducing Class Fourteen on Step Seven, ZEN Quality Assurance Tests. ZEN stands for Zero Error Numerics with the double-entendre on purpose, but this video does not go into the math, concentration or reviewer focus. Ralph’s video instead provides an introduction to the main purpose of Step Seven from a work-flow perspective, to test and validate the decision to stop the Training Cycle steps, 4-5-6.

The Training Cycle shown in the diagram continues until the expert in charge of the training decides to stop. This is a decision to complete the first pass document review. The stop decision is a legal, statistical decision requiring a holistic approach, including metrics, sampling and over-all project assessment. You decide to stop the review after weighing a multitude of considerations, including when the software has attained a highly stratified distribution of documents. See License to Kull: Two-Filter Document Culling and Visualizing Data in a Predictive Coding ProjectPart One, Part Two and Part Three, and Introducing a New Website, a New Legal Service, and a New Way of Life / Work; Plus a Postscript on Software Visualization. Then you test your decision with a random sample in Step Seven.

____

Team Methods in TREC Skipped Steps 1, 3 & 7

_____

____

By the way, I am using the phrase “accept on zero error” in the video in the general quality control sense, not in the specialized usage of the phrase contained in the The Grossman-Cormack Glossary of Technology Assisted Review. I forgot that  phrase was in their glossary until recently. I have been using the term in the more general sense for several years. I do not advocate use of the accept on zero error method as defined in their glossary. I am not sure anyone does, but it is in their dictionary, so I felt this clarification was in order.

Stop Decision

The stop decision is the most difficult decision in predictive coding. The decision must be made in all types of predictive coding methods, not just our Predictive Coding 4.0. Many of the scientists attending TREC 2015 were discussing this decision process. There was no agreement on criteria for the stop decision, except that all seemed to agree it is a complex issue that cannot be resolved by random sampling alone. The prevalence of most projects is too low for that.

The e-Discovery Team grapples with the stop decision in every project, although in most it is a fairly simple decision because no more relevant documents have surfaced to the higher rankings. Still, in some projects it can be tricky. That is where experience is especially helpful. We do not want to quit too soon and miss important relevant information. On the other hand, we do not want to waste time look at uninteresting documents.

Still, in most projects we know it is about time to stop when the stratification of document ranking has stabilized. The training has stabilized when you see very few new documents predicted relevant that have not already been human reviewed and coded as relevant. You essentially run out of documents for step six review. Put another way, your step six no longer uncovers new relevant documents.

This exhaustion marker may, in many projects, mean that the rate of newly found documents has slowed, but not stopped entirely. I have written about this quite a bit, primarily in Visualizing Data in a Predictive Coding ProjectPart One, Part Two and Part Three. The distribution ranking of documents in a mature project, one that has likely found all relevant documents of interest, will typically look something like the diagram below. We call this the upside down champagne glass with red relevant documents on top and irrelevant on the bottom.data-visual_Round_5Also see Postscript on Software Visualization where even more dramatic stratification are encountered and shown.

Another key determinant of when to stop is the cost of further review. Is it worth it to continue on with more iterations of steps four, five and six? See Predictive Coding and the Proportionality Doctrine: a Marriage Made in Big Data, 26 Regent U. Law Review 1 (2013-2014) (note article was based on earlier version 2.0 of our methods where the training was not necessarily continuous). Another criteria in the stop decision is whether you have found the information needed. If so, what is the purpose of continuing a search? Again, the law never requires finding all relevant, only reasonable efforts to find the relevant documents needed to decide the important fact issues in the case. Rule 1 and 26(b)(1) must be considered.

The stop decision is state of the art in difficulty and creativity. We often provide custom solutions for testing the decision depending upon project contours and other unique circumstances. I wish Duke would have a conference on that, instead of one to reinvent old wheels. But as George Bernard Shaw said, those who can, do. You know the rest.

Conclusion

We continue with our work improving our document review methods and improving the free TAR Course. We want to make information on best practices in this area as accessible as possible and as easy to understand as possible. We have figured out our processes over thousands of projects since the Da Silva Moore days (2011-2012). It has come out of legal practice, trial and error. We learn by doing, but we also teach this stuff, just not for a living. We also run scientific experiments in TREC and on our own, again, just not for a living. Our Predictive Coding 4.0 Hybrid Multimodal IST method has not come out of conferences and debates. It is a legal practice, not an academic study or exercise in group consensus.

Try it yourself and see. Just do not use the first version methods of predictive coding that we used back in Da Silva Moore. Another TAR Course Update and a Mea Culpa for the Negative Consequences of ‘Da SIlva Moore’. Use the latest version 4.0 methods.

The old methods, versions 1.0 and 2.0, that most of the industry still follows, must be abandoned. Predictive Coding 1.o did not use continuous active training, it used Train Then Review (TTR). That invited needless disclosure debates and other poor practices. Version 1.0 also used control sets. In version 2.0 continuous active training (CAT) replaced TTR, but control sets are still used. In version 3.0 CAT is used, and Control Sets are abandoned. In our version 3.0 we replaced the secret control set basis of recall calculation with a prevalence based random sample guide in Step Three and an elusion based quality control sample in Step Seven. See: Predictive Coding 3.0 (October 2015).

In version 4.0, our current version, we further refined the continuous training aspects of our method with the technique we call Intelligently Spaced Training, IST.

Our new eight-step Predictive Coding 4.0 is easier to use than every before and is now battled tested in both legal and scientific arenas. Take the TAR Course, try using our new methods of document review, instead of the old Da Silva Moore methods. If you do, we think you will be as excited about predictive coding as we are. Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding.

 

 

 


Another TAR Course Update and a Mea Culpa for the Negative Consequences of ‘Da SIlva Moore’

June 4, 2017

We lengthened the TAR Course again by adding a video focusing on the three iterated steps in the eight-step workflow of predictive coding. Those are steps four, five and six: Training Select, AI Document Ranking, and Multimodal Review. Here is the new video introducing these steps. It is divided into two parts.

This video was added to the thirteenth class of the TAR Course. It has sixteen classes altogether, which we continue to update and announce on this blog. There were also multiple revisions to the text in this class.

Unintended Negative Consequences of Da Silva Moore

Predictive coding methods have come a long way since Judge Peck first approved predictive coding in our Da Silva Moore case. The method Brett Anders and I used back then, including disclosure of irrelevant documents in the seed set, was primarily derived from the vendor whose software we used, Recommind, and from Judge Peck himself. We had a good intellectual understanding, but it was the first use for all of us, except the vendor. I had never done a predictive coding review before, nor, for that matter, had Judge Peck. As far as I know Judge Peck still has not ever actually used predictive coding software to do document review, although you would be hard pressed to find anyone else in the world with a better intellectual grasp of the issues.

I call the methods we used in Da Silva Moore Predictive Coding 1.0. See: Predictive Coding 3.0 (October 2015) (explaining the history of predictive coding methods). Now, more than five years later, my team is on version 4.0. That is what we teach in the TAR Course. What surprises me is that the rest of the profession is still stuck in our first method, our first ideas of how to best use the awesome power of active machine learning.

This failure to move on past the Predictive Coding 1.0 methods of Da Silva Moore, is, I suspect, one of the major reasons that predictive coding has never really caught on. In fact, the most successful document review software developers since 2012 have ignored predictive coding altogether.

Mea Culpa

Looking back now at the 1.0 methods we used in Da Silva I cannot help but cringe. It is truly unfortunate that the rest of the legal profession still uses these methods. The free TAR Course is my attempt to make amends, to help the profession move on from the old methods. Mea Culpa.

In my presentation in Manhattan last month I humorously quipped that my claim to fame, Da Silva Moore, was also my claim to shame. We never intended for the methods in Da Silva Moore to be the last word. It was the first word, writ large, to be sure, but in pencil, not stone. It was like a billboard that was supposed to change, but never did. Who knew what we did back in 2012 would have such unintended negative consequences?

In Da Silva Moore we all considered the method of usage of machine learning that we came up with as something of an experiment. That is what happens when you are the first at anything. We assumed that the methods we came up with would quickly mature and evolve in other cases. They certainly did for us. Yet, the profession has mostly been silent about methods since the first version 1.0 was explained. (I could not take part in these early explanations by other “experts” as the case was ongoing and I was necessarily silenced from all public comment about it.) From what I have been told by a variety of sources many, perhaps even most attorneys and vendors are using the same methods that we used back in 2012. No wonder predictive coding has not caught on like it should. Again, sorry about that.

Why the Silence?

Still, it is hardly all my fault. I have been shouting about methods ever since 2012, even if I was muzzled from talking about Da Silva Moore. Why is no one else talking about the evolution of predictive coding methods? Why is mine the only TAR Course?

There is some discussion of methods going on, to be sure, but most of it is rehashed, or so high-level and intellectual as to be superficial and worthless. The discussions and analysis do not really go into the nitty-gritty of what to do. Why are we not talking about the subtleties of the “Stop decision?” About the in and outs of document training selection. About the respective merits of CAL versus IST? I would welcome dialogue on this with other practicing attorneys or vendor consultants. Instead, all I hear is silence and old issues.

The biggest topic still seems to be the old one of whether to filter documents with keywords before beginning machine training. That is a big, no duh, don’t do it, unless lack of money or some other circumstance forces you to, or unless the filtering is incidental and minor to cull out obvious irrelevant. See eg: Stephanie Serhan, Calling an End to Culling: Predictive Coding and the New Federal Rules of Civil Procedure, 23 Rich. J.L. & Tech. 5 (2016). Referring to the 2015 Rule Amendments, Serhan, a law student, concludes:

Considering these amendments, predictive coding should be applied at the outset on the entire universe of documents in a case. The reason is that it is far more accurate, and is not more costly or time-consuming, especially when the parties collaborate at the outset.

Also see eg, William Webber’s analysis of the Biomet case where this kind of keyword filtering was used before predictive coding began. What is the maximum recall in re Biomet?Evaluating e-Discovery (4/24/13). Webber, an information scientist, showed back in 2013 that when keyword filtering was used in the Biomet case, it filtered out over 40% of the relevant documents. This doomed the second filter predictive coding review to a maximum possible recall of 60%, even if it was perfect, meaning it would otherwise have attained 100% recall, which (almost) never happens. I have never seen a cogent rebuttal of this analysis; again, aside from proportionality, cost arguments.

There was discussion for a while on another important, yet sort of no-brainer issue, whether to keep on machine training or not, which Grossman and Cormack called Continuous Active Learning (CAL).  We did not do that in Da Silva Moore, but we were using predictive Coding 1.0 as explained by our vendor. We have known better than that now for years. In fact, later in 2012, during my two public ENRON document review experiments with predictive coding I did not follow the two-step procedure of version 1.0. Instead, I just kept on training until I could not find any more relevant documents. A Modest Contribution to the Science of Search: Report and Analysis of Inconsistent Classifications in Two Predictive Coding Reviews of 699,082 Enron Documents. (Part One); Comparative Efficacy of Two Predictive Coding Reviews of 699,082 Enron Documents(Part Two); Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron (in PDF form and the blog introducing this 82-page narrative, with second blog regarding an update); Borg Challenge: Report of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology (a five-part written and video series comparing two different kinds of predictive coding search methods).

Of course you keep training. I have never heard any viable argument to the contrary. Train then review, which is the protocol in Da Silva Moore, was the wrong way to do it. Clear and simple. The right way to do machine training is to  keep training until you are done with the review. This is the main thing that separates Predictive Coding 1.0 from 2.0. See: Predictive Coding 3.0 (October 2015). I switched to version 2.0 right after Da Silva Moore in late 2012 and started using continuous on my own initiative. It seemed obvious once I had some experience under my belt.  Still, I do credit Maura Grossman and Gordon Cormack with the terminology and scientific proof of the effectiveness of CAL, a term which they have now trademarked for some reason.  They have made important contributions to methods and are tireless educators of the profession. But where are the other voices? Where are the lawyers?

The Grossman and Cormack efforts are scientific and professorial. To me this is just work. This is what I do as a lawyer to make a living. This is what I do to help other lawyers find the key documents they need in a case. So I necessarily focus on the details of how to actually do active machine learning. I focus on the methods, the work-flow. Aside from the Professors Cormack and Grossman, and myself, almost no one else is talking about predictive coding methods. Lawyers mostly just do what the vendors recommend, like I did back in Da Silva Moore days. Yet almost all of the vendors are stagnant. (The new KrolLDiscovery and Catalyst are two exceptions, and even the former still has some promised software revisions to make.)

From what I have seen of the secret sauce that leaks out in predictive coding software demos of most vendors, they are stuck in the old version 1.0 methods. They know nothing, for instance, of the nuances of double-loop learning taught in the TAR Course. The vendors are instead still using the archaic methods that I thought were good back in 2012. I call these methods Predictive Coding 1.0 an 2.0. See: Predictive Coding 3.0 (October 2015).

In addition to continuous training, or not, most of those methods still use nonsensical random control sets that ignore concept drift, a fact of life in every large review project. Id. Moreover, the statistical analysis in 1.0 and 2.0 that they use for recall does not survive close scrutiny. Most vendors routinely ignore the impact of Confidence Intervals on range and the impact on low prevalence data-sets. They do not even mention binomial calculations designed to deal with low prevalence. Id. Also See: ZeroErrorNumerics.com.

Conclusion

The e-Discovery Team will keep on writing and teaching, satisfied that at least some of the other leaders in the field are doing essentially the same thing. You know who you are. We hope that someday others will experiment with the newer methods. The purpose of the TAR Course is to provide the information and knowledge needed to try these methods. If you have tried predictive coding before, and did not like it, we hear you. We agree. I would not like it either if I still had to use the antiquated methods of Da Silva Moore.

We try to make amends for the unintended consequences of Da SIlva Moore by offering this TAR Course. Predictive coding really is breakthrough technology, but only if used correctly. Come back and give it another try, but this time use the latest methods of Predictive Coding 4.0.

Machine learning is based on science, but the actual operation is an art and craft. So few writers in the industry seem to understand that. Perhaps that is because they are not hands-on. They do not step-in. (Stepping-In is discussed in Davenport and Kirby, Only Humans Need Apply, and by Dean Gonsowski, A Clear View or a Short Distance? AI and the Legal Industry, and A Changing World: Ralph Losey on “Stepping In” for e-Discovery. Also see: Losey, Lawyers’ Job Security in a Near Future World of AI, Part Two.) Even most vendor experts have never actually done a document review project of their own. And the software engineers, well, forget about it. They know very little about the law (and what they think they know is often wrong) and very little about what really goes on in a document review project.

Knowledge of the best methods for machine learning, for AI, does not come from thinking and analysis. It comes from doing, from practice, from trial and error. This is something all lawyers understand because most difficult tasks in the profession are like that.

The legal profession needs to stop taking legal advice from vendors on how to do AI-enhanced document review. Vendors are not supposed to be giving legal advice anyway. They should stick to what they do best, creating software, and leave it to lawyers to determine how to best use the tools they make.

My message to lawyers is to get on board the TAR train. Even though Da Silva Moore blew the train whistle long ago, the train is still in the station. The tracks ahead are clear of all legal obstacles. The hype and easy money phase has passed. The AI review train is about to get moving in earnest. Try out predictive coding, but by all means use the latest methods. Take the TAR Course on Predictive Coding 4.0 and insist that your vendor adjust their software so you can do it that way.


%d bloggers like this: