New Homework Added to the TAR Course and a New Video Added to AI-Ethics

September 3, 2017

We have added a homework assignment to Class Sixteen of the TAR Course. This is the next to last class in the course. Here we cover the eighth step of our eight-step routine, Phased Production. I share the full homework assignment below for those not yet familiar with our instructional methods, especially our take on homework. Learning is or should be a life-long process.

But before we get to that I want to share the new video added to the web at the end of the Intro/Mission page. Here I articulate the opinion of many in the AI world that an interdisciplinary team approach is necessary for the creation of ethical codes to regulate artificial intelligence. This team approach has worked well for electronic discovery and Losey is convinced it will work for AI Law as well. AI Ethics is one of the most important issues facing humanity today. It is way too important for lawyers and government regulators alone. It is also way too important to leave to AI coders and professors to improvise on their own. We have to engage in true dialogue and collaborate.


Now back to the more mundane world of homework and learning the Team’s latest process for the application of machine learning to find evidence for trial. Here is the new homework assignment for Class Sixteen of the TAR Course.


Go on to the Seventeenth and last class, or pause to do this suggested “homework” assignment for further study and analysis.

SUPPLEMENTAL READING: It is important to have a good understanding of privilege and work-product protection. The basic U.S. Supreme Court case in this area is Hickman v. Taylor, 329 US 495 (1947). Another key case to know is Upjohn Co., v. U.S. 449 U.S. 383 (1981).  For an authoritative digest of case law on the subject with an e-discovery perspective, download and study The Sedona Conference Commentary on Protection of Privileged ESI 2015.pdf (Dec. 2015).

EXERCISES: Study Judge Andrew Peck’s form 502(d) order.  You can find it here. His form order started off as just two sentences, but he later added a third sentence at the end:

The production of privileged or work-product protected documents, electronically stored information (“ESI”) or information, whether inadvertent or otherwise, is not a waiver of the privilege or protection from discovery in this case or in any other federal or state proceeding. This Order shall be interpreted to provide the maximum protection allowed by Federal Rule of Evidence 502(d).
Nothing contained herein is intended to or shall serve to limit a party’s right to conduct a review of documents, ESI or information (including metadata) for relevance, responsiveness and/or segregation of privileged and/or protected information before production.

Do you know the purpose of this additional sentence? Why might someone oppose a 502(d) Order? What does that tell you about them? What does that tell the judge about them? My law firm has been opposed a few times, but we have never failed. Well, there was that one time, where both sides agreed, and the judge would not enter the stipulated order, saying it was not necessary, that he would anyway provide such protection. So, mission accomplished anyway.

Do you think it is overly hyper for us to recommend that a 502(d) Order be entered in every case where there is ESI review and production? Think that some cases are too small and too easy to bother with that? That it is o.k. to just have a claw-back agreement? Well take a look at this opinion and you may well change your mind. Irth Solutions, LLC v. Windstream Communications, LLC, (S.D. Ohio, E Div., 8/2/17). Do you think this was a fair decision? What do you think about the partner putting all of the blame on the senior associate (seven-year) for the mistaken production of privileged ESI? What do you think of the senior associate who in turn blamed the junior associate (two-year)? The opinion does not state who signed the Rule 26(g) response to the request to produce. Do you think that should matter? By the way, having been a partner in a law firm since at least 1984, I think this kind of blame-game behavior was reprehensible!

Students are invited to leave a public comment below. Insights that might help other students are especially welcome. Let’s collaborate!


TAR Course Updated to Add Video on Step Seven and the All Important “Stop Decision”

June 11, 2017

We added to the TAR Course again this weekend with a video introducing Class Fourteen on Step Seven, ZEN Quality Assurance Tests. ZEN stands for Zero Error Numerics with the double-entendre on purpose, but this video does not go into the math, concentration or reviewer focus. Ralph’s video instead provides an introduction to the main purpose of Step Seven from a work-flow perspective, to test and validate the decision to stop the Training Cycle steps, 4-5-6.

The Training Cycle shown in the diagram continues until the expert in charge of the training decides to stop. This is a decision to complete the first pass document review. The stop decision is a legal, statistical decision requiring a holistic approach, including metrics, sampling and over-all project assessment. You decide to stop the review after weighing a multitude of considerations, including when the software has attained a highly stratified distribution of documents. See License to Kull: Two-Filter Document Culling and Visualizing Data in a Predictive Coding ProjectPart One, Part Two and Part Three, and Introducing a New Website, a New Legal Service, and a New Way of Life / Work; Plus a Postscript on Software Visualization. Then you test your decision with a random sample in Step Seven.


Team Methods in TREC Skipped Steps 1, 3 & 7



By the way, I am using the phrase “accept on zero error” in the video in the general quality control sense, not in the specialized usage of the phrase contained in the The Grossman-Cormack Glossary of Technology Assisted Review. I forgot that  phrase was in their glossary until recently. I have been using the term in the more general sense for several years. I do not advocate use of the accept on zero error method as defined in their glossary. I am not sure anyone does, but it is in their dictionary, so I felt this clarification was in order.

Stop Decision

The stop decision is the most difficult decision in predictive coding. The decision must be made in all types of predictive coding methods, not just our Predictive Coding 4.0. Many of the scientists attending TREC 2015 were discussing this decision process. There was no agreement on criteria for the stop decision, except that all seemed to agree it is a complex issue that cannot be resolved by random sampling alone. The prevalence of most projects is too low for that.

The e-Discovery Team grapples with the stop decision in every project, although in most it is a fairly simple decision because no more relevant documents have surfaced to the higher rankings. Still, in some projects it can be tricky. That is where experience is especially helpful. We do not want to quit too soon and miss important relevant information. On the other hand, we do not want to waste time look at uninteresting documents.

Still, in most projects we know it is about time to stop when the stratification of document ranking has stabilized. The training has stabilized when you see very few new documents predicted relevant that have not already been human reviewed and coded as relevant. You essentially run out of documents for step six review. Put another way, your step six no longer uncovers new relevant documents.

This exhaustion marker may, in many projects, mean that the rate of newly found documents has slowed, but not stopped entirely. I have written about this quite a bit, primarily in Visualizing Data in a Predictive Coding ProjectPart One, Part Two and Part Three. The distribution ranking of documents in a mature project, one that has likely found all relevant documents of interest, will typically look something like the diagram below. We call this the upside down champagne glass with red relevant documents on top and irrelevant on the see Postscript on Software Visualization where even more dramatic stratification are encountered and shown.

Another key determinant of when to stop is the cost of further review. Is it worth it to continue on with more iterations of steps four, five and six? See Predictive Coding and the Proportionality Doctrine: a Marriage Made in Big Data, 26 Regent U. Law Review 1 (2013-2014) (note article was based on earlier version 2.0 of our methods where the training was not necessarily continuous). Another criteria in the stop decision is whether you have found the information needed. If so, what is the purpose of continuing a search? Again, the law never requires finding all relevant, only reasonable efforts to find the relevant documents needed to decide the important fact issues in the case. Rule 1 and 26(b)(1) must be considered.

The stop decision is state of the art in difficulty and creativity. We often provide custom solutions for testing the decision depending upon project contours and other unique circumstances. I wish Duke would have a conference on that, instead of one to reinvent old wheels. But as George Bernard Shaw said, those who can, do. You know the rest.


We continue with our work improving our document review methods and improving the free TAR Course. We want to make information on best practices in this area as accessible as possible and as easy to understand as possible. We have figured out our processes over thousands of projects since the Da Silva Moore days (2011-2012). It has come out of legal practice, trial and error. We learn by doing, but we also teach this stuff, just not for a living. We also run scientific experiments in TREC and on our own, again, just not for a living. Our Predictive Coding 4.0 Hybrid Multimodal IST method has not come out of conferences and debates. It is a legal practice, not an academic study or exercise in group consensus.

Try it yourself and see. Just do not use the first version methods of predictive coding that we used back in Da Silva Moore. Another TAR Course Update and a Mea Culpa for the Negative Consequences of ‘Da SIlva Moore’. Use the latest version 4.0 methods.

The old methods, versions 1.0 and 2.0, that most of the industry still follows, must be abandoned. Predictive Coding 1.o did not use continuous active training, it used Train Then Review (TTR). That invited needless disclosure debates and other poor practices. Version 1.0 also used control sets. In version 2.0 continuous active training (CAT) replaced TTR, but control sets are still used. In version 3.0 CAT is used, and Control Sets are abandoned. In our version 3.0 we replaced the secret control set basis of recall calculation with a prevalence based random sample guide in Step Three and an elusion based quality control sample in Step Seven. See: Predictive Coding 3.0 (October 2015).

In version 4.0, our current version, we further refined the continuous training aspects of our method with the technique we call Intelligently Spaced Training, IST.

Our new eight-step Predictive Coding 4.0 is easier to use than every before and is now battled tested in both legal and scientific arenas. Take the TAR Course, try using our new methods of document review, instead of the old Da Silva Moore methods. If you do, we think you will be as excited about predictive coding as we are. Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding.




%d bloggers like this: