Another TAR Course Update and a Mea Culpa for the Negative Consequences of ‘Da SIlva Moore’

June 4, 2017

We lengthened the TAR Course again by adding a video focusing on the three iterated steps in the eight-step workflow of predictive coding. Those are steps four, five and six: Training Select, AI Document Ranking, and Multimodal Review. Here is the new video introducing these steps. It is divided into two parts.

This video was added to the thirteenth class of the TAR Course. It has sixteen classes altogether, which we continue to update and announce on this blog. There were also multiple revisions to the text in this class.

Unintended Negative Consequences of Da Silva Moore

Predictive coding methods have come a long way since Judge Peck first approved predictive coding in our Da Silva Moore case. The method Brett Anders and I used back then, including disclosure of irrelevant documents in the seed set, was primarily derived from the vendor whose software we used, Recommind, and from Judge Peck himself. We had a good intellectual understanding, but it was the first use for all of us, except the vendor. I had never done a predictive coding review before, nor, for that matter, had Judge Peck. As far as I know Judge Peck still has not ever actually used predictive coding software to do document review, although you would be hard pressed to find anyone else in the world with a better intellectual grasp of the issues.

I call the methods we used in Da Silva Moore Predictive Coding 1.0. See: Predictive Coding 3.0 (October 2015) (explaining the history of predictive coding methods). Now, more than five years later, my team is on version 4.0. That is what we teach in the TAR Course. What surprises me is that the rest of the profession is still stuck in our first method, our first ideas of how to best use the awesome power of active machine learning.

This failure to move on past the Predictive Coding 1.0 methods of Da Silva Moore, is, I suspect, one of the major reasons that predictive coding has never really caught on. In fact, the most successful document review software developers since 2012 have ignored predictive coding altogether.

Mea Culpa

Looking back now at the 1.0 methods we used in Da Silva I cannot help but cringe. It is truly unfortunate that the rest of the legal profession still uses these methods. The free TAR Course is my attempt to make amends, to help the profession move on from the old methods. Mea Culpa.

In my presentation in Manhattan last month I humorously quipped that my claim to fame, Da Silva Moore, was also my claim to shame. We never intended for the methods in Da Silva Moore to be the last word. It was the first word, writ large, to be sure, but in pencil, not stone. It was like a billboard that was supposed to change, but never did. Who knew what we did back in 2012 would have such unintended negative consequences?

In Da Silva Moore we all considered the method of usage of machine learning that we came up with as something of an experiment. That is what happens when you are the first at anything. We assumed that the methods we came up with would quickly mature and evolve in other cases. They certainly did for us. Yet, the profession has mostly been silent about methods since the first version 1.0 was explained. (I could not take part in these early explanations by other “experts” as the case was ongoing and I was necessarily silenced from all public comment about it.) From what I have been told by a variety of sources many, perhaps even most attorneys and vendors are using the same methods that we used back in 2012. No wonder predictive coding has not caught on like it should. Again, sorry about that.

Why the Silence?

Still, it is hardly all my fault. I have been shouting about methods ever since 2012, even if I was muzzled from talking about Da Silva Moore. Why is no one else talking about the evolution of predictive coding methods? Why is mine the only TAR Course?

There is some discussion of methods going on, to be sure, but most of it is rehashed, or so high-level and intellectual as to be superficial and worthless. The discussions and analysis do not really go into the nitty-gritty of what to do. Why are we not talking about the subtleties of the “Stop decision?” About the in and outs of document training selection. About the respective merits of CAL versus IST? I would welcome dialogue on this with other practicing attorneys or vendor consultants. Instead, all I hear is silence and old issues.

The biggest topic still seems to be the old one of whether to filter documents with keywords before beginning machine training. That is a big, no duh, don’t do it, unless lack of money or some other circumstance forces you to, or unless the filtering is incidental and minor to cull out obvious irrelevant. See eg: Stephanie Serhan, Calling an End to Culling: Predictive Coding and the New Federal Rules of Civil Procedure, 23 Rich. J.L. & Tech. 5 (2016). Referring to the 2015 Rule Amendments, Serhan, a law student, concludes:

Considering these amendments, predictive coding should be applied at the outset on the entire universe of documents in a case. The reason is that it is far more accurate, and is not more costly or time-consuming, especially when the parties collaborate at the outset.

Also see eg, William Webber’s analysis of the Biomet case where this kind of keyword filtering was used before predictive coding began. What is the maximum recall in re Biomet?Evaluating e-Discovery (4/24/13). Webber, an information scientist, showed back in 2013 that when keyword filtering was used in the Biomet case, it filtered out over 40% of the relevant documents. This doomed the second filter predictive coding review to a maximum possible recall of 60%, even if it was perfect, meaning it would otherwise have attained 100% recall, which (almost) never happens. I have never seen a cogent rebuttal of this analysis; again, aside from proportionality, cost arguments.

There was discussion for a while on another important, yet sort of no-brainer issue, whether to keep on machine training or not, which Grossman and Cormack called Continuous Active Learning (CAL).  We did not do that in Da Silva Moore, but we were using predictive Coding 1.0 as explained by our vendor. We have known better than that now for years. In fact, later in 2012, during my two public ENRON document review experiments with predictive coding I did not follow the two-step procedure of version 1.0. Instead, I just kept on training until I could not find any more relevant documents. A Modest Contribution to the Science of Search: Report and Analysis of Inconsistent Classifications in Two Predictive Coding Reviews of 699,082 Enron Documents. (Part One); Comparative Efficacy of Two Predictive Coding Reviews of 699,082 Enron Documents(Part Two); Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron (in PDF form and the blog introducing this 82-page narrative, with second blog regarding an update); Borg Challenge: Report of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology (a five-part written and video series comparing two different kinds of predictive coding search methods).

Of course you keep training. I have never heard any viable argument to the contrary. Train then review, which is the protocol in Da Silva Moore, was the wrong way to do it. Clear and simple. The right way to do machine training is to  keep training until you are done with the review. This is the main thing that separates Predictive Coding 1.0 from 2.0. See: Predictive Coding 3.0 (October 2015). I switched to version 2.0 right after Da Silva Moore in late 2012 and started using continuous on my own initiative. It seemed obvious once I had some experience under my belt.  Still, I do credit Maura Grossman and Gordon Cormack with the terminology and scientific proof of the effectiveness of CAL, a term which they have now trademarked for some reason.  They have made important contributions to methods and are tireless educators of the profession. But where are the other voices? Where are the lawyers?

The Grossman and Cormack efforts are scientific and professorial. To me this is just work. This is what I do as a lawyer to make a living. This is what I do to help other lawyers find the key documents they need in a case. So I necessarily focus on the details of how to actually do active machine learning. I focus on the methods, the work-flow. Aside from the Professors Cormack and Grossman, and myself, almost no one else is talking about predictive coding methods. Lawyers mostly just do what the vendors recommend, like I did back in Da Silva Moore days. Yet almost all of the vendors are stagnant. (The new KrolLDiscovery and Catalyst are two exceptions, and even the former still has some promised software revisions to make.)

From what I have seen of the secret sauce that leaks out in predictive coding software demos of most vendors, they are stuck in the old version 1.0 methods. They know nothing, for instance, of the nuances of double-loop learning taught in the TAR Course. The vendors are instead still using the archaic methods that I thought were good back in 2012. I call these methods Predictive Coding 1.0 an 2.0. See: Predictive Coding 3.0 (October 2015).

In addition to continuous training, or not, most of those methods still use nonsensical random control sets that ignore concept drift, a fact of life in every large review project. Id. Moreover, the statistical analysis in 1.0 and 2.0 that they use for recall does not survive close scrutiny. Most vendors routinely ignore the impact of Confidence Intervals on range and the impact on low prevalence data-sets. They do not even mention binomial calculations designed to deal with low prevalence. Id. Also See: ZeroErrorNumerics.com.

Conclusion

The e-Discovery Team will keep on writing and teaching, satisfied that at least some of the other leaders in the field are doing essentially the same thing. You know who you are. We hope that someday others will experiment with the newer methods. The purpose of the TAR Course is to provide the information and knowledge needed to try these methods. If you have tried predictive coding before, and did not like it, we hear you. We agree. I would not like it either if I still had to use the antiquated methods of Da Silva Moore.

We try to make amends for the unintended consequences of Da SIlva Moore by offering this TAR Course. Predictive coding really is breakthrough technology, but only if used correctly. Come back and give it another try, but this time use the latest methods of Predictive Coding 4.0.

Machine learning is based on science, but the actual operation is an art and craft. So few writers in the industry seem to understand that. Perhaps that is because they are not hands-on. They do not step-in. (Stepping-In is discussed in Davenport and Kirby, Only Humans Need Apply, and by Dean Gonsowski, A Clear View or a Short Distance? AI and the Legal Industry, and A Changing World: Ralph Losey on “Stepping In” for e-Discovery. Also see: Losey, Lawyers’ Job Security in a Near Future World of AI, Part Two.) Even most vendor experts have never actually done a document review project of their own. And the software engineers, well, forget about it. They know very little about the law (and what they think they know is often wrong) and very little about what really goes on in a document review project.

Knowledge of the best methods for machine learning, for AI, does not come from thinking and analysis. It comes from doing, from practice, from trial and error. This is something all lawyers understand because most difficult tasks in the profession are like that.

The legal profession needs to stop taking legal advice from vendors on how to do AI-enhanced document review. Vendors are not supposed to be giving legal advice anyway. They should stick to what they do best, creating software, and leave it to lawyers to determine how to best use the tools they make.

My message to lawyers is to get on board the TAR train. Even though Da Silva Moore blew the train whistle long ago, the train is still in the station. The tracks ahead are clear of all legal obstacles. The hype and easy money phase has passed. The AI review train is about to get moving in earnest. Try out predictive coding, but by all means use the latest methods. Take the TAR Course on Predictive Coding 4.0 and insist that your vendor adjust their software so you can do it that way.


More Enhancements to the TAR Course with New Videos on the Importance of Keyword Search, Blair Maron, the Search Quadrant and a Similarity Search Tip

May 28, 2017

Many new enhancements were made to the TAR Course this weekend, including additions and revisions to the written materials, new graphics, new homework (for the first time) for the Twelfth Class (Random Prevalence), along with two new videos, one for the Sixth Class (Similarity Searches) and a longer one for the Seventh Class on the Search Quadrant and the classic Blair Maron research. The videos are reproduced below for the convenience of those who have already gone through the course or otherwise may be curious about my latest thoughts on legal search.

The Seventh Class is entitled Keyword and Linear Review. The new video gives background on legal search in general, and Keyword search in particular, including its known limitations. It is shown in two parts. I start off simple explaining the basic terminology but eventually get to some more nuanced points, including discussion of the Search Quadrant and the Blair and Maron study.

__

In spite of the limits of keyword search, we still use a sophisticated form of keyword search in every project, especially at the beginning of a project. We use  tested, Boolean Parametric keyword search to find the low hanging fruit. That is part of Step Two of our eight-part method. It is also part of Step Six. We feed the documents we find by this, and all other methods, into our training matrix for our machine learning.That is part of Step-Four. The eight steps in our Predictive Coding 4.0 method are covered in Classes Nine through Fifteen of the sixteen class TAR Course.

One of the things we learned at our 2016 experiments at TREC was that keyword search is more valuable than we had originally thought, when done right and when done in a relatively simple search project. But still, when keyword search is done in a naive Go Fish manner, it is very poor at Recall and Precision, even in simple cases. In complex projects even sophisticated keyword search needs to be supplemented with the more powerful machine learning algorithms. Even the best forms of keyword search can only work well alone in projects with simple data, a clear target and a good SME. The war story in part two of my video above demonstrated that.

The second new video is a short one providing a search tip on one way to use Similarity Searches. it was added to the Sixth Class.

___

___

Here is one of the new graphics I added. It uses a photo of the Compact Muon Solenoid (CMS) detector in the Large Hadron Collider. That is the famous seventeen mile long particle accelerator that straddles the border of Switzerland and France. It is the largest machine in the world and was built by the European Organization for Nuclear Research (CERN).

This photo of a key component of the world’s most sophisticated electronic tool is shown with a lift in place. The lift allows engineers to step-in and keep the technology in good working order. (Stepping-In is discussed in Davenport and Kirby, Only Humans Need Apply, and by Dean Gonsowski, A Clear View or a Short Distance? AI and the Legal Industry, and A Changing World: Ralph Losey on “Stepping In” for e-Discovery. Also see: Losey, Lawyers’ Job Security in a Near Future World of AI, Part Two. The lift in the Hadron photo illustrates the importance of humans to maintain and operate all of the new technologies we are creating. It is truly a Man-Machine hybrid relationship, just like predictive coding, where we lawyers need to step-in and enhance our evidence finding by working with our own new technology tools.

I chose  the CERN CMS because it is the ultimate technology tool now existing to enhance human capabilities. In this case to see elementary particles. The tool makes and records forty million measurements per second of high energy particle collisions. To understand my enthusiasm for the Compact Muon Solenoid in the Large Hadron Collider, the beauty of the design and boldness of the experiments, check out a few instructional videos. Start with this one by the BBC, then, if you are interested, watch a few more. The one below allows for a 360 view that you control.

____

Back to the stepping-in, double loop IST training, this is taught in the fifth class of the TAR Course. That class is called Balanced Hybrid and Intelligently Spaced Training. We use IST, Intelligently Spaced Training, a form of continuous active learning, as part of our process to select documents to use for machine training. This allows us to set up a Double Feedback Loop, where we both teach and learn to better understand the machine’s training needs. IST and double-loop training are advanced concepts and techniques taught throughout the TAR Course, but featured in the Fifth Class. The writing in this class was also slightly improved and expanded. Here is one of the new graphics for that class. The class now explains that the extra control provided by the IST method provides more wiggle-room for human creativity and innovation. (This next graphic is not a giff animation. It is an optical illusion based on work of the Japanese experimental psychologist, Akiyoshi Kitaoka. The image itself is static.)

Another photo of the CERN collider without the lift is shown below. This graphic was added to the Second Class, on TREC Total Recall Track, 2015 and 2016. It illustrates the importance of experiments and research to the e-Discovery Team’s current understanding of the three primary quality controls in TAR: (1) Method, (2) Software and (3) SME.

These three QC process factors are explained in the Eighth Class, SME, Method, Software; the Three Pillars of Quality Control. In this class we discuss the debate between AI leading to automation, versus, IA, intelligence augmentation. We advocate for enhancement and empowerment of attorneys by technology, including quality controls and fraud detection. We oppose delegation of control to the machine for document review. See Why the ‘Google Car’ Has No Place in Legal Search.

This delegation to automated methods will not stop fraud as the full-automation side argues. The SMEs are still programing relevance input. But it will decrease precision and so drive up the costs of review. It will also result in too many lost black swans when a bad stop decision is made. There are other more effective ways to guard against a crooked attorney then trying to remove the human attorney from the equation. Experienced lawyers can already detect omissions, especially when using ranking based searches.

Finally, I also added new writings and some challenging homework assignments for the Twelfth Class. This class covers Step Three – Random Prevalence, of the Team’s standard eight-step workflow. In this step a little math is required, so I added some more explanations and detailed exercises. This should make it easier to learn this new knowledge.  Now only the fourteenth, fifteenth and sixteenth classes do not have homework assignments. They will be added soon enough. Consider this a rolling production.

 

 

 

 

 

 

 


Announcing the e-Discovery Team’s TAR Training Program: 16 Classes, All Online, All Free – The TAR Course

March 19, 2017

We launch today a sixteen class online training program on Predictive Coding: the e-Discovery Team TAR Course. This is a “how to” course on predictive coding. We have a long descriptive name for our method, Hybrid Multimodal IST Predictive Coding 4.0. By the end of the course you will know exactly what that means. You will also understand the seventeen key things you need to know to do predictive coding properly, shown this diagram.


Hands-on
 hacking of predictive coding document reviews has been my obsession since Da Silva went viral. Da Silva Moore v. Publicis Groupe & MSL Group, 27 F.R.D. 182 (S.D.N.Y. 2012). That is the case where I threw Judge Peck the softball opportunity to approve predictive coding for the first time. See: Judge Peck Calls Upon Lawyers to Use Artificial Intelligence and Jason Baron Warns of a Dark Future of Information Burn-Out If We Don’t

Alas, because of my involvement in Da Silva I could never write about it, but I can tell you that none of the thousands of commentaries on the case have told the whole nasty story, including the outrageous “alternate fact” attacks by plaintiff’s counsel on Judge Andrew Peck and me. I guess I should just take the failed attempts to knock me and the Judge out of the case as flattery, but it still leaves a bad taste in my mouth. A good judge like Andy Peck did not deserve that kind of treatment. 

At the time of Da Silva, 2012, my knowledge of predictive coding was mostly theoretical, informational. But now, after “stepping-in” for five years to actually make the new software work, it is practical. For what “stepping-in” means see the excellent book on artificial intelligence and future employment by Professor Thomas Davenport and Julia Kirby, titled Only Humans Need Apply (HarperBusiness, 2016). Also see: Dean Gonsowski, A Clear View or a Short Distance? AI and the Legal Industry, and, Gonsowski, A Changing World: Ralph Losey on “Stepping In” for e-Discovery (Relativity Blog). 

If you are looking to craft a speciality in the law that rides the new wave of AI innovations, then electronic document review with TAR is a good place to start. See Part Two of my January 22, 2017 blog, Lawyers’ Job Security in a Near Future World of AI. This is where the money will be.

 

Our TAR Course is designed to teach this practical, stepping-in based knowledge. The link to the course will always be shown on this blog at the top of the page. The TAR page next to it has related information.

Since Da Silva we have learned a lot about the actual methods of predictive coding. This is hands-on learning through actual cases and experiments, including sixty-four test runs at TREC in 2015 and 2016.

We have come to understand very well the technical details, the ins and outs of legal document review enhanced by artificial intelligence, AI-enhanced review. That is what TAR and predictive coding really mean, the use of active machine learning, a type of specialized artificial intelligence, to find the key documents needed in an investigation. In the process I have written over sixty articles on the subject of TAR, predictive coding and document review, most of them focused on what we have learned about methods.

The TAR Course is the first time we have put all of this information together in a systematic training program. In sixteen classes we cover all seventeen topics, and much more. The result is an online instruction program that can be completed in one long weekend. After that it can serve as a reference manual. The goal is to help you to step-in and improve your document review projects.

The TAR Course has sixteen classes listed below. Click on some and check them out. All free. We do not even require registration. No tests either, but someday soon that may change. Stay tuned to the e-Discovery Team. This is just the first step dear readers of my latest hack of the profession. Change we must, and not just gradual, but radical. That is the only way the Law can keep up with the accelerating advances in technology. Taking the TAR Course is a minimum requirement and will get you ready for the next stage.

  1. First Class: Introduction
  2. Second Class: TREC Total Recall Track
  3. Third Class: Introduction to the Nine Insights Concerning the Use of Predictive Coding in Legal Document Review
  4. Fourth Class: 1st of the Nine Insights – Active Machine Learning
  5. Fifth Class: Balanced Hybrid and Intelligently Spaced Training
  6. Sixth Class: Concept and Similarity Searches
  7. Seventh Class: Keyword and Linear Review
  8. Eighth Class: GIGO, QC, SME, Method, Software
  9. Ninth Class: Introduction to the Eight-Step Work Flow
  10. Tenth Class: Step One – ESI Communications
  11. Eleventh Class: Step Two – Multimodal ECA
  12. Twelfth Class: Step Three – Random Prevalence
  13. Thirteenth Class: Steps Four, Five and Six – Iterate
  14. Fourteenth Class: Step Seven – ZEN Quality Assurance Tests
  15. Fifteenth Class: Step Eight – Phased Production
  16. Sixteenth Class: Conclusion

This course is not about the theory or law of predictive coding. You can easily get that elsewhere. It is about learning the latest methods to do predictive coding. It is about learning how to train an AI to find the ESI evidence you want. The future looks bright for attorneys with both legal knowledge and skills and software knowledge and skills. The best and brightest will also be able to work with various kinds of specialized AI to do a variety of tasks, including AI-enhanced document review. If that is your interest, then jump onto the TAR Course and start your training today. Who knows where it may take you?

________

__

.

 

 


%d bloggers like this: