I cannot believe that the best document review vendors in the world, the ones that include active machine learning in their software, still include secret Control Sets in their built-in methodology. It was a mistake made by most vendors when predictive coding was first released years ago. It is well past time for vendors to own up to the mistake. Please modify your software to eliminate it before you do any more damage, both to yourself and, more importantly, the whole profession. Lose your fear of academic institutions and do what’s right. I am not naming names yet, but I may have to eventually. My patience is wearing thin. Maybe you can tell that from my video rant below, where I get so worked up that I use the “H” word. This is another new video for the e-Discovery Team’s TAR Course. It is included in the new First Class that we just added to the course.
Every day that vendors keep phony control set procedures is another day that lawyers are mislead on recall calculations based on them; another day lawyers are frustrated by wasting their time on overly large random samples; another day everyone has a false sense of protection from the very few unethical lawyers out there, and the very many not fully competent lawyers; and another day clients pay too much for document review. Stop shooting yourself in the foot software vendors. And lawyers, stop using control sets in your methods. Do not just do what vendors tell you to do. Demand that your vendor change its software or at least show you how to use it without secret control sets.
The method of predictive coding taught here has been purged of vendor hype and bad science and proven effective many times. We know that the secret control set almost never works and it is high time it be expressly abandoned. Here are the main reasons why: (1) relevance is never static, it changes over the course of the review; (2) the random selection size was typically too small for statistically meaningful calculations; (3) the random selection was typically too small in low prevalence collections (the last majority in legal search) for complete training selections; and (4) it supposedly required a senior SME’s personal attention for days of document review work, a mission impossible for most e-discovery teams.
The e-Discovery Team calls on all vendors of advanced AI software for document review to stop using secret control sets and phase it out of their software.
New First Class Added to the TAR Course
We also added a new class on the historical background of the development of predictive coding. We felt that it was important to clarify the many conflicting claims and procedures still out there in the e-discovery marketplace. We made this historical discussion the First Class to the TAR Course. This class also includes a discussion of predictive coding patents. Yes. They are lots of fun.
This new First Class expands the number of classes from sixteen to seventeen (a prime number). We feel pretty good about this expansion. Sure, with more time we could have made the class writings a little shorter, but still we think it is an improvement over my prior writings on the subject of control sets. Hopefully repetition will help in your learning of this initial, difficult material in the TAR Course.