Beware of the TAR pits of predictive coding, the places where you can easily get stuck in the mud and end up with poor search results. Despite what some of the software pitchmen say, without experience and good advice, it is all too easy to get stuck in TAR. If you do not know any better, it is easy to take a wrong turn. But, with a little care, you can avoid the TAR pits, at least the worst ones. You can instead have a great ride in your CAR. You can achieve unparalleled new speeds of document review and accuracy. You can cruise your CAR worry free of the minimum reasonable efforts search requirements of Rule 26(g).
I am not going to attempt a complete road map of all of the TAR (technology assisted review) pits in this blog, but instead give a short (for me) introduction to the three biggest in my book: poor Software, poor SMEs and poor AI Trainers. Going back to my favorite analogy, the CAR (computer assisted review), these are equivalent to the automobile itself (the software), the navigation system (the SME), and the driver (the active machine learning trainer).
There is always a down side to everything, and we need to talk about the whole picture of predictive coding, good and bad. I keep running into people who say they have tried predictive coding and do not see why I think it is such a big deal. They say it did not work well for them. They got stuck in a TAR pit.
Having a lousy driver, a poor navigator, or a crummy car, are easy mistakes to make that everyone should look out for, but in fact, few are. In part that is because I seem to be the only one willing to go out on a limb and talk about these three. The discussion, does, after all, necessarily entail criticisms and qualitative comparisons. So most people stay away from them for fear of offending someone. But I am a lawyer, not a politician. I learned long ago that lawyers are not popular, and that if you are afraid of speaking your mind and offending someone, then law is not the right profession for you. But before I go into my usual bulldog attack mode, let me throw out a few positive statements about someone else’s work
Schieneman & Gricks
As a preface I point to two recent articles by Karl Schieneman & Thomas Gricks III that talk about TAR pits. Check out the short vendor article in LTN on Getting Stuck in TAR. It provides a quick peek of their more ambitious paper, The Implications of Rule 26(g) on the Use of Technology-Assisted Review, published in the Federal Courts Law Review. They have a good Forward to the article by Judge Paul Grimm. They point out that predictive coding can get screwed up and cause the unwary to get stuck in TAR; possibly even cause a lawyer to violate their reasonable search duties under Rule 26(g). They point out five problem areas where mistakes are often made. Here is their list with my interpretations and key take aways.
- Collection: you should never review everything you collect (for example, predictive coding only works with text), you must bulk cull first. In my experience this can be a tricky area that requires significant knowledge and skill to maximize efficiency and effectiveness without sacrificing quality.
- Disclosure: you should make disclosure of irrelevant training documents as a defensive measure, which I agree with in principle, but not their suggested implementation. For my somewhat compromise position focusing on grey area documents only, see eg: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part Three at the subheading Disclosure of Irrelevant Training Documents.
Training: The authors point out that use of small judgmental samples alone is under inclusive and should be supplemented with random sampling, and once again they promote full disclosure. For my more detailed approach and analysis, see the Three-Cylinder Multimodal Approach To Predictive Coding (training from judgmental sampling using multimodal, random sampling, and machine suggestions – the three cylinders of a good predictive coding search engine).
- Stabilization: This involves the question of when you stop training and they recommend that recall and precision estimates should be used along with proportionality analysis, to which I agree, but must add, there is also obvious ad hoc observational considerations. For example, when you begin to see no, or relatively few, new relevant documents being uncovered and the ones that are, are of no importance. See Relevant is Irrelevant search of my blog.
- Validation: They suggest a statistical analysis of recall, precision, and elusion, which in my view is an important part of validation, but not the end all that their article would seem to suggest. Process is also very important. They also correctly point out, again, the need for a proportionality analysis to temper and influence the validation process. Obviously I am a strong proponent of proportionality. Predictive Coding and the Proportionality Doctrine: a Marriage Made in Big Data, 26 Regent U. Law Review 1 (2013-2014).
Although I do not agree with everything they say, a further synopsis or rebuttal is not needed. Read The Implications of Rule 26(g) on the Use of Technology-Assisted Review for yourself. It is good to see as many views as possible on the dangers and issues of TAR.
Many TAR Pits
There are many more issues with predictive coding than whether irrelevant documents should be disclosed, or not, even though that is what most predictive coding panels are still preoccupied with. Schieneman & Gricks describe a few of them. I will now add a few more. My focus here will be on a three-fold, higher-level schemata of how you can go wrong. But there are many more TAR pits. No one has yet mapped them all. But any good driver can see them coming, even new ones, and swerve to avoid them.
My three are more of a higher level build, or addition to the Schieneman & Gricks’ analysis. They all point to quality control counter-measures. If you also consider their five, with my three, you have a more complete list of basic TAR problem areas, but there are still many more.
I have previously written about some of them. For instance, my lengthy analysis of the over-randomized approach that I called the Borg methodology. See: Three-Cylinder Multimodal Approach To Predictive Coding. I even went so far as to test that approach myself and reported on it. Borg Challenge: Report of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology (a five-part written and video series comparing two different kinds of predictive coding search methods). Also see: A Modest Contribution to the Science of Search: Report and Analysis of Inconsistent Classifications in Two Predictive Coding Reviews of 699,082 Enron Documents. (Part One); Comparative Efficacy of Two Predictive Coding Reviews of 699,082 Enron Documents. (Part Two).
Also consider the many possible errors of random sampling, and the over-reliance on inconsistent humans, SMEs and contract reviewers alike. I have written on these TAR pits before too. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts One, Two, and Three; and, Random Sample Calculations And My Prediction That 300,000 Lawyers Will Be Using Random Sampling By 2022;
Driver Education Needed
It becomes hard to follow when you go down into the boggy details, especially if you have not driven a CAR before. Lots of people never have, even a few of the best theorists. So I will keep this blog high level and relatively short. That is one thing I have learned for sure, you cannot teach a person to drive a car with books and articles. Although they may help, you can only teach driving by doing. First someone watches you drive, then they drive with you telling them what to do, ready to take over the wheel. Then eventually they drive by themself.
Sorry to be a bearer of bad news, but that is really the only way it works. It is much like learning to try a case. First you watch and do little things, then you second chair, then you do it by yourself with some assistance, then you are on your own. We have a long proud apprenticeship program in the law. Legal search is just like any other complex legal activity. We do not call it the practice of law for nothing.
Three Boggy Areas: Method, SME, Software
I mentioned these same three just last week in my article on PreSuit, but they deserve more attention than a passing remark. PreSuit: How Corporate Counsel Could Use “Smart Data” to Predict and Prevent Litigation. This diagram, shown in my favorite Penrose triangle, summarizes the concepts.
Quality of Trainer’s Work
The quality of a Trainer’s work is the most obvious catch-all of my three. The trainer of course is the person, or persons, who do the teaching for the active machine learning. The Trainers are necessarily masters of the software and the techniques needed to make the software sing, to get the maximum out of the AI systems. They are masters of method. They are the users in charge, the humans that, with the input of the SMEs, control how the machine is trained. They are the H in the all important HCIR process (Human Computer Information Retrieval). See search of my blog of HCIR, and especially Reinventing the Wheel: My Discovery of Scientific Support for “Hybrid Multimodal” Search. They are the drivers of CAR.
If you have an inexperienced trainer, one with no special gifts and talent in human computer interaction, one who does not have a deep intuitive grasp of the software, one who does not really know and understand how search works, who does not have decades of evidence search experience, then you are likely to fall into a TAR pit. No matter how good your software, and how smart your SME, you may not go far. Even if you do end up where you wanted to go, it will be overly long, frightening, and expensive drive. Much like some New York taxi rides we have all been in.
As Gary Marchionini, Professor and Dean of the School of Information and Library Sciences of U.N.C. at Chapel Hill, explained in Information Seeking in Electronic Environments (Cambridge 1995) information seeking expertise is a critical skill for successful search. It is based on both experience and innate talents. For instance, “capabilities such as superior memory and visual scanning abilities interact to support broader and more purposive examination of text.” Id. Professor Marchionini goes on to say that: “One goal of human-computer interaction research is to apply computing power to amplify and augment these human abilities.” Put a professional driver in a Model-T and he’ll do ok. Put him in a Ferrari and it is a beautiful thing.
Without a good trainer, one with information seeking expertise, you could be riding around in a CAR with a clueless driver. Maybe your driver has never even have driven a complicated CAR before, much less earned a chauffeur’s license. I am reminded of my favorite scene in Rain Man where Dustin Hoffman keeps saying I’m an excellent driver, when, of course, he could not drive at all.
Dustin Hoffman’s lovable Rain Man character knows the car backwards and forwards, of course, all of the technical details, but Rain Man does not really know how to drive. Well, he can take it up and down the driveway, but that’s about all. Much like some software demo guys. In fact, Rain Man reminds me of many so-called experts in predictive coding. They know the software, they know the talk, they claim to be an excellent driver, but they do not know the search. Even if they have used the software many times, many have never been in a court room, never introduced an exhibit into evidence, never used a hot document to destroy a witness on the stand. Ah, now that too is a beautiful thing.
Unless you really know and understand search, legal search for evidence, you cannot be an excellent driver. You cannot really walk the talk, much less drive a CAR in an excellent manner. But do not just believe me, or Professor Marchionini, read the research. For instance, study Monica Bay’s excellent, albeit slightly age-insulting article, EDI-Oracle Study: Humans Are Still Essential in E-Discovery: Phase I of the study shows that older lawyers still have e-discovery chops and you don’t want to turn EDD over to robots. Personally I do no think age is a prerequisite. What counts is experience and natural talent.
I am sure there are some excellent drivers of AI enhanced CARs that do not have grey hair. It only takes a few years to become a good driver if you have the right background and special innate skills. That background includes at least some trial experience, but even more importantly, how to understand and implement complex instructions from trial lawyers and other legal experts. This brings us to the next big TAR pit of my big three, the SME.
To be continued ….