Why I Love Predictive Coding

Making document review fun with Mr. EDR and Predictive Coding 3.0

Many lawyers and technologists like predictive coding and recommend it to their colleagues. They have good reasons to do so. It has worked for them. It has allowed them to do e-discovery reviews in an effective, cost efficient manner. That is true for me too, but that is not why I love predictive coding. My feelings come from the excitement, fun, and amazement that often arise from seeing it in action. I love watching my predictive coding software find documents that I never could have found on my own. I love the way the AI in the software helps me do the impossible. I love how it makes me far smarter and skilled than I really am.

I have been getting those kinds of positive feelings a lot lately using the new Predictive Coding 3.0 methodology and Kroll Ontrack’s latest eDiscovery.com Review software (“EDR”). So too have my e-Discovery Team members who helped me to participate in this year’s TREC (the annual science experiment for the latest text search techniques sponsored by the National Institute of Standards and Technology). During our grueling forty-five days of experiments we came to admire the intelligence of the new EDR software so much that we decided to personalize the AI as a robot. We named him Mr. EDR out of respect. He even has his own website now, MrEDR.com, where he explains how he helped my e-Discovery Team in the 2015 TREC Total Recall Track experiments. With Mr. EDR at your side document review need never be boring again.

How and Why Predictive Coding is Fun

Step Six of the eight-step workflow for Predictive Coding 3.0 is called Hybrid Active Training. That is where we work with the active machine-learning features of Mr. EDR, the predictive coding features, which are a type of artificial intelligence. We train the computer on our conception of relevance by showing it relevant and irrelevant documents that we have found. The software is designed to then go out and find all other relevant documents in the total dataset.

We use a multimodal approach to find training documents, meaning we use all of the other search features of Mr. EDR to find relevant ESI, such as keyword searches, similarity and concept. We iterate the training by sample documents, both relevant and irrelevant, until the computer starts to understand the scope of relevance we have in mind. It is a training exercise to make our AI smart, to get it to understand the basic ideas of relevance for that case. It usually takes multiple rounds of training for Mr. EDR to understand what we have in mind. But he is a fast learner, and by using the latest hybrid multimodal continuous active learning techniques, we can usually complete his training in a day or two.

After a while Mr. EDR starts to “get it,” he starts to really understand what we are after, what we think is relevant in the case. That is when a happy shock and awe type moment can happen. That is when Mr. EDR’s intelligence and search abilities start to exceed our own. Yes. It happens. The pupil then starts to evolve beyond his teachers. The smart algorithms start to see patterns and find evidence invisible to us. At that point we let him teach himself by automatically accepting his top-ranked predicted relevant documents without even looking at them. Our main role then is to determine a good range for the automatic acceptance and do some spot-checking. We are, in effect, allowing Mr. EDR to take over the review. Oh what a feeling to then watch what happens, to see him keep finding new relevant documents and keep getting smarter and smarter by his own self-programming. That is the special AI-high that makes it so much fun to work with Predictive Coding 3.0 and Mr. EDR.

It does not happen in every project, but with the new Predictive Coding 3.0 methods and the latest Mr. EDR, we are seeing this kind of transformation happen more and more often. It is a tipping point in the review when we see Mr. EDR go beyond us. He starts to unearth relevant documents that my team would never even have thought to look for. The relevant documents he finds are sometimes completely dissimilar to any others we found before. They do not have the same keywords, or even the same known concepts. Still, Mr. EDR sees patterns in these documents that we do not. He can find the hidden gems of relevance, even outliers and black swans, if they exist. When he starts to train himself, that is the point in the review when we think of Mr. EDR as going into superhero mode. At least, that is the way my young e-Discovery team likes to talk about him.

By the end of many projects the algorithmic functions of Mr. EDR have attained a higher intelligence and skill level than our own (at least on the task of finding the relevant evidence in the document collection). He is always lightening fast and inexhaustible, even untrained, but by the end of his training, he becomes a search genius. Watching Mr. EDR in that kind of superhero mode is one of the things that make Predictive Coding 3.0 a pleasure.

The Empowerment of AI Augmented Search

It is hard to describe the combination of pride and excitement you feel when Mr. EDR, your student, takes your training and then goes beyond you. More than that, the super-AI you created, then empowers you to do things that would have been impossible before, absurd even. That feels pretty good too. You may not be Iron Man, or look like Robert Downey, but you will be capable of remarkable feats of legal search strength.

For instance, using Mr. EDR as our Iron Man-like suits, my e-discovery team of three attorneys was able to do thirty different review projects and classify 17,014,085 documents in 45 days. See TREC experiment summary at Mr. EDR. We did these projects mostly at nights, and on weekends, while holding down our regular jobs. What makes this crazy impossible, is that we were able to accomplish this by only personally reviewing 32,916 documents. That is less than 0.2% of the total collection. That means we relied on predictive coding to do 99.8% of our review work. Incredible, but true. Using traditional linear review methods it would have taken us 45 years to review that many documents! Instead, we did it in 45 days. Plus our recall and precision rates were insanely good. We even scored 100% precision and 100% recall in one TREC project. You read that right. Perfection. Many of our other projects attained scores in the high and mid nineties. We are not saying you will get results like that. Every project is different, and some are much more difficult than others. But we are saying that this kind of AI-enhanced review is not only fast and efficient, it is effective.

Yes, it’s pretty cool when your little AI creation does all the work for you and makes you look good. Still, no robot could do this without your training and supervision. We are a team, which is why we call it hybrid multimodal, man and machine.

Having Fun with Scientific Research at TREC 2015

During the 2015 TREC Total Recall Track experiments my team would sometimes get totally lost on a few of the really hard Topics. We were not given legal issues to search, as usual. They were arcane technical hacker issues, political issues, or local news stories. Not only were we in new fields, the scope of relevance of the thirty Topics was never really explained (we were given one to three word explanations). We had to figure out intended relevance during the project based on feedback from the automated TREC document adjudication system. We would have some limited understanding of relevance based on our suppositions of the initial keyword hints, and so we could begin to train Mr. EDR with that. But, in several Topics, we never had any real understanding of exactly what TREC thought was relevant.

This was a very frustrating situation at first, but, and here is the cool thing, even though we did not know, Mr. EDR knew. That’s right. He saw the TREC patterns of relevance hidden to us mere mortals. In many of the thirty Topics we would just sit back and let him do all of the driving, like a Google car. We would often just cheer him on (and each other) as the TREC systems kept saying Mr. EDR was right, the documents he selected were relevant. The truth is, during much of the 45 days of TREC we were all like kids in a candy store having a great time. That is when we decided to give Mr. EDR a cape and superhero status. He never let us down. It is a great feeling to create an AI with greater intelligence than your own and then see it augment and improve your legal work. It is truly a hybrid human-machine partnership at its best.

I hope you get the opportunity to experience this for yourself someday. This year’s TREC experiments are over, but the search for truth and justice goes on in lawsuits across the country. Try it on your next document review project.

Do What You Love and Love What You Do

Mr. EDR, and other good predictive coding software like it, can augment our own abilities and make us incredibly productive. This is why I love predictive coding and would not trade it for any other legal activity I have ever done (although I have had similar highs from oral arguments that went great, or the rush that comes from winning a big case).

The excitement of predictive coding comes through clearly when Mr. EDR is fully trained and able to carry on without you. It is a kind of Kurzweilian mini-singularity event. It usually happens near the end of the project, but can happen earlier when your computer catches on to what you want and starts to find the hidden gems you missed. I suggest you give Predictive Coding 3.0 and Mr. EDR a try. Then you too can have fun with evidence search. You too can love what you do. Document review need never be boring again.

Caution

One note of caution: most e-discovery vendors, including several prominent software makers, still do not follow the hybrid multimodal Predictive Coding 3.0 approach that we use to attain these results. They instead rely entirely on machine-selected documents for training, or even worse, rely entirely on random selected documents to train the software, or have elaborate unnecessary secret control sets. On the other end of the spectrum, some vendors use all search methods except for predictive coding, to keep it simple they say. It may be simple, but the power, speed, quality control and just plain fun given up for that simplicity are not worth it. The old ways are more costly because they take so much lawyer time to complete, they are less effective, and, they are boring. The use of AI data analytics is clearly the way of the future. It is what makes document review enjoyable and why I love to do big projects. It turns scary to fun.

I have also heard that the algorithms used by some vendors for predictive coding are not very good. Scientists tell me that some are only dressed-up concept search or unsupervised document clustering. Only bona fide active machine learning algorithms create the kind of AI experience that I am talking about. So, if it does not work for you, it could well be the software’s fault, not yours. The new 3.0 methods are not very hard to follow, and they certainly will work. We have proven that at TREC, but only if you have good software. With just a little training, and some help at first from consultants (most vendors will have good ones to help), you can have the kind of success and excitement that I am talking about.

Do not give up if it does not work for you the first time, especially in a complex project. Try another vendor instead, one that may have better software and better consultants. Also, be sure that your consultants are Predictive Coding 3.0 experts, and that you follow their advice. Finally, remember that the cheapest is almost never the best, and, in the long run will cost you a small fortune in wasted time and frustration.

Conclusion

Love what you do. It is a great feeling and sure-fire way to job satisfaction and success. With these new predictive coding technologies it is easier than ever to love e-discovery. Try them out. Treat yourself to the AI high that comes from using smart machine learning software and fast computers. There is nothing else like it. If you switch to the 3.0 methods and software, you too can know that thrill. You can watch an advanced intelligence, which you helped create, exceed your own abilities, exceed anyone’s abilities. You can sit back and watch Mr. EDR complete your search for you. You can watch him do so in record time and with record results. It is amazing to see good software find documents that you know you would never have found on your own.

Predictive coding AI in superhero mode can be exciting to watch. Why deprive yourself of that? Who says document review has to be slow and boring? Start making the practice of law fun again.

This entry was posted on Sunday, February 14th, 2016 at 1:29 pm and is filed under Lawyers Duties, Review, Search, Technology, VENDORS. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

10 Responses to Why I Love Predictive Coding

Greg Fordham says:

December 7, 2015 at 8:21 am

Ralph,

When is predictive coding not a good fit. I don’t think I have seen you discuss this. My understanding is that it is not good for numerical data like spreadsheets. It is not good for short text documents of say less than 100 words because there is not enough data present for it to develop a meaningful “fingerprint”. It is not good for low accuracy text data like when graphic images were OCR’d and the images were not of high quality or the text conversion has not been highly accurate like 99%. Of course, it is not good for graphical images without text that can be converted or such low textual content that it is not meaningful.

Loading...

Reply
- gvc says:
  
  December 7, 2015 at 9:30 am
  
  Greg,
  
  I would frame the question somewhat differently: when does [predictive coding method] fail to improve on [alternative method] to solve [problem].
  
  If [predictive coding method] is a state-of-the-art learning method, and [alternative method] is some combination of Boolean search and manual review, it is difficult to find [problem] for which this question is answered in the affirmative.
  
  Attached please find a document from the 2008 TREC Legal Track that a TAR system found relevant to the request: “All documents which describe, refer to, report on, or mention any “in-store,” “on-counter,” “point of sale,” or other retail marketing campaigns for cigarettes.”
  
  —————
  Attn: RETAIL MANAGER
  Store Z-~ 09 0 SIS N
  6 3 Division _ 5 y~ ga ~
  Street: City: l~
  ~-
  ~/5 n i nrn ~ a~Lf
  U,-~ klOA –
  0
  foS S’ L 6uu 2 ASL?&:e
  Promotion Eiecution: ~
  ~ ~ ~
  ~z6 z ~ ~~ 1Z
  v
  C~ti../
  ~’~
  Contractual Compliance:
  Merchandising:
  KAM/AM A-(~..~ L i:~C~~ I~ate:
  ..w
  cc: Roger Farmer
  
  pgNbr=1
  ]]
  
  Loading...
  
  Reply
  - Greg Fordham says:
    
    December 7, 2015 at 9:52 am
    
    And I bet predictive coding would have returned it too if the training document had been as simplistic as the criteria in your example from Trec. Similarly, if this had been an important document I bet predictive coding would have missed it because it lacks adequate data to provide a meaningful fingerprint. Everything is a nail if all you have is hammer. We’re going to see the same kind of troublesome results with predictive coding if users of the technology don’t have a better understanding of the concepts that make it work and when that tool works well versus when it will not.
    
    Loading...
  - Ralph Losey says:
    
    December 7, 2015 at 1:48 pm
    
    A bit harsh, Greg.
    
    Loading...
  - Ralph Losey says:
    
    December 7, 2015 at 1:50 pm
    
    Thanks. Nice comment Gordon. How do you remember examples like that?!
    
    Loading...
Ralph Losey says:

December 7, 2015 at 1:47 pm

I agree with Gordon and adopt that answer. Thanks for add that.

I have discussed in other articles some of the things that do not work well for predictive coding, including graphics. I don’t like large documents as initial training documents either, but once the classifier is well trained, it seems to have no problem with them. Same comment on spreadsheets that are mostly numbers, but almost always have some key text too. Really short text documents are probably not too good either. So I often look them over separately. Remember, I use a multimodal approach and do not just limit myself to predictive coding (not yet, anyway). Still, in my way too long experience in legal search I have found it to be a very rare case indeed where a short text is relevant. For instance, in practice, I have never seen a stand alone email or text that just says, Yes or No, that was relevant, much less critical; yet that is often an example people use. In my experience such relevant short responses are usually in a chain. If it does become important, and that is usually obvious from the questioning email or text, then you can look for it. Again, I promote multimodal, including the occasional linear review and always include the human mind, and thus the “hybrid” – machine and man part of hybrid multimodal.

Finally, I note that Kroll Ontrack has a neat little feature that I use a lot where you can search audio by using text and phonics. That is part of my multimodal tool belt. I understand new algorithms are working pretty well that search video too.

Loading...

Reply
- Greg Fordham says:
  
  December 7, 2015 at 3:25 pm
  
  I think that is very interesting and useful information that I do not see very often in the discussions of predictive coding nor do I find practitioners familiar with it when they talk about predictive coding. My concern is that they perceive it as some kind of magic bullet that they can implement without forethought or understanding. So, that is why I suggested more discussion about its limitations and quirks.
  
  Loading...
  
  Reply
  - Ralph Losey says:
    
    January 5, 2016 at 10:20 am
    
    Good point. The technology is only fun when you are using it for its intended purpose. I don’t use a hammer to put in a screw.
    
    Loading...
Concept Drift and Consistency: Two Keys To Document Review Quality – Part Two | e-Discovery Team ® says:

January 24, 2016 at 10:44 am

[…] You will not be replaced by robots exactly, but by other AI-enhanced human reviewers. See: Why I Love Predictive Coding (The Empowerment of AI Augmented […]

Loading...

Reply
Concept Drift and Consistency: Two Keys To Document Review Quality – Part Three | e-Discovery Team ® says:

January 29, 2016 at 9:33 am

[…] Good drivers of CARs – Computer Assisted Reviews – can see the curves. They expect them, even when driving a new course. When they come to a curve, they are not surprised, they know how to speed through the curves. They can do a power drift through any corner. Change in relevance should not be a speed-bump. It should be an opportunity to do a controlled skid, an exciting drift with tires burning. Speed drifts help keep a document review interesting, even fun, much like a race track. If you are not having a good time with large scale document review, then you are obviously doing something wrong. You may be driving an old car using the wrong methods. See: Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding 3.0. […]

Loading...

Reply