Document Review and Predictive Coding: Video Talks – Part One

March 1, 2016

predictive_coding_3.0This is the first of seven informal video talks on document review and predictive coding. These short videos share my thoughts on the e-Discovery Team’s eight-step work flow for document review, shown above. I explain predictive coding and the Team’s Hybrid Multimodal Method. This first video addresses the big picture, why it is critical to our system of justice for the legal profession to keep up with technology, including especially active machine learning (predictive coding).

The flood of data now all too often hides the truth and frustrates justice. Cases tend to be decided on shadows, smoke and mirrors, because the key documents cannot be found. The needles of truth hide in vast haystacks in the clouds. Justice demands the truth, the full truth, not some bastardized twitter version.

Lady JusticeThe use of AI in legal search can change that. It can empower lawyers to find the needles and decide cases on what really happened, and do so quickly and inexpensively. It can usher in a new age of greater justice for all, blind to wealth and power. The stability of society demands nothing less.

4-5-6-only_predictive_coding_3.0The videos after this introduction are more technical. They delve into details of the work flow and show that it is easier than you might think. After all, only two of the eight steps (four and six) are unique to document reviews that use predictive coding. The others are found in any large scale review project, or should be.

For a more systematic explanation of the methods and eight-steps see Predictive Coding 3.0. Still more information on predictive coding and electronic document review can be found in the fifty-six articles published here on the topic since 2011.

________

_____


Why the ‘Google Car’ Has No Place in Legal Search

February 24, 2016

Google_Car_HybridHybrid Multimodal is the preferred legal search method of the e-Discovery Team. The Hybrid part of our method means that our Computer Assisted Review, our CAR, uses active machine learning (predictive coding), but still has a human driver. They work together. Our review method is thus like the Tesla’s Model S car with full autopilot capabilities. It is designed to be driven by both Man and Machine. Our CAR is unlike the Google car, which can only be driven by a machine. When it comes to legal document review, we oppose fully autonomous driving. In our view there is no place for a Google car in legal search.

Google cars have no steering wheel, no brakes, no gas pedal, no way for a human to drive it at all. It is fully autonomous. A human driver cannot take over, even if they wanted to. In Google’s view, allowing humans to take over makes driverless cars less safe. Google thinks passengers could try to assert themselves in ways that could lead to a crash, so it is safer to be autonomous.

Tesla_autopilotWe have no opinion about the driverless automobile debate, and only like the analogy up to a point. Our opinion is limited to computer assisted review CARs that search for relevant evidence in law suits. For purposes of Law, we want our CARs to be like a Tesla. You can let the car drive and go hands free, if and when you want to. The Tesla AI will then drive the car for you. But you can still drive the car yourself. The second you grab the wheel, the Tesla senses that and turns the Autopilot off. Full control is instantly passed back to you. It is your car, and you are the driver, but you can ask your car to help you drive, when, in your judgment, that is appropriate. For instance, it has excellent fully autonomous parallel parking features, and you can even summon it to come pick you up from out of a nearby parking lot, a truly cool valet service. It is also good in slow commuter traffic and highways, much like cruise control.

When it comes to law, and legal review, we want an attorney’s hands on, or at least near the wheel at all times. Our Hybrid Multimodal approach includes an autopilot mode using active machine learning, but our attorneys are always responsible. They may allow the programmed AI to take over in some situations, and go hands free, much like autonomous parallel parking or highway driving, but they always control the journey.

Defining the Terms

The e-Discovery Team’s Hybrid Multimodal method of document review is based on a flexible blend of human and machine skills, where a lawyer may often delegate, but always retains control. Before we explore this further, a quick definition of terms is in order. Multimodal means that we use all kinds of search methods, and not just one type. For example, we do not just use active machine learning, a/k/a Predictive Coding, to find relevant documents. We do not just use keyword search, or concept search. We use every kind of search we can. This is shown in the search pyramid below, which does not purport to be complete, but catches the main types of document search used today. Using our car analogy, this means that when a human drives, they have a stick shift, and can run in many gears, use many search engines. They can also let go of the wheel, when they want to, and use AI-enhanced search.Search_pyramid

man_robotWe call this a Hybrid method because of the manner in which we use one particular kind of search, predictive coding. To us predictive coding means active machine learning. See eg. Legal Search Science. It is a Man-Machine process, a hybrid process, where we work together with our machine, our robot, whom we call Mr. EDR. In other words, we use the artificial intelligence generated by active machine learning, but we keep lawyers in the loop. We stay involved, hands on or near the wheel.

Augmentation, Not Automation

iron_manThe e-Discovery Team’s Hybrid approach enhances what lawyers do in document review. It improves our ability to make relevance assessments of complex legal issues. The hybrid approach thus leads to augmentation, where lawyers can do more, faster and better. It does not lead to automation, where lawyers are replaced by machines.

The Hybrid Multimodal approach is designed to improve a lawyer’s ability to find evidence. It is not designed to fully automate the tasks. It is not designed to replace lawyers with robots. Still, since one lawyer with our methods can now do the work of hundreds, some lawyers will inevitably be out of a job. They will be replaced by other, more tech savvy lawyers that can work with the robots, that can control them and be empowered by them at the same time. This development in turn creates new jobs for the experts who design and care for the robots, and for lawyers who find new ways to use them.

robots_newspaperWe think that empowering lawyers, and keeping them in the loop, hands near the wheel, is a good thing. We believe that lawyers bring an instinct and a moral sense that is way beyond the grasp of all automation. Moreover, at least today, lawyers know the law, and robots do not. The active machine learning process – predictive coding – begins with a blank slate. Our robots only know what we teach them about relevance. This may change soon, but we are not there yet. See PreSuit.com. Another advantage that we currently have, again one that may someday be replaced, is legal analysis. Humans are capable of legal reasoning, at least after years of schooling and years of legal practice. Right now no machine in the world is even close. But again, we concede this may someday be automated, but we suspect this is at least ten years away.

Robot_with_HeartThe one thing we do not think can ever be automated is the human moral sense of right and wrong, our ethics, our empathy, our humor, our instinct for justice, and our capacity for creativity and imagination, for molding novel remedies to attain fair results in new fact scenarios. This means that, at the present time at least, only lawyers have an instinct for the probative value of documents and their ability to persuade. Even if legal knowledge and legal analysis are some day programmed into a machine, we contend that the unique human qualities of ethics, fairness, empathy, humor, imagination, creativity, flexibility, etc., will always keep trained lawyers in the loop. When it comes to questions of law and justice, humans will always be needed to train and supervise the machines. Not everyone agrees with us.

There is a struggle going on about this right now, one that is largely under the radar. The clash became apparent to the e-Discovery Team during our venture into the world of science and academia at TREC 2015. Some argue that lawyers should be replaced, not enhanced. They favor fully automated methods for a variety of reasons, including cost, a point with which we agree, but also including the alleged inherent unreliability and dishonesty of humans, especially lawyers, a point with which we strenuously disagree. Some scientists and technologists do not appreciate the unique capabilities that humans bring to legal search. More than that, some even think that lawyers should not to be trusted to find evidence, especially documents that could hurt their client’s case. They doubt our ability to be honest in an adversarial system of justice. They see the cold hard logic of machines as the best answer to human subjectivity and deceitfulness. They see machines as the impartial counter-point to human fallibility. They would rather trust a machine than a lawyer. They see fully automated processes as a way to overcome the base elements of man. We do not. This is an important Roboethics issue that has ramifications far beyond legal search.

con manAlthough we have faced our fair share of dishonest lawyers, we still contend they are the rare exception, not the rule. Lawyers can be trusted to do the right thing. The few bad actors can be policed. The existence of a few unethical lawyers should not dictate the processes used for legal search. That is the tail wagging the dog. It makes no sense and, frankly, is insulting. Just because there are a few bad drivers on the road, does not mean that everyone should be forced into a Google car. Plus, please remember the obvious, these same bad actors could also program their robots to do evil for them. Asimov’s laws are a fiction. Not only that, think of the hacking exposure. No. Turning it all over to supposedly infallible and honest machines is not the answer. A hybrid relationship with Man in control is the answer. Trust, but verify.

JusticeThe e-Discovery Team members have been searching for evidence, both good and bad, all of our careers. We do not put our thumb on the scale of justice. Neither do the vast majority of attorneys. We do, however, routinely look for ways to show bad evidence in a good light; that is what lawyers are supposed to do. Making silk purses out of sow’s ears is Trial Law 101. But we never hide the ears. We argue the law, and application of the law to the facts. We also argue what the facts may be, what a document may mean for instance, but we do not hide facts that should be disclosed. We do not destroy or alter evidence. Explaining is fine, but hiding is not.

Many laypersons outside of the law do not understand the clear line. The same misunderstanding applies to some novice lawyers too, especially the ones that have only heard of trials. Hiding and destroying evidence are things that criminals do, not lawyers. If we catch opposing counsel hiding the ball, we respond accordingly. We do not give up and look for ways to turn our system of justice over to cold machines.

Conclusion

Robot_CAR_driverWe should not take away everyone’s license just because a few cannot drive straight. A Computer Assisted Review guided solely by AI alone has no place in the law. AI guidance is fine, we encourage that, that is what Hybrid means, but the CARs should always have a steering wheel and brake. Lawyers should always participate. It is total delegation to AI that we oppose, fully automated search. Legal robots can and should be our friends, but they should never be our masters.

Robot_handshakeHaving said that, we do concede that the balance between Man and Machine is slowly shifting. The e-Discovery Team is gradually placing more and more reliance on the Machine. We learned many lessons on that in our participation in the TREC experiments in 2015. The fully automated methods that the academic teams used did surprisingly well, at least in relatively simple searches requiring limited legal analysis. We expect to put greater and greater reliance on AI in years to come as the software improves, but we will always keep our hands near the wheel.

Mr_EDRWe believe in a collaborative Man-Machine process, but insist that Man, here Lawyers, be the leaders. The buck must stop with the attorney of record, not a robot, even a superior AI like our Mr. EDR. Man must be responsible. Artificial intelligence can enhance our own intelligence, but should never replace it. Back to the AI car analogy, we can and should let the robot drive from time to time, they are, for instance, great a parallel parking, but we should never discard the steering wheel. Law is not a logic machine, nor should it be. It is an exercise in ethics, in fairness, justice and empathy. We should never forget the priority of the human spirit. We should never put too much faith in inhuman automation.

For more on these issues, the hybrid multimodal method, competition with fully automated methods, and much more, please see the e-Discovery Team’s final report of its participation in the 2015 TREC, Total Recall Track, found on NIST’s web at: http://trec.nist.gov/pubs/trec24/papers/eDiscoveryTeam-TR.pdf. It was just published last week. At 116 pages, it should help you to fall asleep for many nights, but hopefully, not while you are driving like the bozos in the hands-free driving video below.

 


Return of the Robots!

June 29, 2014

transformers_extinctionTired of all of the words thrown at you by the e-Discovery Team blog? Just want to relax and enjoy the summer, but still keep up? Maybe learn something interesting and potentially useful? We understand. We have just the thing for you: a nostalgic look back at our robot movies. They are not extinct yet, and although some sequels stink, these are pretty good. Our robots cover transforming topics that are still cutting edge. They explain the use of storytelling and gamification in predictive coding. They also cover the ethics of viruses and bad robots, and then end with our robots getting ready to testify before Judge Waxse on random sampling in predictive coding. I dare say few people can follow their talk on sampling in just one viewing.

f28e9-blogtrojanhorselogoLove words like we do? Not satisfied with robot reruns? We understand that too. Our summer reading is mainly full of cool cybersecurity books found at eDiscovery Security, especially the Cyberthriller novels. Check them out. I’m reading Trojan Horse right now. I has to do with a virus that allows documents to be altered in route after they are sent by email. Talk about an evidence authentication nightmare!

Remember, for full enjoyment of these videos press the HD button on the upper right corner, and then expand in the lower right for full size screen. Maybe someday we will do 3D and iMax too!

eDiscovery Robots Explain How STORYTELLING Will Be Used in Predictive Coding in the Not Too Distant Future

______

eDiscovery Robots Explain How GAMIFICATION Will Be Used in Predictive Coding in the Not Too Distant Future

______

 

eDiscovery Robots Explain ETHICS and Predictive Coding in the Not Too Distant Future


_____

eDiscovery Robots Explain How RANDOM SAMPLING is Used in Predictive Coding

 

____________

_______

___

Goodbye Lexie! We luv ya! It was a great run while it lasted.

Goodbye Lexie! We luv ya!
It was a great run while it lasted.
Who knows? Maybe you’ll return someday too?