e-Discovery Team’s Best Practices Education Program

May 8, 2016


EDBP                   Mr.EDR         Predictive Coding 3.0
59 TAR Articles
Doc Review  Videos



e-Discovery Team Training

Information → Knowledge → Wisdom

Ralph_4-25-16Education is the clearest path from Information to Knowledge in all fields of contemporary culture, including electronic discovery. The above links take you to the key components of the best-practices teaching program I have been working on since 2006. It is my hope that these education programs will help move the Law out of the dangerous information flood, where it is now drowning, to a safer refuge of knowledge. Information → Knowledge → Wisdom: Progression of Society in the Age of Computers; and How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom.” For more of my thoughts on e-discovery education, see the e-Discovery Team School Page.

justice_guage_negligenceThe best practices and general educational curriculum that I have developed over the years focuses on the legal services provided by attorneys. The non-legal, engineering and project management practices of e-discovery vendors are only collaterally mentioned. They are important too, but students have the EDRM and other commercial organizations and certifications for that. Vendors are part of any e-Discovery Team, but the programs I have developed are intended for law firms and corporate law departments.

LIFE_magazine_Losey_acceleratesThe e-Discovery Team program, both general educational and legal best-practices, is online and available 24/7. It uses lots of imagination, creative mixes, symbols, photos, hyperlinks, interactive comments, polls, tweets, posts, news, charts, drawings, videos, video lectures, slide lectures, video skits, video slide shows, music, animations, cartoons, humor, stories, cultural themes and analogies, inside baseball references, rants, opinions, bad jokes, questions, homework assignments, word-clouds, links for further research, a touch of math, and every lawyer’s favorite tools: words (lots of them), logic, arguments, case law and precedent.

All of this to try to take the e-Discovery Team approach from just information to knowledge →. In spite of these efforts, most of the legal community still does not know e-discovery very well. What they do know is often misinformation. Scenes like the following in a law firm lit-support department are all too common.

supervising-tipsThe e-Discovery Team’s education program has an emphasis on document review. That is because the fees for lawyers reviewing documents is by far the most expensive part of e-discovery, even when contract lawyers are used. The lawyer review fees, and review supervision fees, including SME fees, have always been much more costly than all vendor costs and expenses put together. Still, the latest AI technologies, especially active machine learning using our Predictive Coding 3.0 methods, are now making it possible to significantly reduce review fees. We believe this is a critical application of best practices. The three steps we identify for this area in the EDBP chart are shown in green, to signify money. The reference to C.A. Review is to Computer Assisted Review or CAR, using our Hybrid Multimodal methods.



Predictive Coding 3.0 Hybrid Multimodal Document Search and Review

Control-SetsOur new version 3.0 techniques for predictive coding makes it far easier than ever before to include AI in a document review project. The secret control set has been eliminated, so too has the seed set and SMEs wasting their time reviewing random samples of mostly irrelevant junk. It is a much simpler technique now, although we still call it Hybrid Multimodal.

robot-friendHybrid is a reference to the Man/Machine interactive nature of our methods. A skilled attorney uses a type of continuous active learning to train an AI to help them to find the documents they are looking for. This Hybrid method greatly augments the speed and accuracy of the human attorneys in charge. This leads to cost savings and improved recall. A lawyer with an AI helper at their side is far more effective than lawyers working on their own. This means that every e-discovery team today could use a robot like Kroll Ontrack’s Mr. EDR to help them to do document review.

Search_pyramidMultimodal is a reference to the use of a variety of search methods to find target documents, including, but not limited to, predictive coding type ranked searches. We encourage humans in the loop running a variety of searches of their own invention, especially at the beginning of a project. This always makes for a quick start in finding relevant and hot documents. Why the ‘Google Car’ Has No Place in Legal Search. The multimodal approach also makes for precise, efficient reviews with broad scope. The latest active machine learning software when fully integrated with a full suite of other search tools is attaining higher levels of recall than ever before. That is one reason Why I Love Predictive Coding.

Mr_EDRI have found that Kroll Ontrack’s EDR software is ideally suited for these Hybrid, Multimodal techniques. Try using it on your next large project and see for yourself. The Kroll Ontrack consultant specialists in predictive coding, Jim and Tony, have been trained in this method (and many others). They are well qualified to assist you in every step of the way and their rates are reasonable. With you calling the shots on relevancy, they can do most of the search work for you and still save your client’s money. If the matter is big and important enough, then, if I have a time opening, and it clears my firm’s conflicts, I can also be brought in for a full turn-key operation. Whether you want to include extra time for training your best experts is your option, but our preference.



Embrace e-Discovery Team Education to Escape Information Overload


Introducing a New Website, a New Legal Service, and a New Way of Life / Work; Plus a Postscript on Software Visualization and Thanks to Kroll Ontrack

May 3, 2015

This month my blog is not just an article, but a whole new website and Internet domain.


Check it out


The new website introduces a new legal service, ZEN Document Review. It includes three short videos of me talking and, as usual for me, lots of words and graphics. This new service is part of the social transition that I wrote about last month: Information → Knowledge → Wisdom: Progression of Society in the Age of Computers. It represents a post-information approach to legal work, specifically document review, that goes beyond the first level of Information services. ZEN Document Review is instead a service based on Knowledge and Wisdom. It is here now, but represents the future.

In this new web, Zero Error Numerics, I share, for the first time, some of the inner-side of how I work. I also share more about the quality control procedures that I have developed for predictive coding based document review.

Go to ZeroErrorNumerics.com and see what I mean, including especially A Word About Zen Meditation. Unlike Steve Jobs I am not a Zen Buddhist, but, like Steve, and many others, I am a life-long meditator. I am also a lawyer and futurist with certain ideas as to what the next two stages of society should look like. I talked about this in Information → Knowledge → Wisdom. My major creation this month, Zero Error Numerics, implements these ideas in the field of work that I know.

Zero Error Numerics represents a knowledge and wisdom based approach to legal search and document review. It is not as weird as you might think. As I point out in A Word About Zen Meditation, about 25% of mainstream corporate America now encourage meditation at work for physical and mental health benefits. It creates a good vibe to control stress and get things done.




Postscript to Data Visualization with a Thanks to Kroll Ontrack

visual-numbersOn another subject entirely, I have a postscript on my prior blog, Visualizing Data in a Predictive Coding Project – Parts One, Two and Three. I wrote that series in November 2014. You may recall my challenge to all vendors to include a probability distribution graphic along the lines I described in my blog. I asked for software developers to include such a graphics feature in future versions. I wanted to have a visual display of the relevance ranking of all documents in a predictive coding project. I made no special calls, nor even once asked Kroll Ontrack, my firm’s preferred vendor, to step up to the plate and do it. Yet, being the company that they are, they quietly added that feature in the version they released this Spring and waited to see when I would notice. It is an early, simple version, but it is there and it works well. That’s the way KO rolls, 0n track and ahead of the pack.

UpSide_down_champagne_glassYou may also recall that I shared a graphic in the Visualizing Data blog to show a probability distribution visualization. At the time, during the first three quarters of 2014, I was often seeing in my mind’s eye the kind of rankings that looked like an upside down champagne glass, shown right. I would typically see such a distribution at or near the end of active machine learning. I wanted a software feature that would take it out of my mind’s eye, my imagination, and put it onto the computer screen. My projects then would typically shape the data so that most documents were either highly ranked irrelevant, which I visualized as near the bottom of a vertical array in blue, or highly ranked relevant, which I visualized on the top in red. I was also finding documents in between, with a more gradual sloping of irrelevance at the bottom, than with relevance at the top. That is why it had an upside down champagne glass look.

If you take my graphic and turn it 90% clockwise, so it goes left to right, from irrelevant to relevant, and then flattened it out, it would look like this.

Champagne_Glass_spillKroll Ontrack met my challenge and implemented a visualization of data ranking by using a horizontal bar graph approach. Thanks and kudos to my software development friends at Kroll Ontrack for such a quick response. You never let me down. There is good reason that Kroll Ontrack was chosen by National Law Journal readers in 2014 as the leading predictive coding technology in the industry.


The latest version of KO’s software, which they call EDR, for EDiscovery.com Review, includes a cool graphics tool that does the job of visualizing probability data ranking. It is included in the Technology Assisted Review Metrics page under Probability Distribution. They basically added a spreadsheet bar graph display. You can also see the probability distribution in numeric table form for exact metrics of the probability distributions. You also have the choice to see the probability graph in increments of 5%, 10% or 25%. The screen shot below shows 10% increments. The bar graph display shows the probability ranking from left to right, irrelevant to relevant. Here is a screen shot from a recent project after training was complete. You can click on the graph to see a larger version.


This project had about a 4% prevalence of relevant documents, so it made sense for the relevant half to be far smaller. But what is striking about the data stratification is how polarized the groupings are. This means the ranking distribution separation, relevant and irrelevant, is very well-formed. There are an extremely small number of documents where the AI is unsure of classification. The slow curving shape of irrelevant probability on the left (or the bottom of my upside down champagne glass) is gone.

The visualization shows a much clearer and complete ranking at work than I had ever seen before. The AI is much more certain about what documents are irrelevant. Below is a screenshot of the table form display of this same project in 5% increments. It shows the exact numerics of the probability distribution in place when the machine training was completed. This is the most pronounced polar separation I have ever seen, which shows that my training on relevancy has been well understood by the machine.


KO_EDR_winnerI am unsure of the reason for this significant change in probability distribution from what I routinely saw last year. It could just be chance event. Time will tell. It could also just be a peculiarity of this data and search project, but it did seem typical to me, and certainly a prevalence of just over 4% is common. It could also be a result of some of the latest enhancements to the predictive coding functions in Kroll Ontrack’s EDR. The distribution attained might be more pronounced because the software is smarter. They are always working to make it better. That is how you stay number one.

The better results shown here might even be explained by improvements in my methods and my team’s performance. Maybe we are more relaxed and in the flow now than ever before. Who knows. It could also be some combination of these factors. I will keep a careful eye on the probability distributions in the future to see if this is the new normal, or just a lucky fluke.

NASCAR-Driver that looks like LoseyEither way, in my experience the active machine learning, aka predictive coding, functions of Kroll Ontrack’s EDR software are working very well. It is a powerful and sophisticated tool. Like a top race car, it is hard to beat when driven correctly. Still, if you do not know how to drive, the best race car in the world will never win. If you combine both a bad car and poor driver, you may well get the world’s largest manual review project. I am told this kind of disaster happens all too often.

What passes as a good faith use of predictive coding by some law firms is a disgrace. Of course, if hide the ball is still your real game of choice, then all of the good software in the world will not make any difference. Keep breaking the law like that and someday you are bound to crash and burn. See eg my prior articles: Discovery As AbuseThere Can Be No Justice Unless Lawyers Maintain High Ethical Standardsand E-Discovery Gamers: Join Me In Stopping Them.

%d bloggers like this: