Predictive Coding 4.0 – Nine Key Points of Legal Document Review and an Updated Statement of Our Workflow – Part Three

September 26, 2016

This is the third installment of my lengthy article explaining the e-Discovery Team’s latest enhancements to electronic document review using Predictive Coding. Here are Parts One and Two. This series explains the nine insights (6+3) behind the latest upgrade to version 4.0 and the slight revisions these insights triggered to the eight-step workflow. This is all summarized by the diagram below, which you may freely copy and use if you make no changes.

To summarize this series explains the seventeen points, listed below, where the first nine are insights and the last eight are workflow steps:

  1. Active Machine Learning (aka Predictive Coding)
  2. Concept & Similarity Searches (aka Passive Learning)
  3. Keyword Search (tested, Boolean, parametric)
  4. Focused Linear Search (key dates & people)
  5. GIGO & QC (Garbage In, Garbage Out) (Quality Control)
  6. Balanced Hybrid (man-machine balance with IST)
  7. SME (Subject Matter Expert, typically trial counsel)
  8. Method (for electronic document review)
  9. Software (for electronic document review)
  10. Talk (step 1 – relevance dialogues)
  11. ECA (step 2 – early case assessment using all methods)
  12. Random (step 3 – prevalence range estimate, not control sets)
  13. Select (step 4 – choose documents for training machine)
  14. AI Rank (step 5 – machine ranks documents according to probabilities)
  15. Review (step 6 – attorneys review and code documents)
  16. Zen QC (step 7 – Zero Error Numerics Quality Control procedures)
  17. Produce (step 8 – production of relevant, non-privileged documents)

So far in Part One of this series we explained how these insights came about and provided other general background. In Part Two we explained the first of the nine insights, Active Machine Learning, including the method of double-loop learning. In the process we introduced three more insights, Balanced Hybrid, Concept & Similarity Searches, and Software. For continuity purposes we will address Balanced Hybrid next. (I had hoped to cover many more of the seventeen in this third installment, but turns out it all takes more words than I thought.)

Balanced Hybrid
Using Intelligently Spaced Training  – IST™

The Balanced Hybrid insight is complementary to Active Machine Learning. It has to do with the relationship between the human training the machine and the machine itself. The name itself says it all, namely that is it balanced. We rely on both software and skilled attorneys using the software.

scales_hybrid

We advocate reliance on the machine after it become trained, after it starts to understand your conception of relevance. At that point we find it very helpful to rely on what the machine has determined to be the documents most likely to be relevant. We have found it is a good way to improve precision in the sixth step of our 8-step document review methodology shown below. We generally use a balanced approach where we start off relying more on human selections of documents for training based on their knowledge of the case and other search selection processes, such as keyword or passive machine learning, a/k/a concept search. See steps 2 and 4 of our 8-step method – ECA and Select. Then we switch to relying more on the machine as it’s understanding catches one. See steps 4 and 5 – Select and AI Rank. It is usually balanced throughout a project with equal weight given to the human trainer, typically a skilled attorney, and the machine, a predictive coding algorithm of some time, typically logistic regression or support vector.


predictive_coding_4-0_8-steps_ist

Unlike other methods of Active Machine Learning we do not completely turn over to the machine all decisions as to what documents to review next. We look to the machine for guidance as to what documents should be reviewed next, but it is always just guidance. We never completely abdicate control over to the machine. I have gone into this before at some length in my article Why the ‘Google Car’ Has No Place in Legal Search. In this article I cautioned against over reliance on fully automated methods of active machine learning. Our method is designed to empower the humans in control, the skilled attorneys. Thus although our Hybrid method is generally balanced, our scale tips slightly in favor of humans, the team of attorneys who run the document review. We favor humans. So while we like our software very much, and have even named it Mr. EDR, we have an unabashed favoritism for humans. More on this at the conclusion of the Balanced Hybrid section of this article.

scales_hybrid_tipped

Three Factors That Influence the Hybrid Balance

We have shared the previously described hybrid insights before in earlier e-Discovery Team writings on predictive coding. The new insights on Balanced Hybrid are described in the rest of this segment. Again, they are not entirely new either. They represent more of a deepening of understanding and should be familiar to most document review experts. First, we have gained better insight into when and why the Balanced Hybrid approach should be tipped one way or another towards greater reliance on humans or machine. We see three factors that influence our decision.

  1. On some projects your precision and recall improves by putting greater reliance of the AI, on the machine. These are typically projects where one or more of the following conditions exist:

the data itself is very complex and difficult to work with, such as specialized forum discussions; or,

* the search target is ill-defined, i.w. – no one is really sure what they are looking for; or,

* the Subject Matter Expert (SME) making final determinations on relevance has limited experience and expertise.

3-factors_hybrid_balance2. On some projects your precision and recall improves by putting even greater reliance of the humans, on the skilled attorneys working with the machine. These are typically projects where the converse of one or more of the three criteria above are present:

* the data itself is fairly simple and easy to work with, such as a disciplined email user (note this has little or nothing to do with data volume) or,

* the search target is well-defined, i.w. there are clearly defined search requests and everyone is on the same page as to what they are looking for; or,

* the Subject Matter Expert (SME) making final determinations on relevance has extensive experience and expertise.

What was somewhat surprising from our 2016 TREC research is how one-sided you can go on the Human side of the equation and still attain near perfect recall and precision. The Jeb Bush email underlying all thirty of our topics in TREC Total Recall Track 2016 is, at this point, well-known to us. It is fairly simple and easy to work with. Although the spelling of the thousands of constituents who wrote to Jeb Bush was atrocious (far worse than general corporate email, except maybe construction company emails), Jeb’s use of the email was fairly disciplined and predictable. As a Florida native and lawyer who lived through the Jeb Bush era, and was generally familiar with all of the issues, and have become very familiar with his email, I have become a good SME, and, to a somewhat lesser extent, so has my whole team. (I did all ten of the Bush Topics in 2015 and another ten in 2016.)  Also, we had fairly well-defined, simple search goals in most of the topics.

feedback_loops_human_compFor these reasons in many of these 2016 TREC document review projects the role of the machine and machine ranking became fairly small. In some that I handled it was reduced to a quality control, quality assurance method. The machine would pick up and catch a few documents that the lawyers alone had missed, but only a few. The machine thus had a slight impact on improved recall, but not much effect at all on precision, which was anyway very high. (More on this experience with easy search topics later in this essay when we talk about our Keyword Search insights.)

On a few of the 2016 TREC Topics the search targets were not well-defined. On these Topics our SME skills were necessarily minimized. Thus in these few Topics, even though the data itself was simple, we had to put greater reliance on the machine (in our case Mr. EDR) than on the attorney reviewers.

It bears repeating that the volume of emails has nothing to do with the ease or difficulty of the review project. This is a secondary question and is not dispositive as to how much weight you need to give to machine ranking. (Volume size may, however, have a big impact on project duration.)

We use IST, Not CAL

ist-smAnother insight in Balanced Hybrid in our new version 4.0 of Predictive Coding is what we call Intelligently Spaced Training, or IST™. See Part Two of this series for more detail on IST. We now use the term IST, instead of CAL, for two reasons:

1. Our previous use of the term CAL was only to refer to the fact that our method of training was continuous, in that it continued and was ongoing throughout a document review project. The term CAL has come to mean much more than that, as will be explained, and thus our continued use of the term may cause confusion.

2. Trademark rights have recently been asserted by Professors Grossman and Cormack, who originated this acronym CAL. As they have refined the use of the mark it now not only stands for Continuous Active Learning throughout a project, but also stands for a particular method of training that only uses the highest ranked documents.

Under the Grossman-Cormack CAL method the machine training continues throughout the document review project, as it does under our IST method, but there the similarities end. Under their CAL method of predictive coding the machine trains automatically as soon as a new document is coded. Further, the document or documents are selected by the software itself. It is a fully automatic process. The only role of the human is to say yes or no as to relevance of the document. The human does not select which document or documents to review next to say yes or no to. That is controlled by the algorithm, the machine. Their software always selects and presents for review the document or documents that it considers to have the highest probability of relevance, which have, of course, not already been coded.

The CAL method is only hybrid, like the e-Discovery Team method, in the sense of man and machine working together. But, from our perspective, it is not balanced. In fact, from our perspective the CAL method is way out of balance in favor of the machine. This may be the whole point of their method, to limit the human role as much as possible. The attorney has no role to play at all in selecting what document to review next and it does not matter if the attorney understands the training process. Personally, we do not like that. We want to be in charge and fully engaged throughout. We want the computer to be our tool, not our master.

sorry_dave_ai

Under our IST method the attorney chooses what documents to review next. We do not need the computer’s permission. We decide whether to accept a batch of high-ranking documents from the machine, or not. The attorney may instead find documents that they think are relevant from other methods. Even if the high ranked method of selection of training documents is used, the attorney decides the number of such documents to use and whether to supplement the machine selection with other training documents.

In fact, the only thing in common between IST and CAL is that both process continue throughout the life of a document review project and both are concerned with the Stop decision (when to when to stop the training and project). Under both methods after the Stopping point no new documents are selected for review and production. Instead, quality assurance methods that include sampling reviews are begun. If the quality assurance tests affirm that the decision to stop review was reasonable, then the project concludes. If they fail, more training and review are initiated.

Time_SpiralAside from the differences in document selection between CAL and IST, the primary difference is that under IST the attorney decides when to train.  The training does not occur automatically after each document, or specified number of documents, as in CAL, or at certain arbitrary time periods, as is common with other software. In the e-Discovery Team method of IST, which, again, stands for Intelligently Spaced (or staggered) Training, the attorney in charge decide when to train. We control the clock, the clock does not control us. The machine does not decide. Attorneys use their own intelligence to decide when to train the machine.

This timing control allows the attorney to observe the impact of the training on the machine. It is designed to improve the communication between man and machine. That is the double-loop learning process described in Part Two as part of the insights into Active Machine Learning. The attorney trains the machine and the machine is observed so that the trainer can learn how well the machine is doing. The attorney can learn what aspects of the relevance rule have been understood and what aspects still need improvement. Based on this student to teacher feedback the teacher is able to custom the next rounds of training to fit the needs of the student. This maximizes efficiency and effectiveness and is the essence of double-loop learning.

Pro Human Approach to Hybrid Man-Machine Partnership

ai_brainTo wrap up the new Balanced Hybrid insights we would like to point out that our terminology speaks of Training– ISTrather than Learning – CAL. We do this intentionally because training is consistent with our human perspective. That is our perspective whereas the perspective of the machine is to learn. The attorney trains and the machine learns. We favor humans. Our goal is empowerment of attorney search experts to find the truth (relevance), the whole truth (recall) and nothing but the truth (precision). Our goal is to enhance human intelligence with artificial intelligence. Thus we prefer a Balanced Hybrid approach with IST and not CAL.

This is not to say the CAL approach of Grossman and Cormack is not good and does not work. It appears to work fine. It is just a tad too boring for us and sometimes too slow. Overall we think it is less efficient and may sometimes even be less effective than our Hybrid Multimodal method. But, even though it is not for us, it may be well be great for many beginners. It is very easy and simple to operate. From language in the Grossman Cormack patents that appears to be what they are going for – simplicity and ease of use. They have that and a growing body of evidence that it works. We wish them well, and also their software and CAL methodology.

Robot_handshake

I expect Grossman and Cormack, and others in the pro-machine camp, to move beyond the advantages of simplicity and also argue safety issues. I expect them to argue that it is safer to rely on AI because a machine is more reliable than a human, in the same way that Google’s self-driving car is safer and more reliable than a human driven car. Of course, unlike driving a car, they still need a human, an attorney, to decide yes or no on relevance, and so they are stuck with human reviewers. They are stuck with a least a partial Hybrid method, albeit one favoring as much as possible the machine side of the partnership. We do not think the pro-machine approach will work with attorneys, nor should it. We think that only an unabashedly pro-human approach like ours is likely to be widely adopted in the legal marketplace.

The goal of the pro-machine approach of Professors Cormack and Grossman, and others, is to minimize human judgments, no matter how skilled, and thereby reduce as much as possible the influence of human error and outright fraud. This is a part of a larger debate in the technology world. We respectfully disagree with this approach, at least in so far as legal document review is concerned. (Personally I tend to agree with it in so far as the driving of automobiles is concerned.) We instead seek enhancement and empowerment of attorneys by technology, including quality controls and fraud detection. See Why the ‘Google Car’ Has No Place in Legal Search. No doubt you will be hearing more about this interesting debate in the coming years. It may well have a significant impact on technology in the law, the quality of justice, and the future of lawyer employment.

robots_newspaper

 

To be continued …


Predictive Coding 4.0 – Nine Key Points of Legal Document Review and an Updated Statement of Our Workflow – Part Two

September 18, 2016

Team_TRECIn Part One we announced the latest enhancements to our document review method, the upgrade to Predictive Coding 4.0. We explained the background that led to this upgrade – the TREC research and hundreds of projects we have done since our last upgrade a year ago. Millions have been spent to develop the software and methods we now use for Technology Assisted Review (TAR). As a result our TAR methods are more effective and simpler than ever.

The nine insights we will share are based on our experience and research. Some of our insights may be complicated, especially our lead insight on Active Machine Learning covered in this Part Two with our new description of ISTIntelligently Spaced Training. We consider IST the smart, human empowering alternative to CAL. If I am able to write these insights up here correctly, the obviousness of them should come through. They are all simple in essence. The insights and methods of Predictive Coding 4.0 document review are partially summarized in the chart below (which you are free to reproduce without edit).

predictive_coding_6-9

1st of the Nine Insights: Active Machine Learning

Our method is Multimodal in that it uses all kinds of document search tools. Although we emphasize active machine learning, we do not rely on that method alone. Our method is also Hybrid in that we use both machine judgments and human (lawyer) judgments. Moreover, in our method the lawyer is always in charge. We may take our hand off the wheel and let the machine drive for a while, but under our versions of Predictive Coding, we watch carefully. We remain ready to take over at a moment’s notice. We do not rely on one brain to the exclusion of another. See eg. Why the ‘Google Car’ Has No Place in Legal Search (caution against over reliance on fully automated methods of active machine learning). Of course the converse is also true, we never just rely on our human brain alone. It has too many limitations. We enhance our brain with predictive coding algorithms. We add to our own natural intelligence with artificial intelligence. The perfect balance between the two, the Balanced Hybrid, is another of insights that we will discuss later.

Active Machine Learning is Predictive Coding – Passive Analytic Methods Are Not

Even though our methods are multimodal and hybrid, the primary search method we rely on is Active Machine Learning. The overall name of our method is, after all, Predictive Coding. And, as any information retrieval expert will tell you, predictive coding means active machine learning. That is the only true AI method. The passive type of machine learning that some vendors use under the name Analytics is NOT the same thing as Predictive Coding. These passive Analytics have been around for years and are far less powerful than active machine learning.

concept-searches-brainThese search methods, that used to be called Concept Search, were a big improvement upon relying on keyword search alone. I remember talking about concepts search techniques in reverent terms when I did my first Legal Search webinar in 2006 with Jason Baron and Professor Doug Oard. That same year, Kroll Ontrack bought one of the original developers and patent holders of concept search, Engenium. For a short time in 2006 and 2007 Kroll Ontrack was the only vendor to have these concept search tools. The founder of Engenium, David Chaplin came with the purchase, and became Kroll Ontrack’s VP of Advanced Search Technologies for three years. (Here is an interesting interview of Chaplin that discusses what he and Kroll Ontrack were doing with advanced search analytic-type tools when he left in 2009.)

search_globalBut search was hot and soon boutique search firms like, Clearwell, Cataphora, Content Analyst (the company recently purchased by popular newcomer, kCura), and other e-discovery vendors developed their own concept search tools. Again, they were all using passive machine learning. It was a big deal ten years ago. For a good description of these admittedly powerful, albeit now dated search tools, see the concise, well-written article by D4’s Tom Groom, The Three Groups of Discovery Analytics and When to Apply Them.

Search experts and information scientists know that active machine learning, also called supervised machine learning, was the next big step in search after concept searches, which are, in programming language, also known as passive or unsupervised machine learning. I am getting out of my area of expertise here, and so am unable go into any details, other than present the below instructional chart by Hackbright Academy that sets forth key difference between supervised learning (predictive coding) and unsupervised (analytics, aka concept search).

machine_learning_algorithms

What I do know is that the bonafide active machine learning software in the market today all use either a form of Logistic Regression, including Kroll Ontrack, or SVM, which means Support Vector Machine.

e-Discovery Vendors Have Been Market Leaders in Active Machine Learning Software

Kroll_IRTAfter Kroll Ontrack absorbed the Engenium purchase, and its founder Chaplin completed his contract with Kroll Ontrack and moved on, Kroll Ontrack focused their efforts on the next big step, active machine learning, aka predictive coding. They have always been that kind of cutting edge company, especially when it comes to search, which is one reason they are one of my personal favorites. A few of the other, then leading e-discovery vendors did too, including especially Recommind and the Israeli based search company, Equivo. Do not get me wrong, the concept search methods, now being sold under the name of TAR Analytics, are powerful search tools. They are a part of our multimodal tool-kit and should be part of yours. But they are not predictive coding. They do not rank documents according to your external input, your supervision. They do not rely on human feedback. They group documents according to passive analytics of the data. It is automatic, unsupervised. These passive analytic algorithms can be good tools for efficient document review, but they not active machine learning and are nowhere near as powerful.

ghosts

Search Software Ghosts

Many of the software companies that made the multi-million dollar investments necessary to go to the next step and build document review platforms with active machine learning algorithms have since been bought out by big-tech and repurposed out of the e-discovery market. They are the ghosts of legal search past. Clearwell was purchased by Symantec and has since disappeared. Autonomy was purchased by Hewlett Packard and has since disappeared. Equivio was purchased by Microsoft and has since disappeared. See e-Discovery Industry Reaction to Microsoft’s Offer to Purchase Equivio for $200 Million – Part One and Part Two. Recommind was recently purchased by OpenText and, although it is too early to tell for sure, may also soon disappear from e-Discovery.

Slightly outside of this pattern, but with the same ghosting result, e-discovery search company, Cataphora, was bought by Ernst & Young, and has since disappeared. The year after the acquisition, Ernst & Young added predictive coding features from Cataphora to its internal discovery services. At this point, all of the Big Four Accounting Firms, claim to have their own proprietary software with predictive coding. Along the same lines, at about the time of the Cataphora buy-out, consulting giant FTI purchased another e-discovery document review company, Ringtail Solutions (known for its petri dish like visualizations). Although not exactly ghosted by FTI from the e-discovery world after the purchase, they have been absorbed by the giant FTI.

microsoft_acquiresOutside of consulting/accountancy, in the general service e-discovery industry for lawyers, there are, at this point (late 2016) just a few document review platforms left that have real active machine learning. Some of the most popular ones left behind certainly do not. They only have passive learning analytics. Again, those are good features, but they are not active machine learning, one of the nine basic insights of Predictive Coding 4.0 and a key component of the e-Discovery Team’s document review capabilities.

predictive_coding_9_2

The power of the advanced, active learning technologies that have been developed for e-discovery is the reason for all of these acquisitions by big-tech and the big-4 or 5. It is not just about wild overspending, although that may well have been the case for Hewlett Packard payment of $10.3 Billion to buy Autonomy. The ability to do AI-enhanced document search and review is a very valuable skill, one that will only increase in value as our data volumes continue to explode. The tools used for such document review are also quite valuable, both inside the legal profession and, as the ghostings prove, well beyond into big business. See e-Discovery Industry Reaction to Microsoft’s Offer to Purchase Equivio for $200 MillionPart Two.

The indisputable fact that so many big-tech companies have bought up the e-discovery companies with active machine learning software should tell you a lot. It is a testimony to the advanced technologies that the e-discovery industry has spawned. When it comes to advanced search and document retrieval, we in the e-discovery world are the best in the world my friends, primarily because we have (or can easily get) the best tools. Smile.

usain-bolt-smiling

Search is king of our modern Information Age culture. See Information → Knowledge → Wisdom: Progression of Society in the Age of ComputersThe search for evidence to peacefully resolve disputes is, in my most biased opinion, the most important search of all. It sure beats selling sugar water. Without truth and justice all of the petty business quests for fame and fortune would crumble into anarchy, or worse, dictatorship.

With this background it is easy to understand why some of the e-discovery vendors left standing are not being completely candid about the capabilities of their document review software. (It is called puffing and is not illegal.) The industry is unregulated and, alas, most of our expert commentators are paid by vendors. They are not independent. As a result, many of the lawyers who have tried what they thought was predictive coding, and had disappointing results, have never really tried predictive coding at all. They have just used slightly updated concept search.

Ralph Losey with this "nobody read my blog" sad shirtAlternatively, some of the disappointed lawyers may have used one of the many now-ghosted vendor tools. They were all early version 1.0 type tools. For example, Clearwell’s active machine learning was only on the market for a few months with this feature before they were bought and ghosted by Symantec. (I think Jason Baron and I were the first people to see an almost completed demo of their product at a breakfast meeting a few months before it was released.) Recommind’s predictive coding software was well-developed at the time of their sell-out, but not its methods of use. Most of its customers can testify as to how difficult it is to operate. That is one reason that OpenText was able to buy them so cheaply, which, we now see, was part of their larger acquisition plan culminating in the purchase of Dell’s EMC document management software.

All software still using early methods, what we call version 1.0 and 2.0 methods based on control sets, are cumbersome and hard to operate, not just Recommind’s system. I explained this in my article last year, Predictive Coding 3.0. I also mentioned in this article that some vendors with predictive coding would only let you use predictive coding for search. It was, in effect, mono-modal. That is also a mistake. All types of search must be used – multimodal – for the predictive coding type of search to work efficiently and effectively. More on that point later.

Maura Grossman Also Blows the Whistle on Ineffective “TAR tools”

Maura Grossman aka "Mr. Grossman" to her email friends

Maura Grossman, who is now an independent expert in this field, made many of these same points in a recent interview with Artificial Lawyer, a periodical dedicated to AI and the Law. AI and the Future of E-Discovery: AL Interview with Maura Grossman (Sept. 16, 2016). When asked about the viability of the “over 200 businesses offering e-discovery services” Maura said, among other things:

In the long run, I am not sure that the market can support so many e-discovery providers …

… many vendors and service providers were quick to label their existing software solutions as “TAR,” without providing any evidence that they were effective or efficient. Many overpromised, overcharged, and underdelivered. Sadly, the net result was a hype cycle with its peak of inflated expectations and its trough of disillusionment. E-discovery is still far too inefficient and costly, either because ineffective so-called “TAR tools” are being used, or because, having observed the ineffectiveness of these tools, consumers have reverted back to the stone-age methods of keyword culling and manual review.

caveman lawyerNow that Maura is no longer with the conservative law firm of Wachtell Lipton, she has more freedom to speak her mind about caveman lawyers. It is refreshing and, as you can see, echoes much of what I have been saying. But wait, there is still more that you need to hear from the interview of new Professor Grossman:

It is difficult to know how often TAR is used given confusion over what “TAR” is (and is not), and inconsistencies in the results of published surveys. As I noted earlier, “Predictive Coding”—a term which actually pre-dates TAR—and TAR itself have been oversold. Many of the commercial offerings are nowhere near state of the art; with the unfortunate consequence that consumers have generalised their poor experiences (e.g., excessive complexity, poor effectiveness and efficiency, high cost) to all forms of TAR. In my opinion, these disappointing experiences, among other things, have impeded the adoption of this technology for e-discovery. …

ulNot all products with a “TAR” label are equally effective or efficient. There is no Consumer Reports or Underwriters Laboratories (“UL”) that evaluates TAR systems. Users should not assume that a so-called “market leading” vendor’s tool will necessarily be satisfactory, and if they try one TAR tool and find it to be unsatisfactory, they should keep evaluating tools until they find one that works well. To evaluate a tool, users can try it on a dataset that they have previously reviewed, or on a public dataset that has previously been labelled; for example, one of the datasets prepared for the TREC 2015 or 2016 Total Recall tracks. …

She was then asked by the Artificial Lawyer interviewer (name never identified), which is apparently based in the UK, another popular question:

As is often the case, many lawyers are fearful about any new technology that they don’t understand. There has already been some debate in the UK about the ‘black box’ effect, i.e., barristers not knowing how their predictive coding process actually worked. But does it really matter if a lawyer can’t understand how algorithms work?

Maura_Goog_GlassesThe following is an excerpt of Maura’s answer. Suggest you consult the full article for a complete picture. AI and the Future of E-Discovery: AL Interview with Maura Grossman (Sept. 16, 2016). I am not sure whether she put on her Google Glasses to answer (probably not), but anyway, I rather like it.

Many TAR offerings have a long way to go in achieving predictability, reliability, and comprehensibility. But, the truth that many attorneys fail to acknowledge is that so do most non-TAR offerings, including the brains of the little black boxes we call contract attorneys or junior associates. It is really hard to predict how any reviewer will code a document, or whether a keyword search will do an effective job of finding substantially all relevant documents. But we are familiar with these older approaches (and we think we understand their mechanisms), so we tend to be lulled into overlooking their limitations.

The brains of the little black boxes we call contract attorneys or junior associates. So true. We will go into that more throughly in our discussion of the GIGO & QC insight.

Recent Team Insights Into Active Machine Learning

To summarize what I have said so far, in the field of legal search, only active machine learning:

  • effectively enhances human intelligence with artificial intelligence;
  • qualifies for the term Predictive Coding.

I want to close on this discussion of active machine learning with one more insight. This one is slightly technical, and again, if I explain it correctly, should seem perfectly obvious. It is certainly not new, and most search experts will already know this to some degree. Still, even for them, there may some nuances to this insight that they have not thought of. It can be summarized as follows: active machine learning should have a double feedback loop with active monitoring by the attorney trainers.

robot-friend

feedback_loopsActive machine learning should create feedback for both the algorithm (the data classified) AND the human managing the training. Both should learn, not just the robot. They should, so to speak, be friends. They should get to know each other

Many predictive coding methods that I have read about, or heard described, including how I first used active machine learning, did not sufficiently include the human trainer in the feedback loop.  They were static types of training using single a feedback loop. These methods are, so to speak, very stand-offish, aloof. Under these methods the attorney trainer does not even try to understand what is going on with the robot. The information flow was one-way, from attorney to machine.

Mr_EDRAs I grew more experienced with the EDR software I started to realize that it is possible to start to understand, at least a little, what the black box is doing. Logistic based AI is a foreign intelligence, but it is intelligence. After a while you start to understand it. So although I started just using one-sided machine training, I slowly gained the ability to read how EDR was learning. I then added another dimension, another feedback loop that was very interesting one indeed. Now I not only trained and provided feedback to the AI as to whether the predictions of relevance were correct, or not, but I also received training from the AI as to how well, or not, it was learning. That in turn led to the humorous personification of the Kroll Ontrack software that we now call Mr. EDR. See MrEDR.com. When we reached this level, machine training became a fully active, two-way process.

We now understand that to fully supervise a predictive coding process you to have a good understanding of what is happening. How else can you supervise it? You do not have to know exactly how the engine works, but you at least need to know how fast it is going. You need a speedometer. You also need to pay attention to how the engine is operating, whether it is over-heating, needs oil or gas, etc. The same holds true to teaching humans. Their brains are indeed mysterious black boxes. You do not need to know exactly how each student’s brain works in order to teach them. You find out if your teaching is getting through by questions.

For us supervised learning means that the human attorney has an active role in the process. A role where the attorney trainer learns by observing the trainee, the AI in creation. I want to know as much as possible, so long as it does not slow me down significantly.

In other methods of using predictive coding that we have used or seen described the only role of the human trainer is to say yes or no as to the relevance of a document. The decision as to what documents to select for training has already been predetermined. Typically it is the highest ranked documents, but sometimes also some mid-ranked “uncertain documents” or some “random documents” are added in the mix. The attorney
has no say in what documents to look at. They are all fed to him or her according to predetermined rules. These decision making rules are set in ralph_boredadvance and do not change. These active machine learning methods work, but they are slow, and less precise, not to mention boring as hell.

The recall of these single-loop passive supervision methods may also not be as good. The jury is still out on that question. We are trying to run experiments on that now, although it can be hard to stop yawning. See an earlier experiment on this topic testing the single loop teaching method of random selection: Borg Challenge: Report of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology.

These mere yes or no, limited participation methods are hybrid Man-Machine methods, but, in our opinion, they are imbalanced towards the Machine. (Again, more on the question of Hybrid Balance will be covered in the next installment of this article.) This single versus dual feedback approach seems to be the basic idea behind the Double Loop Learning approach to human education depicted in the diagram below. Also see Graham Attwell, Double Loop Learning and Learning Analytics (Pontydysgu, May 4, 2016).

double-loop-learning

To quote Wikipedia:

The double loop learning system entails the modification of goals or decision-making rules in the light of experience. The first loop uses the goals or decision-making rules, the second loop enables their modification, hence “double-loop.” …

Double-loop learning is contrasted with “single-loop learning”: the repeated attempt at the same problem, with no variation of method and without ever questioning the goal. …

Double-loop learning is used when it is necessary to change the mental model on which a decision depends. Unlike single loops, this model includes a shift in understanding, from simple and static to broader and more dynamic, such as taking into account the changes in the surroundings and the need for expression changes in mental models.

double-loop-learning2

The method of active machine learning that we use in Predictive Coding 4.0 is a type of double loop learning system. As such it is ideal for legal search, which is inherently ad hoc, where even the understanding of relevance evolves as the project develops. As Maura noted near the end of the Artificial Lawyer interview:

… e-discovery tends to be more ad hoc, in that the criteria applied are typically very different for every review effort, so each review generally begins from a nearly zero knowledge base.

The driving impetus behind our double feedback look system is to allow for training document selection to vary according to the circumstances encountered. Attorneys select documents for training and then observe how these documents impact the AI’s overall ranking of the documents. Based on this information decisions are then made by the attorney as to which documents to next submit for training. A single fixed mental model is not used, such as only submitting the ten highest ranked documents for training.

The human stays involved and engaged and selects the next documents to add to the training based on what she sees. This makes the whole process much more interesting. For example, if I find a group of relevant spreadsheets by some other means, such as a keyword search, then, when I add these document to the training, I observe how these documents impact the overall ranking of the dataset. For instance, did this training result in an increase of relevance ranking of other spreadsheets? Was the increase nominal or major? How did it impact the ranking of other documents? For instance, were emails with a lot of numbers in them suddenly much higher ranked? Overall, was this training effective? Were the documents in fact relevant as predicted that moved up in rank to the top, or near top of probable relevance? What was the precision rate like for these documents? Does the AI now have a good understanding of relevance of spreadsheets, or need more training on that type of document? Should we focus our search on other kinds of documents?

You see all kinds of variations on that. If the spreadsheet understanding (ranking) is good, how does it compare to its understanding (correct ranking) of Word Docs or emails? Where should I next focus my multimodal searches? What documents should I next assign to my reviewers to read and make a relevancy determination? These kind of considerations keep the search interesting, fun even. Work as play is the best kind. Typically we simply assign the documents for attorney review that have the highest ranking (which is the essence of what Grossman and Cormack call CAL), but not always. We are flexible. We, the human attorneys, are the second positive feedback loop.

EDR_lookWe like to remain in charge of teaching the classifier, the AI. We do not just turn it over to the classifier to teach itself. Although sometimes, when we are out of ideas and are not sure what to do next, we will do exactly that. We will turn over to the computer the decision of what documents to review next. We just go with his top predictions and use those documents to train. Mr. EDR has come through for us many times when we have done that. But this is more of an exception, than the rule. After all, the classifier is a tabula rasa. As Maura put it: each review generally begins from a nearly zero knowledge base. Before the training starts, it knows nothing about document relevance. The computer does not come with built-in knowledge of the law or relevance. You know what you are looking for. You know what is relevant, even if you do not know how to find it, or even whether it exists at all. The computer does not know what you are looking for, aside from what you have told it by your yes-no judgments on particular documents. But, after you teach it, it knows how to find more documents that probably have the same meaning.

raised_handsBy observation you can see for yourself, first hand, how your training is working, or not working. It is like a teacher talking to their students to find out what they learned from the last assigned reading materials. You may be surprised by how much, or how little they learned. If the last approach did not work, you change the approach. That is double-loop learning. In that sense our active monitoring approach it is like continuous dialogue. You learn how and if the AI is learning. This in turn helps you to plan your next lessons. What has the student learned? Where does the AI need more help to understand the conception of relevance that you are trying to teach it.

Only_Humans_Need_ApplyThis monitoring of the AI’s learning is one of the most interesting aspects of active machine learning. It is also a great opportunity for human creativity and value. The inevitable advance of AI in the law can mean more jobs for lawyers overall, but only for those able step up and change their methods. The lawyers able to play the second loop game of active machine learning will have plenty of employment opportunities. See eg. Thomas H. Davenport, Julia Kirby, Only Humans Need Apply: Winners and Losers in the Age of Smart Machines (Harper 2016).

Going down into the weeds a little bit more, our active monitoring dual feedback approach means that when we use Kroll Ontrack’s EDR software, we adjust the settings so that new learning sessions are not created automatically. They only run when and if we click on the Initiate Session button shown in the EDR screenshot below (arrow and words were added). We do not want the training to go on continuously in the background (typically meaning at periodic intervals of every thirty minutes or so.) We only want the learning sessions to occur when we say so. In that way we can know exactly what documents EDR is training on during a session. Then, when that training session is complete, we can see how the input of those documents has impacted the overall data ranking.  For instance, are there now more documents in the 90% or higher probable relevance category and if so, how many? The picture below is of a completed TREC project. The probability rankings are on the far left with the number of documents shown in the adjacent column. Most of the documents in the 290,099 collection of Bush email were in the 0-5% probable relevant ranking not included in the screen shot.

edr_initiate_session

This means that the e-Discovery Team’s active learning is not continuous, in the sense of always training. It is instead intelligently spaced. That is an essential aspect of our Balanced Hybrid approach to electronic document review. The machine training only begins when we click on the “Initiate Session” button in EDR that the arrow points to. It is only continuous in the sense that the training continues until all human review is completed. The spaced training, in the sense of staggered  in time, is itself an ongoing process until the production is completed. We call this Intelligently Spaced Training or IST. Such ongoing training improves efficiency and precision, and also improves Hybrid human-machine communications. Thus, in our team’s opinion, IST is a better process of electronic document review than training automatically without human participation, the so-called CAL approach promoted (and recently trademarked) by search experts and professors, Maura Grossman and Gordon Cormack.

ist-sm

Exactly how we space out the timing of training in IST is a little more difficult to describe without going into the particulars of a case. A full, detailed description would require the reader to have intimate knowledge of the EDR software. Our IST process is, however, software neutral. You can follow the IST dual feedback method of active machine learning with any document review software that has active machine learning capacities and also allows you to decide when to initiate a training session. (By the way, a training session is the same thing as a learning session, but we like to say training, not learning, as that takes the human perspective and we are pro-human!) You cannot do that if the training is literally continuous and cannot be halted while you input a new batch of relevance determined documents for training.

The details of IST, such as when to initiate a training session, and what human coded documents to select next for training, is an ad hoc process. It depends on the data itself, the issues involved in the case, the progress made, the stage of the review project and time factors. This is the kind of thing you learn by doing. It is not rocket science, but it does help keep the project interesting. Hire one of our team members to guide your next review project and you will see it in action. It is easier than it sounds. With experience Hybrid Multimodal IST becomes an intuitive process, much like riding a bicycle.

ralph_trecTo summarize, active machine learning should be a dual feedback process with double-loop learning. The training should continue throughout a project, but it should be spaced in time so that you can actively monitor the progress, what we call IST. The software should learn from the trainer, of course, but the trainer should also learn from the software. This requires active monitoring by the teacher who reacts to what he or she sees and adjusts the training accordingly so as to maximize recall and precision.

This is really nothing more than a common sense approach to teaching. No teacher who just mails in their lessons, and does not pay attention to the students, is ever going to be effective. The same is true for active machine learning. That’s the essence of the insight. Simple really.

Next, in Part Three, I will address the related insights of Balanced Hybrid.

To be Continued …


Predictive Coding 4.0 – Nine Key Points of Legal Document Review and an Updated Statement of Our Workflow – Part One

September 11, 2016

This blog introduces the e-Discovery Team’s latest insights and methods of document review. We call this Predictive Coding 4.0 because it substantially improves upon, and replaces the methods and insights we announced in our October 2015 publication – Predictive Coding 3.0. In that two-part blog we explained the history of predictive coding software and methods in legal review, including versions 1.0 and 2.0. Then we described our new version 3.0 in some detail. Since that publication we have developed more enhancements to our methods, including many new, innovative uses of  the predictive coding features of Kroll Ontrack’s EDR software. We even developed some new features not related to predictive coding. (Try out the new Folder Similar search in EDR for example.) Most of our new insights, just like our prior 3.0 version methodologies, can also be used on other software platforms. To use all of the features, however, the software will have to have bona fide active machine learning capacities. Most do not. More on that later.

Team_TREC_2These improvements naturally evolved to a certain degree as part of the e-Discovery Team members normal work supervising hundreds, maybe even thousands of document review projects over the past year. But the new insights that require us to make a complete restatement, a new Version 4.0, arose just recently. Major advances were attained as part of an intensive three months of experiments, all conducted outside of our usual legal practice and document reviews. The e-Discovery Team doing this basic research consisted of myself and several of Kroll Ontrack’s top document review specialists, including especially Jim Sullivan and Tony Reichenberger. They have now fully mastered the e-discovery team search and review Hybrid Multimodal methodologies. As far as I can see, at this point in the race for the highest quality legal document review, no one else comes even close to their skill level. Yes, e-discovery is highly competitive, but they trained hard and are now looking back and smiling.

usain-bolt-smiling

The insights we gained, and the skills we honed, including speed, did not come easily. It took full time work on client projects all year, plus three full months of research, often in lieu of real summer vacations (my wife is still waiting). This is hard work, but we love it. See: Why I Love Predictive Coding. This kind of dedication of time and resources by an e-discovery vendor or law firm is unprecedented. There is a cost to attain the research benefits realized, both hard out-of-pocket costs and lost time. So I hope you understand that we are only going to share some of our techniques. The rest we will keep as trade-secrets. (Retain us and watch. Then you can see them in action.)

mark_williams

Mark Williams, CEO Kroll Ontrack

Kroll Ontrack understands the importance of pure research and enthusiastically approved these expenditures. (My thanks again to CEO Mark Williams, a true visionary leader in this industry who approved and supported the research program.) I suggest you ask your vendor, or law firm, how much time they spent last year researching and experimenting with document review methods? As far as we know, the only other vendor with an active research program is Catalyst, whose work is also to be commended. (No one else showed up for TREC.) The only other law firm we know of is Maura Grossman’s new solo practice. Her time spent with research is also impressive.

The results we attained certainly make this investment worthwhile, even if many in the profession do not realize it, much less appreciate it. They will in time, so will the consumers. This is a long term investment. Pure research is necessary for any technology company, including all companies in the e-Discovery field. The same holds true, albeit to a lesser extent, to any law firm claiming to have technological superiority.

mad scientistExperience from handling live projects alone is too slow an incubator for the kind of AI breakthrough technologies we are now using. It is also too inhibiting. You do not experiment on important client data or review projects. Any expert will improvise somewhat during such projects to match the circumstances, and sometimes do post hoc analysis. But such work on client projects alone is not enough. Pure research is needed to continue to advance in AI-enhanced review. That is why the e-Discovery Team spent a substantial part of our waking hours in June, July and August 2016 working on experiments with Jeb Bush email.  The Jeb Bush email collection was our primary laboratory this year. As a result of the many new things we learned, and new methods practiced and perfected, we have now reached a point where a complete restatement of our method is in order. Thus we here release Predictive Coding 4.0.

NIST-Logo_RLOur latest breakthroughs this summer primarily came out of the e-Discovery Team’s participation in the annual Text Retrieval Conference, aka TREC, sponsored by the National Institute of Standards and Technology. This is the 25th year of the TREC event. We were honored to again participate, as we did last year, in the Total Recall Track of TREC. This is the closest Track that TREC now offers to a real legal review project. It is not a Legal Track, however, and so we necessarily did our own side-experiments, and had our own unique approach different from the Universities that participated. The TREC leadership of the Total Recall Track was once again in the capable hands of Maura Grossman, Gordon Cormack and other scientists.

This blog will not report on the specifics of the 2016 Total Recall Track. That will come at a later time after we finish analyzing the enormous amount of data we generated and submit our formal reports to TREC. In any event, the TREC related work we did this Summer went beyond the thirty-four research topics included in the TREC event. It went well beyond the 9,863,366 documents we reviewed with Mr. EDR’s help as part of the formal submittals. Countless more documents were reviewed for relevance if you include our side-experiments.

MrEdr_CapedAt the same time that we did the formal tests specified by the Total Recall Track we did multiple side-experiments of our own. Some of these tests are still ongoing. We did so to investigate our own questions that are unique to legal search and thus beyond the scope of the Total Recall Track. We also performed experiments to test unique attributes of Kroll Ontrack’s EDR software. It uses a proprietary type of logistic regression algorithm that was awarded a patent this year. Way to go KO and Mr. EDR!

Although this blog will not report on our TREC experiments per se, we will share the bottom line, the take-aways of this testing. Not everything will be revealed. We keep some of our methods and techniques trade-secret.

fcsi_forensicsWe will also not be discussing in this multi-part blog our future plans and spin-off projects. Let’s just say for now that we have several in mind. One in particular will, I think, be very exciting for all attorneys and paralegals who do document review. Maybe even fun for those of you who, like us, are really into and enjoy a good computer search. You know who you are! If my recommendations are accepted, we will open that one up to all of our fellow doc-review freaks. I will say no more at this point, but watch for announcements in the coming year from Kroll Ontrack and me. We are having too much fun here not to share some of the good times.

Even if we did adopt 100% transparency on our methods, it would take a book to write it all down, and it would still be incomplete. Many things can only be learned by doing, especially methods. Document review is, after all, a part of legal practice. As the scientists like to put it, legal search is essentially ad hoc. It changes and is customized to fit the particular evidence search assignments at hand. But we will try to share all of the basic insights. They have all been discussed here before. The new insights we gained are more like a deepening understanding and matter of emphasis. They are refinements, not radical departures, although some are surprising.

Nine Insights Concerning the Use of Predictive Coding in Legal Document Review

The diagram below summarizes the nine basic insights that have come out of our work this year. These are the key concepts that we now think are important to understand and implement. [Just like the 8-Step Workflow diagram above, this, and other diagrams in this blog may be freely used with attribution. But please do not change anything without my permission. I am also happy to provide you with higher resolution graphics if needed for presentation or publication purposes.

The diagrams above and following will be explained in detail throughout the rest of this multipart blog, as will the restated 8-Step Workflow shown at the top of the page. These are not new concepts. I have discussed most of these here before. I am confident that all readers will be able to follow along as I set forth the new nuances we learned.

Although these concepts are all familiar, some of our deepened understanding of these concepts may surprise you. Some were surprising to us. These insights include several changes in thinking on our part. Some of the research results we saw were unexpected. But we follow the data. Our opinions are always held lightly. I have argued both sides of a legal issue too many times as a lawyer to fall into that trap. Our thinking follows the evidence, not our preconceptions. That is, after all, the whole point of research. Schedule permitting, we are also happy to provide in-person or online presentations that explain these concept-summary diagrams. If retained, you can also see it in action.

Although these insights and experiments were derived using Kroll Ontrack EDR software, they are essentially vendor neutral. The methods will work on any full-featured document review platform, but especially those that includes bona fide active machine learning abilities, aka Predictive Coding. As all experts in this field know, many of the most popular document review platforms do not have these features, even those stating they use Analytics. Active Machine Learning is very different, and far more advanced than Analytics, the early forms of which were called Concept Search. This type of machine learning is passive and clearly is not predictive coding. It has its place in any multimodal system such as ours, and can be a powerful feature to improve search and review. But such software is incomplete and cannot meet the standards and capability of software that includes active machine learning. Only full featured document review platforms with active machine learning abilities can use all of the Predictive Coding 4.0 methods described here.

truth-to-powerSorry dear start-up vendors, and others, but that’s the truth. Consumers, you get what you pay for. You know that. Not sure? Get the help of an independent expert advisor before you make substantial investments in e-discovery software or choose a vendor for a major project. Also, if you have tried predictive coding, or what you were told was advanced TAR, whatever the hell that is, and it did not work well, do not blame yourself. It could be the software. Or if not the software, then the antiquated version 1.0 or 2.0 methods used. There is a lot of bullshit out there. Excuse my French. There always has been when it comes to new technology. It does, however, seems especially prevalent in the legal technology field. Perhaps they think we lawyers are naive and technologically gullible. Do not be fooled. Again, look to an independent consultant if you get confused by all the vendor claims.

ralph_and_lexieContrary to what some vendors will tell you (typically the ones without bona fide predictive coding features), predictive coding 3.0, and now 4.0, methods are not rocket science. You do not have to be a TAR-whisperer or do nothing but search, like my A-team for TREC. With good software it is not really that hard at all. These methods do, however, require an attorney knowledgable in e-discovery and comfortable with software. This is not for novices. But every law firm should anyway have attorneys with special training and experience in technology and e-discovery. For instance, if you practice in the Northern District of California, an e-discovery liaison with such expertise is required in most cases. See Guidelines for the Discovery of Electronically Stored Information. Almost half of the Bar Associations in the U.S. require basic technology competence as an ethical imperative. See eg. ABA Model Rule 1.1, Comment [8] and Robert Ambrogi’s list of 23 states, and counting, that now require such competence. (My own law firm has had an e-discovery liaison program in place since 2010, which I lead and train. I am proud to say that after six years of work it is now a great success.) So no, you do not have to be a full-time specialist, like the members of my TREC e-Discovery team, to successfully use AI-enhanced review, which we call Hybrid Multimodal. This is especially true when you work with vendors like Kroll Ontrack, Catalyst and others that have teams of special consultants to guide you. You just have to pick your vendors wisely.

To be continued …


The Law’s “Reasonable Man,” Judge Haight, Love, Truth, Justice, “Go Fish” and Why the Legal Profession Is Not Doomed to be Replaced by Robots

June 29, 2016

Reasonable_guageReasonability is a core concept of the law and foundation of our system of justice. Reason, according to accepted legal doctrine, is how we judge the actions of others and determine right from wrong. We do not look to Truth and Love for Justice, we look to Truth and Reason. If a person’s actions are reasonable, then, as a general matter, they are good and should not be punished, no matter what the emotional motives behind the actions. It is an objective standard. Actions judged as unreasonable are not good, no matter the emotional motive (think mercy killing).

Irrational actions are discouraged by law, and, if they cause damages, they are punished. The degree of punishment slides according to how unreasonable the behavior was and the extent of damages caused. Bad behavior ranges from the barely negligent – a close question – to intentionally bad, scienter. Analysis of reasonability in turn always depends on the facts and circumstances surrounding the actions being judged.

Reasonability Depends on the Circumstances

Justice_scaleWhenever a lawyer is asked a legal question they love to start the answer by pointing that it all depends. We are trained to see both sides, to weigh the evidence. We dissect, access and evaluate degrees of reasonability according to the surrounding circumstances. We deal with reason, logic and cold hard facts. Our recipe for justice is simple: add reason to facts and stir well.

The core concept of reasonability not only permeates negligence and criminal law, it underlies discovery law as well. We are constantly called upon the evaluate the reasonability of efforts to save, find and produce electronically stored information. This evaluation of reasonability always depends on the facts. It requires more than information. It requires knowledge of what the information means.

Perfect efforts are not required in the law, but reasonable efforts are. Failure to make such efforts can be punished by the court, with the severity of the punishment contingent on the degree of unreasonability and extent of damages. Again, this requires knowledge of the true facts of the efforts, the circumstances.

justice_guage_negligenceIn discovery litigants and their lawyers are not permitted to make anything less than reasonable efforts to find the information requested. They are not permitted to make sub-standard, negligent efforts, and certainly not grossly negligence efforts. Let us not even talk about intentionally obstructive or defiant efforts. The difference between good enough practice – meaning reasonable efforts – and malpractice is where the red line of negligence is drawn.

Bagely v. Yale

Yale Law Professor Constance Bagley

Professor Constance Bagley

One of my favorite district court judges – 86-year old Charles S. Haight – pointed out the need to evaluate reasonability of e-discovery efforts in a well-known, at this time still ongoing employment discrimination case. Bagely v. Yale, Civil Action No. 3:13-CV-1890 (CSH). See eg. Bagley v. Yale University, 42 F. Supp. 3d 332 (DC, Conn. 2014). On April 27, 2015, Judge Haight considered Defendant’s Motion for Protective Order.

The plaintiff, Constance Bagley, wanted her former employer, Yale University, to look through the emails of more witness to respond to her request for production. The defendant, Yale University, said it had already done enough, that it had reviewed the emails of several custodians, and should not be required to do more. Judge Haight correctly analyzed this dispute as requiring his judgment on the reasonability of Yale’s efforts. He focused on Rule 26(b)(2)(B) involving the “reasonable accessibility” of certain ESI and the reasonable efforts requirements under then Rule 26(b)(2)(C) (now 26(b)(1) – proportionality factors under the 2015 Rules Amendments). In the judge’s words:

Yale can — indeed, it has — shown that the custodians’ responsive ESI is not readily accessible. That is not the test. The question is whether this information is not reasonably accessible: a condition that necessarily implies some degree of effort in accessing the information. So long as that creature of the common law, the reasonable man,[6] paces the corridors of our jurisprudence, surrounding circumstances matter.

[6] The phrase is not gender neutral because that is not the way Lord Coke spoke.

Bagley v. Yale, Ruling on Defendant’s Motion for Protective Order (Doc. 108) (April 27, 2015) (emphasis added).

The Pertinent e-Discovery Facts of Bagley v. Yale

kiss_me_im_a_custodian_keychainJudge Haight went on to deny the motion for protective order by defendant Yale University, his alma mater, by evaluation of the facts and circumstances. Here the plaintiff originally wanted defendant to review for relevant documents the ESI that contained certain search terms of 24 custodians. The parties later narrowed the list of terms and reduced the custodian count from 24 to 10. The defendant began a linear review of each and every document. (Yes, their plan was to have a paralegal or attorney look at each any every document with a hit, instead of more sophisticated approaches, i.e. – concept search or predictive coding.) Here is Judge Haight’s description:

Defendants’ responsive process began when University staff or attorneys commandeered — a more appropriate word than seized — the computer of each of the named custodians. The process of ESI identification and production then “required the application of keyword searches to the computers of these custodians, extracting the documents containing any of those keywords, and then reading every single document extracted to determine whether it is responsive to any of the plaintiff’s production requests and further to determine whether the document is privileged.” Defendants’ Reply Brief [Doc. 124], at 2-3. This labor was performed by Yale in-house paralegals and lawyers, and a third-party vendor the University retained for the project.

Go FishIt appears from the opinion that Yale was a victim of a poorly played game of Go Fish where each side tries to find relevant documents by guessing keywords without study of the data, much less other search methods. Losey, R., Adventures in Electronic Discovery (West 2011); Child’s Game of ‘Go Fish’ is a Poor Model for e-Discovery Search. This is a very poor practice, as I have often argued, and frequently results in surprise burdens on the producing party.

This is what happened here. As Judge Haight explained, Yale did not complain of these keywords and custodian count (ten instead of five), until months later when the review was well underway:

[I]t was not until the parties had some experience with the designated custodians and search terms that the futility of the exercise and the burdens of compliance became sufficiently apparent to Defendants to complain of them.

go fishToo bad. If they had tested the keywords first before agreeing to review all hits, instead of following the Go Fish approach, none of this would have happened. National Day Laborer Organizing Network v. US Immigration and Customs Enforcement Agency, 877 F.Supp.2d 87 (SDNY, 2012) (J. Scheindlin) (“As Judge Andrew Peck — one of this Court’s experts in e-discovery — recently put it: “In too many cases, however, the way lawyers choose keywords is the equivalent of the child’s game of `Go Fish’ … keyword searches usually are not very effective.” FN 113“); Losey, R., Poor Plaintiff’s Counsel, Can’t Even Find a CAR, Much Less Drive One (9/1/13).

After reviewing the documents of only three custodians, following the old-fashioned, buggy-whip method of looking at one document after another (linear review), the defendant complained as to the futility of their effort to the judge. They alleged that the effort:

… required paralegals and lawyers to review approximately 13,393 files, totaling 4.5 gigabytes, or the equivalent of about 450,000 pages of emails. Only 6% of this data was responsive to Plaintiff’s discovery request: about 300 megabytes, or about 29,300 pages of emails. In excess of 95% of this information, while responsive to the ESI request, has absolutely nothing to do with any of the issues in this case. Thus, defendants’ lawyers and paralegals reviewed approximately 450,000 pages of material in order to produce less than 1,500 pages of information which have any relationship whatsoever to this dispute; and the majority of the 1,500 pages are only marginally relevant.

ShiraScheindlin_sketchI do not doubt that at all. It is typical in cases like this. What do you expect from blind negotiated keyword search and linear review? For less effort try driving a CAR instead of walking. As Judge Scheindlin said in National Day Laborer back in 2012:

There are emerging best practices for dealing with these shortcomings and they are explained in detail elsewhere.[114] There is a “need for careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or `keywords’ to be used to produce emails or other electronically stored information.”[115] And beyond the use of keyword search, parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents.[116] Through iterative learning, these methods (known as “computer-assisted” or “predictive” coding) allow humans to teach computers what documents are and are not responsive to a particular FOIA or discovery request and they can significantly increase the effectiveness and efficiency of searches. In short, a review of the literature makes it abundantly clear that a court cannot simply trust the defendant agencies’ unsupported assertions that their lay custodians have designed and conducted a reasonable search.

National Day Laborer Organizing Network, supra 877 F.Supp.2d at pgs. 109-110.

Putting aside the reasonability of search and review methods selected, an issue never raised by the parties and not before the court, Judge Haight addressed whether the defendant should be required to review all ten custodians in these circumstances. Here is Judge Haight’s analysis:

Prior to making this motion, Yale had reviewed the ESI of a number of custodians and produced the fruits of those labors to counsel for Bagley. Now, seeking protection from — which in practical terms means cessation of — any further ESI discovery, the University describes in vivid, near-accusatory prose the considerable amount of time and treasure it has already expended responding to Bagley’s ESI discovery requests: an exercise which, in Yale’s non-objective and non-binding evaluation, has unearthed no or very little information relevant to the lawsuit. Yale’s position is that given those circumstances, it should not be required to review any additional ESI with a view toward producing any additional information in discovery. The contention is reminiscent of a beleaguered prizefighter’s memorable utterance some years ago: “No mas!” Is the University entitled to that relief? Whether the cost of additional ESI discovery warrants condemnation of the total as undue, thereby rendering the requested information not reasonably accessible to Yale, presents a legitimate issue and, in my view, a close question.

Judge Charles Haight (“Terry” to his friends) analyzed the facts and circumstances to decide whether Yale should continue its search and review of four more custodians. (It was five more, but Yale reviewed one while the motion was pending.) Here is his summary:

Defendants sum up the result of the ESI discovery they have produced to Plaintiff to date in these terms: “In other words, of the 11.88 gigabytes of information[3](which is the equivalent of more than 1 million pages of email files) that has so far been reviewed by the defendant, only about 8% of that information has been responsive and non-privileged. Furthermore, only a small percentage of those documents that are responsive and non-privileged actually have any relevance to the issues in this lawsuit.” Id., at 4-5.  . . .

[3] 11.88 gigabytes is the total of 4.5 gigabytes (produced by review of the computers of Defendant custodians Snyder, Metrick and Rae) and 7.38 gigabytes (produced by review of the computers of the additional five custodians named in text).

Defendants assert on this motion that on the basis of the present record, “the review of these remaining documents will amount to nothing more than a waste of time and money. This Court should therefore enter a protective order relieving the defendant[s] from performing the requested ESI review.” Id.  . . .

Ruling in Bagley v. Yale

gavelJudge Haight, a wise senior judge who has seen and heard it all before, found that under these facts Yale had not yet made a reasonable effort to satisfy their discovery obligations in this case. He ordered Yale to review the email of four more custodians. That, he decided, would be a reasonable effort. Here is Judge Haight’s explanation of his analysis of reasonability, which, in my view, is unaffected by the 2015 Rule Amendments, specifically the change to Rule 26(b)(1).

In the case at bar, the custodians’ electronically stored information in its raw form was immediately accessible to Yale: all the University had to do was tell a professor or a dean to hand over his or her computer. But Bagley’s objective is to discover, and Defendants’ obligation is to produce, non-privileged information relevant to the issues: Yale must review the custodians’ ESI and winnow it down. That process takes time and effort; time and effort can be expensive; and the Rule measures the phrase “not reasonably accessible” by whether it exposes the responding party to “undue cost.” Not some cost: undue cost, an adjective Black’s Law Dictionary (10th ed. 2014 at 1759) defines as “excessive or unwarranted.” . . .

In the totality of circumstances displayed by the case at bar, I think it would be an abuse of discretion to cut off Plaintiff’s discovery of Defendants’ electronically stored information at this stage of the litigation. Plaintiff’s reduction of custodians, from the original 24 targeted by Defendants’ furiously worded Main Brief to the present ten, can be interpreted as a good-faith effort by Plaintiff to keep the ESI discovery within permissible bounds. Plaintiff’s counsel say in their Opposing Brief [Doc. 113] at 2: “Ironically, this last production includes some of the most relevant documents produced to date.” While relevance, like beauty, often lies in the eyes of the beholder, and Defendants’ counsel may not share the impressions of their adversaries, I take the quoted remark to be a representation by an officer of the Court with respect to the value and timing of certain evidence which has come to light during this discovery process. The sense of irritated resignation conveyed by the familiar aphorism — “it’s like looking for a needle in a haystack” — does not exclude the possibility that there may actually be a needle (or two or three) somewhere in the haystack, and sharp needles at that. Plaintiff is presumptively entitled to search for them.

As Judge Haight understood when he said that the “Plaintiff is presumptively entitled to search for them,” the search effort is actually upon the defendant, not the plaintiff. The law requires the defendant to expend reasonable efforts to search for the needles in the haystack that the plaintiff would like to be found. Of course, if those needles are not there, no amount of effort can find them. Still, no one knows that in advance (although probabilities can be calculated), whether there are hot documents left to be found, so reasonable efforts are often required to show they are not there. This can be difficult as any e-discovery lawyer well knows.

Faced with this situation most e-discovery specialists will tell you the best solution is to cooperate, or at least try. If your cooperative efforts fail and you seek relief from the court, it needs to be clear to the judge that you did try. If the judge thinks you are just another unreasonable, over-assertive lawyer, your efforts are doomed. This is apparently part of what was driving Judge Haight’s analysis of “reasonable” as the following colorful, one might say “tasty,” quote from the opinion shows:

A recipe for a massive and contentious adventure in ESI discovery would read: “Select a large and complex institution which generates vast quantities of documents; blend as many custodians as come to mind with a full page of search terms; flavor with animosity, resentment, suspicion and ill will; add a sauce of skillful advocacy; stir, cover, set over high heat, and bring to boil. Serves a district court 2-6 motions to compel discovery or for protection from it.”

Yale_pot_boiling

You have got to love a judge with wit and wisdom like that. My only comment is that truly skillful advocacy here would include cooperation, and lots of it. The sauce added in that case would be sweet and sour, not just hot and spicy. It should not give a judge any indigestion at all, much less six motions. That is one reason why Electronic Discovery Best Practices (EDBP.com) puts such an emphasis on skillful cooperation.

EDBP.com You are free to use this chart in any manner so long as you do not chnage it.

What is Reasonable?

Reasonable_man_cloudBagley shows that the dividing line between what is reasonable and thus acceptable efforts, and what is not, can often be difficult to determine. It depends on a careful evaluation of the facts, to be sure, but this evaluation in turn depends on many subjective factors, including whether one side or another was trying to cooperate. These factors include all kinds of prevailing social norms, not just cooperativeness. It also includes personal values, prejudices, education, intelligence, and even how the mind itself works, the hidden psychological influences. They all influence a judge’s evaluation in any particular case as to which side of the acceptable behavior line a particular course of conduct falls.

In close questions the subjectivity inherent in determinations of reasonability is obvious. This is especially true for the attorneys involved, the ones paid to be independent analysts and objective advisors. People can, and often do, disagree on what is reasonable and what is not. They disagree on what is negligent and what is not. On what is acceptable and what is not.

All trial lawyers know that certain tricks of argument and appeals to emotion can have a profound effect on a judge’s resolution of these supposedly reason-based disagreements. They can have an even more profound affect on a jury’s decision. (That is the primary reason that there are so many rules on what can and cannot be said to a jury.)

Study of Legal Psychology

Every good student of the law knows this, but how many attempt to study the psychological dynamics of persuasion? How many attempt to study perceptions of reasonability? Of cognitive bias? Not many, and there are good reasons for this.

First and foremost, few law professors exist that have this kind of knowledge. The only attorneys that I know of with this knowledge are experienced trial lawyers and experienced judges. They know quite a lot about this, but not from any formal or systematic study. They pick up information, and eventually knowledge on the psychological underpinnings of justice by many long years of practice. They learn about the psychology of reasonability through thousands of test cases. They learn what is reasonable by involvement in thousands of disputes. Whatever I know of the subject was learned that way, although I have also read numerous books and articles on the psychology of legal persuasion written by still more senior trial lawyers.

That is not to say that experience, trial and error, is the quickest or best way to learn these insights. Perhaps there is an even quicker and more effective way? Perhaps we could turn to psychologists and see what they have to say about the psychological foundations of perception of reasonability. After all, this is, or should be, a part of their field.

Up until now, not very much has been said from psychologists on law and reasonability, at least not to my knowledge. There are a few books on the psychology of persuasion. I made a point in my early years as a litigator to study them to try to become a better trial lawyer. But in fact, the field is surprisingly thin. There is not much there. It turns out that the fields of Law and Psychology have not overlapped much, at least not in that way.

Perhaps this is because so few psychologists have been involved with legal arguments on reasonability. When psychologists are in the legal system, they are usually focused on legal issues of sanity, not negligence, or in cases involving issues of medial diagnoses.

The blame for the wide gulf between the two fields falls on both sides. Most psychologists, especially research psychologists, have not been interested in the law and legal process. Or when they have, it has involved criminal law, not civil. See eg: Tunnel Vision in the Criminal Justice System (May 2010, Psychology Today). This disinterest has been reciprocal. Most lawyers and judges are not really interested in hearing what psychologists have to say about reasonability. They consider their work to be above such subjective vagaries.

Myth of Objectivity

Myth_ObjectivityLawyers and judges consider reasonability of conduct to be an exclusively legal issue. Most lawyers and judges like to pretend that reasonability exists in some sort of objective, platonic plane of ideas, above all subjective influences. The just decision can be reached by deep, impartial reasoning. This is the myth of objectivity. It is an article of faith in the legal profession.

The myth continues to this day in legal culture, even though all experienced trial lawyers and judges know it is total nonsense, or nearly so. They know full well the importance of psychology and social norms. They know the impact of cognitive biases of all kinds, even transitory ones. As trial lawyers like to quip – What did the judge have for breakfast?

Experienced lawyers take advantage of these biases to win cases for their clients. They know how to push the buttons of judge and jury. See Cory S. Clements, Perception and Persuasion in Legal Argumentation: Using Informal Fallacies and Cognitive Biases to Win the War of Words, 2013 BYU L. Rev. 319 (2013)Justice is sometimes denied as a result. But this does not mean judges should be replaced by robots. No indeed. There is far more to justice than reason. Still a little help from robots is surely part of the future we are making together.

More often than not the operation of cognitive biases happen unconsciously without any puppet masters intentionally pulling the strings. There is more to this than just rhetoric and sophistry. Justice is hard. So is objective ratiocination.

Even assuming that the lawyers and judges in the know could articulate their knowledge of decisional bias, they have little incentive to do so. (The very few law professors with such knowledge do have an incentive, as we see in Professor Clements’ article cited above, but these articles are rare and too academic.) Moreover, most judges and lawyers are incapable of explaining these insights in a systematic manner. They lack the vocabulary of psychology to do so, and, since they learned by long, haphazard experience, that is their style of teaching as well.

Shattering the Myth

One psychologist I know has studies these issues and share his insights. They are myth shattering to be sure, and thus will be unwelcome to some idealists. But for me this is a much-needed analysis. The psychologist who has dared to expose the myth, to lift the curtain, has worked with lawyers for over a decade on discovery issues. He has even co-authored a law review article on reasonability with two distinguished lawyers. Oot, Kershaw, Roitblat, Mandating Reasonableness in a Reasonable Inquiry, Denver University Law Review, 87:2, 522-559 (2010).

Herb RoitblatI am talking about Herbert L. Roitbalt, who has a PhD in psychology. Herb did research and taught psychology for many years at the University of Hawaii. Only after a distinguished career as a research psychologist and professor did Herb turn his attention to computer search in general and then ultimately to law and legal search. He is also a great admirer of dolphins.

Schlemiel and Schlimazel

Herb has written a small gem of a paper on law and reasonability that is a must read for everyone, especially those who do discovery. The Schlemiel and the Schlimazel and the Psychology of Reasonableness (Jan. 10, 2014, LTN) (link is to republication by a vendor without attribution). I will not spoil the article by telling you Herb’s explanation of the Yiddish terms, Schlemiel and Schlimazel, nor what they have to do with reasonability and the law, especially the law of spoliation and sanctions. Only a schmuck would do that. It is a short article; be a mensch and go read it yourself. I will, however, tell you the Huffington Post definition:

A Schlemiel is an inept clumsy person and a Schlimazel is a very unlucky person. There’s a Yiddish saying that translates to a funny way of explaining them both. A schlemiel is somebody who often spills his soup and a schlimazel is the person it lands on.

This is folk wisdom for what social psychologists today call attribution error. It is the tendency to blame your own misfortune on outside circumstances beyond your control (the schlimazel) and blame the misfortune of others on their own negligence (the schlemiel). Thus, for example, when I make a mistake, it is in spite of my reasonable efforts, but when you make a mistake it is because of your unreasonably lame efforts. It is a common bias that we all have. The other guy is often unreasonable, whereas you are not.

Herb Roitblat’s article should be required reading for all judges and lawyers, especially new ones. Understanding the many inherent vagaries of reasonability could, for instance, lead to a much more civil discourse on the subject of sanctions. Who knows, it could even lead to cooperation, instead of the theatre and politics we now see everywhere instead.

Hindsight Bias

Roitblat’s article contains a two paragraph introduction to another important psychological factor at work in many evaluations of reasonability: Hindsight Bias. This has to do with the fact that most legal issues consider past decisions and actions that have gone bad. The law almost never considers good decisions, much less great decisions with terrific outcomes. Instead it focuses on situations gone bad, where it turns out that wrong decisions were made. But were they necessarily negligent decisions?

The mere fact that a decision led to an unexpected, poor outcome does not mean that the decision was negligent. But when we examine the decision with the benefit of 20/20 hindsight, we are naturally inclined towards a finding of negligence. In the same way, if the results prove to be terrific, the hindsight bias is inclined to perceive most any crazy decision as reasonable.

Due to hindsight bias, we all have, in Rotiblat’s words:

[A] tendency to see events that have already occurred as being more predictable than they were before they actually took place. We over-estimate the predictability of the events that actually happened and under-estimate the predictability of events that did not happen.  A related phenomenon is “blame the victim,” where we often argue that the events that occurred should have been predicted, and therefore, reasonably avoided.

Hindsight bias is well known among experienced lawyers and you will often see it argued, especially in negligence and sanctions cases. Every good lawyer defending such a charge will try to cloak all of the mistakes as seemingly reasonable at the time, and any counter-evaluation as merely the result of hindsight bias. They will argue, for instance, that while it may now seem obvious that wiping the hard drives would delete relevant evidence, that is only because of the benefit of hindsight, and that it was not at all obvious at the time.

Judge_Lee_RosenthalGood judges will also sometimes mention the impact of 20/20 hindsight, either on their own initiative, or in response to defense argument. See for instance the following analysis by Judge Lee H. Rosenthal in Rimkus v Cammarata, 688 F. Supp. 2d 598 (S.D. Tex. 2010):

These general rules [of spoliation] are not controversial. But applying them to determine when a duty to preserve arises in a particular case and the extent of that duty requires careful analysis of the specific facts and circumstances. It can be difficult to draw bright-line distinctions between acceptable and unacceptable conduct in preserving information and in conducting discovery, either prospectively or with the benefit (and distortion) of hindsight. Whether preservation or discovery conduct is acceptable in a case depends on what is reasonable ,and that in turn depends on whether what was done–or not done–was proportional to that case and consistent with clearly established applicable standards.  [FN8] (emphasis added)

Judge Shira A. Scheindlin also recognized the impact hindsight in Pension Committee of the University of Montreal Pension Plan, et al. v. Banc of America Securities, LLC, et al., 685 F. Supp. 2d 456 (S.D.N.Y. Jan. 15, 2010 as amended May 28, 2010) at pgs. 463-464:

While many treatises and cases routinely define negligence, gross negligence, and willfulness in the context of tortious conduct, I have found no clear definition of these terms in the context of discovery misconduct. It is apparent to me that these terms simply describe a continuum. FN9 Conduct is either acceptable or unacceptable. Once it is unacceptable the only question is how bad is the conduct. That is a judgment call that must be made by a court reviewing the conduct through the backward lens known as hindsight. It is also a call that cannot be measured with exactitude and might be called differently by a different judge. That said, it is well established that negligence involves unreasonable conduct in that it creates a risk of harm to others, but willfulness involves intentional or reckless conduct that is so unreasonable that harm is highly likely to occur. (emphasis added)

The relatively well-known backward lens known as hindsight can impact anyone’s evaluation of reasonability. But there are many other less obvious psychological factors that can alter a judge or jury’s perception. Herb Roitblat mentions a few more such as the overconfidence effect, where people tend to inflate their own knowledge and abilities, and framing, an example of cognitive bias where the outcome of questions is impacted by the way they are asked. The later is one reason that trial lawyers fight so hard on jury instructions and jury interrogatories.

Conclusion

Ralph_4-25-16Many lawyers are interested in this law-psych intersection and the benefits that might be gained by cross-pollination of knowledge. I have a life-long interest in psychology, and so do many others, some with advanced degrees. That includes my fellow predictive coding expert, Maura R. Grossman, an attorney who also has a Ph.D. in Clinical/School Psychology. A good discovery team can use all of the psychological insights it can get.

The myth of objectivity and the “Reasonable Man” in the law should be exposed. Many naive people still put all of their faith in legal rules and the operation of objective, unemotional logic. The system does no really work that way. Outsiders trying to automate the law are misguided. The Law is far more than logic and reason. It is more than the facts, the surrounding circumstances.nit is more than evidence. It is about people and by people. It is about emotion and empathy too. It is about fairness and equity. It’s prime directive is justice, not reason.

That is the key reason why AI cannot automate law, nor legal decision making. Judge Charles (“Terry”) Haight could be augmented and enhanced by smart machines, by AI, but never replaced. The role of AI in the Law is to improve our reasoning, minimize our schlemiel biases. But the robots will never replace lawyers and judges. In spite of the myth of the Reasonable Man, there is far more to law then reason and facts. I for one am glad about that. If it were otherwise the legal profession would be doomed.


%d bloggers like this: