Lawyers as Legal-Fortune Tellers

March 30, 2014

crystal_ball_IBMMost lawyers predict the future as part of their every day work. The best lawyers are very good at it. Indeed, the top lawyers I have worked with have all been great prognosticators, at least when it comes to predicting litigation outcomes. That is why concepts of predictive coding come naturally to them. Since they already do probability analysis as part of their work, it is easy for them to accept the notion that new software can extend these forward-looking skills. They are not startled by the ability of predictive analytics to discover evidence.

Although these lawyers will not know how to operate predictive coding software, nor understand the many intricacies of computer assisted search, they will quickly understand the concepts of probability relevance predictions. This deep intuitive ability is found in all good transactional and litigation attorneys. Someday soon AI and data analytics, perhaps in the form as Watson as a lawyer, will significantly enhance all  lawyer’s abilities. It will not only help them to find relevant evidence, but also to predict case outcomes.

Transactional Lawyers and Future Projections

crystal-ball.ESCHER.Losey2A good contract lawyer is also a good prognosticator. They try to imagine all of the problems and opportunities that may arise from a new deal. The lawyer will help the parties foresee issues that he or she thinks are likely to arise in the future. That way the parties can address the issues in advance. The lawyers include provisions in the agreement to implement the parties intent. They predict events that may, or may not, ever come to pass. Even if it is a new type of deal, one that has never been done before, they try to predict the future of what is likely to happen. I recall doing this when I helped create some of the first Internet hosting agreements in the mid-nineties. (We started off making them like shopping center agreements and used real estate analogies.)

Contract lawyers become very good at predicting the many things that might go wrong and provide specific remedies for them. Many of the contractual provisions based on possible future events are fairly routine. For instance, what happens if a party does not make a payment? Others are creative and pertain to specific conduct in the agreement. Like what happens if any party loses any shared information? What disclosure obligations are triggered? What other curative actions? Who pays for it?

Most transactional lawyers focus on the worst case scenario. They write contract provisions that try to protect their clients from major damages if bad things happen. Many become very good at that. Litigators like myself come to appreciate that soothsaying gift. When a deal goes sour, and a litigator is then brought in to try to resolve a dispute, the first thing we do is read the contract. If we find a contract provision that is right on point, our job is much easier.

Litigation Lawyers and Future Projections

magic_8_ball_animatedIn litigation the prediction of probable outcomes is a constant factor in all case analysis. Every litigator has to dabble in this kind of future prediction. The most basic prediction, of course, is will you win the case? What are the probabilities of prevailing? What will have to happen in order to win the case? How much can you win or lose? What is the probable damage range? What is the current settlement value of the case? If we prevail on this motion, how will that impact settlement value? What would be the best time for mediation? How will the judge rule on various issues? How will the opposing counsel respond to this approach? How will this witness hold up under the pressure of deposition?

All litigation necessarily involves near constant probability analysis. The best litigators in the world become very good at this kind of future projection. They can very accurately predict what is likely to happen in a case. Not only that, they can provide pretty good probability ranges for each major future event. It becomes a part of their everyday practice.

Clients rely on this analysis and come to expect their lawyers to be able to accurately predict what will happen in court. Trust develops as they see their lawyer’s predictions come true. Eventually clients become true believers in their legal oracles. They even accept it when they are told from time to time that no reasonable prediction is possible, that anything might happen. They also come to accept that there are no certainties. They get used to probability ranges, and so do the soothsaying lawyers.

Good lawyers quickly understand the limits of all predictions. A successful lawyer will never say that anything will certainly happen, well almost never. Instead the lawyer almost always speaks in terms of probabilities. For instance, they rarely say we cannot lose this motion, only that loss is highly unlikely. That way they are almost never wrong.

Insightful or Wishful

Professor Jane Goodman-Delahunty, JD, PhD.

Professor Jane Goodman-Delahunty, JD, PhD, Australia.

An international team of law professors have looked into the legal-fortune telling aspects of lawyers and litigation. Goodman-Delahunty, Granhag, Hartwig, Loftus, Insightful or Wishful: Lawyers’ Ability to Predict Case Outcomes, (Psychology, Public Policy, and Law, 2010, Vol. 16, No. 2, 133–157). This is the introduction to their study:

In the course of regular legal practice, judgments and meta-judgments of future goals are an important aspect of a wide range of litigation-related decisions. (English & Sales, 2005). From the moment when a client first consults a lawyer until the matter is resolved, lawyers must establish goals in a case and estimate the likelihood that they can achieve these goals. The vast majority of lawyers recognize that prospective judgments are integral features of their professional expertise. For example, a survey of Dutch criminal lawyers acknowledged that 90% made predictions of this nature in some or all of their real-life cases (Malsch, 1990). The central question addressed in the present study was the degree of accuracy in lawyers’ forecasts of case outcomes. To explore this question, we contacted a broad national sample of U.S. lawyers who predicted their chances of achieving their goals in real-life cases and provided confidence ratings in their predictions.

Assoc. Professor Maria Hartwig, PhD,  Sweden, Psychology and Law

Assoc. Professor Maria Hartwig, PhD, Psychology & Law, Sweden

Prediction of success is of paramount importance in the system for several reasons. In the course of litigation, lawyers constantly make strategic decisions and/or advise their clients on the basis of these predictions. Attorneys make decisions about future courses of action, such as whether to take on a new client, the value of a case, whether to advise the client to enter into settlement negotiations, and whether to accept a settlement offer or proceed to trial. Thus, these professional judgments by lawyers are influential in shaping the cases and the mechanisms selected to resolve them. Clients’ choices and outcomes therefore depend on the abilities of their counsel to make reasonably accurate forecasts concerning case outcomes. For example, in civil cases, after depositions of key witnesses or at the close of discovery, the parties reassess the likelihood of success at trial in light of the impact of these events.

Professor Pär Anders Granhag, Ph.D. Psychology, Sweden

Professor Pär Anders Granhag, PhD, Psychology, Sweden

In summary, whether lawyers can accurately predict the outcome of a case has practical consequences in at least three areas: (a) the lawyer’s professional reputation and financial success; (b) the satisfaction of the client; and (c) the justice environment as a whole. Litigation is risky, time consuming, and expensive. The consequences of judgmental errors by lawyers can be costly for lawyers and their clients, as well as an unnecessary burden on an already overloaded justice system. Ultimately, a lawyer’s repute is based on successful calculations of case outcome. A lawyer who advises clients to pursue litigation without delivering a successful outcome will not have clients for long. Likewise, a client will be most satisfied with a lawyer who is accurate and realistic when detailing the potential outcomes of the case. At the end of the day, it is the accurate predictions of the lawyer that enable the justice system to function smoothly without the load of cases that were not appropriately vetted by the lawyers.

Elizabeth F. Loftus, Professor of Social Ecology, and Professor of Law, and Cognitive Science Ph.D., Stanford University

Elizabeth F. Loftus, Professor of Social Ecology, Law and Cognitive Science, PhD., California

The law professors found that a lawyer’s prognostication ability does not necessarily come from experience. This kind of legal-fortune telling appears to be a combination of special gift, knowledge, and learned skills. It certainly requires more than just age and experience.

The law professor survey showed two things: (1) that lawyers as a whole tend to be overconfident in their predictions of favorable outcomes, and, (2) that experienced lawyers do not on average do a better job of predicting outcomes than inexperienced lawyers. Insightful or Wishful (“Overall, lawyers were over-confident in their predictions, and calibration did not increase with years of legal experience”). The professors also found that women lawyers tend to be better at future projection than men, so too did specialists over generalists.

Experience should make lawyers better prognosticators, but it does not. Their ego gets in the way. The average lawyer does not get better at predicting case outcomes with experience because they get over-confident with experience. They remember the victories and rationalize the losses. They delude themselves into thinking that they can control things more than they can.

I have seen this happen in legal practice time and time again. Indeed, as a young lawyer I remember surprising senior attorneys I went up against. They were confident, but wrong. My son is now having the same experience. The best lawyers do not fall into the over confidence trap with age. They encourage their team to point out issues and problems, and to challenge them on strategy and analysis. The best lawyers I know tend to err on the side of caution. They are typically glass half empty types.They remember the times they have been wrong.

How Lawyers Predict The Future

SoothsayerAccurate prediction of future events by lawyers, or anyone for that matter, requires deep understanding of process, rules, and objective analysis. Deep intuitive insights into the people involved also helps. Experience assists too, but only in providing a deep understanding of process and rules, and knowledge of relevant facts in the past and present. Experience alone does not necessarily assist in analysis for the reasons discussed. Effective analysis has to be objective. It has to be uncoupled from personal perspectives and ego inflation.

The best lawyers understand all this, even if they may not be able to articulate it. That is how they are able to consistently and accurately calibrate case outcomes, including, when appropriate, probable losses. They do not take it personally. Accurate future vision requires not only knowledge, but also objectivity, humility, and freedom from ulterior motives. Since most lawyers lack these qualities, especially male lawyers, they end up simply engaging in wishful thinking.

The Insightful or Wishful study seems to have proven this point. (Note my use of the word seems, a typical weasel word that lawyers are trained to use. It is indicative of probability, as opposed to certainty, and protects me from ever being wrong. That way I can maintain my illusion of omnipotence.)

The best lawyers inspire confidence, but are not deluded by it. They are knowledgable and guided by hard reason, coupled with deep intuition into the person or persons whose decisions they are trying to predict. That is often the judge, sometimes a jury too, if the process gets that far (less than 1% of the cases go to trial). It is often opposing counsel or opposing parties, or even individual witnesses in the case.

All of these players have emotions. Unlike Watson, the human lawyers can directly pick up on these emotions. The top lawyers understand the non-verbal flows of energy, the irrational motivations. They can participate in them and influence them.

If lawyers with these skills can also maintain objective reason, then they can become the best in their field. They can become downright uncanny in their ability to both influence and forecast what is likely to happen in a law suit. Too bad so few lawyers are able to attain that kind of extremely high skill level. I think most are held back by an incapacity to temper their emotions with objective ratiocination. The few that can, rarely also have the emphatic, intuitive skills.

Watson as Lawyer Will be a Champion Fortune Teller

Is Watson coming to Legal Jeopardy?

Is Watson coming to Legal Jeopardy?

The combination of impartial reason and intuition can be very powerful, but, as the law professor study shows, impartial reason is a rarity reserved for the top of the profession. These are the attorneys who understand both reason and emotion. They know that the reasonable man is a myth. They understand the personality frailties of being human. Scientific Proof of Law’s Overreliance On Reason: The “Reasonable Man” is Dead, Long Live the Whole Man, Parts OneTwo and Three; and The Psychology of Law and Discovery.

I am speaking about the few lawyers who have human empathy, and are able to overcome their human tendencies towards overconfidence, and are able to look at things impartially, like a computer. Computers lack ego. They have no confidence, no personality, no empathy, no emotions, no intuitions. They are cold and empty, but they are perfect thinking machines. Thus they are the perfect tool to help lawyers become better prognosticators.

This is where Watson the lawyer comes in. Someday soon, say the next ten years, maybe sooner, most lawyers will have access to a Watson-type lawyer in their office. It will provide them with objective data analysis. It will provide clear rational insights into likely litigation outcomes. Then human lawyers can add their uniquely human intuitions, empathy, and emotional insights to this (again ever mindful of overconfidence).

The AI-enhanced analysis will significantly improve legal prognostications. It will level the playing field and up everyone’s game in the world of litigation. I expect it will also have the same quality improvement impact on contract and deal preparations. The use of data analytics to predict the outcome in patent cases is already enjoying remarkable success with a project called Lex Machina. The CEO of Lex Machina, Josh Becker, calls his data analytics company the moneyball of IP litigation. Tam Harbert, Supercharging Patent Lawyers With AI. Here is the Lex Machina description of services:

We mine litigation data, revealing insights never before available about judges, lawyers, parties, and patents, culled from millions of pages of IP litigation information.

Many corporations are already using the Lex Machina’s analytics to help them to select litigation counsel most likely to do well in particular kinds of patent cases, and with particular courts and judges. Law firms are mining the past case data for similar reasons.

Conclusion

Oracle_delphiHere is my prediction for the future of the legal profession. In just a few more years, perhaps longer, the linear, keyword-only evidence searchers will be gone. They will be replaced by multi-modal, predictive coding based evidence searchers. In just a decade, perhaps longer (note weasel word qualifier), all lawyers will be obsolete who are not using the assistance of artificial intelligence and data analytics for general litigation analysis.

Lawyers in the future who overcome their arrogance, their overconfidence, and accept the input and help of Watson-type robot lawyers, will surely succeed. Those who do not, will surely go the way of linear, keyword-only searchers in discovery today. These dinosaurs are already being replaced by AI-enhanced searchers and AI-enhanced reviewers. I could be overconfident, but that is what I am starting to see. It appears to me to be an inevitable trend pulled along by larger forces of technological change. If you think I am missing something, please leave a comment below.

This rapid forced evolution is a good thing for the legal profession. It is good because the quality of legal practice will significantly improve as the ability of lawyers to make more accurate predictions improves. For instance, the justice system will function much more smoothly when it does not have to bear the load of cases that have not been appropriately vetted by lawyers. Fewer frivolous and marginal cases will be filed that have no chance of success, except for in the deluded minds of second rate attorneys. (Yes, that is what I really think.) These poor prognosticators will be aided by robots to finally recognize a hopeless case. That is not to say that good lawyers will avoid taking any high risk cases. I think they should and I believe they will. But the cases will be appropriately vetted with realistic risk-reward analysis. The clients will not be seduced into them with false expectations.

8-ball_Sue_advice

With data analytics unnecessary motions and depositions will be reduced for the same reason. The parties will instead focus on the real issues, the areas where there is bona fide dispute and uncertainty. The Watson type legal robots will help the judges as well. With data analytics and AI, more and more lawyers and judges will be able to follow Rule 1 of the Federal Rules of Civil Procedure. Then just, speedy, and inexpensive litigation will be more than a remote ideal. The AI law robots will make lawyers and judges smart enough to run the judicial system properly.

Artificial intelligence and big data analytics will enable all lawyers to become excellent outcome predictors. It will allow all lawyers to move their everyday practice from art to science, much like predictive coding has already done for legal search.


Best Practices in e-Discovery for Handling Unreviewed Client Data

March 16, 2014

Hacker_closeupBig data security, hackers, and data breaches are critical problems facing the world today, including the legal profession. That is why I have focused on development of best practices for law firms to handle large stores of client data in e-discovery. The best practice I have come up with is simple. Do not do it. Outsource.

Attorneys should only handle evidence. Law firms should not take possession of large, unprocessed, unreviewed stores of client data, the contents of which are typically unknown. They should not even touch it. They should stay out of the chain of custody. Instead, lawyers should rely on professional data hosting vendors that have special expertise and facilities designed for data security. In today’s world, rife as it is with hackers and data breaches, hosting is a too dangerous and complex a business for law firms. The best practice is to delegate to security professionals the hosting of large stores of unreviewed client data.

Although it is still a best practice for knowledgable lawyers to control the large stores of client data collected for preservation and review, they should limit actual possession of client data. Only after electronic evidence has been reviewed and identified as relevant, or probable-relevant, should a law firm take possession. Before that, all large stores of unidentified client data, such as a custodian’s email box, should only be handled by the client’s IT experts, and professional data hosting companies, typically e-discovery vendors. The raw data should go directly from the client to the vendor. Alternatively, the client should never let the data leave its own premises. It should host the data on site for review by outside counsel. Either way, the outside law firm should not touch it, and certainly should not host it on the law firm’s computer systems. Instead, lawyers should search and review the data by secure online connections.

This outsourcing arrangement is, in my opinion, the best practice for law firm handling of large stores of unreviewed client data. I know that many private law firms, especially their litigation support departments, will strongly disagree.

focusLaw firms should stick to providing legal services, a position I have stated several times before. Losey, R., Five Reasons to Outsource Litigation Support (LTN, Nov. 2, 2012); WRECK-IT RALPH: Things in e-discovery that I want to destroy!Going “All Out” for Predictive Coding and Vendor Cost Savings. Data hosting is a completely different line of work, and is very hazardous in today’s world of hacking and data breaches.

Best Practice: Full Control, But Limited Possession

Again, to be clear, law firms must have actual possession of evidence, including original client documents. Lawyers cannot do their job without that. But lawyers do not need possession of vast hordes of unidentified, irrelevant data. The best practice is for law firms to control such client data, but to do so without taking possession. Attorneys should limit possession to the evidence.

Only after the large stores of client’s raw data have been searched, and evidence identified, should the digital evidence be transferred to the law firm for hosting and use in the pending legal matter. In other words, lawyers and law firms only need the signal, they do not need the noise. The noise – the raw data not identified as evidence or possible evidence – should be returned to the client, or destroyed. Typically this return or destruction is delayed pending the final outcome of the matter, just in case a research of the raw data is required.

I know this is a very conservative view. My law firm may well be the only AmLaw 100 firm that now has this rule. This hands-off rule as to all large stores of ESI is a radical departure from the status quo. But even if no other large law firm in the world now does this, that does not mean such outsourcing is wrong. It just means we are the first.

tugboat_1930s

Remember the T.J. Hooper, the tugboat with valuable barge that sunk at sea because they were not equipped with radios to warn them of an approaching storm? The case involving this tragic loss of life and property is required reading in every law school torts class in the country. T. J. Hooper 60 F.2d 737 (2d Cir. 1932) (J. Hand).

Sometimes a whole profession can lag behind the times. There is no safety in numbers. The only safety is in following best practices that make sense in today’s environment.  Although law firm hosting of large data stores of client data once made sense, I no longer think it does. The high amount of data and security threats in today’s environment makes it too risky for me to continue to accept this as a best practice.

Current Practice of Most Law Firms

data-breachMost of the legal profession today, including most private attorneys and their law firms, collect large stores of ESI from their clients when litigation hits. This is especially true in significant cases. They do so for preservation purposes and in the hopes they may someday find relevant evidence. The law firms take delivery of the data from their clients. They hold the entire haystack, even though they only need the few needles they hope are hidden within. They insert themselves into the chain of custody. This needs to stop.

Corporate counsel often make the same mistake. The data may go from the client IT, to the client legal department, and then to the outside counsel. Three hands, at least, have by then already touched the data. Sometimes the metadata changes and sanctions motions follow.

It gets even worse from there, much worse. When the data arrives at the law firm, the firm typically keeps the data. The data is sent by the client on CDs, DVDs, thumb drives, or portable USB drives. Sometimes FTP transfer is used. It is received by the outside attorney, or their assistant, or paralegal, or law firm office manager, or tech support person. We are talking about receipt of giant haystacks of information, remember, not just a few hundred, or few thousand documents, but millions of documents, and other computer files. The exact contents of these large collections is unknown. Who knows, they might contain critical trade secrets of the company. They almost certainly contain some protected information. Perhaps a lot of protected information. Regardless, all of it must be treated as confidential and protected from disclosure, except by due process in the legal proceeding.

After the law firm receives the client’s confidential data one of three things typically happen:

1.  The law firm forwards the data to a professional data processing and hosting company and deletes all copies from its system, and, for example, does not keep a copy of the portable media on which the larges stores of ESI were received. This is not a perfect best practice because the law firm is in the chain of custody, but it is far better than the next two alternatives, which is what usually happens in most firms.

2.  The law firm again forwards the data to a professional data processing and hosting company, but does not delete all copies from its system, and, for example, keeps a copy of the portable media on which the larges stores of ESI were received. This is a very common practice. Many attorneys think this is a good practice because that way they have a backup copy, just in case. (The backup should be kept by the client IT as part of the collection and forwarding, not the law firm.) I used to do this kind of thing for years, until one day I realized how it was all piling up. I realized the risk from holding thousands of PST files and other raw unprocessed client data collections. I was literally holding billions of emails in little storage devices in my office or in subdirectories of one of my office computers. Trillions more were on our firm’s litigation support computers, which bring us to the third, worst case scenario, where the data is not forwarded to a vendor.

3.  In this third alternative, which is the most common practice in law firms today, and the most dangerous, the law firm keeps the data. All it does is transfer the data from the receiving attorney (or secretary) to another department in the law firm, typically called Litigation Support. The Litigation Support Department, or whatever name the law firm may choose to call it, than holds the billions of computer files, contents unknown, on law firms computers, and storage closets, hopefully locked. Copies are placed on law firm servers, so that some attorneys and paralegals in the firm can search them for evidence. Then they often multiply by backups and downloads. They stay in the firm’s IT systems until the case is over.

Rows of server cages

At that time, in theory at least, they are either returned to the client or destroyed. But in truth this often never happens and raw data tends to live on and on in law firm computers, back up tapes, personal hard drives, DVDs, etc. Some people call that dark data. Most large law firms have tons of client dark data like that. It is a huge hidden liability. Dark or not it is subject to subpoena. Law firm’s can be forced to search and produce from these stores of client data. I know of one firm forced to spend over a million dollars to review such data for privilege before production to the government. The client was insolvent and could not pay, but still the firm had to spend the money to protect the privileged communications.

Dangers of Data Intrusions of Law Firms

Data-Breach-climbThese practices are unwise and pose a serious risk to client data security, a risk that grows bigger each day. The amount of data in the world doubles every two years, so this problem is getting worse as the amount of data held for litigation grows at an exponential rate. The sophistication of data thieves is also growing. The firewall that law firms think protect their client’s data is child play to some hackers. The security is an illusion. It is only a matter of time before disaster strikes and a large store of client data is stolen. The damages from even an average sized data breach can be extensive, as the below chart shows.

Data_breach_cost

Client data is usually held by law firms on their servers so that their attorneys can search and review the data as part of e-discovery. As IT security experts know, servers are the ultimate target at the end of a lateral kill chain that advanced persistent threat (APT)-type attackers pursue. Moreover, servers are the coveted prize of bot herders seeking persistent access to high-capacity computing. Application control and comprehensive vulnerability management are essential to breaking the lateral kill chain of attackers. You do not follow all of this? Never seen a presentation titled Keeping Bot Herders Off Your Servers and Breaking the Lateral Kill Chain of Today’s Attackers? Of course not. I do not really understand this either. IT security has become a very specialized and complex field. That is one of my key points here.

Law firms are the soft underbelly of corporate data security. More and more bad hackers are realizing the vulnerability of law firms and beginning to exploit it. So many lawyers are technically naive. They do not yet see the danger of hacking, nor the severity and complexity of issues surrounding data security.

Sharon_NelsonSharon Nelson, President of the Virginia State Bar and well known expert in this area, has been warning about this threat to law firms for years. In 2012 her warnings were echoed by the FBI. FBI Again Warns Law Firms About the Threat From Hackers. Mary Galligan, the special agent in charge of cyber and special operations for the FBI’s New York Office, is reported by Law Technology News as saying: We have hundreds of law firms that we see increasingly being targeted by hackers. Bloomberg’s Business Week quoted Galligan as saying: As financial institutions in New York City and the world become stronger, a hacker can hit a law firm and it’s a much, much easier quarryChina-Based Hackers Target Law Firms to Get Secret Deal Data (Bloomberg 1/31/12).

If lawyers are in a big firm, their client’s data may already have been hacked and they were never told about it. According to Sharon Nelson’s report on a survey done in 2013, 70% of large firm lawyers do not know if their firm has ever been breached. The same survey reported that 15% of the law firms have experienced a security breach. That’s right. Fifteen percent of the law firms surveyed admitted to having discovered a computer security intrusion of some kind.

Sharon said that the survey confirmed what her company Sensei Enterprises already knew from decades of experience with lawyers and data security. She reports that most law firms never tell their attorneys when there has been a breach. Your law firm may already have been hacked multiple times. You just do not know about it. Sharon, never an attorney to mince words, went on to say in her excellent blog, Ride the Lightning:

We often hear “we have no proof that anything was done with client data” in spite of the fact that the intruders had full access to their network. Our encounters with these breaches indicate that if law firms can keep the breach quiet, they will.

They will spend the money to investigate and remediate the breach, but they will fail to notify clients under state data breach laws and they won’t tell their own lawyers for fear the data breach will become public. Is that unethical? Probably. Unlawful? Probably. But until there is a national data breach law with teeth, that approach to data breaches is unlikely to change.

Someday a breach will go public. A big data breach and loss by just one law firm could quickly make the whole profession as conservative as me when it comes to big data and confidentiality. All it would take is public disclosure of one large data breach of one large law firm, especially if the ESI lost or stolen included protected information requiring widespread remedial action. Then everyone will outsource hosting to specialists.

What if a law firm happened to have credit card information and it was stolen from the law firm? Or worse yet, what if the client data was lost when a lawyer misplaced his brief case with a portable hard drive? This would a nightmare for any law firm, even if it did not get publicized. Why take that risk? That is my view. I am sounding the alarm now on big data security so that the profession can change voluntarily without the motivation of crisis.

Outsource To Trusted Professionals

Data_monitoringI have never seen a law firm with even close to the same kind of data security protocols that I have seen with the top e-discovery vendors. Law firms do not have 24/7 human in-person monitoring of all computer systems. They do not have dozens of video cameras recording all spaces where data is maintained. They do not have multiple layers of secured clean rooms, with iris scans and finger print scans, and other super high-tech security systems. You have seen this kind of thing in movies I’m sure, but not in your law firm.

Some vendors have systems like that. I know. I have seen them. As part of my due diligence for my firm’s selection of Kroll Ontrack, I visited their secure data rooms (well, some of them; others I was not allowed in). These were very cold, very clean, very secure rooms where the client data is stored. I am not even permitted to disclose the general location of these secure rooms. They are very paranoid about the whole thing. I like that. So do our clients. This kind of data security does not come cheap, but it is money well spent. The cheapest vendor is often a false bargain.

Data_SecurityHave you seen your vendor’s secure rooms? Does your law firm have anything like that? How many technical experts in data security does your firm employ? Again, I am not referring to legal experts, but to computer engineers who specialize in hacker defenses? The ones who know about the latest intrusion detection systems, viruses, bot herders, and breaking a lateral kill chain of attackers. Protecting client data is a serious business and should be treated seriously.

Any data hosting company that you choose should at least have independent certifications of security and other best practices based on audits. The ones I know about are the ISO/IEC 27000 series and the SSAE 16 SOC 2 certification. Is your law firm so certified? Your preferred vendor?

The key question here in choosing vendors is do you know where your client’s data is? In the clouds somewhere within your vendor’s control is not an acceptable answer, at least not for anyone who takes data security seriously. As a best practice you should know, and you should have multiple assurances, including third party certifications and input from security experts. In large matters, or when selecting a preferred vendor, you should also make a personal inspection, and you should verify adequate insurance coverage. You want to see cyber liability insurance. Remember, even the NSA screws up from time to time. Are you covered if this happens?

Client data security should be job number one for all e-discovery lawyers. I know it is for me, which is why I take this conservative hands-off position.

Most Law Firms Do a Poor Job of Protecting Client Data

Computer security conceptFrom what I have seen, very few law firms have highly secure client data hosting sites. Most do not even have reliable, detailed data accounting for checking in and out client data. The few that do, rarely enforce it. They rarely (never?) audit attorneys and inspect their offices and equipment to verify that they do not have copies of client data on their hard drives and DVDs, etc. In most law firms a person posing as a janitor could rummage through any office undisturbed, and probably even gain access to confidential computers. Have you ever seen all the sticky notes with passwords around the monitors of many (most?) attorneys.

Attorneys and law firms can and should be trusted to handle evidence, even when that may sometimes included hundreds of thousands of electronic and paper files. But they should not be over-burdened with the duty to also host large volumes of raw unprocessed data. Most are simply not up to the task. That is not their skill set. It is not part of the legal profession. It is not a legal service at all. Secure data hosting is a highly specialized computer engineering service, one that requires an enormous capital investment and constant diligence to do correctly. I do not think law firms have made that kind of investment, nor do I think they should. Again, it is beyond our core competence. We provide legal services, not data hosting.

Even data hosting by the best professionals is not without its risks. Just ask the NSA about the risks of rogue employees like Snowden. Are law firms equipped to mitigate these risks? Are they even adequately insured to deal with losses if client data is lost or stolen? I doubt it, and yet only a few more sophisticated clients even think to ask.

Is your law firm ready? Why even put yourself in that kind of risky position. Do you really make that much money in e-discovery charges to your clients? Is that profit worth the risk?

Ethical Considerations

This issue also has ethical implications. We are talking about protecting the confidentiality of client data. When it comes to issues like this I think the best practice is to take a very conservative view. The governing ethical rule for lawyers is Rule 1.6 of the ABA Model Rules of Professional Conduct. Subsection (c) of this rule applies here:

(c)  A lawyer shall make reasonable efforts to prevent the inadvertent or unauthorized disclosure of, or unauthorized access to, information relating to the representation of a client.

justice_guage_negligenceAgain we are faced with the difference between reasonable efforts and best practices. The ABA and most lawyers agree that Rule 1.6 allows a law firm to take possession of the raw, unreviewed client data, no matter what the size, so long as certain minimum “reasonable efforts” are made to safeguard the data. I do not disagree with this. I am certainly not attempting to create a new, higher standard for professional malpractice. It is not negligent for a law firm to possess large stores of unreviewed client data, although it could be, if rudimentary safeguards were not in place. My position is that it is no longer a best practice to do so. The best practice is now to outsource to reliable professionals who specialize in this sort of thing.

Conclusion

HackerLaw firms are in the business of providing legal services, not data hosting. They need to handle and use evidence, not raw data. Lawyers and law firms are not equipped to maintain and inventory terabytes of unknown client data. Some firms have petabytes of client data and seem to be very pleased with themselves about it. They brag about it. They seem oblivious of the risks. Or, at the very least, they are over confident. That’s something that bad hackers look for. Take a conservative view like I do and outsource this complex task. That is the best practice in e-discovery for handling large stores of unreviewed client data.

I sleep well at night knowing that if Anonymous or some other hacker group attacks my firm, and penetrates our high security, as they often do with even the best defenses of military security systems, that they will not get a treasure trove of client data.

Data_privacy_wordsThis does not mean law firms should be lax in handling their own data and communications. They must be hyper-vigilant in this too. Security and hacker defense is everyone’s concern. Law firms should focus on defense of their own information. Firms should not compound their problems by vastly increasing the size and value of their targets. Law firms are the soft underbelly of corporate data security because of the information of their corporate clients that most of them hold.

Although some hackers may be hired by litigants for purposes of illegal discovery of privileged communications and work product, most are not. They are after money and valuable trade secrets. The corporate stashes are the real target. If these potential treasure troves of data must leave a corporation’s possession, be sure they are in the hands of professional big data security experts. Do not set yourself up to be the next hacker victim.



IT-Lex Discovers a Previously Unknown Predictive Coding Case: “FHFA v. JP Morgan, et al”

March 2, 2014

brain_gearsThe researchers at IT-Lex have uncovered a previously unknown predictive coding case out of the SDNY, Federal Housing Finance Authority v. JP Morgan Chase & Co., Inc. et al. The et al here includes just about every other major bank in the world, each represented by one of the top 25 mega law firms in the country. The interesting orders approving predictive coding were entered in 2012, yet, until now, no one has ever talked about FHFA v JP Morgan. That is amazing considering the many players involved.

The two main orders in the case pertaining to predictive coding, are here (order dated July 24, 2012), and here (order dated July 31, 2012). I have highlighted the main passages in these long transcripts. These are Ore Tenus orders, but orders none the less. The Pacer file is huge, so IT-Lex may have missed others, but we doubt it. The two key memorandums underlying the orders are by the defendant, JP Morgan’s attorneys, Sullivan & Cromwell, dated July 20, 2012, and by the plaintiff, FHFA’s lawyers, Quinn Emanuel Urquhart & Sullivan, dated July 23, 2012.

The fact these are ore tenus rulings on predictive coding explains how they have remained under the radar for so long. The orders show the mastery, finesse, and wisdom of the presiding District Court Judge Denise Cote. She was hearing her first predictive coding issue and handled it beautifully. Unfortunately, under the transcript the trial lawyers arguing pro and con did not hold up as well. Still, they appear to have been supported by good e-discovery lawyer experts behind the scenes. It all seems to have all turned out relatively well in the end as a recent Order dated February 14, 2014 suggests. Predictive coding was approved and court ordered cooperation resulted in a predictive coding project that appears to have gone pretty well. 

Defense Wanted To Use Predictive Coding

JP_MorganThe case starts with the defense, primarily JP Morgan, wanting to use predictive coding and the plaintiff, FHFA, objecting. The FHFA wanted the defendant banks to review everything. Good old tried and true linear review. The plaintiff also had fall back objections on the way the defense proposed to conduct the predictive coding.

The letter memorandum by Sullivan & Cromwell for JP Morgan is only three pages in length, but has 63 pages of exhibits attached. The letter relies heavily on the then new Da Silva Moore opinion by Judge Peck. The exhibits include the now famous 2011 Grossman and Cormack law review article on TAR, a letter from plaintiff’s counsel objecting to predictive coding, and a proposed stipulation and order. Here are key segments of Sullivan and Cromwell’s arguments:

According to Plaintiff, it will not agree to JPMC’s use of any Predictive Coding unless JPMC agrees to manually review each and every one of the millions of documents that JPMC anticipates collecting. As Plaintiff stated: “FHF A’s position is straightforward. In reviewing the documents identified by the agreed-upon search terms, the JPM Defendants should not deem a document nonresponsive unless that document has been reviewed by an attorney.”

Plaintiffs stated position, and its focus on “non-responsive” documents, necessitates this request for prompt judicial guidance. Predictive Coding has been recognized widely as a useful, efficient and reliable tool precisely because it can help determine whether there is some subset of documents that need not be manually reviewed, without sacrificing the benefit, if any, gained from manual review. Predictive Coding can also aid in the prioritization of documents that are most likely to be responsive. As a leading judicial opinion as well as commentators have warned, the assumption that manual review of every document is superior to Predictive Coding is “a myth” because “statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review.” Da Silva Moore v. Publicis Groupe, 2012 U.S. Dist. LEXIS 23350, at *28 (S.D.N.Y. Feb. 24, 2012) (Peck, Mag. J.) …

JPMC respectfully submits that this is an ideal case for Predictive Coding or “machine learning” to be deployed in aid of a massive, expedited document production. Plaintiffs claims in this case against JPMC concern more than 100 distinct securitizations, issued over a several year period by three institutions that were entirely separate until the end of that period, in 2008 (i.e., JPMorgan Chase, Bear Stearns & Co., and Washington Mutual). JPMC conservatively predicts that it will have to review over 2.5 million documents collected from over 100 individual custodians. Plaintiffhas called upon JPMC to add large numbers of custodians, expand date ranges, and otherwise augment this population, which could only expand the time and expense required? Computer assisted review has been approved for use on comparable volumes of material. See, e.g., DaSilva Moore, 2012 U.S. Dist. LEXIS 23350, at *40 (noting that the manual review of3 million emails is “simply too expensive.”).

Plaintiff’s Objections

FHFA

The plaintiff federal government agency, FHFA, filed its own three page response letter with 11 pages of exhibits. The response objects to use of predictive coding and the plaintiff’s proposed methodology. Here is the core of their argument:

First, JPMC’s proposal is the worst of both worlds, in that the set of documents to which predictive coding is to be applied is already narrowed through the use of search terms designed to collect relevant documents, and predictive coding would further narrow that set of documents without attorney review,1 thereby eliminating potentially responsive documents. …

Finally, because training a predictive coding program takes a considerable amount of time,2 the truncated timeframe for production of documents actually renders these Actions far from “ideal” for the use of predictive coding.

Poppy_headThe first objection on keyword search screening is good, but the second, that training would take too long, shows that the FHFA needed better experts. The machine learning training time is usually far less than the document review time, especially in a case like this, and the overall times savings from using predictive coding are dramatic. So the second objection was a real dog.

Still, FHFA made one more objection to method that was well placed, namely that their had been virtually no disclosure as to how Sullivan and Cromwell intended to conduct the process. (My guess is, they had not really worked that all out yet. This was all new then, remember.)

[I]t has similarly failed to provide this Court with any details explaining (i) how it intends to use predictive coding, (ii) the methodology or computer program that will be used to determine responsiveness, or (iii) any safeguards that will ensure that responsive documents are not excluded by the computer model. Without such details, neither FHFA nor this Court can meaningfully assess JPMC’s proposal. See Da Silva Moore v. Publicis Groupe SA, 2012 U.S. Dist. LEXIS 23350, at *23 (S.D.N.Y. Feb. 24, 2012) (“[Defendant’s] transparency in its proposed ESI search protocol made it easier for the Court to approve the use of predictive coding.”).4 JPMC’s proposed order sets forth an amorphous proposal that lacks any details. In the absence of such information, this Court’s authorization of JPMC’s use of predictive coding would effectively give JPMC carte blanche to implement predictive coding as it sees fit.

Hearing of July 24, 2012

Judge_Denise_CoteJudge Denise Cote came into the hearing having read the briefs and Judge Peck’s then recent landmark ruling in Da Silva Moore. It was obvious from her initial comments that her mind was made up that predictive coding should be used. She understood that this mega-size case needed predictive coding to meet the time deadlines and not waste a fortune on e-document review. Here are Judge Cote’s words at pages 8-9 of the transcript:

It seems to me that predictive coding should be given careful consideration in a case like this, and I am absolutely happy to endorse the use of predictive coding and to require that it be used as part of the discovery tools available to the parties. But it seems to me that the reliability and utility of predictive coding depends upon the process that takes place in the initial phases in which there is a pool of materials identified to run tests against, and I think that some of the documents refer to this as the seed — S-E-E-D — set of documents, and then there are various rounds of further testing to make sure that the code becomes smart with respect to the issues in this case and is sufficiently focused on what needs to be defined as a responsive document. And for this entire process to work, I think it needs transparency and cooperation of counsel.

I think ultimately the use of predictive coding is a benefit to both the plaintiff and the defendants in this case. I think there’s every reason to believe that, if it’s done correctly, it may be more reliable — not just as reliable but more reliable than manual review, and certainly more cost effective — cost effective for the plaintiff and the defendants.

To plaintiff’s counsel credit she quickly shifted her arguments from whether to how. Defense counsel also falls all over herself about how cooperative she has been and will continue to be, all the while implying that the other side is a closet non-cooperator.

As it turns out, very little actual conservation had occurred between the two lead counsel before the hearing, as both had preferred snarly emails and paper letters. At the hearing Judge Cote ordered the attorneys to talk first, and then rather than shoot off more letters, and to call her if they could not agree.

I strongly suggest you read the whole transcript of the first order to see the effect a strong judge can have on trial lawyers. Page 24 is especially instructive as to just how active a bench can be. At the second hearing of July 24, 2012, I suggest you read the transcript at pages 110-111 to get an idea as to just how difficult those attorneys meetings proved to be.

As a person obsessed with predictive coding I find the transcripts of the two hearings to be kind of funny in a perverse sort of way. The best way for me to share my insights is by using the format of a lawyer joke.

Two Lawyers Walked Into A Bar

star_trek_barOne e-discovery lawyer walks into a Bar and nothing much happens. Two e-discovery lawyers walks into a Bar and an interesting discussion ensues about predictive coding. One trial lawyer walks into a Bar the volume of the whole place increases. Two trial lawyers walk into a Bar and an argument starts.

The 37 lawyers who filed appearances in the FHFA case walk into a Bar and all hell breaks loose. There are arguments everywhere. Memos are written, motions are filed, and the big bank clients are billed a million or more just talking about predictive coding.

Then United States District Court Judge Denise Cote walks into the Bar. All the trial lawyers immediately shut up, stand up, and start acting real agreeable, nice, and polite. Judge Cote says she has read all of the letters and they should all talk less, and listen more to the two e-discovery specialists still sitting in the bar bemused. Everything becomes a cooperative love-fest thereafter, at least, as far as predictive coding and Judge Conte are concerned. The trial lawyers move on to fight and bill about other issues more within their kin.

Substantive Disputes in FHFA v. JP Morgan

disclosureThe biggest substantive issues in the first hearing of July 24, 2012 had to do with disclosure and keyword filtering before machine training. Judge Cote was prepared on the disclosure issue from having read the Da Silva Moore protocol, and so were the lawyers. The judge easily pressured defense counsel to disclose both relevant and irrelevant training documents to plaintiff’s counsel, with the exception of privileged documents.

As to the second issue of keyword filtering, the defense lawyers had been told by the experts behind the scenes that JP Morgan should be allowed to keyword filter the custodians ESI before running predictive coding. Judge Peck had not addressed that issue in Da Silva Moore, since the defense had not asked for that, so Judge Cote was not prepared to rule on that then new and esoteric issue. The trial lawyers were not able to articulate much on the issue either.

Judge Cote asked trial counsel if they had previously discussed this issue, not just traded memos, and they admitted no. So she ordered them to talk about it. It is amazing how much easier it is to cooperate and reach agreement when you actually speak, and have experts with you guiding the process. So Judge Cote ordered them to discuss the issue, and, as it turns out from the second order of July 31, 2012, they reached agreement. There would be no keyword filtering.

Although we do not know all of the issues discussed by attorneys, we do know they managed to reach agreement, and we know from the first hearing what a few of the issues were. They were outlined by plaintiff’s counsel who complained that they had no idea as to how defense counsel was going to handle the following issues at page 19 of the first hearing transcript:

What is the methodology for creating the seed set? How will that seed set be pulled together? What will be the number of documents in the seed set? Who will conduct the review of the seed set documents? Will it be senior attorneys or will it be junior attorneys? Whether the relevant determination is a binary determination, a yes or no for relevance, or if there’s a relevance score or scale in terms of 1 to 100. And the number of rounds, as your Honor noted, in terms of determining whether the system is well trained and stable.

So it seems likely all these issues and more were later discussed and accommodations reached.  At the second hearing of July 31, 2012, we get a pretty good idea as to how difficult the attorneys meetings must have been. At pages 110-111 of the second hearing transcript we see how counsel for JP Morgan depicted these meetings and the quality of input received from plaintiff’s counsel and experts:

We meet every day with the plaintiff to have a status report, get input, and do the best we can to integrate that input. It isn’t always easy, not just to carry out those functions but to work with the plaintiff.

The suggestions we have had so far have been unworkable and by and large would have swamped the project from the outset and each day that a new suggestion gets made. But we do our best to explain that and keep moving forward.

Defense counsel then goes into what most lawyers would call “suck-up” mode to the judge and says:

We very much appreciate that your Honor has offered to make herself available, and we would not be surprised if we need to come to you with a dispute that hasn’t been resolved by moving forward or that seems sufficiently serious to put the project at risk. But that has not happened yet and we hope it will not.

After that plaintiff’s counsel complains the defense counsel has not agreed to allow depositions transcripts and witness statements to be used as training documents. That’s right. The plaintiff wanted to include congressional testimony, depositions and other witness statements that they found favorable to their position as part of the training documents to find relevant information store of custodian information.

Judge Cote was not about to be tricked into making a ruling on the spot, but instead wisely told them to go back and talk some more and get real expert input on the advisability of this approach. She is a very quick study as the following exchange at page 114 of the transcript with defense counsel after hearing the argument of plaintiff’s counsel illustrates:

THE COURT: Good. We will put those over for another day. I’m learning about predictive coding as we go. But a layperson’s expectation, which may be very wrong, would be that you should train your algorithm from the kinds of relevant documents that you might actually uncover in a search. Maybe that’s wrong and you will all educate me at some other time. I expect, Ms. Shane, if a deposition was just shot out of this e-discovery search, you would produce it. Am I right?

MS. SHANE: Absolutely, your Honor. But your instinct that what they are trying to train the system with are the kinds of documents that would be found within the custodian files as opposed to a batch of alien documents that will only confuse the computer is exactly right.

It is indeed a very interesting issue, but we cannot see a report in the case on Pacer that shows how the issue was resolved. I suspect the transcripts were all excluded, unless they were within a custodian’s account.

2014 Valentines Day Hearing

kiss_me_im_a_custodian_keychainThe only other order we found in the case mentioning predictive coding is here (dated February 14, 2014). Most of the Valentine’s Day transcript pertains to trial lawyers arguing about perjury, and complaining that some key documents were missed in the predictive coding production by JP Morgan. But the fault appears due to the failure to include a particular custodian in the search, an easy mistake to have happen. That has nothing to do with the success of the predictive coding or not.

Judge Cote handled that well, stating that no review is “perfect” and she was not about to have a redo at this late date. Her explanation at pages 5-6 of the February 14, 2014 transcript provides a good wrap up for FHFA v. JP Morgan:

Parties in litigation are required to be diligent and to act in good faith in producing documents in discovery. The production of documents in litigation such as this is a herculean undertaking, requiring an army of personnel and the production of an extraordinary volume of documents. Clients pay counsel vast sums of money in the course of this undertaking, both to produce documents and to review documents received from others. Despite the commitment of these resources, no one could or should expect perfection from this process. All that can be legitimately expected is a good faith, diligent commitment to produce all responsive documents uncovered when following the protocols to which the parties have agreed, or which a court has ordered.

Indeed, at the earliest stages of this discovery process, JP Morgan Chase was permitted, over the objection of FHFA, to produce its documents through the use of predictive coding. The literature that the Court reviewed at that time indicated that predictive coding had a better track record in the production of responsive documents than human review, but that both processes fell well short of identifying for production all of the documents the parties in litigation might wish to see.

Conclusion

transparencyThere are many unpublished decisions out there approving and discussing predictive coding. I know of several more. Many of them, especially the ones that first came out and pretty much blindly followed our work in Da Silva Moore, call for complete transparency, including disclosure of irrelevant documents used in training. That is what happened in FHFA v. JP Morgan and the world did not come to an end. Indeed, the process seemed to go pretty well, even with a plaintiff’s counsel who, in the words of Sullivan and Cromwell, made suggestions everyday that were unworkable and by and large would have swamped the project … but we do our best to explain that and keep moving forward. Pages 110-111 of the second hearing transcript. So it seems cooperation can happen, even when one side is clueless, and even if full disclosure has been ordered.

Since the days of 2011 and 2012, when our Da Silva Moore protocol was developed, we have had much more experience with predictive coding. We have more information on how the training actually functions with a variety of chaotic email datasets, including the new Oracle ESI collection, and even more testing with the Enron dataset.

Based on what we know now, I do not think it is necessary to make disclosure of all irrelevant documents used in training. The only documents that have a significant impact on machine learning are the borderline, grey area documents. These are the ones who relevancy is close, and often a matter of opinion, of how you view the case. Only these grey area irrelevant documents need to be disclosed to protect the integrity of the process.

grey_area_disclosure

The science and other data behind that has to do with Jaccard Index classification inconsistencies, as well as the importance of mid-range ranked documents to most predictive coding algorithmic analysis. See Eg: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part Three at the subheadings Disclosure of Irrelevant Training Documents and Conclusions Regarding Inconsistent Reviews. When you limit disclosure to grey area training documents, and relevant documents, the process can become even more efficient without any compromise in quality or integrity. This of course assumes honest evaluations of grey area documents and forthright communications between counsel. But then so does all discovery in our system of justice. So this is really nothing new, nor out of the ordinary.

All discovery depends on the integrity and trustworthiness of the attorneys for the parties. Fortunately, almost all attorneys honorably fulfill these duties, except perhaps for the duty of technology competence. That is the greatest ethical challenge of the day for all litigators.


Beware of the TAR Pits! – Part Two

February 23, 2014

This is the conclusion of a two part blog. For this to make sense please read Part One first.

Quality of Subject Matter Experts

Poppy_headThe quality of Subject Matter Experts in a TAR project is another key factor in predictive coding. It is one that many would prefer to sweep under the rug. Vendors especially do not like to talk about this (and they sponsor most panel discussions) because it is beyond their control. SMEs come from law firms. Law firms hire vendors. What dog will bite the hand that feeds him? Yet, we all know full well that not all subject matter experts are alike. Some are better than others. Some are far more experienced and knowledgeable than others. Some know exactly what documents they need at trial to win a case. They know what they are looking for. Some do not. Some have done trials, lots of them. Some do not know where the court house is. Some have done many large search projects, first paper, now digital. Some are great lawyers; and some, well, you’d be better off with my dog.

The SMEs are the navigators. They tell the drivers where to go. They make the final decisions on what is relevant and what is not. They determine what is hot, and what is not. They determine what is marginally relevant, what is grey area, what is not. They determine what is just unimportant more of the same. They know full well that some relevant is irrelevant. They have heard and understand the frequent mantra at trials: Objection, Cumulative. Rule 403 of the Federal Evidence Code. Also see The Fourth Secret of Search: Relevant Is Irrelevant found in Secrets of Search – Part III.

Quality of SMEs is important because the quality of input in active machine learning is important. A fundamental law of predictive coding as we now know it is GIGO, garbage in, garbage out. Your active machine learning depends on correct instruction. Although good software can mitigate this somewhat, it can never be eliminated. See: Webber & Pickens, Assessor Disagreement and Text Classifier Accuracy, SIGIR 2013 (24% more ranking depth needed to reach equivalent recall when not using SMEs, even in a small data search of news articles with rather simple issues).

Jeremy_PickensInformation scientists like Jeremy Pickens are, however, working hard on ways to minimize the errors of SME document classifications on overall corpus rankings. Good thing too because even one good SME will not be consistent in ranking the same documents. That is the Jaccard Index scientists like to measure. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part Two, and search of Jaccard in my blog.

Unique_Docs_VennIn my Enron experiments I was inconsistent in determining the relevance of the same document 23% of the time. That’s right, I contradicted myself on relevancy 23% of the time. (If you included irrelevancy coding the inconsistencies were only 2%.) Lest you think I’m a complete idiot (which, by the way, I sometimes am), the 23% rate is actually the best on record for an experiment. It is the best ever measured, by far. Other experimentally measured rates have inconsistencies of from 50% to 90% (with multiple reviewers). Pathetic huh? Now you know why AI is so promising and why it is so important to enhance our human intelligence with artificial intelligence. When it comes to consistency of document identifications in large scale data reviews, we are all idiots!

With these human  frailty facts in mind, not only variable quality in expertise of subject matter, but also human inconsistencies, it is obvious why scientists like Pickens and Webber are looking for techniques to minimize the impact of errors and, get this, even use these inevitable errors to improve search. Jeremy Pickens and I have been corresponding about this issue at length lately. Here is Jeremy’s later response to this blog. In TAR, Wrong Decisions Can Lead to the Right Documents (A Response to Ralph Losey). Jeremy does at least concede that coding quality is indeed important. He goes on to argue that his study shows that wrong decisions, typically on grey area documents, can indeed be useful.

Penrose_triangle_ExpertiseI do not doubt Dr. Pickens’ findings, but am skeptical of the search methods and conclusions derived therefrom. In other words, how the training was accomplished, the supervision of the learning. This is what I call here the driver’s role, shown on the triangle as the Power User and Experienced Searcher. In my experience as a driver/SME, much depends on where you are in the training cycle. As the training continues the algorithms eventually do become able to detect and respond to subtle documents distinctions. Yes, it take a while, and you have to know what and when to train on, which is the drivers skill (for instance you never train with giant documents), but it does eventually happen. Thus, while it may not matter if you code grey area documents wrong at first, it eventually will, that is unless you do not really care about the distinctions. (The TREC overturn documents Jeremy tested on, the ones he called wrong documents, were in fact grey area documents, that is, close questions. Attorneys disagreed on whether they were relevant, which is why they were overturned on appeal.) The lack of precision in training, which is inevitable anyway even when one SME is used, may not matter much in early stages of training, and may not matter at all when testing simplistic issues using easy databases, such as news articles. In fact, I have used semi-supervised training myself, as Jeremy describes from old experiments in Pseudo Relevance Feedback. I have seen it work myself, especially in early training.

Still, the fact some errors do not matter in early training does not mean you should not care about consistency and accuracy of training during the whole ride. In my experience, as training progresses and the machine gets smarter, it does matter. But let’s test that shall we? All I can do is report on what I see, i.w. – anecdotal.

Outside of TREC and science experiments, in the messy real world of legal search, the issues are typically maddeningly difficult. Moreover, the difference in cost of review of hundreds of thousands of irrelevant documents can be mean millions of dollars. The fine points of differentiation in matured training are needed for precision in results to reduce costs of final review. In other words, both precision and recall matter in legal search, and all are governed by the overarching legal principle of proportionality. That is not part of information science of course, but we lawyers must govern our search efforts by proportionality.

Also See William Webber’s response: Can you train a useful model with incorrect labels? I believe that William’s closing statement may be correct, either that or software differences:

It may also be, though this is speculation on my part, that a trainer who is not only a subject-matter expert, but an expert in training itself (an expert CAR driver, to adopt Ralph Losey’s terminology) may be better at selecting training examples; for instance, in recognizing when a document, though responsive (or non-responsive), is not a good training example.

alchemyI hope Pickens and Webber get there some day. In truth, I am a big supporter of their efforts and experiments. We need more scientific research. But for now, I still do not believe we can turn lead into gold. It is even worse if you have a bunch of SMEs arguing with each other about where they should be going, about what is relevant and what is not. That is a separate issue they do not address, which points to the downside of all trainers, both amateurs and SMEs alike. See: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

For additional support on the importance of SMEs, see again Monica’s article, EDI-Oracle Studywhere she summarizes the conclusion of Patrick Oot from the study that:

Technology providers using similar underlying technology, but different human resources, performed in both the top-tier and bottom-tier of all categories. Conclusion: Software is only as good as its operators. Human contribution is the most significant element. (emphasis in original)

Also see the recent Xerox blog, Who Prevails in the E-Discovery War of Man vs. Machine? by Gabriela Baron.

Teams that participated in Oracle without a bona fide SME, much less a good driver, well, they were doomed. The software was secondary. How could you possibly replicate the work of the original SME trial lawyers that did the first search without having an SME yourself, one with at least a similar experience and knowledge level.

map_lost_navigator_SMEThis means that even with a good driver, and good software, if you do not also have a good SME, you can still end up driving in circles. It is even worse when you try to do a project with no SME at all. Remember, the SME in the automobile analogy is the navigation system, or to use the pre-digital reality, the passenger with the map. We have all seen what happens where the navigation system screws up, or the map is wrong, or more typically, out of date (like many old SMEs). You do not get to the right place. You can have a great driver, and go quite fast, but if you have a poor navigator, you will not like the results.

The Oracle study showed this, but it is hardly new or surprising. In fact, it would be shocking if the contrary were true. How can incorrect information ever create correct information? The best you can hope for is to have enough correct information to smooth out the errors. Put another way, without signal, noise is just noise. Still, Jeremy Pickens claims there is a way. I will be watching and hope he succeeds where the alchemists of old always failed.

Tabula Rasa

blank_slateThere is one way out of the SME frailty conundrum that I have high hopes for and can already understand. It has to do with teaching the machine about relevance for all projects, not just one. The way predictive coding works now the machine is a tabula rasa, a blank slate. The machine knows nothing to begin with. It only knows what you teach it as the search begins. No matter how good the AI software is at learning, it still does not know anything on its own. It is just good at learning.

That approach is obviously not too bright. Yet, it is all we can manage now in legal search at the beginning of the Second Machine Age. Someday soon it will change. The machine will not have its memory wiped after every project. It will remember. The training from one search project will carry over to the next one like it. The machine will remember the training of past SMEs.

That is the essential core of my PreSuit proposal: to retain the key components of the past SME training so that you do not have to start afresh on each search project. PreSuit: How Corporate Counsel Could Use “Smart Data” to Predict and Prevent Litigation. When that happens (I don’t say if, because this will start happening soon, some say it already has) the machine could start smart.

Scarlett_Johansson - Samantha in HERThat is what we all want. That is the holy grail of AI-enhanced search — a smart machine. (For the ultimate implications of this, see the movie Her, which is about an AI enhanced future that is still quite a few years down the road.) But do not kid yourself, that is not what we have now. Now we only have baby robots, ones that are eager and ready to learn, but do not know anything. It is kind of like 1-Ls in law school, except that when they finish a class they do not retain a thing!

When my PreSuit idea is implemented, the next SME will not have to start afresh. The machine will not be a tabula rasa. It will be able to see litigation brewing. It will help general counsel to stop law suits before they are filed. The SMEs will then build on the work of prior SMEs, or maybe build on their own previous work in another similar project. Then the GIGO principle will be much easier to mitigate. Then the computer will not be completely dumb, it will have some intelligence from the last guy. There will be some smart data, not just big dumb data. The software will know stuff, know the law and relevance, not just know how to learn stuff.

When that happens, then the SME in a particular project will not be as important, but for now, when working from scratch with dumb data, the SME is still critical. The smarter and more consistent the better. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

Professor Marchionini, like all other search experts, recognizes the importance of SMEs to successful search. As he puts it:

Thus, experts in a domain have greater facility and experience related to information-seeking factors specific to the domain and are able to execute the subprocesses of information seeking with speed, confidence, and accuracy.

That is one reason that the Grossman Cormack glossary builds in the role of SMEs as part of their base definition of computer assisted review:

A process for Prioritizing or Coding a Collection of electronic Documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of Documents and then extrapolates those judgments to the remaining Document Collection.

Glossary at pg. 21 defining TAR.

Most SMEs Today Hate CARs
(And They Don’t Much Like High-Tech Drivers Either)

simpsonoldmanThis is an inconvenient truth for vendors. Predictive coding is defined by SMEs. Yet vendors cannot make good SMEs step up to the plate and work with the trainers, the drivers, to teach the machine. All the vendors can do is supply the car and maybe help with the driver. The driver and navigator have to be supplied by the law firm or corporate clients. There is no shortage of good SMEs, but almost all of them have never even seen a CAR. They do not like them. They can barely even speak the language of the driver. They don’t much like most of the drivers either. They are damn straight not going to spend two weeks of their lives riding around in one of those new fangled horseless carriages.

ringo and old guy

That is the reality of where we are now. Also see: Does Technology Leap While Law Creeps? by Brian Dalton, Above the Law. Of course this will change with the generations. But for now, that is the way it is. So vendors work on error minimization. They try to minimize the role of SMEs. That is anyway a good idea, because, as mentioned, all human SMEs are inconsistent. I was lucky to only be inconsistent 23% of the time on relevance. But still, there is another obvious solution.

There is another way to deal today with the reluctant SME problem, a way that works right now with today’s predictive coding software. It is a kind of non-robotic surrogate system that I have developed, and I’m sure a several other professional drivers have as well. See my CAR page for more information on this. But, in reality it is one of those things I would just have to show you in a driver education school type setting. I do it frequently. It involves action in behalf of an SME, and dealing with the driver for them. It places them in their comfort zone, where they just make yes no decisions on the close question documents, although there is obviously more to it than that. It is not nearly as good as the surrogate system in the movie Her, and of course, I’m no movie star, but it works.

HER_Samantha_Surrogate

My own legal subject matter expertise is, like most lawyers, fairly limited. I know a lot about a few things, and am a stand alone SME in those fields. I know a fair amount about many more legal fields, enough to understand real experts, enough to serve as their surrogate or right hand. Those are the CAR trips I will take.

If I do not know enough about a field of law to understand what the experts are saying, then I cannot serve as a surrogate. I could still drive of course, but I would refuse to do that out of principle, unless I had a navigator, an SME, who knew what they were doing and where they wanted to go. I would need an SME willing to spend the time in the CAR needed to tell me where to go. I hate a TAR pit as much as the next guy. Plus at my age and experience I can drive anywhere I want, in pretty much any CAR I want. That brings us to the final corner of the triangle, the variance in the quality of predictive coding software.

Quality of the CAR Software

I am not going to spend a lot of time on this. No lawyer could be naive enough to think that all of the software is equally as good. That is never how it works. It takes time and money to make sophisticated software like this. Anybody can simply add on open source machine learning software code to their review platforms. That does not take much, but that is a Model-T.

Old_CAR_stuck_mud

To make active machine learning work really well, to take it to the next level, requires thousands of programming hours. It takes large teams of programmers. It takes years. It take money. It takes scientists. It takes engineers. It takes legal experts too. It takes many versions and continuous improvements of search and review software. That is how you tell the difference between ok, good, and great software. I am not going to name names, but I will say the Gartner’s so called Magic Quadrant evaluation of e-discovery software is not too bad. Still, be aware that evaluation of predictive coding is not really their thing, or even a primary factor for rating review software.

Gartner_Magic_Quadrant

It is kind of funny how pretty much everybody wins in the Gartner evaluation. Do you think that’s an accident? I am privately much more critical. Many well known programs are very late to the predictive coding party. They are way behind. Time will tell if they are ever able to catch up.

Still, these things do change from year to year, as new versions of software are continually released. For some companies you can see real improvements, real investments being made. For others, not so much, and what you do see is often just skin deep. Always be skeptical. And remember, the software CAR is only as good as your driver and navigator.

car_mind_meld

When it comes to software evaluation what counts is whether the algorithms can find the documents needed or not. Even the best driver navigator team in the world can only go so far in a clunker. But give them a great CAR, and they will fly. The software will more than pay for itself in saved reviewer time and added security of a job well done.

Deja Vu All Over Again. 

Predictive coding is a great leap forward in search technology. In the longterm predictive coding and other AI-based software will have a bigger impact on the legal profession than did the original introduction of computers into the law office. No large changes like this are without problems. When computers were first brought into law offices they too caused all sorts of problems and had their pitfalls and nay sayers. It was a rocky road at first.

Ralph in the late 1980s

I was there and remember it all very well. The Fonz was cool. Disco was still in. I can remember the secretaries yelling many times a day that they needed to reboot. Reboot! Better save. It became a joke, a maddening one. The network was especially problematic. The partner in charge threw up his hands in frustration. The other partners turned the whole project over to me, even though I was a young associate fresh out of law school. They had no choice. I was the only one who could make the damn systems work.

Ifloppy_8incht was a big investment for the firm at the time. Failure was not an option. So I worked late and led my firm’s transition from electric typewriters and carbon paper to personal computers, IBM System 36 minicomputers, word processing, printers, hardwired networks, and incredibly elaborate time and billing software. Remember Manac time and billing in Canada? Remember Displaywriter? How about the eight inch floppy? It was all new and exciting. Computers in a law office! We were written up in IBM’s small business magazine.

For years I knew what every DOS operating file was on every computer in the firm. The IBM repair man became a good friend. Yes, it was a lot simpler then. An attorney could practice law and run his firm’s IT department at the same time.

ralph_1990sHey, I was the firm’s IT department for the first decade. Computers, especially word processing and time and billing software, eventually made a huge difference in efficiency and productivity. But at first there were many pitfalls. It took us years to create new systems that worked smoothly in law offices. Business methods always lag way behind new technology. This is clearly shown by MIT’s Erik Brynjolfsson and Andrew McAfee in their bestseller, Second Machine Age. It typically takes a generation to adjust to major technology breakthroughs. Also see Ted Talk by Brynjolfsson with video.

I see parallels with the 1980s and now. The main difference is legal tech pioneers were very isolated then. The world is much more connected now. We can observe together how, like in the eighties, a whole new level of technology is starting to make its way into the law office. AI-enhanced software, starting with legal search and predictive coding, is something new and revolutionary. It is like the first computers and word processing software of the late 1970s and early 80s.

It will not stop there. Predictive coding will soon expand into information governance. This is the PreSuit project idea that I, and others, are starting to talk about. See Eg: Information Governance Initiative. Moreover, many think AI software will soon revolutionize legal practice in a number of other ways, including contract generation and other types of repetitive legal work and analysis. See Eg: Rohit Talwar, Rethinking Law Firm Strategies for an Era of Smart Technology (ABA  LPT, 2014). The potential impact of supervised learning and other cognitive analytics tools on all industries is vast. See Eg: Deloitte’s 2014 paper: Cognitive Analytics (“For the first time in computing history, it’s possible for machines to learn from experience and penetrate the complexity of data to identify associations.”); Also see: Digital Reasoning software, and Paragon Science software. Who knows where it will lead the world, much less the legal profession? Back in the 1980s I could never have imagined the online Internet based legal practice that most of us have now.

The only thing we know for sure is that it will not come easy. There will be problems, and the problems will be overcome. It will take creativity and hard work, but it will be done. Easy buttons have always been a myth, especially when dealing with the latest advancements of technology. The benefits are great. The improvements from predictive coding in document review quality and speed are truly astonishing. And it lowers cost too, especially if you avoid the pits. Of course there are issues. Of course there are TAR pits. But they can be avoided and the results are well worth the effort. The truth is we have no choice.

Conclusion

retire

If you want to remain relevant and continue to practice law in the coming decades, then you will have to learn how to use the new AI-enhanced technologies. There is really no choice, other than retirement. Keep up, learn the new ways, or move on. Many lawyers my age are retiring now for just this reason. They have no desire to learn e-discovery, much less predictive coding. That’s fine. That is the honest thing to do. The next generation will learn to do it, just like a few lawyers learned to use computers in the 1980s and 1990s. Stagnation and more of the same is not an option in today’s world. Constant change and education is the new normal. I think that is a good thing. Do you?

Leave a comment. Especially feel free to point out a TAR pit not mentioned here. There are many, I know, and you cannot avoid something you cannot see.


Follow

Get every new post delivered to your Inbox.

Join 3,100 other followers