Keywords and Search Methods Should Be Disclosed, But Not Irrelevant Documents

May 26, 2013

black_box_KEYWORDSA common question these days between most lawyers discussing e-discovery is: What Keywords Did You Use? This is often followed by I’ll show you mine if you show me yours. Often this latter statement is made out of a bona fide spirit of cooperation, typically in cases where:

  1. both sides had too much ESI to search manually;
  2. they culled using simple keyword technology either because that is the only search they knew how to do, or they did not deem more advanced predictive coding technology to be appropriate for that case; and,
  3. the attorneys knew how to cooperate to get discovery done without spending too much money.

In these cases attorneys freely exchange the final keywords they used. There is no wasted breath or valuable client dollars spilled over the question.

Only rarely would attorneys in this symmetrical position not only want to know the keywords finally chosen, but also the keywords, parametrics, Boolean logic, etc., tested and rejected along the way. If they asked for a list of all the keywords ever tested, the proper response is no, or in my case, no such list exists, and I can’t recall, but there were quite a few. They might want to ask whether you tried this, that, or the other keyword. That’s fair, and an expert searcher would probably say, yes, I tried all of those early on, and they were all rejected because (fill in the blank). Alternatively, they might say to one or more of the suggestions: No, I didn’t think of that one, but I’ll check on it later today. It’ll just take a minute to try it out, and then I’ll get back to you on that.

A Keyword Search Hiding Kimono Made of Work Product is a Kimono Made of Whole Cloth

But what about another scenario, the asymmetrical one? You know, where one side has tons of ESI (actually ESI is weightless, but this sounds good), and the other side that has virtually none, aside these days from a pesky Facebook page or two. In these cases the kind of cooperation described for symmetrical cases is often lacking. One side, typically the plaintiff, has nothing to disclose. The conversation is more like, you show me yours, but I can’t show you mine because, well, I don’t have one. So sad. All too often this conservation serves as a prelude to a real waste of client money fighting over the question.

men's kimonoThe well-endowed defense counsel is often too shy to show theirs, keywords of course. So they hide theirs in a kimono made of a fabric called work product. This legal doctrine is designed to protect an attorney’s mental impressions, conclusions, opinions, or legal theories. It also protects from discovery documents and tangible things that are prepared in anticipation of litigation. Hickman v. Taylor329 U.S. 495 (1947); Rule 26(b)(3), FRCP. Therefore, you cannot send an interrogatory asking for the other side’s strategy to win the case; or more correctly stated, you can ask, but the attorney does not have to answer. 

Many lawyers have long considered the particular methods they used to find documents responsive to a request for production to be work product. It was, after all, their own thought processes and legal techniques that created the keywords. They object to disclosing the  keywords they used. They argue such disclosure would unfairly require them to disclose their theory of the case, their mental impressions of how to find relevant information.

This seems like a stretch to many attorneys, and judges, some of whom have rejected this argument outright. They do not think that any significant attorney ideas are revealed by something as mechanical as keywords, especially the final keywords used to cull a large dataset. They think keywords relate to the underlying facts of what documents are responsive to a document request, not mental impressions. They do not see any work product in keywords.

Still, many lawyers cling to this very broad interpretation of work product, especially when in asymmetrical litigation. In those cases defense counsel may respond to the question of what keywords did you use by saying something like: You cannot see my keywords, they are mine, all mine, and mine alone. No one may see my magic words. They are secret. They are protected by privilege. I would rather die than let you peek under my kimono. Well, ok, maybe the last phrase is not uttered too often, but the others essentially are. Passion by lawyers to protect their secrets can run high. This is usually displaced passion. It should be directed to protecting client confidences instead. Often lawyers grappling with e-discovery forget and confuse an attorney-client privilege with a work product privilege.

Attorney-client and work product are two completely different kinds of privilege. The AC privilege is owned by the client, not the lawyer. The lawyer has a strong ethical duty to protect the client’s AC privilege. This is dramatically contrasted with a work product privilege that is owned by the attorney and is given far less protection under the law. There is no ethical duty whatsoever for an attorney to keep his work product secret, except for the duty of competent representation. Often competence requires an attorney not to reveal his or her mental impressions of a case to opposing counsel. They reasonably construe the scope of the privilege and determine that:

  1. disclosure would not be in their client’s best interests, and
  2. the rules do not otherwise require them to share this particular aspect of their mind-set about the case.

But having said all of that, it is important to understand that attorneys often deem it to be in their client’s best interests to share some of their mental impressions of a case. Moreover, like it or not, the rules often require an attorney to make some disclosure of their mind-set, theories, etc., or material prepared for the case. So they do it. They move the case along. They do not get bogged down with a question of open or closed kimono, which is often just an ego-trip where a lawyer is over-valuing their own mental impressions at the client’s expense. Yes, kimono-closing motion play can be a very expensive process.

All actual trial lawyers, and not mere paper-pushers as we used to say, or now, maybe better said, mere electron-pushers, know full well that good lawyers share mental impressions with opposing counsel all of the time. Indeed, is that not what legal briefs are all about?

Some disclosure of work product is required for any attorney to comply with the rules of civil procedure, including the almighty Rule One, and especially the discovery rules. Discovery is built on the premise of cooperation, and that in turn requires some rudimentary sharing of mental impressions, such as what do you think is relevant, what documents do you want us to try to find to respond to this or that category in a Request For Production?

How could you possibly comply with Rule 26(f), for instance, without some select waiver of work product? Remember in subsection (2) it mandates attorneys to discuss the “nature and basis of their claims and defenses and the possibilities for promptly settling or resolving the case” and to develop a joint discovery plan. In subsection (3) the rules require lawyers to talk to each other and “state the parties’ views and proposals” on a topics A-F. Subsection (C) of Rule 26(f) in turn requires discussion of the views and proposals concerning “any issues about disclosure or discovery of electronically stored information…” All of these mandated discussions require disclosure of an attorney’s mental impressions, conclusions, opinions, and legal theories.

Trial lawyers have always disclosed some work product to each other to prepare for trail (of which discovery is a part), conduct trials, and settle cases. A lawyer can share his mental impressions with the other side, if he wants, and can do so without fear of opening the door of a complete waiver. Again, this happens all of the time, especially in any settlement discussions, where both sides will try to persuade the other of the strength of their case. They will explain why and how they will win and the other side will lose. The same kind of discussion is inherent in any proportionality issue.

Lawyers usually love to argue about their opinions, so why this recalcitrance about keywords? Could it be because they know or suspect that their keywords suck? Do they fear ridicule and reversal because they just dreamed up keywords without testing? Or worse, did they use bad keywords on purpose to try to hide the truth?

Ralph_Kimono_Search_Triangle

Bottom line: to represent a client’s best interests and comply with the Rules, a lawyer has to share mental impressions to a certain extent. If lawyers refuse to talk to each other, refuse to cooperate, all on some misguided notion that they have a right to remain to remain silent because of the work product doctrine, discovery will never get done. The case will go off track and may never be resolved on the merits. The hide-your-keywords under a kimono doctrine that seems to be in fashion among many e-discovery lawyers these days is misguided at best, and at worst, may be illegal.

Kimono closing lawyers, get over yourself and how valuable your mental impressions are. Tell the other side what your keywords are. Or are you hiding them because your keywords are so poor? Are you embarrassed by what you have to show? Then get a keyword search expert to help you out. Unlike predictive coding experts, there are plenty of power users and professional keyword searchers around.

Cases Supporting Disclosure of Keywords

Like it or not more and more judges are growing tired of obstructionism and expensive discovery side-shows. They are requiring lawyers to show their keywords, at least the ones used in final culling. They are compelling lawyers to open their kimonos. Consider the ruling in a recent trade-secret theft case in California that cites to the law of several other jurisdictions.

To the extent Plaintiff argues that disclosure of search terms would reveal privileged information, the Court rejects that argument. Such information is not subject to any work product protection because it goes to the underlying facts of what documents are responsive to Defendants’ document request, rather than the thought processes of Plaintiff’s counsel. See Romero v. Allstate Ins. Co., 271 F.R.D. 96, 109-10 (E.D. Pa. 2010) (finding that document production information, including search terms, did not fall under work product protection because such information related to facts) (citing Upjohn Co. v. United States, 449 U.S. 383, 395–96 (1981) (“Protection of the privilege extends only to communications and not to facts. The fact is one thing and a communication concerning that fact is entirely different.”)); see also Doe v. District of Columbia, 230 F.R.D. 47, 55-56 (D.D.C. 2005) (holding that Rule 26(b)(1) of the Federal Rules of Civil Procedure can be read to allow for discovery of document production policies and procedures and such information is not protected under the work product doctrine or attorney-client privilege). Moreover, Defendants’ substantial need for this information is apparent. See In re Enforcement of Subpoena Issued by F.D.I.C., 2011 WL 2559546, at *1 (N.D. Cal. June 28, 2011) (LaPorte, J.) (“Fact work product consists of factual material and is subject to a qualified protection that a showing of substantial need can overcome.”). There is simply no way to determine whether Plaintiff did an adequate search without production of the search terms used.

Formfactor, Inc. v. Micro-Probe, Inc., Case No. C-10-03095 PJH (JCS), 2012 WL 1575093, at *7 n.4 (N.D. Cal. May 3, 2012).

The holding in Formfactor was recently followed in the well-known Apple v Samsung case involving a third-party subpoena of Google. Apple Inc. v. Samsung Electronics Co. LtdThe court compelled Google to disclose the keywords it used to respond to the subpoena and also disclose the names of the custodians whose computer records were searched. Their arguments that a third-party under Rule 45 did not have to make such disclosure were rejected. The court instead noted that discovery cooperation, including transparency of search methods, was required of anyone in litigation, both parties and non-parties.

Keyword Search Alone is Good Enough for Most Cases

Keywords are here to stay. Sure, it is old technology, but it is still an effective means of search. It should not be abandoned entirely. Predictive coding, by which I mean search using near-infinite-dimensional vector space probability analysis of all documents searched, is far more advanced than mere keyword search. But this kind of advanced-math search is hard to do correctly, and anyway, is not needed for all search projects.

Forgive the primitive image, but you do not need to use an elephant gun to kill a mouse. Since most cases today, even in federal court, involve less than $100,000 at issue, predictive coding is not needed in most suits. Keywords search alone, without including advanced analytics, is proportionally sound for most of these small value cases. Indeed, it is proportionally sound today for any case that does not involve high volumes of ESI or otherwise have complex search challenges.

Moreover, even in the big cases involving complex search problems, you would never use predictive coding search alone. That is about as silly as relying on random chance alone to train your predictive coding robots. You would use all kinds of search, what I call the multimodal approach. That includes keyword search using modern-day parametric Boolean features.

Ralph_kimono_whole_cloth

Anytime keywords are used to screen out files for review you should be prepared to disclose those keywords. I personally do not like to use keywords as an independent filter in a predictive coding process. But sometimes it happens, such as to limit the initial documents collected and thereafter searched with predictive coding. If that happens, and if the question is asked – What keywords did you use? – you should be prepared to answer. You should not try to hide that under your kimono. More and more courts consider that work product argument to be made of whole cloth.

Disclosure in Predictive Coding Search

hypercube_predictive_coding

Assuming that a predictive coding process is done properly, and keywords are not used to select what documents get searched, then the question of what keywords you used as part of the multimodal search becomes moot. A predictive coding CAR is not driven by keywords. It is driven by infinite dimensional probability math. Keywords are not in the black box anymore, hyper-dimensions are.

The keywords used in a predictive coding project are just one of many types of search used to find the documents that fuel the predictive coding engine. They help an expert searcher find documents to train the machine in an active learning process. The training documents are what cull out by probability ranking, not particular keywords. A predictive coding CAR runs on whole-documents, usually thousands of documents, not a few keywords. Should these training documents be disclosed, or not, is the issue in predictive coding search projects. Despite what some may say, there is no one set-answer to that question that applies to all cases. Da Silva Moore does not purport to provide the only possible answer for all cases. None of the orders in the field do that. The judges involved know better.

For now it is still an open question as to how far the work product doctrine applies to predictive coding search processes. It may not apply at all. You may have to share your mental impressions, your basic search plan. You may have to disclose what you did and why. You may have to explain what predictive coding search methods you used, but not disclose your entire training set, not disclose your irrelevant documents. Even if courts hold to the contrary that your search methods are protected by work product, you may want to share the methods  anyway to save time and client money. It may be in your client’s best interests to explain what you did. You will not lose any competitive advantage by doing so.

Sell Proportionality by Explaining How Good Your Search Is

I for one enjoy sharing how I did a complex, advanced analytics, iterative, multimodal search project. I do not enjoy this kind of show and tell because I like the sound of my own voice (well, ok that may be part of it), but because it helps persuade the requesting party that they are getting the most bang possible for the buck. It supports my proportionality argument. The abilities of predictive coding are truly mind-blowing. When done right, with good software (and, the truth is, most of the software out there is not good, is not bona fide active machine learning, and is not used properly), it can accomplish miracles. At least from our three-dimensional, keyword conditioned perspective, the search results seem miraculous.

black_box_SVMIf the requesting party, or judge, are still not convinced, and insist on an explanation on how the software black box really works, then you can bring an information scientist familiar with the software you used to explain it all. They like to talk almost as much as lawyers. They will go on and on. The other side may be sorry they asked, and most judges will be sorry they allowed a Daubert hearing for discovery. To me it is fascinating to hear how near-infinite-dimensional vector space probability analysis really works. So fascinating, in fact, that I have lined up a guest blog by Jason R. Baron and Jesse B. Freeman, a wiz-kid math genius he found, that introduces multidimensional support vector machines to lawyers. It is coming soon.

Predictive coding, when done right, is the best thing that ever happened to a requesting party in a complex ESI case. So if you are proud of what you have got, open up your predictive coding kimono for the other side to see. They should be impressed by the advanced methods, by the not just reasonable, but stellar, multidimensional efforts. After all, did they use hyper-dimensional based probability algorithms in their search? Did they used an iterative multimodal approach with both statistical quality control and quality assurance methods. Your best-practices in search justify a low-cost proportional approach.

You Should Not Have to Share Irrelevant Documents Unless and Until a Showing is Made that Your Predictive Coding Search Efforts Were Unreasonable

Does this mean that you must share all of the actual documents used in the machine training? Absolutely not. I have been talking about sharing process, not documents. The attorney work product has nothing to do with sharing, or not sharing, the client’s documents. We are not talking about documents prepared in connection with litigation. We are talking about documents prepared in the ordinary course of business, of life. These documents have their own protection from disclosure, for instance, the privilege held  by the client, not the lawyer, the attorney-client privilege. A lawyer cannot waive that. Only a client can. There are many other protections that apply to ESI, such as trade-secrets, or personal privacy laws. But the most central protection is that built into the rules, where only relevant information must be disclosed. Irrelevant ESI is not discoverable.

That means that unless there is some dispute as to the adequacy of the search efforts, only the relevant documents in a training set used in predictive coding need be produced. There is no legal basis in the initial stage to require production of all documents, including irrelevant documents.

Still, the client may choose to do so in certain cases, with certain sets of documents, and with certain protections in place. But that has nothing to do with work product analysis. It typically has to do with building confidence and trust between litigation parties and avoiding expensive disputes. It has to do with concerns that a reasonable effort to find relevant ESI is being made. It has to do with mitigating risk by cooperation and participation. It has to do with avoiding motions for sanctions, motions predicated upon allegations of an unreasonable search effort.

If a requesting party is kept in the dark, and the producing party does not reveal their search kung fu, and if the requesting party later makes a good cause showing that the producing party’s  search effort was unreasonable, then you are facing possible sanctions and the dreaded redo. All a requesting party will have to do to show good cause is provide proof that the producing party missed certain key hot documents. Then in a sanctions motion the reasonability of the producing party’s search efforts becomes relevant. This would usually happen in the context of a post-production motion.

Then, and only then, would there be legal authority to require production of the irrelevant documents used in the training sets. That is because these formerly irrelevant documents would then become relevant. They would then be relevant to the issue of reasonable efforts. They would then be discoverable, assuming the producing party claims its efforts were reasonable.

I contend that this magical transformation from irrelevant to relevant requires a good cause showing. There must at least be a justiciable issue of fact that the search efforts were unreasonable before the irrelevant documents in a training set become discoverable. Based on my CLE efforts, and listening to judges from around the country, I am confident that the judges, when they hear these arguments, will agree with this logic. They will, in most cases, only require disclosure of process, and not also require disclosure of irrelevant documents.

A recent case out of New York confirms this belief. Hinterberger v. Catholic Health Systems, Inc., 1:08-cv-00380-WMS-LGF, U.S. Dist. Crt., W. District of N.Y., Order dated 5/21/13. In this complicated case with tons of ESI the defendant finally gave up on using keyword search alone to try to find which of its millions of emails were likely relevant and needed to be reviewed for possible production. After Magistrate Judge Leslie Foshio pointed out the Da Silva Moore case to the parties, the defendant decided to drive a more advanced CAR, one that included a predictive coding search engine. They hoped that would allow them to accomplish their search task within budget.

Before the predictive coding work began, however, plaintiff demanded that the exact same Da Silva Moore search protocol be followed, and they be allowed to participate in the seed set generation, including a quick peek of all unprivileged training documents. Defendant objected, as well it should. No good cause had been shown to force such disclosure of irrelevant documents. Defendant argued that plaintiff’s motion was premature, and plaintiff misread the intent of Judge Peck’s order in Da Silva Moore. Defendant asserted that plaintiffs had no right to access to Defendants’ seed-set documents at this time. Judge Foshio essentially agreed with defendant, and denied plaintiff’s motion to compel, but did so largely upon defense counsel’s representation that they would cooperate with plaintiff. Judge Foshio got it right, so did the defendant: cooperation is the key, not disclosure of irrelevant documents or particular protocols agreed to in other cases

Conclusion

Ralph_Kimono_Search3As the predictive coding landscape matures, and as counsel learn to cooperate and make disclosure of methods used, there will be no need to build trust by disclosing irrelevant documents in training sets. Judges will not have to go there. Counsel will only need to disclose the multimodal search processes used, including details of the predictive coding methods. I do this all the time, although in much greater detail than required. See eg: summary of CAR and list of over thirty articles on predictive coding theory and methods; Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron (detailed description of a multimodal search of 699,082 ENRON documents); Borg Challenge  (description of same search using a semi-automated monomodal method). Worst case scenario, counsel may have to explain the black boxes of the predictive coding software they used. All they need do for that is pull a science rabbit out of their hat who will explain hyper-dimensional probability vectors, regression analysis and the like. There are experts for that too. Every good software company has at least one. I know several.

In simple keyword search cases a similar logic will prevail. Counsel will have to disclose the keywords used in final culling, but not the documents deemed irrelevant by keywords or second-pass relevance attorney review teams. The instruction books prepared for these human review teams, if any, will also be kept secret, but not the general methods used, such as the quality controls. Case specific reviewer instruction manuals are documents prepared for litigation. That is classic work product. Moreover, they typically include far more information than keyword disclosure, or search method disclosure. They often explain an attorney’s strategies and theories of a case. Here a clear line still exists to protect a lawyer’s work product. Yes, the kimono still lives, so too does the concept of relevance.


TAR Course Expands Again: Standardized Best Practice for Technology Assisted Review

February 11, 2018

The TAR Course has a new class, the Seventeenth Class: Another “Player’s View” of the Workflow. Several other parts of the Course have been updated and edited. It now has Eighteen Classes (listed at end). The TAR Course is free and follows the Open Source tradition. We freely disclose the method for electronic document review that uses the latest technology tools for search and quality controls. These technologies and methods empower attorneys to find the evidence needed for all text-based investigations. The TAR Course shares the state of the art for using AI to enhance electronic document review.

The key is to know how to use the document review search tools that are now available to find the targeted information. We have been working on various methods of use since our case before Judge Andrew Peck in Da Silva Moore in 2012. After we helped get the first judicial approval of predictive coding in Da Silva, we began a series of several hundred document reviews, both in legal practice and scientific experiments. We have now refined our method many times to attain optimal efficiency and effectiveness. We call our latest method Hybrid Multimodal IST Predictive Coding 4.0.

The Hybrid Multimodal method taught by the TARcourse.com combines law and technology. Successful completion of the TAR course requires knowledge of both fields. In the technology field active machine learning is the most important technology to understand, especially the intricacies of training selection, such as Intelligently Spaced Training (“IST”). In the legal field the proportionality doctrine is key to the  pragmatic application of the method taught at TAR Course. We give-away the information on the methods, we open-source it through this publication.

All we can transmit by online teaching is information, and a small bit of knowledge. Knowing the Information in the TAR Course is a necessary prerequisite for real knowledge of Hybrid Multimodal IST Predictive Coding 4.0. Knowledge, as opposed to Information, is taught the same way as advanced trial practice, by second chairing a number of trials. This kind of instruction is the one with real value, the one that completes a doc review project at the same time it completes training. We charge for document review and throw in the training. Information on the latest methods of document review is inherently free, but Knowledge of how to use these methods is a pay to learn process.

The Open Sourced Predictive Coding 4.0 method is applied for particular applications and search projects. There are always some customization and modifications to the default standards to meet the project requirements. All variations are documented and can be fully explained and justified. This is a process where the clients learn by doing and following along with Losey’s work.

What he has learned through a lifetime of teaching and studying Law and Technology is that real Knowledge can never be gained by reading or listening to presentations. Knowledge can only be gained by working with other people in real-time (or near-time), in this case, to carry out multiple electronic document reviews. The transmission of knowledge comes from the Q&A ESI Communications process. It comes from doing. When we lead a project, we help students to go from mere Information about the methods to real Knowledge of how it works. For instance, we do not just make the Stop decision, we also explain the decision. We share our work-product.

Knowledge comes from observing the application of the legal search methods in a variety of different review projects. Eventually some Wisdom may arise, especially as you recover from errors. For background on this triad, see Examining the 12 Predictions Made in 2015 in “Information → Knowledge → Wisdom” (2017). Once Wisdom arises some of the sayings in the TAR Course may start to make sense, such as our favorite “Relevant Is Irrelevant.” Until this koan is understood, the legal doctrine of Proportionality can be an overly complex weave.

The TAR Course is now composed of eighteen classes:

  1. First Class: Background and History of Predictive Coding
  2. Second Class: Introduction to the Course
  3. Third Class:  TREC Total Recall Track, 2015 and 2016
  4. Fourth Class: Introduction to the Nine Insights from TREC Research Concerning the Use of Predictive Coding in Legal Document Review
  5. Fifth Class: 1st of the Nine Insights – Active Machine Learning
  6. Sixth Class: 2nd Insight – Balanced Hybrid and Intelligently Spaced Training (IST)
  7. Seventh Class: 3rd and 4th Insights – Concept and Similarity Searches
  8. Eighth Class: 5th and 6th Insights – Keyword and Linear Review
  9. Ninth Class: 7th, 8th and 9th Insights – SME, Method, Software; the Three Pillars of Quality Control
  10. Tenth Class: Introduction to the Eight-Step Work Flow
  11. Eleventh Class: Step One – ESI Communications
  12. Twelfth Class: Step Two – Multimodal ECA
  13. Thirteenth Class: Step Three – Random Prevalence
  14. Fourteenth Class: Steps Four, Five and Six – Iterative Machine Training
  15. Fifteenth Class: Step Seven – ZEN Quality Assurance Tests (Zero Error Numerics)
  16. Sixteenth Class: Step Eight – Phased Production
  17. Seventeenth Class: Another “Player’s View” of the Workflow (class added 2018)
  18. Eighteenth Class: Conclusion

With a lot of hard work you can complete this online training program in a long weekend, but most people take a few weeks. After that, this course can serve as a solid reference to consult during complex document review projects. It can also serve as a launchpad for real Knowledge and eventually some Wisdom into electronic document review. TARcourse.com is designed to provide you with the Information needed to start this path to AI enhanced evidence detection and production.

 


Concept Drift and Consistency: Two Keys To Document Review Quality – Part Three

January 29, 2016

This is Part Three of this blog. Please read Part One and Part Two first.

Mitigating Factors to Human Inconsistency

Bob_DylanWhen you consider all of the classifications of documents, both relevant and irrelevant, my consistency rate in the two ENRON reviews jumps to about 99% (01% inconsistent). Compare this with the Grossman Cormack study of the 2009 TREC experiments, where agreement on all non-relevant adjudications, assuming all non-appealed decisions were correct, was 97.4 percent (2.6% inconsistent). My guess is that most well run CAR review projects today are in fact attaining overall high consistency rates. The existing technologies for duplication, similarity, concept and predictive ranking are very good, especially when all used together. When you consider both relevant and irrelevant coding, it should be in the 90s for sure, probably the high nineties. Hopefully, by using todays’ improved software and the latest, fairly simple 8-step methods, we can reduce the relevance inconsistency problem even further. Further scientific research is, however, needed test these hopes and suppositions. My results in the Enron studies could be black swan, but I doubt it. I think my inconsistency is consistent.

ei-recall_sphereEven though overall inconsistencies may be small, the much higher inconsistencies in relevance calls alone remains a continuing problem. It is a fact of life of all human document review as Voorhees showed years ago. The inconsistency problem must continue to be addressed by a variety of ongoing quality controls, including the use of predictive ranking, and including post hoc quality assurance tests such as ei-Recall. The research to date shows that duplicate, similarity and predictive coding ranking searches can help mitigate the inconsistency problem (the overlap has increased from the 30% range, to the 70% range), but not eliminate them entirely. By 2012 I was able to use these features to get the relevant-only disagreement rates down to 23%, and even then, the 63 inconsistently coded relevant documents were all unimportant. I suspect, but do not know, that my rates are now lower with improved quality controls, but do not know that. Again, further research is required before any blanket statements like that can be made authoritatively.

Our quest for quality legal search requires that we keep the natural human weakness of inconsistency front and center. Only computers are perfectly consistent. To help keep the human reviewers as consistent as possible, and so mitigate any damages that inconsistent coding may cause, a whole panoply of quality control and quality assurance methods should be used, not just improved search methods. See eg: ZeroErrorNumerics.com.

ZenB_transparent

The Zero Error Numerics (ZEN) quality methods include:

  • UpSide_down_champagne_glasspredictive coding analytics, a type of artificial intelligence, actively managed by skilled human analysts in a hybrid approach;
  • data visualizations with metrics to monitor progress;
  • flow-state of human reviewer concentration and interaction with AI processes;
  • quiet, uninterrupted, single-minded focus (dual tasking during review is prohibited);
  • disciplined adherence to a scientifically proven set of search and review methods including linear, keyword, similarity, concept, and predictive coding;
  • repeated tests for errors, especially retrieval omissions;
  • objective measurements of recall, precision and accuracy ranges;
  • judgmental and random sampling and analysis such as ei-Recall;
  • active project management and review-lawyer supervision;
  • small team approach with AI leverage, instead of large numbers of reviewers;
  • quality_trianglerecognition that mere relevant is irrelevant;
  • recognition of the importance of simplicity under the 7±2 rule;
  • multiple fail-safe systems for error detection of all kinds, including reviewer inconsistencies;
  • use of only the highest quality, tested e-discovery software and vendor teamsunder close supervision and teamwork;
  • use of only experienced, knowledgeable Subject Matter Experts for relevancy guidance, either directly or by close consultation;
  • extreme care taken to protect client confidentiality; and,
  • high ethics – our goal is to find and disclose the truth in compliance with local laws, not win a particular case.

That is my quality play book. No doubt others have come up with their own methods.

Conclusion

quality_compassHigh quality effective legal search depends in part on recognition of the common document review phenomena of concept shift and inconsistent classifications. Although you want to avoid inconsistencies, concept drift is a good thing. It should appear in all complex review projects. Think Bob Dylan – He not busy being born is busy dying. Moreover, you should have a standard protocol in place to both encourage and efficiently deal with such changes in relevance conception. If coding does not evolve, if relevance conceptions do not shift by conversations and analysis, there could be a quality issue. It is a warning flag and you should at least investigate.

race_car_warning_flagVery few projects go in a straight line known from the beginning. Most reviews are not like a simple drag race. There are many curves. If you do not see a curve in the road, and you keep going straight, a spectacular wreck can result. You could fly off the track. This can happen all too easily if the SME in charge of defining relevance has lost track of what the reviewers are doing. You have to keep your eyes on the road and your hands on the wheel.

NASCAR-Driver that looks like Losey

Good drivers of CARs – Computer Assisted Reviews – can see the curves. They expect them, even when driving a new course. When they come to a curve, they are not surprised, they know how to speed through the curves. They can do a power drift through any corner. Change in relevance should not be a speed-bump. It should be an opportunity to do a controlled skid, an exciting drift with tires burning. Race_car_drift_cornerSpeed drifts help keep a document review interesting, even fun, much like a race track. If you are not having a good time with large scale document review, then you are obviously doing something wrong. You may be driving an old car using the wrong methods. See: Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding 3.0.

quality_diceConcept shift makes it harder than ever to maintain consistency. When the contours of relevance are changing, at least somewhat, as they should, then you have to be careful and be sure all of your prior codings are redone and made consistent with the latest understanding. Your third step of a baseline random sample should, for instance, be constantly revisited. All of the prior codings should be corrected to be consistent with the latest thinking. Otherwise your prevalence estimate could be way off, and with it all of your rough estimates of recall. The concern with consistency may slow you down a bit, and make the project cost a little more, but the benefits in quality are well worth it.

If you are foolish enough to still use secret control sets, you will not be able to make these changes at all. When the drift hits, as it almost always does, your recall and precision reports based on this control set will be completely worthless. Worse, if the driver does not know this, they will be mislead by the software reports of precision and recall based on the secret control set. That is one reason I am so adamantly opposed to the use of secret control set and have called for all software manufacturers to remove them. See Predictive Coding 3.0 article, part one.

If you do not go back and correct for changes in conception, then you risk withholding a relevant document that you initially coded irrelevant. It could be an important document. There is also the chance that the inconsistent classifications can impact the active machine learning by confusing the algorithmic classifier. Good predictive coding software can handle some errors, but you may slow things down, or if it is extreme, mess them up entirely. Quality controls of all kinds are needed to prevent that.

Less_More_RalphAll types of quality controls are needed to address the inevitability of errors in reviewer classifications. Humans, even lawyers, will make some mistakes from time to time. We should expect that and allow for it in the process. Use of duplicate and near-duplicate guides, email strings, and other similarity searches, concept searches and probability rankings can mitigate against that fact that no human will ever attain perfect machine like consistency. So too can a variety of additional quality control measures, primary among them being the use of as few human reviewers as possible. This is in accord with the general review principle that I call less is more. See: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part One and Part Two. That is not a problem if you are driving a good CAR, one with the latest predictive coding search engines. More than a couple of reviewers in a CAR like that will just slow you down. But it’s alright, Ma, it’s life, and life only.

________________

Since I invoked the great Bob Dylan and It’s Alright, Ma earlier in this blog, I thought I owed it to you share the full lyrics, plus a video of young Bob’s performance. It could be his all time best song-poem. What do you think? Feeling very creative, leave a poem below that paraphrases Dylan to make one of the points in this blog

______________________

 “It’s Alright, Ma (I’m Only Bleeding)”

Bob Dylan as a young man

Bob Dylan

Darkness at the break of noon
Shadows even the silver spoon
The handmade blade, the child’s balloon
Eclipses both the sun and moon
To understand you know too soon
There is no sense in trying.
Pointed threats, they bluff with scorn
Suicide remarks are torn
From the fools gold mouthpiece
The hollow horn plays wasted words
Proved to warn
That he not busy being born
Is busy dying.
Temptation’s page flies out the door
You follow, find yourself at war
Watch waterfalls of pity roar
You feel to moan but unlike before
You discover
That you’d just be
One more person crying.
So don’t fear if you hear
A foreign sound to you ear
It’s alright, Ma, I’m only sighing.
As some warn victory, some downfall
Private reasons great or small
Can be seen in the eyes of those that call
To make all that should be killed to crawl
While others say don’t hate nothing at all
Except hatred.
Disillusioned words like bullets bark
As human gods aim for their marks
Made everything from toy guns that sparks
To flesh-colored Christs that glow in the dark
It’s easy to see without looking too far
That not much
Is really sacred.
While preachers preach of evil fates
Teachers teach that knowledge waits
Can lead to hundred-dollar plates
Goodness hides behind its gates
But even the President of the United States
Sometimes must have
To stand naked.
An’ though the rules of the road have been lodged
It’s only people’s games that you got to dodge
And it’s alright, Ma, I can make it.
Advertising signs that con you
Into thinking you’re the one
That can do what’s never been done
That can win what’s never been won
Meantime life outside goes on
All around you.
You loose yourself, you reappear
You suddenly find you got nothing to fear
Alone you stand without nobody near
When a trembling distant voice, unclear
Startles your sleeping ears to hear
That somebody thinks
They really found you.
A question in your nerves is lit
Yet you know there is no answer fit to satisfy
Insure you not to quit
To keep it in your mind and not forget
That it is not he or she or them or it
That you belong to.
Although the masters make the rules
For the wise men and the fools
I got nothing, Ma, to live up to.
For them that must obey authority
That they do not respect in any degree
Who despite their jobs, their destinies
Speak jealously of them that are free
Cultivate their flowers to be
Nothing more than something
They invest in.
While some on principles baptized
To strict party platforms ties
Social clubs in drag disguise
Outsiders they can freely criticize
Tell nothing except who to idolize
And then say God Bless him.
While one who sings with his tongue on fire
Gargles in the rat race choir
Bent out of shape from society’s pliers
Cares not to come up any higher
But rather get you down in the hole
That he’s in.
But I mean no harm nor put fault
On anyone that lives in a vault
But it’s alright, Ma, if I can’t please him.
Old lady judges, watch people in pairs
Limited in sex, they dare
To push fake morals, insult and stare
While money doesn’t talk, it swears
Obscenity, who really cares
Propaganda, all is phony.
While them that defend what they cannot see
With a killer’s pride, security
It blows the minds most bitterly
For them that think death’s honesty
Won’t fall upon them naturally
Life sometimes
Must get lonely.
My eyes collide head-on with stuffed graveyards
False gods, I scuff
At pettiness which plays so rough
Walk upside-down inside handcuffs
Kick my legs to crash it off
Say okay, I have had enough
What else can you show me?
And if my thought-dreams could been seen
They’d probably put my head in a guillotine
But it’s alright, Ma, it’s life, and life only.

 


Beware of the TAR Pits! – Part Two

February 23, 2014

This is the conclusion of a two part blog. For this to make sense please read Part One first.

Quality of Subject Matter Experts

Poppy_headThe quality of Subject Matter Experts in a TAR project is another key factor in predictive coding. It is one that many would prefer to sweep under the rug. Vendors especially do not like to talk about this (and they sponsor most panel discussions) because it is beyond their control. SMEs come from law firms. Law firms hire vendors. What dog will bite the hand that feeds him? Yet, we all know full well that not all subject matter experts are alike. Some are better than others. Some are far more experienced and knowledgeable than others. Some know exactly what documents they need at trial to win a case. They know what they are looking for. Some do not. Some have done trials, lots of them. Some do not know where the court house is. Some have done many large search projects, first paper, now digital. Some are great lawyers; and some, well, you’d be better off with my dog.

The SMEs are the navigators. They tell the drivers where to go. They make the final decisions on what is relevant and what is not. They determine what is hot, and what is not. They determine what is marginally relevant, what is grey area, what is not. They determine what is just unimportant more of the same. They know full well that some relevant is irrelevant. They have heard and understand the frequent mantra at trials: Objection, Cumulative. Rule 403 of the Federal Evidence Code. Also see The Fourth Secret of Search: Relevant Is Irrelevant found in Secrets of Search – Part III.

Quality of SMEs is important because the quality of input in active machine learning is important. A fundamental law of predictive coding as we now know it is GIGO, garbage in, garbage out. Your active machine learning depends on correct instruction. Although good software can mitigate this somewhat, it can never be eliminated. See: Webber & Pickens, Assessor Disagreement and Text Classifier Accuracy, SIGIR 2013 (24% more ranking depth needed to reach equivalent recall when not using SMEs, even in a small data search of news articles with rather simple issues).

Jeremy_PickensInformation scientists like Jeremy Pickens are, however, working hard on ways to minimize the errors of SME document classifications on overall corpus rankings. Good thing too because even one good SME will not be consistent in ranking the same documents. That is the Jaccard Index scientists like to measure. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part Two, and search of Jaccard in my blog.

Unique_Docs_VennIn my Enron experiments I was inconsistent in determining the relevance of the same document 23% of the time. That’s right, I contradicted myself on relevancy 23% of the time. (If you included irrelevancy coding the inconsistencies were only 2%.) Lest you think I’m a complete idiot (which, by the way, I sometimes am), the 23% rate is actually the best on record for an experiment. It is the best ever measured, by far. Other experimentally measured rates have inconsistencies of from 50% to 90% (with multiple reviewers). Pathetic huh? Now you know why AI is so promising and why it is so important to enhance our human intelligence with artificial intelligence. When it comes to consistency of document identifications in large scale data reviews, we are all idiots!

With these human  frailty facts in mind, not only variable quality in expertise of subject matter, but also human inconsistencies, it is obvious why scientists like Pickens and Webber are looking for techniques to minimize the impact of errors and, get this, even use these inevitable errors to improve search. Jeremy Pickens and I have been corresponding about this issue at length lately. Here is Jeremy’s later response to this blog. In TAR, Wrong Decisions Can Lead to the Right Documents (A Response to Ralph Losey). Jeremy does at least concede that coding quality is indeed important. He goes on to argue that his study shows that wrong decisions, typically on grey area documents, can indeed be useful.

Penrose_triangle_ExpertiseI do not doubt Dr. Pickens’ findings, but am skeptical of the search methods and conclusions derived therefrom. In other words, how the training was accomplished, the supervision of the learning. This is what I call here the driver’s role, shown on the triangle as the Power User and Experienced Searcher. In my experience as a driver/SME, much depends on where you are in the training cycle. As the training continues the algorithms eventually do become able to detect and respond to subtle documents distinctions. Yes, it take a while, and you have to know what and when to train on, which is the drivers skill (for instance you never train with giant documents), but it does eventually happen. Thus, while it may not matter if you code grey area documents wrong at first, it eventually will, that is unless you do not really care about the distinctions. (The TREC overturn documents Jeremy tested on, the ones he called wrong documents, were in fact grey area documents, that is, close questions. Attorneys disagreed on whether they were relevant, which is why they were overturned on appeal.) The lack of precision in training, which is inevitable anyway even when one SME is used, may not matter much in early stages of training, and may not matter at all when testing simplistic issues using easy databases, such as news articles. In fact, I have used semi-supervised training myself, as Jeremy describes from old experiments in Pseudo Relevance Feedback. I have seen it work myself, especially in early training.

Still, the fact some errors do not matter in early training does not mean you should not care about consistency and accuracy of training during the whole ride. In my experience, as training progresses and the machine gets smarter, it does matter. But let’s test that shall we? All I can do is report on what I see, i.w. – anecdotal.

Outside of TREC and science experiments, in the messy real world of legal search, the issues are typically maddeningly difficult. Moreover, the difference in cost of review of hundreds of thousands of irrelevant documents can be mean millions of dollars. The fine points of differentiation in matured training are needed for precision in results to reduce costs of final review. In other words, both precision and recall matter in legal search, and all are governed by the overarching legal principle of proportionality. That is not part of information science of course, but we lawyers must govern our search efforts by proportionality.

Also See William Webber’s response: Can you train a useful model with incorrect labels? I believe that William’s closing statement may be correct, either that or software differences:

It may also be, though this is speculation on my part, that a trainer who is not only a subject-matter expert, but an expert in training itself (an expert CAR driver, to adopt Ralph Losey’s terminology) may be better at selecting training examples; for instance, in recognizing when a document, though responsive (or non-responsive), is not a good training example.

alchemyI hope Pickens and Webber get there some day. In truth, I am a big supporter of their efforts and experiments. We need more scientific research. But for now, I still do not believe we can turn lead into gold. It is even worse if you have a bunch of SMEs arguing with each other about where they should be going, about what is relevant and what is not. That is a separate issue they do not address, which points to the downside of all trainers, both amateurs and SMEs alike. See: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

For additional support on the importance of SMEs, see again Monica’s article, EDI-Oracle Studywhere she summarizes the conclusion of Patrick Oot from the study that:

Technology providers using similar underlying technology, but different human resources, performed in both the top-tier and bottom-tier of all categories. Conclusion: Software is only as good as its operators. Human contribution is the most significant element. (emphasis in original)

Also see the recent Xerox blog, Who Prevails in the E-Discovery War of Man vs. Machine? by Gabriela Baron.

Teams that participated in Oracle without a bona fide SME, much less a good driver, well, they were doomed. The software was secondary. How could you possibly replicate the work of the original SME trial lawyers that did the first search without having an SME yourself, one with at least a similar experience and knowledge level.

map_lost_navigator_SMEThis means that even with a good driver, and good software, if you do not also have a good SME, you can still end up driving in circles. It is even worse when you try to do a project with no SME at all. Remember, the SME in the automobile analogy is the navigation system, or to use the pre-digital reality, the passenger with the map. We have all seen what happens where the navigation system screws up, or the map is wrong, or more typically, out of date (like many old SMEs). You do not get to the right place. You can have a great driver, and go quite fast, but if you have a poor navigator, you will not like the results.

The Oracle study showed this, but it is hardly new or surprising. In fact, it would be shocking if the contrary were true. How can incorrect information ever create correct information? The best you can hope for is to have enough correct information to smooth out the errors. Put another way, without signal, noise is just noise. Still, Jeremy Pickens claims there is a way. I will be watching and hope he succeeds where the alchemists of old always failed.

Tabula Rasa

blank_slateThere is one way out of the SME frailty conundrum that I have high hopes for and can already understand. It has to do with teaching the machine about relevance for all projects, not just one. The way predictive coding works now the machine is a tabula rasa, a blank slate. The machine knows nothing to begin with. It only knows what you teach it as the search begins. No matter how good the AI software is at learning, it still does not know anything on its own. It is just good at learning.

That approach is obviously not too bright. Yet, it is all we can manage now in legal search at the beginning of the Second Machine Age. Someday soon it will change. The machine will not have its memory wiped after every project. It will remember. The training from one search project will carry over to the next one like it. The machine will remember the training of past SMEs.

That is the essential core of my PreSuit proposal: to retain the key components of the past SME training so that you do not have to start afresh on each search project. PreSuit: How Corporate Counsel Could Use “Smart Data” to Predict and Prevent Litigation. When that happens (I don’t say if, because this will start happening soon, some say it already has) the machine could start smart.

Scarlett_Johansson - Samantha in HERThat is what we all want. That is the holy grail of AI-enhanced search — a smart machine. (For the ultimate implications of this, see the movie Her, which is about an AI enhanced future that is still quite a few years down the road.) But do not kid yourself, that is not what we have now. Now we only have baby robots, ones that are eager and ready to learn, but do not know anything. It is kind of like 1-Ls in law school, except that when they finish a class they do not retain a thing!

When my PreSuit idea is implemented, the next SME will not have to start afresh. The machine will not be a tabula rasa. It will be able to see litigation brewing. It will help general counsel to stop law suits before they are filed. The SMEs will then build on the work of prior SMEs, or maybe build on their own previous work in another similar project. Then the GIGO principle will be much easier to mitigate. Then the computer will not be completely dumb, it will have some intelligence from the last guy. There will be some smart data, not just big dumb data. The software will know stuff, know the law and relevance, not just know how to learn stuff.

When that happens, then the SME in a particular project will not be as important, but for now, when working from scratch with dumb data, the SME is still critical. The smarter and more consistent the better. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

Professor Marchionini, like all other search experts, recognizes the importance of SMEs to successful search. As he puts it:

Thus, experts in a domain have greater facility and experience related to information-seeking factors specific to the domain and are able to execute the subprocesses of information seeking with speed, confidence, and accuracy.

That is one reason that the Grossman Cormack glossary builds in the role of SMEs as part of their base definition of computer assisted review:

A process for Prioritizing or Coding a Collection of electronic Documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of Documents and then extrapolates those judgments to the remaining Document Collection.

Glossary at pg. 21 defining TAR.

Most SMEs Today Hate CARs
(And They Don’t Much Like High-Tech Drivers Either)

simpsonoldmanThis is an inconvenient truth for vendors. Predictive coding is defined by SMEs. Yet vendors cannot make good SMEs step up to the plate and work with the trainers, the drivers, to teach the machine. All the vendors can do is supply the car and maybe help with the driver. The driver and navigator have to be supplied by the law firm or corporate clients. There is no shortage of good SMEs, but almost all of them have never even seen a CAR. They do not like them. They can barely even speak the language of the driver. They don’t much like most of the drivers either. They are damn straight not going to spend two weeks of their lives riding around in one of those new fangled horseless carriages.

ringo and old guy

That is the reality of where we are now. Also see: Does Technology Leap While Law Creeps? by Brian Dalton, Above the Law. Of course this will change with the generations. But for now, that is the way it is. So vendors work on error minimization. They try to minimize the role of SMEs. That is anyway a good idea, because, as mentioned, all human SMEs are inconsistent. I was lucky to only be inconsistent 23% of the time on relevance. But still, there is another obvious solution.

There is another way to deal today with the reluctant SME problem, a way that works right now with today’s predictive coding software. It is a kind of non-robotic surrogate system that I have developed, and I’m sure a several other professional drivers have as well. See my CAR page for more information on this. But, in reality it is one of those things I would just have to show you in a driver education school type setting. I do it frequently. It involves action in behalf of an SME, and dealing with the driver for them. It places them in their comfort zone, where they just make yes no decisions on the close question documents, although there is obviously more to it than that. It is not nearly as good as the surrogate system in the movie Her, and of course, I’m no movie star, but it works.

HER_Samantha_Surrogate

My own legal subject matter expertise is, like most lawyers, fairly limited. I know a lot about a few things, and am a stand alone SME in those fields. I know a fair amount about many more legal fields, enough to understand real experts, enough to serve as their surrogate or right hand. Those are the CAR trips I will take.

If I do not know enough about a field of law to understand what the experts are saying, then I cannot serve as a surrogate. I could still drive of course, but I would refuse to do that out of principle, unless I had a navigator, an SME, who knew what they were doing and where they wanted to go. I would need an SME willing to spend the time in the CAR needed to tell me where to go. I hate a TAR pit as much as the next guy. Plus at my age and experience I can drive anywhere I want, in pretty much any CAR I want. That brings us to the final corner of the triangle, the variance in the quality of predictive coding software.

Quality of the CAR Software

I am not going to spend a lot of time on this. No lawyer could be naive enough to think that all of the software is equally as good. That is never how it works. It takes time and money to make sophisticated software like this. Anybody can simply add on open source machine learning software code to their review platforms. That does not take much, but that is a Model-T.

Old_CAR_stuck_mud

To make active machine learning work really well, to take it to the next level, requires thousands of programming hours. It takes large teams of programmers. It takes years. It take money. It takes scientists. It takes engineers. It takes legal experts too. It takes many versions and continuous improvements of search and review software. That is how you tell the difference between ok, good, and great software. I am not going to name names, but I will say the Gartner’s so called Magic Quadrant evaluation of e-discovery software is not too bad. Still, be aware that evaluation of predictive coding is not really their thing, or even a primary factor for rating review software.

Gartner_Magic_Quadrant

It is kind of funny how pretty much everybody wins in the Gartner evaluation. Do you think that’s an accident? I am privately much more critical. Many well known programs are very late to the predictive coding party. They are way behind. Time will tell if they are ever able to catch up.

Still, these things do change from year to year, as new versions of software are continually released. For some companies you can see real improvements, real investments being made. For others, not so much, and what you do see is often just skin deep. Always be skeptical. And remember, the software CAR is only as good as your driver and navigator.

car_mind_meld

When it comes to software evaluation what counts is whether the algorithms can find the documents needed or not. Even the best driver navigator team in the world can only go so far in a clunker. But give them a great CAR, and they will fly. The software will more than pay for itself in saved reviewer time and added security of a job well done.

Deja Vu All Over Again. 

Predictive coding is a great leap forward in search technology. In the longterm predictive coding and other AI-based software will have a bigger impact on the legal profession than did the original introduction of computers into the law office. No large changes like this are without problems. When computers were first brought into law offices they too caused all sorts of problems and had their pitfalls and nay sayers. It was a rocky road at first.

Ralph in the late 1980s

I was there and remember it all very well. The Fonz was cool. Disco was still in. I can remember the secretaries yelling many times a day that they needed to reboot. Reboot! Better save. It became a joke, a maddening one. The network was especially problematic. The partner in charge threw up his hands in frustration. The other partners turned the whole project over to me, even though I was a young associate fresh out of law school. They had no choice. I was the only one who could make the damn systems work.

Ifloppy_8incht was a big investment for the firm at the time. Failure was not an option. So I worked late and led my firm’s transition from electric typewriters and carbon paper to personal computers, IBM System 36 minicomputers, word processing, printers, hardwired networks, and incredibly elaborate time and billing software. Remember Manac time and billing in Canada? Remember Displaywriter? How about the eight inch floppy? It was all new and exciting. Computers in a law office! We were written up in IBM’s small business magazine.

For years I knew what every DOS operating file was on every computer in the firm. The IBM repair man became a good friend. Yes, it was a lot simpler then. An attorney could practice law and run his firm’s IT department at the same time.

ralph_1990sHey, I was the firm’s IT department for the first decade. Computers, especially word processing and time and billing software, eventually made a huge difference in efficiency and productivity. But at first there were many pitfalls. It took us years to create new systems that worked smoothly in law offices. Business methods always lag way behind new technology. This is clearly shown by MIT’s Erik Brynjolfsson and Andrew McAfee in their bestseller, Second Machine Age. It typically takes a generation to adjust to major technology breakthroughs. Also see Ted Talk by Brynjolfsson with video.

I see parallels with the 1980s and now. The main difference is legal tech pioneers were very isolated then. The world is much more connected now. We can observe together how, like in the eighties, a whole new level of technology is starting to make its way into the law office. AI-enhanced software, starting with legal search and predictive coding, is something new and revolutionary. It is like the first computers and word processing software of the late 1970s and early 80s.

It will not stop there. Predictive coding will soon expand into information governance. This is the PreSuit project idea that I, and others, are starting to talk about. See Eg: Information Governance Initiative. Moreover, many think AI software will soon revolutionize legal practice in a number of other ways, including contract generation and other types of repetitive legal work and analysis. See Eg: Rohit Talwar, Rethinking Law Firm Strategies for an Era of Smart Technology (ABA  LPT, 2014). The potential impact of supervised learning and other cognitive analytics tools on all industries is vast. See Eg: Deloitte’s 2014 paper: Cognitive Analytics (“For the first time in computing history, it’s possible for machines to learn from experience and penetrate the complexity of data to identify associations.”); Also see: Digital Reasoning software, and Paragon Science software. Who knows where it will lead the world, much less the legal profession? Back in the 1980s I could never have imagined the online Internet based legal practice that most of us have now.

The only thing we know for sure is that it will not come easy. There will be problems, and the problems will be overcome. It will take creativity and hard work, but it will be done. Easy buttons have always been a myth, especially when dealing with the latest advancements of technology. The benefits are great. The improvements from predictive coding in document review quality and speed are truly astonishing. And it lowers cost too, especially if you avoid the pits. Of course there are issues. Of course there are TAR pits. But they can be avoided and the results are well worth the effort. The truth is we have no choice.

Conclusion

retire

If you want to remain relevant and continue to practice law in the coming decades, then you will have to learn how to use the new AI-enhanced technologies. There is really no choice, other than retirement. Keep up, learn the new ways, or move on. Many lawyers my age are retiring now for just this reason. They have no desire to learn e-discovery, much less predictive coding. That’s fine. That is the honest thing to do. The next generation will learn to do it, just like a few lawyers learned to use computers in the 1980s and 1990s. Stagnation and more of the same is not an option in today’s world. Constant change and education is the new normal. I think that is a good thing. Do you?

Leave a comment. Especially feel free to point out a TAR pit not mentioned here. There are many, I know, and you cannot avoid something you cannot see.


%d bloggers like this: