TAR Course Expands Again: Standardized Best Practice for Technology Assisted Review

February 11, 2018

The TAR Course has a new class, the Seventeenth Class: Another “Player’s View” of the Workflow. Several other parts of the Course have been updated and edited. It now has Eighteen Classes (listed at end). The TAR Course is free and follows the Open Source tradition. We freely disclose the method for electronic document review that uses the latest technology tools for search and quality controls. These technologies and methods empower attorneys to find the evidence needed for all text-based investigations. The TAR Course shares the state of the art for using AI to enhance electronic document review.

The key is to know how to use the document review search tools that are now available to find the targeted information. We have been working on various methods of use since our case before Judge Andrew Peck in Da Silva Moore in 2012. After we helped get the first judicial approval of predictive coding in Da Silva, we began a series of several hundred document reviews, both in legal practice and scientific experiments. We have now refined our method many times to attain optimal efficiency and effectiveness. We call our latest method Hybrid Multimodal IST Predictive Coding 4.0.

The Hybrid Multimodal method taught by the TARcourse.com combines law and technology. Successful completion of the TAR course requires knowledge of both fields. In the technology field active machine learning is the most important technology to understand, especially the intricacies of training selection, such as Intelligently Spaced Training (“IST”). In the legal field the proportionality doctrine is key to the  pragmatic application of the method taught at TAR Course. We give-away the information on the methods, we open-source it through this publication.

All we can transmit by online teaching is information, and a small bit of knowledge. Knowing the Information in the TAR Course is a necessary prerequisite for real knowledge of Hybrid Multimodal IST Predictive Coding 4.0. Knowledge, as opposed to Information, is taught the same way as advanced trial practice, by second chairing a number of trials. This kind of instruction is the one with real value, the one that completes a doc review project at the same time it completes training. We charge for document review and throw in the training. Information on the latest methods of document review is inherently free, but Knowledge of how to use these methods is a pay to learn process.

The Open Sourced Predictive Coding 4.0 method is applied for particular applications and search projects. There are always some customization and modifications to the default standards to meet the project requirements. All variations are documented and can be fully explained and justified. This is a process where the clients learn by doing and following along with Losey’s work.

What he has learned through a lifetime of teaching and studying Law and Technology is that real Knowledge can never be gained by reading or listening to presentations. Knowledge can only be gained by working with other people in real-time (or near-time), in this case, to carry out multiple electronic document reviews. The transmission of knowledge comes from the Q&A ESI Communications process. It comes from doing. When we lead a project, we help students to go from mere Information about the methods to real Knowledge of how it works. For instance, we do not just make the Stop decision, we also explain the decision. We share our work-product.

Knowledge comes from observing the application of the legal search methods in a variety of different review projects. Eventually some Wisdom may arise, especially as you recover from errors. For background on this triad, see Examining the 12 Predictions Made in 2015 in “Information → Knowledge → Wisdom” (2017). Once Wisdom arises some of the sayings in the TAR Course may start to make sense, such as our favorite “Relevant Is Irrelevant.” Until this koan is understood, the legal doctrine of Proportionality can be an overly complex weave.

The TAR Course is now composed of eighteen classes:

  1. First Class: Background and History of Predictive Coding
  2. Second Class: Introduction to the Course
  3. Third Class:  TREC Total Recall Track, 2015 and 2016
  4. Fourth Class: Introduction to the Nine Insights from TREC Research Concerning the Use of Predictive Coding in Legal Document Review
  5. Fifth Class: 1st of the Nine Insights – Active Machine Learning
  6. Sixth Class: 2nd Insight – Balanced Hybrid and Intelligently Spaced Training (IST)
  7. Seventh Class: 3rd and 4th Insights – Concept and Similarity Searches
  8. Eighth Class: 5th and 6th Insights – Keyword and Linear Review
  9. Ninth Class: 7th, 8th and 9th Insights – SME, Method, Software; the Three Pillars of Quality Control
  10. Tenth Class: Introduction to the Eight-Step Work Flow
  11. Eleventh Class: Step One – ESI Communications
  12. Twelfth Class: Step Two – Multimodal ECA
  13. Thirteenth Class: Step Three – Random Prevalence
  14. Fourteenth Class: Steps Four, Five and Six – Iterative Machine Training
  15. Fifteenth Class: Step Seven – ZEN Quality Assurance Tests (Zero Error Numerics)
  16. Sixteenth Class: Step Eight – Phased Production
  17. Seventeenth Class: Another “Player’s View” of the Workflow (class added 2018)
  18. Eighteenth Class: Conclusion

With a lot of hard work you can complete this online training program in a long weekend, but most people take a few weeks. After that, this course can serve as a solid reference to consult during complex document review projects. It can also serve as a launchpad for real Knowledge and eventually some Wisdom into electronic document review. TARcourse.com is designed to provide you with the Information needed to start this path to AI enhanced evidence detection and production.

 


Concept Drift and Consistency: Two Keys To Document Review Quality – Part Three

January 29, 2016

This is Part Three of this blog. Please read Part One and Part Two first.

Mitigating Factors to Human Inconsistency

Bob_DylanWhen you consider all of the classifications of documents, both relevant and irrelevant, my consistency rate in the two ENRON reviews jumps to about 99% (01% inconsistent). Compare this with the Grossman Cormack study of the 2009 TREC experiments, where agreement on all non-relevant adjudications, assuming all non-appealed decisions were correct, was 97.4 percent (2.6% inconsistent). My guess is that most well run CAR review projects today are in fact attaining overall high consistency rates. The existing technologies for duplication, similarity, concept and predictive ranking are very good, especially when all used together. When you consider both relevant and irrelevant coding, it should be in the 90s for sure, probably the high nineties. Hopefully, by using todays’ improved software and the latest, fairly simple 8-step methods, we can reduce the relevance inconsistency problem even further. Further scientific research is, however, needed test these hopes and suppositions. My results in the Enron studies could be black swan, but I doubt it. I think my inconsistency is consistent.

ei-recall_sphereEven though overall inconsistencies may be small, the much higher inconsistencies in relevance calls alone remains a continuing problem. It is a fact of life of all human document review as Voorhees showed years ago. The inconsistency problem must continue to be addressed by a variety of ongoing quality controls, including the use of predictive ranking, and including post hoc quality assurance tests such as ei-Recall. The research to date shows that duplicate, similarity and predictive coding ranking searches can help mitigate the inconsistency problem (the overlap has increased from the 30% range, to the 70% range), but not eliminate them entirely. By 2012 I was able to use these features to get the relevant-only disagreement rates down to 23%, and even then, the 63 inconsistently coded relevant documents were all unimportant. I suspect, but do not know, that my rates are now lower with improved quality controls, but do not know that. Again, further research is required before any blanket statements like that can be made authoritatively.

Our quest for quality legal search requires that we keep the natural human weakness of inconsistency front and center. Only computers are perfectly consistent. To help keep the human reviewers as consistent as possible, and so mitigate any damages that inconsistent coding may cause, a whole panoply of quality control and quality assurance methods should be used, not just improved search methods. See eg: ZeroErrorNumerics.com.

ZenB_transparent

The Zero Error Numerics (ZEN) quality methods include:

  • UpSide_down_champagne_glasspredictive coding analytics, a type of artificial intelligence, actively managed by skilled human analysts in a hybrid approach;
  • data visualizations with metrics to monitor progress;
  • flow-state of human reviewer concentration and interaction with AI processes;
  • quiet, uninterrupted, single-minded focus (dual tasking during review is prohibited);
  • disciplined adherence to a scientifically proven set of search and review methods including linear, keyword, similarity, concept, and predictive coding;
  • repeated tests for errors, especially retrieval omissions;
  • objective measurements of recall, precision and accuracy ranges;
  • judgmental and random sampling and analysis such as ei-Recall;
  • active project management and review-lawyer supervision;
  • small team approach with AI leverage, instead of large numbers of reviewers;
  • quality_trianglerecognition that mere relevant is irrelevant;
  • recognition of the importance of simplicity under the 7±2 rule;
  • multiple fail-safe systems for error detection of all kinds, including reviewer inconsistencies;
  • use of only the highest quality, tested e-discovery software and vendor teamsunder close supervision and teamwork;
  • use of only experienced, knowledgeable Subject Matter Experts for relevancy guidance, either directly or by close consultation;
  • extreme care taken to protect client confidentiality; and,
  • high ethics – our goal is to find and disclose the truth in compliance with local laws, not win a particular case.

That is my quality play book. No doubt others have come up with their own methods.

Conclusion

quality_compassHigh quality effective legal search depends in part on recognition of the common document review phenomena of concept shift and inconsistent classifications. Although you want to avoid inconsistencies, concept drift is a good thing. It should appear in all complex review projects. Think Bob Dylan – He not busy being born is busy dying. Moreover, you should have a standard protocol in place to both encourage and efficiently deal with such changes in relevance conception. If coding does not evolve, if relevance conceptions do not shift by conversations and analysis, there could be a quality issue. It is a warning flag and you should at least investigate.

race_car_warning_flagVery few projects go in a straight line known from the beginning. Most reviews are not like a simple drag race. There are many curves. If you do not see a curve in the road, and you keep going straight, a spectacular wreck can result. You could fly off the track. This can happen all too easily if the SME in charge of defining relevance has lost track of what the reviewers are doing. You have to keep your eyes on the road and your hands on the wheel.

NASCAR-Driver that looks like Losey

Good drivers of CARs – Computer Assisted Reviews – can see the curves. They expect them, even when driving a new course. When they come to a curve, they are not surprised, they know how to speed through the curves. They can do a power drift through any corner. Change in relevance should not be a speed-bump. It should be an opportunity to do a controlled skid, an exciting drift with tires burning. Race_car_drift_cornerSpeed drifts help keep a document review interesting, even fun, much like a race track. If you are not having a good time with large scale document review, then you are obviously doing something wrong. You may be driving an old car using the wrong methods. See: Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding 3.0.

quality_diceConcept shift makes it harder than ever to maintain consistency. When the contours of relevance are changing, at least somewhat, as they should, then you have to be careful and be sure all of your prior codings are redone and made consistent with the latest understanding. Your third step of a baseline random sample should, for instance, be constantly revisited. All of the prior codings should be corrected to be consistent with the latest thinking. Otherwise your prevalence estimate could be way off, and with it all of your rough estimates of recall. The concern with consistency may slow you down a bit, and make the project cost a little more, but the benefits in quality are well worth it.

If you are foolish enough to still use secret control sets, you will not be able to make these changes at all. When the drift hits, as it almost always does, your recall and precision reports based on this control set will be completely worthless. Worse, if the driver does not know this, they will be mislead by the software reports of precision and recall based on the secret control set. That is one reason I am so adamantly opposed to the use of secret control set and have called for all software manufacturers to remove them. See Predictive Coding 3.0 article, part one.

If you do not go back and correct for changes in conception, then you risk withholding a relevant document that you initially coded irrelevant. It could be an important document. There is also the chance that the inconsistent classifications can impact the active machine learning by confusing the algorithmic classifier. Good predictive coding software can handle some errors, but you may slow things down, or if it is extreme, mess them up entirely. Quality controls of all kinds are needed to prevent that.

Less_More_RalphAll types of quality controls are needed to address the inevitability of errors in reviewer classifications. Humans, even lawyers, will make some mistakes from time to time. We should expect that and allow for it in the process. Use of duplicate and near-duplicate guides, email strings, and other similarity searches, concept searches and probability rankings can mitigate against that fact that no human will ever attain perfect machine like consistency. So too can a variety of additional quality control measures, primary among them being the use of as few human reviewers as possible. This is in accord with the general review principle that I call less is more. See: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part One and Part Two. That is not a problem if you are driving a good CAR, one with the latest predictive coding search engines. More than a couple of reviewers in a CAR like that will just slow you down. But it’s alright, Ma, it’s life, and life only.

________________

Since I invoked the great Bob Dylan and It’s Alright, Ma earlier in this blog, I thought I owed it to you share the full lyrics, plus a video of young Bob’s performance. It could be his all time best song-poem. What do you think? Feeling very creative, leave a poem below that paraphrases Dylan to make one of the points in this blog

______________________

 “It’s Alright, Ma (I’m Only Bleeding)”

Bob Dylan as a young man

Bob Dylan

Darkness at the break of noon
Shadows even the silver spoon
The handmade blade, the child’s balloon
Eclipses both the sun and moon
To understand you know too soon
There is no sense in trying.
Pointed threats, they bluff with scorn
Suicide remarks are torn
From the fools gold mouthpiece
The hollow horn plays wasted words
Proved to warn
That he not busy being born
Is busy dying.
Temptation’s page flies out the door
You follow, find yourself at war
Watch waterfalls of pity roar
You feel to moan but unlike before
You discover
That you’d just be
One more person crying.
So don’t fear if you hear
A foreign sound to you ear
It’s alright, Ma, I’m only sighing.
As some warn victory, some downfall
Private reasons great or small
Can be seen in the eyes of those that call
To make all that should be killed to crawl
While others say don’t hate nothing at all
Except hatred.
Disillusioned words like bullets bark
As human gods aim for their marks
Made everything from toy guns that sparks
To flesh-colored Christs that glow in the dark
It’s easy to see without looking too far
That not much
Is really sacred.
While preachers preach of evil fates
Teachers teach that knowledge waits
Can lead to hundred-dollar plates
Goodness hides behind its gates
But even the President of the United States
Sometimes must have
To stand naked.
An’ though the rules of the road have been lodged
It’s only people’s games that you got to dodge
And it’s alright, Ma, I can make it.
Advertising signs that con you
Into thinking you’re the one
That can do what’s never been done
That can win what’s never been won
Meantime life outside goes on
All around you.
You loose yourself, you reappear
You suddenly find you got nothing to fear
Alone you stand without nobody near
When a trembling distant voice, unclear
Startles your sleeping ears to hear
That somebody thinks
They really found you.
A question in your nerves is lit
Yet you know there is no answer fit to satisfy
Insure you not to quit
To keep it in your mind and not forget
That it is not he or she or them or it
That you belong to.
Although the masters make the rules
For the wise men and the fools
I got nothing, Ma, to live up to.
For them that must obey authority
That they do not respect in any degree
Who despite their jobs, their destinies
Speak jealously of them that are free
Cultivate their flowers to be
Nothing more than something
They invest in.
While some on principles baptized
To strict party platforms ties
Social clubs in drag disguise
Outsiders they can freely criticize
Tell nothing except who to idolize
And then say God Bless him.
While one who sings with his tongue on fire
Gargles in the rat race choir
Bent out of shape from society’s pliers
Cares not to come up any higher
But rather get you down in the hole
That he’s in.
But I mean no harm nor put fault
On anyone that lives in a vault
But it’s alright, Ma, if I can’t please him.
Old lady judges, watch people in pairs
Limited in sex, they dare
To push fake morals, insult and stare
While money doesn’t talk, it swears
Obscenity, who really cares
Propaganda, all is phony.
While them that defend what they cannot see
With a killer’s pride, security
It blows the minds most bitterly
For them that think death’s honesty
Won’t fall upon them naturally
Life sometimes
Must get lonely.
My eyes collide head-on with stuffed graveyards
False gods, I scuff
At pettiness which plays so rough
Walk upside-down inside handcuffs
Kick my legs to crash it off
Say okay, I have had enough
What else can you show me?
And if my thought-dreams could been seen
They’d probably put my head in a guillotine
But it’s alright, Ma, it’s life, and life only.

 


Beware of the TAR Pits! – Part Two

February 23, 2014

This is the conclusion of a two part blog. For this to make sense please read Part One first.

Quality of Subject Matter Experts

Poppy_headThe quality of Subject Matter Experts in a TAR project is another key factor in predictive coding. It is one that many would prefer to sweep under the rug. Vendors especially do not like to talk about this (and they sponsor most panel discussions) because it is beyond their control. SMEs come from law firms. Law firms hire vendors. What dog will bite the hand that feeds him? Yet, we all know full well that not all subject matter experts are alike. Some are better than others. Some are far more experienced and knowledgeable than others. Some know exactly what documents they need at trial to win a case. They know what they are looking for. Some do not. Some have done trials, lots of them. Some do not know where the court house is. Some have done many large search projects, first paper, now digital. Some are great lawyers; and some, well, you’d be better off with my dog.

The SMEs are the navigators. They tell the drivers where to go. They make the final decisions on what is relevant and what is not. They determine what is hot, and what is not. They determine what is marginally relevant, what is grey area, what is not. They determine what is just unimportant more of the same. They know full well that some relevant is irrelevant. They have heard and understand the frequent mantra at trials: Objection, Cumulative. Rule 403 of the Federal Evidence Code. Also see The Fourth Secret of Search: Relevant Is Irrelevant found in Secrets of Search – Part III.

Quality of SMEs is important because the quality of input in active machine learning is important. A fundamental law of predictive coding as we now know it is GIGO, garbage in, garbage out. Your active machine learning depends on correct instruction. Although good software can mitigate this somewhat, it can never be eliminated. See: Webber & Pickens, Assessor Disagreement and Text Classifier Accuracy, SIGIR 2013 (24% more ranking depth needed to reach equivalent recall when not using SMEs, even in a small data search of news articles with rather simple issues).

Jeremy_PickensInformation scientists like Jeremy Pickens are, however, working hard on ways to minimize the errors of SME document classifications on overall corpus rankings. Good thing too because even one good SME will not be consistent in ranking the same documents. That is the Jaccard Index scientists like to measure. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Part Two, and search of Jaccard in my blog.

Unique_Docs_VennIn my Enron experiments I was inconsistent in determining the relevance of the same document 23% of the time. That’s right, I contradicted myself on relevancy 23% of the time. (If you included irrelevancy coding the inconsistencies were only 2%.) Lest you think I’m a complete idiot (which, by the way, I sometimes am), the 23% rate is actually the best on record for an experiment. It is the best ever measured, by far. Other experimentally measured rates have inconsistencies of from 50% to 90% (with multiple reviewers). Pathetic huh? Now you know why AI is so promising and why it is so important to enhance our human intelligence with artificial intelligence. When it comes to consistency of document identifications in large scale data reviews, we are all idiots!

With these human  frailty facts in mind, not only variable quality in expertise of subject matter, but also human inconsistencies, it is obvious why scientists like Pickens and Webber are looking for techniques to minimize the impact of errors and, get this, even use these inevitable errors to improve search. Jeremy Pickens and I have been corresponding about this issue at length lately. Here is Jeremy’s later response to this blog. In TAR, Wrong Decisions Can Lead to the Right Documents (A Response to Ralph Losey). Jeremy does at least concede that coding quality is indeed important. He goes on to argue that his study shows that wrong decisions, typically on grey area documents, can indeed be useful.

Penrose_triangle_ExpertiseI do not doubt Dr. Pickens’ findings, but am skeptical of the search methods and conclusions derived therefrom. In other words, how the training was accomplished, the supervision of the learning. This is what I call here the driver’s role, shown on the triangle as the Power User and Experienced Searcher. In my experience as a driver/SME, much depends on where you are in the training cycle. As the training continues the algorithms eventually do become able to detect and respond to subtle documents distinctions. Yes, it take a while, and you have to know what and when to train on, which is the drivers skill (for instance you never train with giant documents), but it does eventually happen. Thus, while it may not matter if you code grey area documents wrong at first, it eventually will, that is unless you do not really care about the distinctions. (The TREC overturn documents Jeremy tested on, the ones he called wrong documents, were in fact grey area documents, that is, close questions. Attorneys disagreed on whether they were relevant, which is why they were overturned on appeal.) The lack of precision in training, which is inevitable anyway even when one SME is used, may not matter much in early stages of training, and may not matter at all when testing simplistic issues using easy databases, such as news articles. In fact, I have used semi-supervised training myself, as Jeremy describes from old experiments in Pseudo Relevance Feedback. I have seen it work myself, especially in early training.

Still, the fact some errors do not matter in early training does not mean you should not care about consistency and accuracy of training during the whole ride. In my experience, as training progresses and the machine gets smarter, it does matter. But let’s test that shall we? All I can do is report on what I see, i.w. – anecdotal.

Outside of TREC and science experiments, in the messy real world of legal search, the issues are typically maddeningly difficult. Moreover, the difference in cost of review of hundreds of thousands of irrelevant documents can be mean millions of dollars. The fine points of differentiation in matured training are needed for precision in results to reduce costs of final review. In other words, both precision and recall matter in legal search, and all are governed by the overarching legal principle of proportionality. That is not part of information science of course, but we lawyers must govern our search efforts by proportionality.

Also See William Webber’s response: Can you train a useful model with incorrect labels? I believe that William’s closing statement may be correct, either that or software differences:

It may also be, though this is speculation on my part, that a trainer who is not only a subject-matter expert, but an expert in training itself (an expert CAR driver, to adopt Ralph Losey’s terminology) may be better at selecting training examples; for instance, in recognizing when a document, though responsive (or non-responsive), is not a good training example.

alchemyI hope Pickens and Webber get there some day. In truth, I am a big supporter of their efforts and experiments. We need more scientific research. But for now, I still do not believe we can turn lead into gold. It is even worse if you have a bunch of SMEs arguing with each other about where they should be going, about what is relevant and what is not. That is a separate issue they do not address, which points to the downside of all trainers, both amateurs and SMEs alike. See: Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

For additional support on the importance of SMEs, see again Monica’s article, EDI-Oracle Studywhere she summarizes the conclusion of Patrick Oot from the study that:

Technology providers using similar underlying technology, but different human resources, performed in both the top-tier and bottom-tier of all categories. Conclusion: Software is only as good as its operators. Human contribution is the most significant element. (emphasis in original)

Also see the recent Xerox blog, Who Prevails in the E-Discovery War of Man vs. Machine? by Gabriela Baron.

Teams that participated in Oracle without a bona fide SME, much less a good driver, well, they were doomed. The software was secondary. How could you possibly replicate the work of the original SME trial lawyers that did the first search without having an SME yourself, one with at least a similar experience and knowledge level.

map_lost_navigator_SMEThis means that even with a good driver, and good software, if you do not also have a good SME, you can still end up driving in circles. It is even worse when you try to do a project with no SME at all. Remember, the SME in the automobile analogy is the navigation system, or to use the pre-digital reality, the passenger with the map. We have all seen what happens where the navigation system screws up, or the map is wrong, or more typically, out of date (like many old SMEs). You do not get to the right place. You can have a great driver, and go quite fast, but if you have a poor navigator, you will not like the results.

The Oracle study showed this, but it is hardly new or surprising. In fact, it would be shocking if the contrary were true. How can incorrect information ever create correct information? The best you can hope for is to have enough correct information to smooth out the errors. Put another way, without signal, noise is just noise. Still, Jeremy Pickens claims there is a way. I will be watching and hope he succeeds where the alchemists of old always failed.

Tabula Rasa

blank_slateThere is one way out of the SME frailty conundrum that I have high hopes for and can already understand. It has to do with teaching the machine about relevance for all projects, not just one. The way predictive coding works now the machine is a tabula rasa, a blank slate. The machine knows nothing to begin with. It only knows what you teach it as the search begins. No matter how good the AI software is at learning, it still does not know anything on its own. It is just good at learning.

That approach is obviously not too bright. Yet, it is all we can manage now in legal search at the beginning of the Second Machine Age. Someday soon it will change. The machine will not have its memory wiped after every project. It will remember. The training from one search project will carry over to the next one like it. The machine will remember the training of past SMEs.

That is the essential core of my PreSuit proposal: to retain the key components of the past SME training so that you do not have to start afresh on each search project. PreSuit: How Corporate Counsel Could Use “Smart Data” to Predict and Prevent Litigation. When that happens (I don’t say if, because this will start happening soon, some say it already has) the machine could start smart.

Scarlett_Johansson - Samantha in HERThat is what we all want. That is the holy grail of AI-enhanced search — a smart machine. (For the ultimate implications of this, see the movie Her, which is about an AI enhanced future that is still quite a few years down the road.) But do not kid yourself, that is not what we have now. Now we only have baby robots, ones that are eager and ready to learn, but do not know anything. It is kind of like 1-Ls in law school, except that when they finish a class they do not retain a thing!

When my PreSuit idea is implemented, the next SME will not have to start afresh. The machine will not be a tabula rasa. It will be able to see litigation brewing. It will help general counsel to stop law suits before they are filed. The SMEs will then build on the work of prior SMEs, or maybe build on their own previous work in another similar project. Then the GIGO principle will be much easier to mitigate. Then the computer will not be completely dumb, it will have some intelligence from the last guy. There will be some smart data, not just big dumb data. The software will know stuff, know the law and relevance, not just know how to learn stuff.

When that happens, then the SME in a particular project will not be as important, but for now, when working from scratch with dumb data, the SME is still critical. The smarter and more consistent the better. Less Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

Professor Marchionini, like all other search experts, recognizes the importance of SMEs to successful search. As he puts it:

Thus, experts in a domain have greater facility and experience related to information-seeking factors specific to the domain and are able to execute the subprocesses of information seeking with speed, confidence, and accuracy.

That is one reason that the Grossman Cormack glossary builds in the role of SMEs as part of their base definition of computer assisted review:

A process for Prioritizing or Coding a Collection of electronic Documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of Documents and then extrapolates those judgments to the remaining Document Collection.

Glossary at pg. 21 defining TAR.

Most SMEs Today Hate CARs
(And They Don’t Much Like High-Tech Drivers Either)

simpsonoldmanThis is an inconvenient truth for vendors. Predictive coding is defined by SMEs. Yet vendors cannot make good SMEs step up to the plate and work with the trainers, the drivers, to teach the machine. All the vendors can do is supply the car and maybe help with the driver. The driver and navigator have to be supplied by the law firm or corporate clients. There is no shortage of good SMEs, but almost all of them have never even seen a CAR. They do not like them. They can barely even speak the language of the driver. They don’t much like most of the drivers either. They are damn straight not going to spend two weeks of their lives riding around in one of those new fangled horseless carriages.

ringo and old guy

That is the reality of where we are now. Also see: Does Technology Leap While Law Creeps? by Brian Dalton, Above the Law. Of course this will change with the generations. But for now, that is the way it is. So vendors work on error minimization. They try to minimize the role of SMEs. That is anyway a good idea, because, as mentioned, all human SMEs are inconsistent. I was lucky to only be inconsistent 23% of the time on relevance. But still, there is another obvious solution.

There is another way to deal today with the reluctant SME problem, a way that works right now with today’s predictive coding software. It is a kind of non-robotic surrogate system that I have developed, and I’m sure a several other professional drivers have as well. See my CAR page for more information on this. But, in reality it is one of those things I would just have to show you in a driver education school type setting. I do it frequently. It involves action in behalf of an SME, and dealing with the driver for them. It places them in their comfort zone, where they just make yes no decisions on the close question documents, although there is obviously more to it than that. It is not nearly as good as the surrogate system in the movie Her, and of course, I’m no movie star, but it works.

HER_Samantha_Surrogate

My own legal subject matter expertise is, like most lawyers, fairly limited. I know a lot about a few things, and am a stand alone SME in those fields. I know a fair amount about many more legal fields, enough to understand real experts, enough to serve as their surrogate or right hand. Those are the CAR trips I will take.

If I do not know enough about a field of law to understand what the experts are saying, then I cannot serve as a surrogate. I could still drive of course, but I would refuse to do that out of principle, unless I had a navigator, an SME, who knew what they were doing and where they wanted to go. I would need an SME willing to spend the time in the CAR needed to tell me where to go. I hate a TAR pit as much as the next guy. Plus at my age and experience I can drive anywhere I want, in pretty much any CAR I want. That brings us to the final corner of the triangle, the variance in the quality of predictive coding software.

Quality of the CAR Software

I am not going to spend a lot of time on this. No lawyer could be naive enough to think that all of the software is equally as good. That is never how it works. It takes time and money to make sophisticated software like this. Anybody can simply add on open source machine learning software code to their review platforms. That does not take much, but that is a Model-T.

Old_CAR_stuck_mud

To make active machine learning work really well, to take it to the next level, requires thousands of programming hours. It takes large teams of programmers. It takes years. It take money. It takes scientists. It takes engineers. It takes legal experts too. It takes many versions and continuous improvements of search and review software. That is how you tell the difference between ok, good, and great software. I am not going to name names, but I will say the Gartner’s so called Magic Quadrant evaluation of e-discovery software is not too bad. Still, be aware that evaluation of predictive coding is not really their thing, or even a primary factor for rating review software.

Gartner_Magic_Quadrant

It is kind of funny how pretty much everybody wins in the Gartner evaluation. Do you think that’s an accident? I am privately much more critical. Many well known programs are very late to the predictive coding party. They are way behind. Time will tell if they are ever able to catch up.

Still, these things do change from year to year, as new versions of software are continually released. For some companies you can see real improvements, real investments being made. For others, not so much, and what you do see is often just skin deep. Always be skeptical. And remember, the software CAR is only as good as your driver and navigator.

car_mind_meld

When it comes to software evaluation what counts is whether the algorithms can find the documents needed or not. Even the best driver navigator team in the world can only go so far in a clunker. But give them a great CAR, and they will fly. The software will more than pay for itself in saved reviewer time and added security of a job well done.

Deja Vu All Over Again. 

Predictive coding is a great leap forward in search technology. In the longterm predictive coding and other AI-based software will have a bigger impact on the legal profession than did the original introduction of computers into the law office. No large changes like this are without problems. When computers were first brought into law offices they too caused all sorts of problems and had their pitfalls and nay sayers. It was a rocky road at first.

Ralph in the late 1980s

I was there and remember it all very well. The Fonz was cool. Disco was still in. I can remember the secretaries yelling many times a day that they needed to reboot. Reboot! Better save. It became a joke, a maddening one. The network was especially problematic. The partner in charge threw up his hands in frustration. The other partners turned the whole project over to me, even though I was a young associate fresh out of law school. They had no choice. I was the only one who could make the damn systems work.

Ifloppy_8incht was a big investment for the firm at the time. Failure was not an option. So I worked late and led my firm’s transition from electric typewriters and carbon paper to personal computers, IBM System 36 minicomputers, word processing, printers, hardwired networks, and incredibly elaborate time and billing software. Remember Manac time and billing in Canada? Remember Displaywriter? How about the eight inch floppy? It was all new and exciting. Computers in a law office! We were written up in IBM’s small business magazine.

For years I knew what every DOS operating file was on every computer in the firm. The IBM repair man became a good friend. Yes, it was a lot simpler then. An attorney could practice law and run his firm’s IT department at the same time.

ralph_1990sHey, I was the firm’s IT department for the first decade. Computers, especially word processing and time and billing software, eventually made a huge difference in efficiency and productivity. But at first there were many pitfalls. It took us years to create new systems that worked smoothly in law offices. Business methods always lag way behind new technology. This is clearly shown by MIT’s Erik Brynjolfsson and Andrew McAfee in their bestseller, Second Machine Age. It typically takes a generation to adjust to major technology breakthroughs. Also see Ted Talk by Brynjolfsson with video.

I see parallels with the 1980s and now. The main difference is legal tech pioneers were very isolated then. The world is much more connected now. We can observe together how, like in the eighties, a whole new level of technology is starting to make its way into the law office. AI-enhanced software, starting with legal search and predictive coding, is something new and revolutionary. It is like the first computers and word processing software of the late 1970s and early 80s.

It will not stop there. Predictive coding will soon expand into information governance. This is the PreSuit project idea that I, and others, are starting to talk about. See Eg: Information Governance Initiative. Moreover, many think AI software will soon revolutionize legal practice in a number of other ways, including contract generation and other types of repetitive legal work and analysis. See Eg: Rohit Talwar, Rethinking Law Firm Strategies for an Era of Smart Technology (ABA  LPT, 2014). The potential impact of supervised learning and other cognitive analytics tools on all industries is vast. See Eg: Deloitte’s 2014 paper: Cognitive Analytics (“For the first time in computing history, it’s possible for machines to learn from experience and penetrate the complexity of data to identify associations.”); Also see: Digital Reasoning software, and Paragon Science software. Who knows where it will lead the world, much less the legal profession? Back in the 1980s I could never have imagined the online Internet based legal practice that most of us have now.

The only thing we know for sure is that it will not come easy. There will be problems, and the problems will be overcome. It will take creativity and hard work, but it will be done. Easy buttons have always been a myth, especially when dealing with the latest advancements of technology. The benefits are great. The improvements from predictive coding in document review quality and speed are truly astonishing. And it lowers cost too, especially if you avoid the pits. Of course there are issues. Of course there are TAR pits. But they can be avoided and the results are well worth the effort. The truth is we have no choice.

Conclusion

retire

If you want to remain relevant and continue to practice law in the coming decades, then you will have to learn how to use the new AI-enhanced technologies. There is really no choice, other than retirement. Keep up, learn the new ways, or move on. Many lawyers my age are retiring now for just this reason. They have no desire to learn e-discovery, much less predictive coding. That’s fine. That is the honest thing to do. The next generation will learn to do it, just like a few lawyers learned to use computers in the 1980s and 1990s. Stagnation and more of the same is not an option in today’s world. Constant change and education is the new normal. I think that is a good thing. Do you?

Leave a comment. Especially feel free to point out a TAR pit not mentioned here. There are many, I know, and you cannot avoid something you cannot see.



%d bloggers like this: