Happy Birthday to Abraham Lincoln, America’s First Tech-Lawyer

February 12, 2019

Lincoln in his lawyer phaseAbraham Lincoln was born on February 12, 1809. He was probably our greatest President. Putting aside the tears honest Abe must now be shedding over his political party, it is good to remember Lincoln as an exemplar of a U.S. lawyer. All lawyers would benefit from emulating aspects of his Nineteenth Century legal practice and Twenty First Century thoughts on technology. He was honest, diligent, a deep thinker and ethical. Very ethical. He did not need to be lectured on Cooperation and Rule 1. He also did not need to be told to embrace technology, not hide from it. In fact, he was a prominent Tech-Lawyer of his day, well known for his speaking abilities on the subject. Near the end of his legal career Abe was busy pushing technology and his vision of the future. Sound familiar dear readers? It should. Most of you are like that.

Close up of Lincoln's face on April 10, 1865

Lincoln Was a Technophile

Lincoln was as obsessed with the latest inventions and advances in technology as any techno-geek e-discovery lawyer alive today. The latest things in Lincoln’s day were mechanical devices of all kinds, typically steam-powered, and the early electromagnetic devices, then primarily the telegraph. Indeed, the first electronic transmission from a flying machine, a balloon, was a telegraph sent from inventor Thaddeus Lowe to President Lincoln on June 16, 1861. Unlike Lincoln’s generals, he quickly realized the military potential of flying machines and created an Aeronautics Corps for the Army, appointing Professor Lowe as its chief. See Bruce, Robert V., Abraham Lincoln and the Tools of War. Below is a copy of a handwritten note by Lincoln introducing Lowe to General Scott.

Lincoln's handwritten introduction of Professor Lowe

At the height of his legal career, Lincoln’s biggest clients were the Googles of his day, namely the railroad companies with their incredible new locomotives. These newly rich, super-technology corporations dreamed of uniting the new world with a cross-country grid of high speed transportation. Little noticed today is one of Lincoln’s proudest achievements as President, the enactment of legislation that funded these dreams, the Pacific Railway Act of 1862. The intercontinental railroad did unite the new world, much like the Internet and airlines today are uniting the whole world. A lawyer as obsessed with telegraphs and connectivity as Lincoln was would surely have been an early adopter of the Internet and an enthusiast of electronic discovery.  See: Abraham Lincoln: A Technology Leader of His Time (U.S. News & World Report, 2/11/09).

Abraham Lincoln loved technology and loved to think and talk about the big picture of technology, of how it is used to advance the dreams of Man. In fact, Lincoln gave several public lectures on technology, having nothing to do with law or politics. The first such lecture known today was delivered on April 6, 1858, before the Young Men’s Association in Bloomington, Illinois, and was entitled “Lecture on Discoveries and Inventions.” In this lecture, he traced the progress of mankind through its inventions, starting with Adam and Eve and the invention of the fig leaf for clothing. I imagine that if he were giving this speech today (and I’m willing to try to replicate it should I be so invited) he would end with AI and blockchain.

In Lincoln’s next and last lecture series first delivered on February 11, 1859, known as “Second Lecture on Discoveries and Inventions,” Lincoln used fewer biblical references, but concentrated instead on communication. For history buffs, see the complete copy of Lincoln’s Second Lecture, which, in my opinion, is much better than the first. Here are a few excerpts from this little known lecture:

The great difference between Young America and Old Fogy, is the result of Discoveries, Inventions, and Improvements. These, in turn, are the result of observation, reflection and experiment.

Writing – the art of communicating thoughts to the mind, through the eye – is the great invention of the world. Great in the astonishing range of analysis and combination which necessarily underlies the most crude and general conception of it, great, very great in enabling us to converse with the dead, the absent, and the unborn, at all distances of time and of space; and great, not only in its direct benefits, but greatest help, to all other inventions.

I have already intimated my opinion that in the world’s history, certain inventions and discoveries occurred, of peculiar value, on account of their great efficiency in facilitating all other inventions and discoveries. Of these were the arts of writing and of printing – the discovery of America, and the introduction of Patent-laws.

Can there be any doubt that the lawyer who wrote these words would instantly “get” the significance of the total transformation of writing, “the great invention of the world,” from tangible paper form, to intangible, digital form?  Can there be any doubt that a lawyer like this would understand the importance of the Internet, the invention that unites the world in a web of inter-connective writing, where each person may be a printer and instantly disseminate their ideas “at all distances of time and of space?”

Lincoln standing by his generals in the field; close up

Abraham Lincoln did not just have a passing interest in new technologies. He was obsessed with it, like most good e-discovery lawyers are today. In the worst days of the Civil War, the one thing that could still bring Lincoln joy was his talks with the one true scientist then residing in Washington, D.C., the first director of the Smithsonian Institution, Dr. Joseph Henry, a specialist in light and electricity. Despite the fact that Henry’s political views were anti-emancipation and virtually pro-secession, Lincoln would sneak over to the Smithsonian every chance he could get to talk to Dr. Henry. Lincoln told the journalist, Charles Carleton Coffin:

My visits to the Smithsonian, to Dr. Henry, and his able lieutenant, Professor Baird, are the chief recreations of my life…These men are missionaries to excite scientific research and promote scientific knowledge. The country has no more faithful servants, though it may have to wait another century to appreciate the value of their labors.

Bruce, Lincoln and the Tools of War, p. 219.

Lincoln was no mere poser about technology and inventions. He walked his talk and railed against the Old Fogies who opposed technology. Lincoln was known to be willing to meet with every crackpot inventor who came to Washington during the war and claimed to have a new invention that could save the Union. Lincoln would talk to most of them and quickly separate the wheat from the chaff. As mentioned, he recognized the potential importance of aircraft to the military and forced the army to fund Professor Lowe’s wild-eyed dreams of aerial reconnaissance. He also recognized another inventor and insisted, over much opposition, that the army adopt his new invention: Dr. Richard Gatling. His improved version of the machine gun began to be used by the army in 1864, and before that, the Gattling guns that Lincoln funded are credited with defending the New York Times from an invasion by “anti-draft, anti-negro mobs” that roamed New York City in mid-July 1863. Bruce, Lincoln and the Tools of War, p. 142.

As final proof that Lincoln was one of the preeminent technology lawyers of his day, and if he were alive today, surely would be again, I offer the little known fact that Abraham Lincoln is the only President in United States history to have been issued a patent. He patented an invention for “Buoying Vessels Over Shoals.” It is U.S. Patent Number 6,469, issued on May 22, 1849. I could only find the patent on the USPTO web, where it is not celebrated and is hard to read. So as my small contribution to Lincoln memorabilia in the bicentennial year of 2009, I offer the complete copy below of Abraham Lincoln’s three page patent. You should be able to click on the images with your browser to enlarge and download.

Lincoln Patent Pg. 1
Lincoln Patent Pg. 2Lincoln Patent Pg. 3 (Drawings)

The invention consisted of a set of bellows attached to the hull of a ship just below the water line. After reaching a shallow place, the bellows were to be filled with air that buoyed the vessel higher, making it float higher and off the river shoals. The patent application was accompanied with a wooden model depicting the invention. Lincoln whittled the model with his own hands. It is on display at the Smithsonian and is shown below.

Lincoln Hand-Carved Wooden Model of Patent

Lincoln Filing Invention at Patent Office (fictionalized depiction)

Conclusion

On Abe Lincoln’ birthday it is worth recalling the long, prestigious pedigree of Law and Technology in America. Lincoln is a symbol of freedom, emancipation. He is also a symbol of Law and Technology.  If Abe were alive today, I have no doubt he would be, among other things, a leader of Law and Technology.

Stand tall friends. We walk in long shadows and, like Lincoln, we shall overcome the hardships we face. As Abe himself was fond of saying: down with the Old Fogies; it is young America’s destiny to embrace change and lead the world into the future. Let us lead with the honesty and integrity of Abraham Lincoln. Nothing less is acceptable.



Elusion Random Sample Test Ordered Under Rule 26(g) in a Keyword Search Based Discovery Plan

August 26, 2018

There is a new case out of Chicago that advances the jurisprudence of my sub-specialty, Legal Search. City of Rockford v. Mallinckrodt ARD Inc., 2018 WL 3766673, Case 3:17-cv-50107 (N.D. Ill., Aug. 7, 2018). This discovery order was written by U.S. Magistrate Judge Iain Johnston who entitled it: “Order Establishing Production Protocol for Electronically Stored Information.” The opinion is both advanced and humorous, destined to be an oft-cited favorite for many. Thank you Judge Johnston.

In City of Rockford an Elusion random sample quality assurance test was required as part of the parties discovery plan to meet the reasonable efforts requirements of Rule 26(g). The random sample procedure proposed was found to impose only a proportional, reasonable burden under Rule 26(b)(1). What makes this holding particularly interesting is that an Elusion test is commonly employed in predictive coding projects, but here the parties had agreed to a keyword search based discovery plan. Also see: Tara Emory, PMP, Court Holds that Math Matters for eDiscovery Keyword Search,  Urges Lawyers to Abandon their Fear of Technology (Driven, (August 16, 2018) (“party using keywords was required to test the search effectiveness by sampling the set of documents that did not contain the keywords.”)

The Known Unknowns and Unknown Unknowns

Judge Johnston begins his order in City of Rockford with a famous quote by Donald Rumseld, a two-time Secretary of Defense.

“[A]s we know there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. . .”
Donald Rumseld

For those not familiar with this famous Known Knowns quip, here is a video of the original:

Here the knowledge logic is spelled out in a chart, since I know we all love that sort of thing. Deconstructing Rumsfeld: Knowledge and Ignorance in the Age of Innovation (Inovo 5/114).

Anybody who does complex investigations is familiar with this problem. Indeed, you can argue this insight is fundamental to all of science and experimental method. Logan, David C. (March 1, 2009). “Known knowns, known unknowns, unknown unknowns and the propagation of scientific enquiry”, Journal of Experimental Botany 60 (3). pp. 712–4. [I have always wanted to quote a botany journal.]

How do you deal with the known unknowns and the unknown unknowns, the information that we don’t even know that we don’t know about? The deep, hidden information that is both obtuse and rare. Information that is hard to retrieve and harder still to prove does not exist at all. Are you chasing something that might not exist? Something unknown because nonexistent? Such as an overlooked Highly Relevant document? (The stuff of nightmares!) Are you searching for nothing? Zero? If you find it, what does that mean? What can be known and what can never be known? Scientists, investigators and the Secretary of Defense alike all have to ponder these questions and all want to use the best tools and best people possible to do so. See: Deconstructing Rumsfeld: Knowledge and Ignorance in the Age of Innovation (Inovo 5/114).

Seeking Knowledge of the Unknown Elusion Error Rate

These big questions, though interesting, are not why Judge Johnston started his opinion with the Rumseld quote. Instead, he used the quote to emphasize that new e-discovery methods, namely random sampling and statistical analysis, can empower lawyers to know what they never did before. A technical way to know the known unknowns. For instance, a way to know the number of relevant documents that will be missed and not produced: the documents that elude retrieval.

As the opinion and this blog will explain, you can do that, know that, by using an Elusion random sample of the null-set. The statistical analysis of the sample transforms the unknown quantity to a known (subject to statistical probabilities and range). It allows lawyers to know, at least within a range, the number of relevant documents that have not been found. This is a very useful quality assurance method that relies on objective measurements to demonstrate success of your project, which here is information retrieval. This and other random sampling methods allow for the calculation of Recall, meaning the percent of total relevant documents found. This is another math-based, quality assurance tool in the field of information retrieval.

One of the main points Judge Johnston makes in his order is that lawyers should embrace this kind of technical knowledge, not shy away from it. As Tara Emory said in her article, Court Holds that Math Matters for eDiscovery Keyword Search:

A producing party must determine that its search process was reasonable. In many cases, the best way to do this is with objective metrics. Producing parties often put significant effort into brainstorming keywords, interviewing witnesses to determine additional terms, negotiating terms with the other party, and testing the documents containing their keywords to eliminate false positives. However, these efforts often still fail to identify documents if important keywords were missed, and sampling the null set is a simple, reasonable way to test whether additional keywords are needed. …

It is important to overcome the fear of technology and its related jargon, which can help counsel demonstrate the reasonableness of search and production process. As Judge Johnston explains, sampling the null set is a process to determine “the known unknown,” which “is the number of the documents that will be missed and not produced.” Judge Johnson disagreed with the defendants’ argument “that searching the null set would be costly and burdensome.” The Order requires Defendants to sample their null set at a 95% +/-2% margin of error (which, even for a very large set of documents, would be about 2,400 documents to review).[4] By taking these measures—either with TAR or with search terms, counsel can more appropriately represent that they have undertaken a “reasonable inquiry” for relevant information within the meaning of FRCP 26(g)(1).

Small Discovery Dispute in an Ocean of Cooperation

Judge Johnston was not asked to solve the deep mysteries of knowing and not knowing in City of Rockford. The parties came to him instead with an interesting, esoteric discovery dispute. They had agreed on a great number of things, for which the court profusely congratulated them.

The attorneys are commended for this cooperation, and their clients should appreciate their efforts in this regard. The Court certainly does. The litigation so far is a solid example that zealous advocacy is not necessarily incompatible with cooperation. The current issue before the Court is an example of that advocacy and cooperation. The parties have worked to develop a protocol for the production of ESI in this case, but have now reached an impasse as to one aspect of the protocol.

The parties disagreed on whether to include a document review quality assurance test in the protocol. The Plaintiffs wanted one and the Defendants did not. Too burdensome they said.

To be specific, the Plaintiffs wanted a test where the efficacy of any parties production would be tested by use of an Elusion type of Random Sample of the documents not produced. The Defendants opposed any specific test. Instead, they wanted the discovery protocol to say that if the receiving party had concerns about the adequacy of the producing party’s efforts, then they would have a conference to address the concerns.

Judge Johnston ruled for the plaintiff in this dispute and ordered a  random elusion sample to be taken after the defendant stopped work and completed production. In this case it was a good decision, but should not be routinely required in all matters.

The Stop Decision and Elusion Sample

One of the fundamental problems in any investigation is to know when you should stop the investigation because it is no longer worth the effort to carry on. When has a reasonable effort been completed? Ideally this happens after all of the important documents have already been found. At that point you should stop the effort and move on to a new project. Alternatively, perhaps you should keep on going and look for more? Should you stop or not?

In Legal Search we all this the “Stop Decision.” Should you conclude the investigation or continue further AI training rounds and other search. As explained in the e-Discovery Team TAR Course:

The all important stop decision is a legal, statistical decision requiring a holistic approach, including metrics, sampling and over-all project assessment.You decide to stop the review after weighing a multitude of considerations. Then you test your decision with a random sample in Step Seven.

See: TAR Course: 15th Class – Step Seven – ZEN Quality Assurance Tests.

If you want to go deeper into this, then listen in on this TAR Course lecture on the Stop decision.

____________

Once a decision is made to Stop, then a well managed document review project will use different tools and metrics to verify that the Stop decision was correct. Judge Johnston in City of Rockford used one of my favorite tools, the Elusion random sample that I teach in the e-Discovery Team TAR Course. This type of random sample is called an Elusion sample.

Judge Johnston ordered an Elusion type random sample of the null set in City of Rockford. The sample would determine the range of relevant documents that likely eluded you. These are called False Negatives. Documents presumed Irrelevant and withheld that were in fact Relevant and should have been produced. The Elusion sample is designed to give you information on the total number of Relevant documents that were likely missed, unretrieved, unreviewed and not produced or logged. The fewer the number of False Negatives the better the Recall of True Positives. The goal is to find, to retrieve, all of the Relevant ESI in the collection.

Another way to say the same thing is to say that the goal is Zero False Negatives. You do not miss a single relevant file. Every file designated Irrelevant is in fact not relevant. They are all True Negatives. That would be Total Recall: “the Truth, the Whole Truth …” But that is very rare and some error, some False Negatives, are expected in every large information retrieval project. Some relevant documents will almost always be missed, so the goal is to make the False Negatives inconsequential and keep the Elusion rate low.

Here is how Judge Iain Johnston explained the random sample:

Plaintiffs propose a random sample of the null set. (The “null set” is the set of documents that are not returned as responsive by a search process, or that are identified as not relevant by a review process. See Maura R. Grossman & Gordon v. Cormack, The Grossman-Cormack Glossary of Technology-Assisted Review, 7 Fed. Cts. L. Rev. 1, 25 (2013). The null set can be used to determine “elusion,” which is the fraction of documents identified as non-relevant by a search or review effort that are, in fact, relevant. Elusion is estimated by taking a random sample of the null set and determining how many or what portion of documents are actually relevant. Id. at 15.) FN 2

Judge Johnston’s Footnote Two is interesting for two reasons. One, it attempts to calm lawyers who freak out when hearing anything having to do with math or statistics, much less information science and technology. Two, it does so with a reference to Fizbo the clown.

The Court pauses here for a moment to calm down litigators less familiar with ESI. (You know who you are.) In life, there are many things to be scared of, including, but not limited to, spiders, sharks, and clowns – definitely clowns , even Fizbo. ESI is not something to be scared of. The same is true for all the terms and jargon related to ESI. … So don’t freak out.

Accept on Zero Error for Hot Documents

Although this is not addressed in the court order, in my personal view, no False Negatives, iw – overlooked  documents – are acceptable when it comes to Highly Relevant documents. If even one document like that is found in the sample, one Highly Relevant Document, then the Elusion test has failed in my view. You must conclude that the Stop decision was wrong and training and document review must recommence. That is called an Accept on Zero Error test for any hot documents found. Of course my personal views on best practice here assume the use of AI ranking, and the parties in City of Rockford only used keyword search. Apparently they were not doing machine training at all.

The odds of finding False Negatives, assuming that only a few exist (very low prevalence) and the database is large, are very unlikely in a modest sized random sample. With very low prevalence of relevant ESI the test can be of limited effectiveness. That is an inherent problem with low prevalence and random sampling. That is why statistics have only limited effectiveness and should be considered part of a total quality control program. See Zero Error Numerics: ZEN. Math matters, but so too does good project management and communications.

The inherent problem with random sampling is that the only way to reduce the error interval is to increase the size of the sample. For instance, to decrease the margin of error to only 2% either way, a total error of 4%, a random sample size of around 2,400 documents is needed. Even though that narrows the error rate to 4%, there is still another error factor of the Confidence Level, here at 95%. Still, it is not worth the effort to review even more sample documents to reduce that to a 99% Level.

Random sampling has limitations in low prevalence datasets, which is typical in e-discovery, but still sampling can be very useful. Due to this rarity issue, and the care that producing parties always take to attain high Recall, any documents found in an Elusion random sample should be carefully studied to see if they are of any significance. We look very carefully at any new documents found that are of a kind not seen before. That is unusual. Typically  any relevant documents found by random sample of the elusion set are of a type that have been seen before, often many, many times before. These “same old, same old” type of documents are of no importance to the investigation at this point.

Most email related datasets are filled with duplicative, low value data. It is not exactly irrelevant noise, but it is not a helpful signal either. We do not care if we  get all of that kind of merely relevant data. What we really want are the Hot Docs, the high value Highly Relevant ESI, or at least Relevant and of a kind not seen before. That is why the Accept On Zero Error test is so important for Highly Relevant documents.

The Elusion Test in City of Rockford 

In City of Rockford Judge Johnston considered a discovery stipulation where the parties had agreed to use a typical keyword search protocol, but disagreed on a quality assurance protocol. Judge Johnston held:

With key word searching (as with any retrieval process), without doubt, relevant documents will be produced, and without doubt, some relevant documents will be missed and not produced. That is a known known. The known unknown is the number of the documents that will be missed and not produced.

Back to the False Negatives again, the known unknown. Judge Johnston continues his analysis:

But there is a process by which to determine that answer, thereby making the known unknown a known known. That process is to randomly sample the nullset. Karl Schieneman & Thomas C. Gricks III, The Implications of Rule26(g) on the Use of Technology-Assisted Review, 2013 Fed. Cts. L. Rev. 239, 273 (2013)(“[S]ampling the null set will establish the number of relevant documents that are not being produced.”). Consequently, the question becomes whether sampling the null set is a reasonable inquiry under Rule 26(g) and proportional to the needs of this case under Rule 26(b)(1).

Rule 26(g) Certification
Judge Johnston takes an expansive view of the duties placed on counsel of record by Rule 26(g), but concedes that perfection is not required:

Federal Rule of Civil Procedure 26(g) requires all discovery requests be signed by at least one attorney (or party, if proceeding pro se). Fed. R. Civ. P. 26(g)(1). By signing the response, the attorney is certifying that to the best of counsel’s knowledge, information, and belief formed after a reasonable inquiry, the disclosure is complete and correct at the time it was made. Fed. R. Civ. P. 26(g)(1)(A). But disclosure of documents need not be perfect. … If the Federal Rules of Civil Procedure were previously only translucent on this point, it should now be clear with the renewed emphasis on proportionality.

Judge Johnston concludes that Rule 26(g) on certification applies to require the Elusion sample in this case.

Just as it is used in TAR, a random sample of the null set provides validation and quality assurance of the document production when performing key word searches.  Magistrate Judge Andrew Peck made this point nearly a decade ago. See William A. Gross Constr. Assocs., 256 F.R.D. at 135-6 (citing Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 262 (D. Md. 2008)); In re Seroquel Products Liability Litig., 244 F.R.D. 650, 662 (M.D. Fla. 2007) (requiring quality assurance).

Accordingly, because a random sample of the null set will help validate the document production in this case, the process is reasonable under Rule 26(g).

Rule 26(b)(1) Proportionality

Judge Johnston considered as a separate issue whether it was proportionate under Rule 26(b)(1) to require the elusion test requested. Again, the court found that it was in this large case on the pricing of prescription medication and held that it was proportional:

The Court’s experience and understanding is that a random sample of the null set will not be unreasonably expensive or burdensome. Moreover and critically, Defendants have failed to provide any evidence to support their contention. Mckinney/Pearl Rest. Partners, L.P. v. Metro. Life Ins. Co., 322 F.R.D. 235, 242 (N.D.Tex. 2016) (party required to submit affidavits or offer evidence revealing the nature of the burden)
Once again we see a party seeking protection from having to do something because it is so burdensome then failing to present actual evidence of burden. We see this a lot lately. Responding Party’s Complaints of Financial Burden of Document Review Were Unsupported by the Evidence, Any Evidence (e-Discovery Team, 8/5/18);

Judge Johnston concludes his “Order Establishing Production Protocol for Electronically Stored Information” with the following:

The Court adopts the parties’ proposed order establishing the production protocol for ESI with the inclusion of Plaintiffs’ proposal that a random sample of the null set will occur after the production and that any responsive documents found as a result of that process will be produced. Moreover, following that production, the parties should discuss what additional actions, if any, should occur. If the parties cannot agree at that point, they can raise the issue with the Court.

Conclusion

City of Rockford is important because it is the first case to hold that a quality control procedure should be used to meet the reasonable efforts certification requirements of Rule 26(g). The procedure here required was a random sample Elusion test with related, limited data sharing. If this interpretation of Rule 26(g) is followed by other courts, then it could have a big impact on legal search jurisprudence. Tara Emory in her article, Court Holds that Math Matters for eDiscovery Keyword Search goes so far as to conclude that City of Rockford stands for the proposition that “the testing and sampling process associated with search terms is essential for establishing the reasonableness of a search under FRCP 26(g).”

The City of Rockford holding could persuade other judges and encourage courts to be more active and impose specific document review procedures on all parties, including requiring the use of sampling and artificial intelligence. The producing party cannot always have a  free pass under Sedona Principle Six. Testing and sampling may well be routinely ordered in all “large” document review cases in the future.

It will be very interesting to watch how other attorneys argue City of Rockford. It will continue a line of cases examining methodology and procedures in document review. See eg., William A. Gross Construction Associates, Inc. v. American Manufacturers Mutual Insurance Co., 256 F.R.D. 134 (S.D.N.Y. 2009) (“wake-up call” for lawyers on keyword search); Winfield v. City of New York (SDNY, Nov. 27, 2017), where Judge Andrew Peck considers methodologies and quality controls of the active machine learning process. Also see Special Master Maura Grossman’s Order Regarding Search Methodology for ESI, a validation Protocol for the Broiler Chicken antitrust cases.

The validation procedure of an Elusion sample in City of Rockford is just one of many possible review protocols that a court could impose under Rule 26(g). There are dozens more, including whether predictive coding should be required. So far, courts have been reluctant to order that, as Judge Peck explained in Hyles:

There may come a time when TAR is so widely used that it might be unreasonable for a party to decline to use TAR. We are not there yet.

Hyles v. New York City, No. 10 Civ. 3119 (AT)(AJP), 2016 WL 4077114 (S.D.N.Y. Aug. 1, 2016):

Like a kid in the backseat of the car, I cannot help but ask, are we there yet? Hyles was published over two years ago now. Maybe some court, somewhere in the world, has already ordered a party to do predictive coding against their will, but not to our knowledge. That is a known unknown. Still, we are closer to “There” with the City of Rockford’s requirement of an Elusion test.

When we get “there,” and TAR is finally ordered in a case, it will probably arise in a situation like City of Rockford where a joint protocol applicable to all parties is involved. That is easier to sell than a one-sided protocol. The court is likely to justify the order by Rule 26(g), and hold that it requires all parties in the case to use predictive coding. Otherwise, they will not meet the  reasonable effort burdens of Rule 26(g). Other rules will be cited too, of course, including Rule 1, but Rule 26(g) is likley to be key.

____________

___

 

____

 

 


Judge Goes Where Angels Fear To Tread: Tells the Parties What Keyword Searches to Use

June 24, 2018

John Facciola was one of the first e-discovery expert judges to consider the adequacy of a producing parties keyword search efforts in United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008). He first observed that keyword search and other computer assisted legal search techniques required special expertise to do properly. Everyone agrees with that. He then reached an interesting, but still somewhat controversial conclusion: because he lacked such special legal search expertise, and knew full well that most of the lawyers appearing before him did too, that he could not properly analyze and compel the use of specific keywords without the help of expert testimony. To help make his point he paraphrased Alexander Pope‘s famous line from An Essay on Criticism: “For fools rush in where angels fear to tread.

Here are the well-known words of Judge Facciola in O’Keffe (emphasis added):

As noted above, defendants protest the search terms the government used.[6]  Whether search terms or “keywords” will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics. See George L. Paul & Jason R. Baron, Information Inflation: Can the Legal System Adapt?; 13 Ricn. J.L. & TECH. 10 (2007). Indeed, a special project team of the Working Group on Electronic Discovery of the Sedona Conference is studying that subject and their work indicates how difficult this question is. See The Sedona Conference, Best Practices Commentary on the Use of Search and Information Retrieval, 8 THE SEDONA CONF. J. 189 (2008).

Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread. This topic is clearly beyond the ken of a layman and requires that any such conclusion be based on evidence that, for example, meets the criteria of Rule 702 of the Federal Rules of Evidence. Accordingly, if defendants are going to contend that the search terms used by the government were insufficient, they will have to specifically so contend in a motion to compel and their contention must be based on evidence that meets the requirements of Rule 702 of the Federal Rules of Evidence.

Many courts have followed O’Keffe, even though it is a criminal case, and declined to step in and order specific searches without expert input. See eg. the well-known patent case, Vasudevan Software, Inc. v. Microstrategy Inc., No. 11-cv-06637-RS-PSG, 2012 US Dist LEXIS 163654 (ND Cal Nov 15, 2012). The opinion was by U.S. Magistrate Judge Paul S. Grewal, who later became the V.P. and Deputy General Counsel of Facebook. Judge Grewal wrote:

But as this case makes clear, making those determinations often is no easy task. “There is no magic to the science of search and retrieval: only mathematics, linguistics, and hard work.”[9]

Unfortunately, despite being a topic fraught with traps for the unwary, the parties invite the court to enter this morass of search terms and discovery requests with little more than their arguments.

More recently, e-discovery expert Judge James Francis addressed this issue in Greater New York Taxi Association v. City of New York, No. 13 Civ. 3089 (VSB) (JCF) (S.D.N.Y. Sept. 11, 2017) and held:

The defendants have not provided the necessary expert opinions for me to assess their motion to compel search terms. The application is therefore denied. This leaves the defendants with three options: “They can cooperate [with the plaintiffs] (along with their technical consultants) and attempt to agree on an appropriate set of search criteria. They can refile a motion to compel, supported by expert testimony. Or, they can request the appointment of a neutral consultant who will design a search strategy.”[10] Assured Guaranty Municipal Corp. v. UBS Real Estate Securities Inc., No. 12 Civ. 1579, 2012 WL 5927379, at *4 (S.D.N.Y. Nov. 21, 2012).

I am inclined to agree with Judge Francis. I know from daily experience that legal search, even keyword search, can be very tricky, depends on many factors, including the documents searched. I have spent over a decade working hard to develop expertise in this area. I know that the appropriate searches to be run depends on experience and scientific, technical knowledge on information retrieval and statistics. It also depends on tests of proposed keywords; it depends on sampling and document reviews; it depends on getting your hands dirty in the digital mud of the actual ESI. It cannot be done effectively in the blind, no matter what your level of expertise. It is an iterative process of trial and errors, false positives and negatives alike.

Enter a Judge Braver Than Angels

Recently appointed U.S. Magistrate Judge Laura Fashing in Albuquerque, New Mexico, heard a case involving a dispute over keywords. United States v. New Mexico State University, No. 1:16-cv-00911-JAP-LF, 2017 WL 4386358 (D.N.M. Sept. 29, 2017). It looks like the attorneys in the case neglected to inform Judge Fashing of United States v. O’Keefe. It is a landmark case in this field, yet was not cited in Judge Fashing’s order. More importantly, Judge Fashing did not take the advice of O’Keefe, nor the many cases that follow it. Unlike Judge Facciola and his angels, she told the parties what keywords to use, even without input from experts.

The New Mexico State University opinion did, however, cite to two other landmark cases in legal search, William A. Gross Const. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009) by Judge Andrew Peck and Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260, 262 (D. Md. May 29, 2008) by Judge Paul Grimm. Judge Fashing held in New Mexico State University:

This case presents the question of how parties should search and produce electronically stored information (“ESI”) in response to discovery requests. “[T]he best solution in the entire area of electronic discovery is cooperation among counsel.” William A. Gross Const. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009). Cooperation prevents lawyers designing keyword searches “in the dark, by the seat of the pants,” without adequate discussion with each other to determine which words would yield the most responsive results. Id.

While keyword searches have long been recognized as appropriate and helpful for ESI search and retrieval, there are well-known limitations and risks associated with them, and proper selection and implementation obviously involves technical, if not scientific knowledge.

* * *

Selection of the appropriate search and information retrieval technique requires careful advance planning by persons qualified to design effective search methodology. The implementation of the methodology selected should be tested for quality assurance; and the party selecting the methodology must be prepared to explain the rationale for the method chosen to the court, demonstrate that it is appropriate for the task, and show that it was properly implemented.

Id. (quoting Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260, 262 (D. Md. May 29, 2008)).

Although NMSU has performed several searches and produced thousands of documents, counsel for NMSU did not adequately confer with the United States before performing the searches, which resulted in searches that were inadequate to reveal all responsive documents. As the government points out, “NMSU alone is responsible for its illogical choices in constructing searches.” Doc. 117-1 at 8. Consequently, which searches will be conducted is left to the Court.

Judges Francis, Peck and Facciola

Judge Laura Fashing had me in the quote above until the final sentence. Up till then she had been wisely following the four great judges in this area, Facciola, Peck, Francis and Grimm. Then in the next several paragraphs she rushes in to specify what search terms should be used for what categories of ESI requested. Why should the Court go ahead and do that without expert advice? Why not wait? Especially since Judge Fashing starts her opinion by recognizing the difficulty of the task, that “there are well-known limitations and risks associated with them [keyword searches], and proper selection and implementation obviously involves technical, if not scientific knowledge.” Knowing that, why was she fearless? Why did she ignore Judge Facciola’s advice? Why did she make multiple detailed, technical decisions on legal search, including specific keywords to be used, without the benefit of expert testimony? Was that foolish as several judges have suggested, or was she just doing her job by making the decisions that the parties asked her to make?

Judge Fashing recognized that she did have enough facts to make a decision, much less expert opinions based on technical, scientific knowledge, but she went ahead and ruled anyway.

Although NMSU argues that the search terms proposed by the government will return a greater number of non-responsive documents than responsive documents, this is not a particular and specific demonstration of fact, but is, instead, a conclusory argument by counsel. See Velasquez, 229 F.R.D. at 200. NMSU’s motion for a protective order with regard to RFP No. 8 is DENIED.

NMSU will perform a search of the email addresses of all individuals involved in salary-setting for Ms. Harkins and her comparators, including Kathy Agnew and Dorothy Anderson, to include the search terms “Meaghan,” “Harkins,” “Gregory,” or “Fister” for the time period of 2007-2012. If this search results in voluminous documents that are non-responsive, NMSU may further search the results by including terms such as “cross-country,” “track,” “coach,” “salary,” “pay,” “contract,” or “applicants,” or other appropriate terms such as “compensation,” which may reduce the results to those communications most likely relevant to this case, and which would not encompass every “Meaghan” or “Gregory” in the system. However, the Court will require NMSU to work with the USA to design an appropriate search if it seeks to narrow the search beyond the four search terms requested by the United States.

Judge Fashing goes on to make several specific orders on what to do to make a reasonable effort to find relevant evidence:

NMSU will conduct searches of the OIE databases, OIE employee’s email accounts, and the email accounts of all head coaches, sport administrators, HR liaisons working within the Athletics Department, assistant or associate Athletic Directors, and/or Athletic Directors employed by NMSU between 2007 and the present. The USA suggests that NMSU conduct a search for terms that are functionally equivalent to a search for (pay or compensate! or salary) and (discriminat! or fair! or unfair!). Doc. 117-1 at 13. If NMSU cannot search with “Boolean” connectors as suggested, it must search for the terms “pay” or “compensate” or “salary” and “discriminate” or “fair” or “unfair” and the various derivatives of these terms (for example the search would include “compensate” and “compensation”). The parties are to work together to determine what terms will be used to search these databases and email accounts.

Judge Laura Fashing hangs her hat on cooperation, but not on experts. She concludes her order with the following admonishment:

The parties are reminded that:

Electronic discovery requires cooperation between opposing counsel and transparency in all aspects of preservation and production of ESI. Moreover, where counsel are using keyword searches for retrieval of ESI, they at a minimum must carefully craft the appropriate keywords, with input from the ESI’s custodians as to the words and abbreviations they use, and the proposed methodology must be quality control tested to assure accuracy in retrieval and elimination of “false positives.” It is time that the Bar—even those lawyers who did not come of age in the computer era—understand this.

William A. Gross Const. Assocs., Inc., 256 F.R.D. at 136.

Conclusion

Of course I agree with Judge Fashing’s concluding reminder to the parties. Cooperation is key, but so is expertise. There is a good reason for the fear felt by Facciola’s angels. They wisely  knew that they lacked the necessary technical, scientific knowledge for the proper selection and implementation of keyword searches. I only wish that Judge Fashing’s order had reminded the parties of this need for experts too. It would have made her job much easier and also helped the parties. Sometimes the wisest thing to do is nothing, at least not until you have more information.

There is widespread agreement among legal search experts on such simplistic methods as keyword search. They would have helped. The same holds true on advanced search methods, such as active machine learning (predictive coding), at least among the elite. See TARcourse.com. There is still some disagreement on TAR methods, especially when you include the many pseudo experts out there. But even they can usually agree on keyword search methods.

I urge the judges and litigants faced with a situation like Judge Fashing had to deal with in New Mexico State University, to consider the three choices set out by Judge Francis in Greater New York Taxi Association:

  1. Cooperation with the other side and their technical consultants to attempt to agree on an appropriate set of search criteria.
  2. Motions supported by expert testimony and facts regarding the search.
  3. Appointment of a neutral consultant who will design a search strategy.

Going it alone with legal search in a complex case is a fool’s errand. Bring in an expert. Spend a little to save a lot. It is not only the smart thing to do, it is also required by ethics. Rule 1.1: Competence, Model Rules of Professional Conduct. The ABA Comment two to Rule 1.1 states that “Competent representation can also be provided through the association of a lawyer of established competence in the field in question.” Yet, in my experience, this is seldom done and is not something that clients are clamoring for. That should change, and quickly, if we are ever to stop wasting so much time and money on simplistic e-discovery arguments. I am again reminded of the great Alexander Pope (1688–1744) and another of his famous lines from An Essay on Criticism.

_______________

 

After I wrote this blog I did a webinar for ACEDS about this topic. Here is a one-hour talk to add to your personal Pierian spring.

 

_________

 

 

 


%d bloggers like this: