Another Judge is Asked to Settle a Keyword Squabble and He Hesitates To Go Where Angels Fear To Tread: Only Tells the Parties What Keywords NOT To Use

July 15, 2018

In this blog we discuss yet another case where the parties are bickering over keywords and the judge was asked to intervene. Webastro Thermo & Comfort v. BesTop, Inc., 2018 WL 3198544, No.16-13456 (E.D. Mich. June 29, 2018). The opinion was written in a patent case in Detroit by Executive Magistrate Judge R. Steven Whalen. He looked at the proposed keywords and found them wanting, but wisely refused to go further and tell them what keywords to use. Well done Judge Whalen!

This case is similar to the one discussed in my last blog, Judge Goes Where Angels Fear To Tread: Tells the Parties What Keyword Searches to Use, where Magistrate Judge Laura Fashing in Albuquerque was asked to resolve a keyword dispute in United States v. New Mexico State University, No. 1:16-cv-00911-JAP-LF, 2017 WL 4386358 (D.N.M. Sept. 29, 2017). Judge Fashing not only found the proposed keywords inadequate, but came up with her own replacement keywords and did so without any expert input.

In my prior blog on Judge Fashing’s decision I discussed Judge John Facciola’s landmark legal search opinion in United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008) and other cases that follow it. In O’Keefe Judge Facciola held that because keyword search questions involve complex, technical, scientific questions, that a judge should not decide such issues without the help of expert testimony. That is the context for his famous line:

Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread. This topic is clearly beyond the ken of a layman and requires that any such conclusion be based on evidence that, for example, meets the criteria of Rule 702 of the Federal Rules of Evidence.

In this weeks blog I consider the opinion by Judge Whalen in Webastro Thermo & Comfort v. BesTop, Inc., 2018 WL 3198544, No.16-13456 (E.D. Mich. June 29, 2018) where he told the parties what keywords not to use, again without expert input, but stopped there. Interesting counterpoint cases. It is also interesting to observe that in all three cases, O’Keefe, New Mexico State University and Webastro, the judges end on the same note where the parties are ordered to cooperate. Ah, if it were only so easy.

Stipulated Order Governing ESI Production

In Webastro Thermo & Comfort v. BesTop, Inc., the parties cooperated at the beginning of the case. They agreed to the entry of a stipulated ESI Order governing ESI production. The stipulation included a cooperation paragraph where the parties pledge to try to resolve all ESI issues without judicial intervention. Apparently, the parties cooperation did not go much beyond the stipulated order. Cooperation broke down and the plaintiff filed a barrage of motions to avoid having to do document review, including an Emergency Motion to Stay ESI Discovery. The plaintiff alleged that the defendant violated the ESI stipulation by “propounding overly broad search terms in its request for ESI.” Oh, how terrible. Red Alert!

Plaintiffs further accused defense counsel of “propounding prima facie inappropriate search criteria, and refusal to work in good faith to target its search terms to specific issues in this case.” Again, the outrageous behavior reminds me of the Romulans. I can see why plaintiff’s counsel called an emergency and asked for costs and relief from having to produce any ESI at all. That kind of approach rarely goes over well with any judge, but here it worked. That’s because the keywords the defense wanted plaintiff to use in its search for relevant ESI were, in fact, very bad.

Paragraph 1.3(3) of the ESI Order establishes a protocol designed to constrain e-discovery, including a limitation to eight custodians with no more than ten keyword search terms for each. It goes on to provide the following very interesting provision:

The search terms shall be narrowly tailored to particular issues. Indiscriminate terms, such as the producing company’s name or its product name, are inappropriate unless combined with narrowing search criteria that significantly reduce the risk of overproduction. A conjunctive combination of multiple words or phrases (e.g. ‘computer’ and ‘system’) narrows the search and shall count as a single term. A disjunctive combination of multiple words or phrases (e.g. ‘computer’ or ‘system’) broadens the search, and thus each word or phrase shall count as a separate search term unless they are variants of the same word. Use of narrowing search criteria (e.g. ‘and,’ ‘but not,’ ‘w/x’) is encouraged to limit the production and shall be considered when determining whether to shift costs for disproportionate discovery.

Remember, this is negotiated wording that the parties agreed to, including the bit about product names and “conjunctive combination.”

Defendant’s Keyword Demands

The keywords proposed by defense counsel for plaintiff’s search then included: “Jeep,” “drawing” and its abbreviation “dwg,” “top,” “convertible,” “fabric,” “fold,” “sale or sales,” and the plaintiff’s product names,  “Swaptop” and “Throwback.”

Plaintiff’s counsel advised Judge Whalen that the ten terms created the following results with five custodians (no word on the other three):

  • Joseph Lupo: 30 gigabytes, 118,336 documents.
  • Ryan Evans: 13 gigabytes, 44,373 documents.
  • Tyler Ruby: 10 gigabytes, 44,460 documents.
  • Crystal Muglia: 245,019 documents.
  • Mark Denny: 162,067 documents.
In Footnote Three Judge Whalen adds, without citation to authority or the record, that:
One gigabyte would comprise approximately 678,000 pages of text. 30 gigabytes would represent approximately 21,696,000 pages of text.

Note that Catalyst did a study of average number of files in a gigabyte in 2014. They found that the average number was 2,500 files per gigabyte. They suggest using 3,000 files per gigabyte for cost estimates, just to be safe. So I have to wonder where Judge Whalen got this 678,000 pages of text per gigabyte.

Plaintiff’s counsel added that:

Just a subset of the email discovery requests propounded by BesTop have returned more than 614,00 documents, comprising potentially millions of individual pages for production.

Plaintiff’s counsel also filed an affidavit where he swore that he reviewed the first 100 consecutively numbered documents to evaluate the burden. Very impressive effort. Not! He looked at the first one-hundred documents that happened to be on top of a 614,000 pile. He also swore that none of these first one-hundred were relevant. (One wonders how many of them were empty pst container files. They are often the “documents” found first in consecutive numbering of an email collection. A better sample might have been to look at the 100 docs with the most hits.)

Judge Whalen Agrees with Plaintiff on Keywords

Judge Whalen agreed with plaintiff and held that:

The majority of defendant’s search terms are overly broad, and in some cases violate the ESI Order on its face. For example, the terms “throwback” and “swap top” refer to Webasto’s product names, which are specifically excluded under 1.3(3) of the ESI Order.

The overbreadth of other terms is obvious, especially in relation to a company that manufactures and sells convertible tops: “top,” “convertible,” “fabric,” “fold,” “sale or sales.” Using “dwg” as an alternate designation for “drawing” (which is itself a rather broad term) would call into play files with common file extension .dwg.

Apart from the obviously impermissible breadth of BesTop’s search terms, their overbreadth is borne out by Mr. Carnevale’s declarations, which detail a return of multiple gigabytes of ESI potentially comprising tens of millions of pages of documents, based on only a partial production. In addition, the search of just the first 100 records produced using BesTop’s search terms revealed that none were related to the issues in this lawsuit. Contrary to BesTop’s contention that Webasto’s claim of prejudice is conclusory, I find that Webasto has sufficiently “articulate[d] specific facts showing clearly defined and serous injury resulting from the discovery sought ….” Nix, 11 Fed.App’x. at 500.

Thus, BesTop’s reliance on City of Seattle v. Professional Basketball Club, LLC, 2008 WL 539809 (W.D. Wash. 2008), is inapposite. In City of Seattle, the defendant offered no facts to support its assertion that discovery would be overly burdensome, instead “merely state[ing] that producing such emails ‘would increase the email universe exponentially[.]’” Id. at *3. In our case, Webasto has proffered hard numbers as to the staggering amount of ESI returned based on BesTop’s search requests. Moreover, while disapproving of conclusory claims of burden, the Court in City of Seattle recognized that the overbreadth of some search terms would be apparent on their face:

“‘[U]nless it is obvious from the wording of the request itself that it is overbroad, vague, ambiguous or unduly burdensome, an objection simply stating so is not sufficiently specific.’” Id., quoting Boeing Co. v. Agric. Ins. Co., 2007 U.S. Dist. LEXIS 90957, *8 (W.D.Wash. Dec. 11, 2007).

As discussed above, many of BesTop’s terms are indeed overly general on their face. And again, propounding Webasto’s product names (e.g., “throwback” and “swap top”) violates the express language of the ESI Order.

Defense Counsel Did Not Cooperate

Judge Whalen then went on to address the apparent lack of cooperation by defendant.

Adversarial discovery practice, particularly in the context of ESI, is anathema to the principles underlying the Federal Rules, particularly Fed.R.Civ.P. 1, which directs that the Rules “be construed, administered, and employed by the court and the parties to secure the just, speedy, and inexpensive determination of every action and proceeding.” In this regard, the Sedona Conference Cooperation Proclamation states:

“Indeed, all stakeholders in the system–judges, lawyers, clients, and the general pubic–have an interest in establishing a culture of cooperation in the discovery process. Over-contentious discovery is a cost that has outstripped any advantage in the face of ESI and the data deluge. It is not in anyone’s interest to waste resources on unnecessary disputes, and the legal system is strained by ‘gamesmanship’ or ‘hiding the ball,’ to no practical effect.”

The stipulated ESI Order, which controls electronic discovery in this case, is an important step in the right direction, but whether as the result of adversarial overreach or insufficient effort, BesTop’s proposed search terms fall short of what is required under that Order.

Judge Whalen’s Ruling

Judge Whalen concluded his short Order with the following ruling:

For these reasons, Webasto’s motion for protective order [Doc. #78] is GRANTED as follows:

Counsel for the parties will meet and confer in a good-faith effort to focus and narrow BesTop’s search terms to reasonably limit Webastro’s production of ESI to emails relevant (within the meaning of Rule 26) to the issues in this case, and to exclude ESI that would have no relationship to this case.

Following this conference, and within 14 days of the date of this Order, BesTop will submit an amended discovery request with the narrowed search terms.  …

Because BesTop will have the opportunity to reformulate its discovery request to conform to the ESI Order, Webasto’s request for cost-shifting is DENIED at this time. However, the Court may reconsider the issue of cost-shifting if BesTop does not reasonably narrow its requests.

Difficult to Cooperate on Legal Search Without the Help of Experts

The defense in Webastro violated their own stipulation by the use of a party’s product names without further Boolean limiters, such as “product name AND another term.” Then defense counsel added insult to injury by coming across as uncooperative. I don’t know if they alone were uncooperative, or if it was a two way street, but appearances are everything. The emails between counsel were attached to the motions, and the judge scowled at the defense here, not plaintiff’s counsel. No judge likes attorneys who ignore orders, stipulated or otherwise, and are uncooperative to boot. “Uncooperative” is  label that you should avoid being called by a judge, especially in the world of e-discovery. Better to be an angel for discovery and save the devilish details for motions and trial.

In Webastro Thermo & Comfort v. BesTop, Inc., Judge Whalen struck down the proposed keywords without expert input. Instead Judge Whalen based his order on some incomplete metrics, namely the number of hits produced by the keywords that defense dreamed up. At least Judge Whalen did not go further and order the use of specific keywords as Judge Fashing did in United States v. New Mexico State University. Still, I wish he had not only ordered the parties to cooperate, but also ordered them to bring in some experts to help with the search tasks. You cannot just talk your way into good searches. No matter what the level of cooperation, you still have to know what you are doing.

If I had been handling this for the plaintiff, I would have gotten my hands much dirtier in the digital mud, meaning I would have done far more than just look at the first one-hundred of 614,000 documents. That was a poor quality control test, but obviously, here at least, was better than nothing. I would have done a sample review of each keyword and evaluated the precision of each. Some might have been ok as is, although probably not. They usually require some refinement. Sometimes it only takes a few minutes of review to determine that. Bottom line, I would have checked out the requested keywords. There were only ten here. That would take maybe three hours or so with the right software. You do not need big judgmental sampling most of the time to see the effectiveness, or not, or keywords.

The next step is to come up with, and test, a number of keyword refinements based on what you see in the data. Learn from the data. Test and improve various keyword combinations. That can take a few more hours. Some may think this is too much work, but it is far less time than preparing motions, memos and attending hearings. And anyway, you need to find the relevant evidence for your case.

After the tests, you share what you learned with opposing counsel and the judge, assuming they want to know. In my experience, most could care less about your methods, so long as your production includes the information they were looking for. You do not have to disclose your every little step, but you should at least advise, again if asked, information about “hit results.” This disclosure alone can go a long way, as this opinion demonstrates. Plaintiff’s counsel obtained very little data about the ineffectiveness of the defendants proposed searched terms, but that was enough to persuade the judge to enter a protective order.

To summarize, after evaluating the proposed search terms I would have improved on them. Using the improved searches I would have begun the attorney review and production. I would have shared the search information, cooperated as required by stipulation, case-law and rules, and gone ahead with my multimodal searches. I would use keywords and the many other wonderful kinds of searches that the Legal Technology industry has come up with in the last 25 years or so since keyword search was new and shiny.

Conclusion

The stipulation the parties used in Webastro could have been used at the turn of the century. Now it seems a little quaint, but alas, suits most inexperienced lawyers today. Anyway, talking about and using keywords is a good way to start a legal search. I sometimes call that Relevancy Dialogues or ESI Communications. Try out some keywords, refine and use them to guide your review, but do not stop there. Try other types of search too. Multimodal. Harness the power of the latest technology, namely AI enhanced search (Predictive Coding). Use statistics too and random sampling to better understand the data prevalence and overall search effectiveness.

If you do not know how to do legal search, and I estimate that 98% of lawyers today do not, then hire an expert. (Or take the time to learn, see eg TARcourse.com.) Your vendor probably has a couple of search experts. There may also be a lawyer in town with this expertise. Now there are even a few specialty law firms that offer these services nationwide. It is a waste of time to reinvent the Wheel, plus it is an ethical dictate under Rule 1.1 – Competence, to associate with competent counsel on a legal task when you are not.

Regarding the vendor experts, remember that even though they may be lawyers, they can only go so far. They can only provide technical advice, not legal, such as proportionality analysis under Rule 26, etc. That requires a practicing lawyer who specializes in e-discovery, preferably as a full-time specialty and not just something they do every now and then. If you are in a big firm, like I am, find the expert in your firm who specializes in e-discovery, like me. They will help you. If your firm does not have such an expert, better get one, either that or get used to losing and having your clients complain.

 


Judge Goes Where Angels Fear To Tread: Tells the Parties What Keyword Searches to Use

June 24, 2018

John Facciola was one of the first e-discovery expert judges to consider the adequacy of a producing parties keyword search efforts in United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008). He first observed that keyword search and other computer assisted legal search techniques required special expertise to do properly. Everyone agrees with that. He then reached an interesting, but still somewhat controversial conclusion: because he lacked such special legal search expertise, and knew full well that most of the lawyers appearing before him did too, that he could not properly analyze and compel the use of specific keywords without the help of expert testimony. To help make his point he paraphrased Alexander Pope‘s famous line from An Essay on Criticism: “For fools rush in where angels fear to tread.

Here are the well-known words of Judge Facciola in O’Keffe (emphasis added):

As noted above, defendants protest the search terms the government used.[6]  Whether search terms or “keywords” will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics. See George L. Paul & Jason R. Baron, Information Inflation: Can the Legal System Adapt?; 13 Ricn. J.L. & TECH. 10 (2007). Indeed, a special project team of the Working Group on Electronic Discovery of the Sedona Conference is studying that subject and their work indicates how difficult this question is. See The Sedona Conference, Best Practices Commentary on the Use of Search and Information Retrieval, 8 THE SEDONA CONF. J. 189 (2008).

Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread. This topic is clearly beyond the ken of a layman and requires that any such conclusion be based on evidence that, for example, meets the criteria of Rule 702 of the Federal Rules of Evidence. Accordingly, if defendants are going to contend that the search terms used by the government were insufficient, they will have to specifically so contend in a motion to compel and their contention must be based on evidence that meets the requirements of Rule 702 of the Federal Rules of Evidence.

Many courts have followed O’Keffe, even though it is a criminal case, and declined to step in and order specific searches without expert input. See eg. the well-known patent case, Vasudevan Software, Inc. v. Microstrategy Inc., No. 11-cv-06637-RS-PSG, 2012 US Dist LEXIS 163654 (ND Cal Nov 15, 2012). The opinion was by U.S. Magistrate Judge Paul S. Grewal, who later became the V.P. and Deputy General Counsel of Facebook. Judge Grewal wrote:

But as this case makes clear, making those determinations often is no easy task. “There is no magic to the science of search and retrieval: only mathematics, linguistics, and hard work.”[9]

Unfortunately, despite being a topic fraught with traps for the unwary, the parties invite the court to enter this morass of search terms and discovery requests with little more than their arguments.

More recently, e-discovery expert Judge James Francis addressed this issue in Greater New York Taxi Association v. City of New York, No. 13 Civ. 3089 (VSB) (JCF) (S.D.N.Y. Sept. 11, 2017) and held:

The defendants have not provided the necessary expert opinions for me to assess their motion to compel search terms. The application is therefore denied. This leaves the defendants with three options: “They can cooperate [with the plaintiffs] (along with their technical consultants) and attempt to agree on an appropriate set of search criteria. They can refile a motion to compel, supported by expert testimony. Or, they can request the appointment of a neutral consultant who will design a search strategy.”[10] Assured Guaranty Municipal Corp. v. UBS Real Estate Securities Inc., No. 12 Civ. 1579, 2012 WL 5927379, at *4 (S.D.N.Y. Nov. 21, 2012).

I am inclined to agree with Judge Francis. I know from daily experience that legal search, even keyword search, can be very tricky, depends on many factors, including the documents searched. I have spent over a decade working hard to develop expertise in this area. I know that the appropriate searches to be run depends on experience and scientific, technical knowledge on information retrieval and statistics. It also depends on tests of proposed keywords; it depends on sampling and document reviews; it depends on getting your hands dirty in the digital mud of the actual ESI. It cannot be done effectively in the blind, no matter what your level of expertise. It is an iterative process of trial and errors, false positives and negatives alike.

Enter a Judge Braver Than Angels

Recently appointed U.S. Magistrate Judge Laura Fashing in Albuquerque, New Mexico, heard a case involving a dispute over keywords. United States v. New Mexico State University, No. 1:16-cv-00911-JAP-LF, 2017 WL 4386358 (D.N.M. Sept. 29, 2017). It looks like the attorneys in the case neglected to inform Judge Fashing of United States v. O’Keefe. It is a landmark case in this field, yet was not cited in Judge Fashing’s order. More importantly, Judge Fashing did not take the advice of O’Keefe, nor the many cases that follow it. Unlike Judge Facciola and his angels, she told the parties what keywords to use, even without input from experts.

The New Mexico State University opinion did, however, cite to two other landmark cases in legal search, William A. Gross Const. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009) by Judge Andrew Peck and Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260, 262 (D. Md. May 29, 2008) by Judge Paul Grimm. Judge Fashing held in New Mexico State University:

This case presents the question of how parties should search and produce electronically stored information (“ESI”) in response to discovery requests. “[T]he best solution in the entire area of electronic discovery is cooperation among counsel.” William A. Gross Const. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009). Cooperation prevents lawyers designing keyword searches “in the dark, by the seat of the pants,” without adequate discussion with each other to determine which words would yield the most responsive results. Id.

While keyword searches have long been recognized as appropriate and helpful for ESI search and retrieval, there are well-known limitations and risks associated with them, and proper selection and implementation obviously involves technical, if not scientific knowledge.

* * *

Selection of the appropriate search and information retrieval technique requires careful advance planning by persons qualified to design effective search methodology. The implementation of the methodology selected should be tested for quality assurance; and the party selecting the methodology must be prepared to explain the rationale for the method chosen to the court, demonstrate that it is appropriate for the task, and show that it was properly implemented.

Id. (quoting Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260, 262 (D. Md. May 29, 2008)).

Although NMSU has performed several searches and produced thousands of documents, counsel for NMSU did not adequately confer with the United States before performing the searches, which resulted in searches that were inadequate to reveal all responsive documents. As the government points out, “NMSU alone is responsible for its illogical choices in constructing searches.” Doc. 117-1 at 8. Consequently, which searches will be conducted is left to the Court.

Judges Francis, Peck and Facciola

Judge Laura Fashing had me in the quote above until the final sentence. Up till then she had been wisely following the four great judges in this area, Facciola, Peck, Francis and Grimm. Then in the next several paragraphs she rushes in to specify what search terms should be used for what categories of ESI requested. Why should the Court go ahead and do that without expert advice? Why not wait? Especially since Judge Fashing starts her opinion by recognizing the difficulty of the task, that “there are well-known limitations and risks associated with them [keyword searches], and proper selection and implementation obviously involves technical, if not scientific knowledge.” Knowing that, why was she fearless? Why did she ignore Judge Facciola’s advice? Why did she make multiple detailed, technical decisions on legal search, including specific keywords to be used, without the benefit of expert testimony? Was that foolish as several judges have suggested, or was she just doing her job by making the decisions that the parties asked her to make?

Judge Fashing recognized that she did have enough facts to make a decision, much less expert opinions based on technical, scientific knowledge, but she went ahead and ruled anyway.

Although NMSU argues that the search terms proposed by the government will return a greater number of non-responsive documents than responsive documents, this is not a particular and specific demonstration of fact, but is, instead, a conclusory argument by counsel. See Velasquez, 229 F.R.D. at 200. NMSU’s motion for a protective order with regard to RFP No. 8 is DENIED.

NMSU will perform a search of the email addresses of all individuals involved in salary-setting for Ms. Harkins and her comparators, including Kathy Agnew and Dorothy Anderson, to include the search terms “Meaghan,” “Harkins,” “Gregory,” or “Fister” for the time period of 2007-2012. If this search results in voluminous documents that are non-responsive, NMSU may further search the results by including terms such as “cross-country,” “track,” “coach,” “salary,” “pay,” “contract,” or “applicants,” or other appropriate terms such as “compensation,” which may reduce the results to those communications most likely relevant to this case, and which would not encompass every “Meaghan” or “Gregory” in the system. However, the Court will require NMSU to work with the USA to design an appropriate search if it seeks to narrow the search beyond the four search terms requested by the United States.

Judge Fashing goes on to make several specific orders on what to do to make a reasonable effort to find relevant evidence:

NMSU will conduct searches of the OIE databases, OIE employee’s email accounts, and the email accounts of all head coaches, sport administrators, HR liaisons working within the Athletics Department, assistant or associate Athletic Directors, and/or Athletic Directors employed by NMSU between 2007 and the present. The USA suggests that NMSU conduct a search for terms that are functionally equivalent to a search for (pay or compensate! or salary) and (discriminat! or fair! or unfair!). Doc. 117-1 at 13. If NMSU cannot search with “Boolean” connectors as suggested, it must search for the terms “pay” or “compensate” or “salary” and “discriminate” or “fair” or “unfair” and the various derivatives of these terms (for example the search would include “compensate” and “compensation”). The parties are to work together to determine what terms will be used to search these databases and email accounts.

Judge Laura Fashing hangs her hat on cooperation, but not on experts. She concludes her order with the following admonishment:

The parties are reminded that:

Electronic discovery requires cooperation between opposing counsel and transparency in all aspects of preservation and production of ESI. Moreover, where counsel are using keyword searches for retrieval of ESI, they at a minimum must carefully craft the appropriate keywords, with input from the ESI’s custodians as to the words and abbreviations they use, and the proposed methodology must be quality control tested to assure accuracy in retrieval and elimination of “false positives.” It is time that the Bar—even those lawyers who did not come of age in the computer era—understand this.

William A. Gross Const. Assocs., Inc., 256 F.R.D. at 136.

Conclusion

Of course I agree with Judge Fashing’s concluding reminder to the parties. Cooperation is key, but so is expertise. There is a good reason for the fear felt by Facciola’s angels. They wisely  knew that they lacked the necessary technical, scientific knowledge for the proper selection and implementation of keyword searches. I only wish that Judge Fashing’s order had reminded the parties of this need for experts too. It would have made her job much easier and also helped the parties. Sometimes the wisest thing to do is nothing, at least not until you have more information.

There is widespread agreement among legal search experts on such simplistic methods as keyword search. They would have helped. The same holds true on advanced search methods, such as active machine learning (predictive coding), at least among the elite. See TARcourse.com. There is still some disagreement on TAR methods, especially when you include the many pseudo experts out there. But even they can usually agree on keyword search methods.

I urge the judges and litigants faced with a situation like Judge Fashing had to deal with in New Mexico State University, to consider the three choices set out by Judge Francis in Greater New York Taxi Association:

  1. Cooperation with the other side and their technical consultants to attempt to agree on an appropriate set of search criteria.
  2. Motions supported by expert testimony and facts regarding the search.
  3. Appointment of a neutral consultant who will design a search strategy.

Going it alone with legal search in a complex case is a fool’s errand. Bring in an expert. Spend a little to save a lot. It is not only the smart thing to do, it is also required by ethics. Rule 1.1: Competence, Model Rules of Professional Conduct. The ABA Comment two to Rule 1.1 states that “Competent representation can also be provided through the association of a lawyer of established competence in the field in question.” Yet, in my experience, this is seldom done and is not something that clients are clamoring for. That should change, and quickly, if we are ever to stop wasting so much time and money on simplistic e-discovery arguments. I am again reminded of the great Alexander Pope (1688–1744) and another of his famous lines from An Essay on Criticism.

_______________

 

After I wrote this blog I did a webinar for ACEDS about this topic. Here is a one-hour talk to add to your personal Pierian spring.

 

_________

 

 

 


Disproportionate Keyword Search Demands Defeated by Metric Evidence of Burden

June 10, 2018

The defendant in a complex commercial dispute demanded that plaintiff search its ESI for all files that had the names of four construction projects. Am. Mun. Power, Inc. v. Voith Hydro, Inc. (S.D. Ohio, 6/4/18) (copy of full opinion below). These were the four projects underlying the law suit. Defense counsel, like many attorneys today, thought that they had magical powers when it comes to finding electronic evidence. They thought that all, or most all, of the ESI with these fairly common project names would be relevant or, at the very least, worth examining for relevance. As it turns out, defense counsel was very wrong, most of the docs with keyword hits were not relevant and the demand was unreasonable.

The Municipal Power opinion was written by Chief Magistrate Judge Elizabeth A. Preston Deavers of the Southern District Court of Ohio. She reached this conclusion based on evidence of burden, what we like to call the project metrics. We do not know the total evidence presented, but we do know that Judge Deavers was impressed by the estimate that the privilege review alone would cost the plaintiff between $100,000 – $125,000. I assume that estimate was based on a linear review of all relevant documents. That is very expensive to do right, especially in large, diverse data sets with high privilege and relevance prevalence. Triple and quadruple checks are common and are built into standard protocols.

Judge Deavers ruled against the defense on the four project names keywords request, and granted a protective order for the plaintiff because, in her words:

The burden and expense of applying the search terms of each Project’s name without additional qualifiers outweighs the benefits of this discovery for Voith and is disproportionate to the needs of even this extremely complicated case.

The plaintiff made its own excessive demand upon defendant to search its ESI using a long list of keywords, including Boolean logic. The plaintiff’s keyword list was much more sophisticated than the defendants four name search demand. The plaintiff’s proposal was rejected by the defendant and the judge for the same proportionality reason. It kind of looks like tit for tat with excessive demands on both sides. But, it is hard to say because the negotiations were apparently focused on mere guessed-keywords, instead of a process of testing and refining – evolved-tested keywords.

Defense counsel responded to the plaintiff’s keyword demands by presenting their own metrics of burden, including the projected costs of redaction of confidential customer information. These confidentiality concerns can be difficult, especially where you are required to redact. Better to agree upon an alternative procedure where you withhold the entire document and log them with a description. This can be a less expensive alternative to redaction.

When reading the opinion below note how the Plaintiff’s opposition to the demand to review all ESI with the four project names gave specific examples of types of documents (ESI) that would have the names on them and still have nothing whatsoever to do with the parties claims or defenses, the so called “false positives.” This is a very important exercise that should not be overlooked in any argument. I have seen some pretty terrible precision percentages, sometimes as low as two percent.

Get your hands in the digital mud. Go deep into TAR if you need to. It is where the time warps happen and we bend space and time to attain maximum efficiency. Our goal is to attain: (1) the highest possible review speeds (files per hr), both hybrid and human; (2)  the highest precision (% of relevant docs); and, (3) the countervailing goal of total recall (% of relevant docs found). The recall goal is typically given the greatest weight, with emphasis on highly relevant. The question is how much greater weight to give recall and that depends on the total facts and circumstances of the doc review project.

Keywords are the Model T of legal search, but we all start there. It is still a very important skill for everyone to learn and then move on to other techniques, especially to active machine learning.

In some simple projects it can still be effective, especially if the user is highly skilled and the data is simple. It also helps if the data is well known to the searcher from earlier projects. See TAR Course: 8th Class (Keyword and Linear Review).

________________________

Below is the unedited full opinion (very short). We look forward to more good opinions by Judge Deavers on e-discovery.

__________

UNITED STATES DISTRICT COURT FOR THE SOUTHERN DISTRICT OF OHIO, EASTERN DIVISION. No. 2:17-cv-708

June 4, 2018

AMERICAN MUNICIPAL POWER, INC., Plaintiff, vs. VOITH HYDRO, INC., Defendant.

ELIZABETH A. PRESTON DEAVERS, UNITED STATES MAGISTRATE JUDGE. Judge Algenon L. Marbley.

MEMORANDUM OF DECISION

This matter came before the Court for a discovery conference on May 24, 2018. Counsel for both parties appeared and participated in the conference.

The parties provided extensive letter briefing regarding certain discovery disputes relating to the production of Electronically Stored Information (“ESI”) and other documents. Specifically, the parties’ dispute centers around two ESI-related issues: (1) the propriety of a single-word search by Project name proposed by Defendant Voith Hydro, Inc. (“Voith”) which it seeks to have applied to American Municipal Power, Inc.’s (“AMP”) ESI; 1 and (2) the propriety of AMP’s request that Voith run crafted search terms which AMP has proposed that are not limited to the Project’s name. 2 After careful consideration of the parties’ letter briefing and their arguments during the discovery conference, the Court concluded as follows:

  • Voith’s single-word Project name search terms are over-inclusive. AMP’s position as the owner of the power-plant Projects puts it in a different situation than Voith in terms of how many ESI “hits” searching by Project name would return. As owner, AMP has stored millions of documents for more than a decade that contain the name of the Projects which refer to all kinds of matters unrelated to this case. Searching by Project name, therefore, would yield a significant amount of discovery that has no bearing on the construction of the power plants or Voith’s involvement in it, including but not limited to documents related to real property acquisitions, licensing, employee benefits, facility tours, parking lot signage, etc. While searching by the individual Project’s name would yield extensive information related to the name of the Project, it would not necessarily bear on or be relevant to the construction of the four hydroelectric power plants, which are the subject of this litigation. AMP has demonstrated that using a single-word search by Project name would significantly increase the cost of discovery in this case, including a privilege review that would add $100,000 – $125,000 to its cost of production. The burden and expense of applying the search terms of each Project’s name without additional qualifiers outweighs the benefits of this discovery for Voith and is disproportionate to the needs of even this extremely complicated case.
  • AMP’s request that Voith search its ESI collection without reference to the Project names by using as search terms including various employee and contractor names together with a list of common construction terms and the names of hydroelectric parts is overly inclusive and would yield confidential communications about other projects Voith performed for other customers. Voith employees work on and communicate regarding many customers at any one time. AMPs proposal to search terms limited to certain date ranges does not remedy the issue because those employees still would have sent and received communications about other projects during the times in which they were engaged in work related to AMP’s Projects. Similarly, AMP’s proposal to exclude the names of other customers’ project names with “AND NOT” phrases is unworkable because Voith cannot reasonably identify all the projects from around the world with which its employees were involved during the decade they were engaged in work for AMP on the Projects. Voith has demonstrated that using the terms proposed by AMP without connecting them to the names of the Projects would return thousands of documents that are not related to this litigation. The burden on Voith of running AMP’s proposed search terms connected to the names of individual employees and general construction terms outweighs the possibility that the searches would generate hits that are relevant to this case. Moreover, running the searches AMP proposes would impose on Voith the substantial and expensive burden of manually reviewing the ESI page by page to ensure that it does not disclose confidential and sensitive information of other customers. The request is therefore overly burdensome and not proportional to the needs of the case.

1 Voith seeks to have AMP use the names of the four hydroelectric projects at issue in this case (Cannelton, Smithland, Willow and Meldahl) as standalone search terms without qualifiers across all of AMP’s ESI. AMP proposed and has begun collecting from searches with numerous multiple-word search terms using Boolean connectors. AMP did not include the name of each Project as a standalone term.

2 AMP contends that if Voith connects all its searches together with the Project name, it will not capture relevant internal-Voith ESI relating to the construction claims and defenses in the case. AMP asserts Voith may have some internal documents that relate to the construction projects that do not refer to the Project by name, and included three (3) emails with these criteria it had discovered as exemplars. AMP proposes that Voith search its ESI collection without reference to the Project names by using as search terms including various employee and contractor names together with a list of generic construction terms and the names of hydroelectric parts.

IT IS SO ORDERED.

DATED: June 4, 2018

/s/ Elizabeth A. Preston Deavers

ELIZABETH A. PRESTON DEAVERS

UNITED STATES MAGISTRATE JUDGE

 

 


e-Discovery and Poetry on a Rainy Night in Portugal

April 17, 2018

From time to time I like read poetry. Lately it has been the poetry of Billy Collins, a neighbor and famous friend. (He was the Poet Laureate of the United States from 2001 to 2003.) I have been reading his latest book recently, The Rain in Portugal. Billy’s comedic touches balance the heavy parts. Brilliant poet. I selected one poem from this book to write about here, The Five Spot, 1964. It has a couple of obvious e-discovery parallels. It also mentions a musician I had never heard of before, Roland Kirk, who was a genius at musical multi-tasking. Enjoy the poem and videos that follow. There is even a lesson here on e-discovery.

The Five Spot, 1964

There’s always a lesson to be learned
whether in a hotel bar
or over tea in a teahouse,
no matter which way it goes,
for you or against,
what you want to hear or what you don’t.

Seeing Roland Kirk, for example,
with two then three saxophones
in his mouth at once
and a kazoo, no less,
hanging from his neck at the ready.

Even in my youth I saw this
not as a lesson in keeping busy
with one thing or another,
but as a joyous impossible lesson
in how to do it all at once,

pleasing and displeasing yourself
with harmony here and discord there.
But what else did I know
as the waitress lit the candle
on my round table in the dark?
What did I know about anything?

Billy Collins

The famous musician in this poem is Rahsaan Roland Kirk (August 7, 1935[2] – December 5, 1977). Kirk was an American jazz multi-instrumentalist who played tenor saxophone, flute, and many other instruments. He was renowned for his onstage vitality, during which virtuoso improvisation was accompanied by comic banter, political ranting, and, as mentioned, the astounding ability to simultaneously play several musical instruments.

Here is a video of Roland Kirk with his intense multimodal approach to music.

One more Kirk video. What a character.

____

The Law

There are a few statements in Billy Collins’ Five Spot poem that have obvious applications to legal discovery, such as “There’s always a lesson to be learnedno matter which way it goes, for you or against, what you want to hear or what you don’t.” We are all trained to follow the facts, the trails, wherever they may lead, pro or con.

I do not say either pro or con “my case” because it is not. It is my client’s case. Clients pay lawyers for their knowledge, skill and independent advice. Although lawyers like to hear evidence that supports their client’s positions and recollections, after all it makes their job easier, they also want to hear evidence that goes against their client. They want to hear all sides of a story and understand what it means. They look at everything to craft a reasonable story for judge and jury.

Almost all cases have good and bad evidence on both sides. There is usually some merit to each side’s positions. Experienced lawyers look for the truth and present it in the best light favorable for their client. The Rules of Procedure and duties to the court and client require this too.

Bottom line for all e-discovery professionals is that you learn the lessons taught by the parties notes and documents, all of the lessons, good and bad.

The poem calls this a “… joyous impossible lesson in how to do it all at once, pleasing and displeasing yourself with harmony here and discord there.” All lawyers know this place, this joyless lesson of discovering the holes in your client’s case. As far as the “doing it all at once ” phrase, this too is very familiar to any e-discovery professional. If it is done right, at the beginning of a case, the activity is fast and furious. Kind of like a Roland Kirk solo, but without Roland’s exuberance.

Everybody knows that the many tasks of e-discovery must be done quickly and pretty much all at once at the beginning of a case: preservation notices, witness interviews, ESI collection, processing and review. The list goes on and on. Yet, in spite of this knowledge, most everyone still treats e-discovery as if they had bags of time to do it. Which brings me to another Billy Collins poem that I like:

BAGS OF TIME

When the keeper of the inn
where we stayed in the Outer Hebrides
said we had bags of time to catch the ferry,
which we would reach by traversing the causeway
between this island and the one to the north,

I started wondering what a bag of time
might look like and how much one could hold.
Apparently, more than enough time for me
to wonder about such things,
I heard someone shouting from the back of my head.

Then the ferry arrived, silent across the water,
at the Lochmaddy Ferry Terminal,
and I was still thinking about the bags of time
as I inched the car clanging onto the slipway
then down into the hold for the vehicles.

Yet it wasn’t until I stood at the railing
of the upper deck with a view of the harbor
that I decided that a bag of time
should be the same color as the pale blue
hull of the lone sailboat anchored there.

And then we were in motion, drawing back
from the pier and turning toward the sea
as ferries had done for many bags of time,
I gathered from talking to an old deckhand,
who was decked out in a neon yellow safety vest,

and usually on schedule, he added,
unless the weather has something to say about it.

Conclusion

Take time out to relax and let yourself ponder the works of a poet. We have bags of time in our life for that. Poetry is liable to make you a better person and a better lawyer.

I leave you with two videos of poetry readings by Billy Collins, the first at the Obama White House. He is by far my favorite contemporary poet. Look for some of his poems on dogs and cats. They are especially good for any pet lovers like me.

One More Billy Collins video.

 


%d bloggers like this: