In this blog we discuss yet another case where the parties are bickering over keywords and the judge was asked to intervene. Webastro Thermo & Comfort v. BesTop, Inc., 2018 WL 3198544, No.16-13456 (E.D. Mich. June 29, 2018). The opinion was written in a patent case in Detroit by Executive Magistrate Judge R. Steven Whalen. He looked at the proposed keywords and found them wanting, but wisely refused to go further and tell them what keywords to use. Well done Judge Whalen!
This case is similar to the one discussed in my last blog, Judge Goes Where Angels Fear To Tread: Tells the Parties What Keyword Searches to Use, where Magistrate Judge Laura Fashing in Albuquerque was asked to resolve a keyword dispute in United States v. New Mexico State University, No. 1:16-cv-00911-JAP-LF, 2017 WL 4386358 (D.N.M. Sept. 29, 2017). Judge Fashing not only found the proposed keywords inadequate, but came up with her own replacement keywords and did so without any expert input.
In my prior blog on Judge Fashing’s decision I discussed Judge John Facciola’s landmark legal search opinion in United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008) and other cases that follow it. In O’Keefe Judge Facciola held that because keyword search questions involve complex, technical, scientific questions, that a judge should not decide such issues without the help of expert testimony. That is the context for his famous line:
Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread. This topic is clearly beyond the ken of a layman and requires that any such conclusion be based on evidence that, for example, meets the criteria of Rule 702 of the Federal Rules of Evidence.
In this weeks blog I consider the opinion by Judge Whalen in Webastro Thermo & Comfort v. BesTop, Inc., 2018 WL 3198544, No.16-13456 (E.D. Mich. June 29, 2018) where he told the parties what keywords not to use, again without expert input, but stopped there. Interesting counterpoint cases. It is also interesting to observe that in all three cases, O’Keefe, New Mexico State University and Webastro, the judges end on the same note where the parties are ordered to cooperate. Ah, if it were only so easy.
Stipulated Order Governing ESI Production
In Webastro Thermo & Comfort v. BesTop, Inc., the parties cooperated at the beginning of the case. They agreed to the entry of a stipulated ESI Order governing ESI production. The stipulation included a cooperation paragraph where the parties pledge to try to resolve all ESI issues without judicial intervention. Apparently, the parties cooperation did not go much beyond the stipulated order. Cooperation broke down and the plaintiff filed a barrage of motions to avoid having to do document review, including an Emergency Motion to Stay ESI Discovery. The plaintiff alleged that the defendant violated the ESI stipulation by “propounding overly broad search terms in its request for ESI.” Oh, how terrible. Red Alert!
Plaintiffs further accused defense counsel of “propounding prima facie inappropriate search criteria, and refusal to work in good faith to target its search terms to specific issues in this case.” Again, the outrageous behavior reminds me of the Romulans. I can see why plaintiff’s counsel called an emergency and asked for costs and relief from having to produce any ESI at all. That kind of approach rarely goes over well with any judge, but here it worked. That’s because the keywords the defense wanted plaintiff to use in its search for relevant ESI were, in fact, very bad.
Paragraph 1.3(3) of the ESI Order establishes a protocol designed to constrain e-discovery, including a limitation to eight custodians with no more than ten keyword search terms for each. It goes on to provide the following very interesting provision:
The search terms shall be narrowly tailored to particular issues. Indiscriminate terms, such as the producing company’s name or its product name, are inappropriate unless combined with narrowing search criteria that significantly reduce the risk of overproduction. A conjunctive combination of multiple words or phrases (e.g. ‘computer’ and ‘system’) narrows the search and shall count as a single term. A disjunctive combination of multiple words or phrases (e.g. ‘computer’ or ‘system’) broadens the search, and thus each word or phrase shall count as a separate search term unless they are variants of the same word. Use of narrowing search criteria (e.g. ‘and,’ ‘but not,’ ‘w/x’) is encouraged to limit the production and shall be considered when determining whether to shift costs for disproportionate discovery.
Remember, this is negotiated wording that the parties agreed to, including the bit about product names and “conjunctive combination.”
Defendant’s Keyword Demands
The keywords proposed by defense counsel for plaintiff’s search then included: “Jeep,” “drawing” and its abbreviation “dwg,” “top,” “convertible,” “fabric,” “fold,” “sale or sales,” and the plaintiff’s product names, “Swaptop” and “Throwback.”
Plaintiff’s counsel advised Judge Whalen that the ten terms created the following results with five custodians (no word on the other three):
- Joseph Lupo: 30 gigabytes, 118,336 documents.
- Ryan Evans: 13 gigabytes, 44,373 documents.
- Tyler Ruby: 10 gigabytes, 44,460 documents.
- Crystal Muglia: 245,019 documents.
- Mark Denny: 162,067 documents.
One gigabyte would comprise approximately 678,000 pages of text. 30 gigabytes would represent approximately 21,696,000 pages of text.
Note that Catalyst did a study of average number of files in a gigabyte in 2014. They found that the average number was 2,500 files per gigabyte. They suggest using 3,000 files per gigabyte for cost estimates, just to be safe. So I have to wonder where Judge Whalen got this 678,000 pages of text per gigabyte.
Plaintiff’s counsel added that:
Just a subset of the email discovery requests propounded by BesTop have returned more than 614,00 documents, comprising potentially millions of individual pages for production.
Plaintiff’s counsel also filed an affidavit where he swore that he reviewed the first 100 consecutively numbered documents to evaluate the burden. Very impressive effort. Not! He looked at the first one-hundred documents that happened to be on top of a 614,000 pile. He also swore that none of these first one-hundred were relevant. (One wonders how many of them were empty pst container files. They are often the “documents” found first in consecutive numbering of an email collection. A better sample might have been to look at the 100 docs with the most hits.)
Judge Whalen Agrees with Plaintiff on Keywords
Judge Whalen agreed with plaintiff and held that:
The majority of defendant’s search terms are overly broad, and in some cases violate the ESI Order on its face. For example, the terms “throwback” and “swap top” refer to Webasto’s product names, which are specifically excluded under 1.3(3) of the ESI Order.
The overbreadth of other terms is obvious, especially in relation to a company that manufactures and sells convertible tops: “top,” “convertible,” “fabric,” “fold,” “sale or sales.” Using “dwg” as an alternate designation for “drawing” (which is itself a rather broad term) would call into play files with common file extension .dwg.
Apart from the obviously impermissible breadth of BesTop’s search terms, their overbreadth is borne out by Mr. Carnevale’s declarations, which detail a return of multiple gigabytes of ESI potentially comprising tens of millions of pages of documents, based on only a partial production. In addition, the search of just the first 100 records produced using BesTop’s search terms revealed that none were related to the issues in this lawsuit. Contrary to BesTop’s contention that Webasto’s claim of prejudice is conclusory, I find that Webasto has sufficiently “articulate[d] specific facts showing clearly defined and serous injury resulting from the discovery sought ….” Nix, 11 Fed.App’x. at 500.
Thus, BesTop’s reliance on City of Seattle v. Professional Basketball Club, LLC, 2008 WL 539809 (W.D. Wash. 2008), is inapposite. In City of Seattle, the defendant offered no facts to support its assertion that discovery would be overly burdensome, instead “merely state[ing] that producing such emails ‘would increase the email universe exponentially[.]’” Id. at *3. In our case, Webasto has proffered hard numbers as to the staggering amount of ESI returned based on BesTop’s search requests. Moreover, while disapproving of conclusory claims of burden, the Court in City of Seattle recognized that the overbreadth of some search terms would be apparent on their face:
“‘[U]nless it is obvious from the wording of the request itself that it is overbroad, vague, ambiguous or unduly burdensome, an objection simply stating so is not sufficiently specific.’” Id., quoting Boeing Co. v. Agric. Ins. Co., 2007 U.S. Dist. LEXIS 90957, *8 (W.D.Wash. Dec. 11, 2007).
As discussed above, many of BesTop’s terms are indeed overly general on their face. And again, propounding Webasto’s product names (e.g., “throwback” and “swap top”) violates the express language of the ESI Order.
Defense Counsel Did Not Cooperate
Judge Whalen then went on to address the apparent lack of cooperation by defendant.
Adversarial discovery practice, particularly in the context of ESI, is anathema to the principles underlying the Federal Rules, particularly Fed.R.Civ.P. 1, which directs that the Rules “be construed, administered, and employed by the court and the parties to secure the just, speedy, and inexpensive determination of every action and proceeding.” In this regard, the Sedona Conference Cooperation Proclamation states:
“Indeed, all stakeholders in the system–judges, lawyers, clients, and the general pubic–have an interest in establishing a culture of cooperation in the discovery process. Over-contentious discovery is a cost that has outstripped any advantage in the face of ESI and the data deluge. It is not in anyone’s interest to waste resources on unnecessary disputes, and the legal system is strained by ‘gamesmanship’ or ‘hiding the ball,’ to no practical effect.”The stipulated ESI Order, which controls electronic discovery in this case, is an important step in the right direction, but whether as the result of adversarial overreach or insufficient effort, BesTop’s proposed search terms fall short of what is required under that Order.
Judge Whalen’s Ruling
Judge Whalen concluded his short Order with the following ruling:
For these reasons, Webasto’s motion for protective order [Doc. #78] is GRANTED as follows:
Counsel for the parties will meet and confer in a good-faith effort to focus and narrow BesTop’s search terms to reasonably limit Webastro’s production of ESI to emails relevant (within the meaning of Rule 26) to the issues in this case, and to exclude ESI that would have no relationship to this case.
Following this conference, and within 14 days of the date of this Order, BesTop will submit an amended discovery request with the narrowed search terms. …
Because BesTop will have the opportunity to reformulate its discovery request to conform to the ESI Order, Webasto’s request for cost-shifting is DENIED at this time. However, the Court may reconsider the issue of cost-shifting if BesTop does not reasonably narrow its requests.
Difficult to Cooperate on Legal Search Without the Help of Experts
The defense in Webastro violated their own stipulation by the use of a party’s product names without further Boolean limiters, such as “product name AND another term.” Then defense counsel added insult to injury by coming across as uncooperative. I don’t know if they alone were uncooperative, or if it was a two way street, but appearances are everything. The emails between counsel were attached to the motions, and the judge scowled at the defense here, not plaintiff’s counsel. No judge likes attorneys who ignore orders, stipulated or otherwise, and are uncooperative to boot. “Uncooperative” is label that you should avoid being called by a judge, especially in the world of e-discovery. Better to be an angel for discovery and save the devilish details for motions and trial.
In Webastro Thermo & Comfort v. BesTop, Inc., Judge Whalen struck down the proposed keywords without expert input. Instead Judge Whalen based his order on some incomplete metrics, namely the number of hits produced by the keywords that defense dreamed up. At least Judge Whalen did not go further and order the use of specific keywords as Judge Fashing did in United States v. New Mexico State University. Still, I wish he had not only ordered the parties to cooperate, but also ordered them to bring in some experts to help with the search tasks. You cannot just talk your way into good searches. No matter what the level of cooperation, you still have to know what you are doing.
If I had been handling this for the plaintiff, I would have gotten my hands much dirtier in the digital mud, meaning I would have done far more than just look at the first one-hundred of 614,000 documents. That was a poor quality control test, but obviously, here at least, was better than nothing. I would have done a sample review of each keyword and evaluated the precision of each. Some might have been ok as is, although probably not. They usually require some refinement. Sometimes it only takes a few minutes of review to determine that. Bottom line, I would have checked out the requested keywords. There were only ten here. That would take maybe three hours or so with the right software. You do not need big judgmental sampling most of the time to see the effectiveness, or not, or keywords.
The next step is to come up with, and test, a number of keyword refinements based on what you see in the data. Learn from the data. Test and improve various keyword combinations. That can take a few more hours. Some may think this is too much work, but it is far less time than preparing motions, memos and attending hearings. And anyway, you need to find the relevant evidence for your case.
After the tests, you share what you learned with opposing counsel and the judge, assuming they want to know. In my experience, most could care less about your methods, so long as your production includes the information they were looking for. You do not have to disclose your every little step, but you should at least advise, again if asked, information about “hit results.” This disclosure alone can go a long way, as this opinion demonstrates. Plaintiff’s counsel obtained very little data about the ineffectiveness of the defendants proposed searched terms, but that was enough to persuade the judge to enter a protective order.
To summarize, after evaluating the proposed search terms I would have improved on them. Using the improved searches I would have begun the attorney review and production. I would have shared the search information, cooperated as required by stipulation, case-law and rules, and gone ahead with my multimodal searches. I would use keywords and the many other wonderful kinds of searches that the Legal Technology industry has come up with in the last 25 years or so since keyword search was new and shiny.
The stipulation the parties used in Webastro could have been used at the turn of the century. Now it seems a little quaint, but alas, suits most inexperienced lawyers today. Anyway, talking about and using keywords is a good way to start a legal search. I sometimes call that Relevancy Dialogues or ESI Communications. Try out some keywords, refine and use them to guide your review, but do not stop there. Try other types of search too. Multimodal. Harness the power of the latest technology, namely AI enhanced search (Predictive Coding). Use statistics too and random sampling to better understand the data prevalence and overall search effectiveness.
If you do not know how to do legal search, and I estimate that 98% of lawyers today do not, then hire an expert. (Or take the time to learn, see eg TARcourse.com.) Your vendor probably has a couple of search experts. There may also be a lawyer in town with this expertise. Now there are even a few specialty law firms that offer these services nationwide. It is a waste of time to reinvent the Wheel, plus it is an ethical dictate under Rule 1.1 – Competence, to associate with competent counsel on a legal task when you are not.
Regarding the vendor experts, remember that even though they may be lawyers, they can only go so far. They can only provide technical advice, not legal, such as proportionality analysis under Rule 26, etc. That requires a practicing lawyer who specializes in e-discovery, preferably as a full-time specialty and not just something they do every now and then. If you are in a big firm, like I am, find the expert in your firm who specializes in e-discovery, like me. They will help you. If your firm does not have such an expert, better get one, either that or get used to losing and having your clients complain.