Project Cost Estimation Is Key To Opposing ESI Discovery as Disproportionately Burdensome Under Rule 26(b)(1)

May 6, 2018

If you are opposing ESI discovery as over-burdensome under Rule 26(b)(1), then you MUST provide evidence of the economic burden of the requested review. You cannot just say it is over-burdensome. Even if it seems obvious, you must provide some metrics, some data, some hard evidence to back that up. That requires the ability to estimate the costs and burdens involved in a document review. In the old days, the nineties, almost every litigator could estimate the cost of a paper review. It was not a tough skill. But today, where large volumes of ESI are common, everything is much more complicated. Today you need an expert to accurately and reliably estimate the costs of various types of ESI reviews.

Requiring proof of burden is nothing new to the law, yet most lawyers today need outside help to do it, especially in large ESI projects. For example, consider the defense team of lawyers representing the City of Chicago and other defendants in a major civil rights case with lots of press, Mann v. City of Chicago, Nos. 15 CV 9197, 13 CV 4531, (N.D. Ill. Sept. 8, 2017); Chicago sued for ‘unconstitutional and torturous’ Homan Square police abuse (The Guardian, 10/19/15). They did not even attempt to estimate the costs of the review they opposed. They also failed or refused to hire an expert who could do that for them. Sine they had no evidence, not even an estimate, their argument under Rule 26(b)(1) failed miserably.

Mann v. City of Chicago: Case Background

The background of the case is interesting, but I won’t go into the fact details here; just enough to set up the discovery dispute. Plaintiffs in later consolidated cases sued the City of Chicago and the Chicago police alleging that they had been wrongfully arrested, detained and abused at “off the books” detention centers without access to an attorney. Aside from the salacious allegations, it does not look like the plaintiffs have a strong case. It looks like a fishing expedition to me, in more ways than one as I will explain. With this background, it seems to me that if defendants had made any real effort to prove burden here, they could have prevailed on this discovery dispute.

The parties agreed on the majority of custodians whose ESI would be searched, but, as usual, the plaintiffs’ wanted more custodians searched, including that of the mayor himself, Rahm Emanuel. The defendants did not want to include the mayor’s email in the review. They argued, without any real facts showing burden, that the Mayor’s email would be irrelevant (a dubious argument that seemed to be a throw-away) and too burdensome (their real argument).

Here is how Magistrate Judge Mary M. Rowland summarized the custodian dispute in her opinion:

Plaintiffs argue Mayor Emanuel and ten members of his senior staff, including current and former chiefs of staff and communications directors are relevant to Plaintiffs’ Monell claim. (Id. at 5).[2] The City responds that Plaintiffs’ request is burdensome, and that Plaintiffs have failed to provide any grounds to believe that the proposed custodians were involved with CPD’s policies and practices at Homan Square. (Dkt. 74 at 1, 6). The City proposes instead that it search the two members of the Mayor’s staff responsible for liasoning with the CPD and leave “the door open for additional custodians” depending on the results of that search. (Id. at 2, 4).[3]

Another Silly “Go Fish” Case

As further background, this is one of those negotiated keywords Go Fish cases where the attorneys involved all thought they had the magical powers to divine what words were used in relevant ESI. The list is not shared, but I bet it included wondrous words like “torture” and “off the books,” plus every plaintiff’s favorite “claim.”

The parties agreed that the defendants would only review for relevant evidence the ESI of the custodians that happened to have one or more of the keyword incantations they dreamed up. Under this still all to common practice the attorneys involved, none of whom appear to have any e-discovery search expertise, the majority of documents in the custody of the defense custodians would never be reviewed. They would not be reviewed because they did not happen to have a “magic word” in them. This kind of untested, keyword filtering agreement is irrational, archaic and not a best practice in any but small cases, but that is what the attorneys for both sides agreed to. They were convinced they could guess that words were used by police, city administrators and politicians in any relevant document. It is a common delusion facilitated by Google’s search of websites.

When will the legal profession grow up and stop playing Go Fish when it comes to a search for relevant legal evidence? I have been writing about this for years. Losey, R., Adventures in Electronic Discovery (West 2011); Child’s Game of ‘Go Fish’ is a Poor Model for e-Discovery Search. Guessing keywords does not work. It almost always fails in both precision and recall. The keyword hits docs are usually filled with junk and relevant docs often used unexpected language, not to mention abbreviations and spelling errors. If you do not at least test proposed keywords on a sample custodian, then your error rate will multiply. I saw a review recently where the precision rate on keywords was only six percent, and that is with superficial feedback, i.w. – unskilled testing. You never want to waste so much attorney time, even if you are reviewing at low rates. The ninety-four irrelevant docs to find six is an inefficient expensive approach. We try to improve precision without a significant loss of recall.

When I first wrote about Go Fish and keywords back in 2010 most everyone agreed with me, even if they disagreed on the significance, the meaning and what you should do about it. That started the proportionality debate in legal search. E-Discovery search expert Judges Peck and Scheindlin joined in the chorus of criticism of negotiated keywords. National Day Laborer Organizing Network v. US Immigration and Customs Enforcement Agency, 877 F.Supp.2d 87 (SDNY, 2012) (J. Scheindlin) (“As Judge Andrew Peck — one of this Court’s experts in e-discovery — recently put it: “In too many cases, however, the way lawyers choose keywords is the equivalent of the child’s game of `Go Fish’ … keyword searches usually are not very effective.” FN 113“); Losey, R., Poor Plaintiff’s Counsel, Can’t Even Find a CAR, Much Less Drive One (9/1/13). Don’t you love the quote within a quote. A rare gem in legal writing.

Judge Rowland’s Ruling

I have previously written about the author of the Mann v. City of Chicago opinion, Judge Mary Rowland. Spoliated Schmalz: New Sanctions Case in Chicago That Passes-Over a Mandatory Adverse Inference. She is a rising star in the e-discovery world. Judge Rowland found that the information sought from the additional custodians would be relevant. This disposed of the defendants first and weakest argument. Judge Rowland then held that Defendants did not meet the burden of proof “—failing to provide even an estimate—” and for that reason granted, in part, Plaintiffs’ motion to compel, including their request to add the Mayor. Judge Rowland reviewed all six of the proportionality factors under Rule 26(b)(1), including the importance of the issues at stake and the plaintiffs’ lack of access to the requested information.

On the relevance issue Judge Rowland held that, in addition to the agreed-upon staff liaisons, the Mayor and his “upper level staff” might also have relevant information in their email. As to the burden argument, Judge Rowland held that the City did not “offer any specifics or even a rough estimate about the burden.” Judge Rowland correctly rejected the City’s argument that they could not provide any such information because “it is impossible to determine how many emails there may be ‘unless the City actually runs the searches and collects the material.’” Instead, the court held that the defendants should have at least provided “an estimate of the burden.” Smart Judge. Here are her words:

The City argues that it will be “burdened with the time and expense of searching the email boxes of nine (9) additional custodians.” (Dkt. 74 at 5). The City does not offer any specifics or even a rough estimate about the burden. See Kleen Prods. LLC 2012 U.S. Dist. LEXIS 139632, at *48 (“[A] party must articulate and provide evidence of its burden. While a discovery request can be denied if the `burden or expense of the proposed discovery outweighs its likely benefit,’ Fed. R. Civ. P. 26(b)(2)(C)(iii), a party objecting to discovery must specifically demonstrate how the request is burdensome.”) (internal citations and quotations omitted).

As the Seventh Circuit stated in Heraeus Kulzer, GmbH, v. Biomet, Inc., 633 F.3d 591, 598 (7th Cir. 2011):

[The party] could have given the district court an estimate of the number of documents that it would be required to provide Heraeus in order to comply with the request, the number of hours of work by lawyers and paralegals required, and the expense. A specific showing of burden is commonly required by district judges faced with objections to the scope of discovery . . . Rough estimates would have sufficed; none, rough or polished, was offered.

The City argues in its sur-reply that it is impossible to determine how many emails there may be “unless the City actually runs the searches and collects the material.” (Dkt. 78-1 at 4). Still, the City should have provided an estimate of the burden. The Court is not convinced by the City’s argument about the burden.

Judge Rowland also held that the City should have addressed the “other Rule 26 factors—the importance of the issues and of the discovery in resolving the issues, and the parties’ relative access to information and their resources.” She noted that these other factors: “weigh[ed] in favor of allowing discovery of more than just the two custodians proposed by the City.”  However, the court declined to compel the search of four proposed custodians based on their “short tenure” or the “time during which the person held the position,” concluding the requested searches were “not proportional to the needs of the case.”

Judge Rowland’s opinion notes with seeming surprise the failure of the City of Chicago to provide any argument at all on the five non-economic factors in Rule 26(b)(1). I do not fault them for that. Their arguments on these points were necessarily weak in this type of case, but a conciliatory gesture, a polite acknowledgement showing awareness, might have helped sweeten the vinegar. As it is, they came across as oblivious to the full requirements of the Rule.

What Chicago Should Have Done

What additional information should the defendants have provided to oppose the search and review of the additional nine custodians, including the Mayor’s email? Let’s start with the obvious. They should have shared the total document count and GB size of the nine custodians, and they should have broken that information down on a per-custodian basis. Then they should have estimated the costs to review that many emails and attachments.

The file count information should have been easy to ascertain from the City’s IT department. They know the PST sizes and can also determine, or at least provide a good estimate of the total document count. The problem they had with this obvious approach is that they wanted a keyword filter. They did not want to search all documents of the custodians, only the ones with keyword hits. Still, that just made the process slightly more difficult, not impossible.

Yes, it is true, as defendant’s alleged, that to ascertain this supporting information, they would have to run the searches and collect the material. So what? Their vendor or Chicago IT department should have helped them with that. It is not that difficult or expensive to do. No expensive lawyer time is required. It is just a computer process. Any computer technician could do it. Certainly any e-discovery vendor. The City could easily have gone ahead and done the silly keyword filtering and provide an actual file count. This would have provided the City some hard facts to support their burden argument. It should not be that expensive to do. Almost certainly the expense would have been less than this motion practice.

Alternatively, the City could have at least estimated the file count and other burden metrics. They could have made reasonable estimated based on their document review experience in the case so far. They had already reviewed uncontested custodians under their Go Fish structure, so they could have made projections based on past results. Estimates made by projections like this would probably have been sufficient in this case and was certainly better than the track they chose, not providing any information at all.

Another alternative, the one that would have produced the most persuasive evidence, would be to load the filtered ESI of at least a sample of the nine custodians, including the Mayor. Then begin the review, say for a couple of days, and see what that costs. Then project those costs for the rest of the review and rest of the custodians. By this gold standard approach you would not only have the metrics from the data itself — the file counts, page counts, GB size — but also metrics of the document review, what it costs.

You would need to do this on the Mayor’s email separately and argue this burden separately. The Mayor’s email would likely be much more expensive to review than any of the other custodians. It would take attorneys longer to review his documents. There would be more privileged materials to find and log and there would be more redactions. It is like reviewing a CEO’s email. If the attorneys for the City had at least begun some review of Emanuel’s email, they would have been able to provide extensive evidence on the cost and time burden to complete the review.

I suspect the Mayor was the real target here and the other eight custodians were of much less importance. The defense should have gauged their response accordingly. Instead, they did little or nothing to support their burdensome argument, even with the Mayor’s sensitive government email account.

We have a chance to learn from Chicago’s mistake. Always, at the very least, provide some kind of an estimate of the burden. The estimate should include as much information as possible, including time and costs. These estimates can, with time and knowledge, be quite accurate and should be used to set budgets, along with general historical knowledge of costs and expenses. The biggest problem now is a shortage of experts on how to properly estimate document review projects, specifically large ESI-only projects. I suggest you consult with such an cost-expert anytime you are faced with a disproportionate ESI review demands. You should do so before you make final decisions or reply in writing.

 Conclusion

Mann v. City of Chicago is one of those cases where we can learn from the mistakes of others. At least provide an estimate of costs in every dispute under Rule 26(b)(1). Learn to estimate the costs of document reviews. Either that or hire an expert who can do that for you, one that can provide testimony. Start with file counts and go from there. Always have some metrics to back-up your argument. Learn about your data. Learn what it will likely cost to review that data. Learn how to estimate the costs of document reviews. It will probably be a range. The best way to do that is by sampling. With sampling you at least start the document review and estimate total costs by projection of what it has actually cost to date. There are fewer speculative factors that way.

If you agree to part of the review requested, for instance to three out of ten custodians requested, then do that review and measure its costs. That creates the gold standard for metrics of burden under Rule 26(b)(1) and is, after all, required in any objections under Rule 34(b)(2)(B)&(C). See: Judge Peck Orders All Lawyers in NY to Follow the Rules when Objecting to Requests for Production, or Else ….

For more on cost burden estimation listen to my upcoming Ed-Talk on the subject, Proportional Document Review under the New Rules and the Art of Cost Estimation.

 


Evidence Code Revisions and the Grimm/Brady Evidence Admissibility Chart

April 22, 2018

Great fanfare was provided for  the changes to the Federal Rules of Civil Procedure in December 2015. But not much attention has been given to the December 2017 changes to the Federal Rules of Evidence. Maybe that has to do with the disappearing trial, the fact that less than one percent of federal cases actually go to trial. Still, you need to know the rules of evidence admissibility, even if you are preparing for a trial that will never come. You need to collect and discover evidence in a way that it can be used, even if it is just in a motion for summary judgment.

Two New Subsections to Rule 902 on Self-Authenticating Evidence

In December 2017 two new subsections were added to Evidence Rule 902, subsections (13) and (14). They are designed to streamline authentication of electronically stored information (ESI). The goal is to eliminate the need to call a witness at trial to authenticate evidence, at least in most instances. Here are the two new provisions:

Rule 902. Evidence That Is Self-Authenticating

The following items of evidence are self-authenticating; they require no extrinsic evidence of authenticity in order to be admitted: . . .

(13) Certified Records Generated by an Electronic Process or System. A record generated by an electronic process or system that produces an accurate result, as shown by a certification of a qualified person that complies with the certification requirements of Rule 902(11) or (12). The proponent must also meet the notice requirements of Rule 902(11).

(14) Certified Data Copied from an Electronic Device, Storage Medium, or File. Data copied from an electronic device, storage medium, or file, if authenticated by a process of digital identification, as shown by a certification of a qualified person that complies with the certification requirements of Rule (902(11) or (12). The proponent also must meet the notice requirements of Rule 902 (11).

The Evidence Rules Committee Notes explain the background of these two new subsections.

Committee Notes on Rules—2017 Amendment

Paragraph (14). The amendment sets forth a procedure by which parties can authenticate data copied from an electronic device, storage medium, or an electronic file, other than through the testimony of a foundation witness. As with the provisions on business records in Rules 902(11) and (12), the Committee has found that the expense and inconvenience of producing an authenticating witness for this evidence is often unnecessary. It is often the case that a party goes to the expense of producing an authentication witness, and then the adversary either stipulates authenticity before the witness is called or fails to challenge the authentication testimony once it is presented. The amendment provides a procedure in which the parties can determine in advance of trial whether a real challenge to authenticity will be made, and can then plan accordingly.

Today, data copied from electronic devices, storage media, and electronic files are ordinarily authenticated by “hash value”. A hash value is a number that is often represented as a sequence of characters and is produced by an algorithm based upon the digital contents of a drive, medium, or file. If the hash values for the original and copy are different, then the copy is not identical to the original. If the hash values for the original and copy are the same, it is highly improbable that the original and copy are not identical. Thus, identical hash values for the original and copy reliably attest to the fact that they are exact duplicates. This amendment allows self-authentication by a certification of a qualified person that she checked the hash value of the proffered item and that it was identical to the original. The rule is flexible enough to allow certifications through processes other than comparison of hash value, including by other reliable means of identification provided by future technology.

Nothing in the amendment is intended to limit a party from establishing authenticity of electronic evidence on any ground provided in these Rules, including through judicial notice where appropriate.

A proponent establishing authenticity under this Rule must present a certification containing information that would be sufficient to establish authenticity were that information provided by a witness at trial. If the certification provides information that would be insufficient to authenticate the record of the certifying person testified, then authenticity is not established under this Rule.

The reference to the “certification requirements of Rule 902(11) or (12)” is only to the procedural requirements for a valid certification. There is no intent to require, or permit, a certification under this Rule to prove the requirements of Rule 803(6). Rule 902(14) is solely limited to authentication, and any attempt to satisfy a hearsay exception must be made independently.

A certification under this Rule can only establish that the proffered item is authentic. The opponent remains free to object to admissibility of the proffered item on other grounds—including hearsay, relevance, or in criminal cases the right to confrontation. For example, in a criminal case in which data copied from a hard drive is proffered, the defendant can still challenge hearsay found in the hard drive, and can still challenge whether the information on the hard drive was placed there by the defendant.

A challenge to the authenticity of electronic evidence may require technical information about the system or process at issue, including possibly retaining a forensic technical expert; such factors will affect whether the opponent has a fair opportunity to challenge the evidence given the notice provided.

The reference to Rule 902(12) is intended to cover certifications that are made in a foreign country.

Also see: Paul Grimm, Gregory Joseph, Daniel Capra, Manual on Best Practices for Authenticating Digital Evidence; Authenticating Digital Evidence, 69 BAYLOR L. REV. 1 (2017).

Grimm/Brady Evidence Admissibility Chart

The rule change is a helpful addition to the litigator’s toolkit, but many challenges remain for attorneys handling electronic evidence. I agree with Kevin Brady, who is a top expert in the field of ESI evidence, who says that “the challenge for lawyers trying to authenticate digital evidence using the traditional rules of evidence can be confusing.” This may be an understatement! Kevin thinks that part of the challenge for attorneys arises from the rapidly-evolving landscape of data sources. He gives examples such as bitcoin, blockchain, smart contracts, social media, IoT, mobile devices, and cloud computing services. Moreover, the use of social media like Facebook, LinkedIn, Instagram and others continues to increase at an unbelievable rate and adds to the problem. Moreover, according to Business Insider, there are more people using the top four social messaging apps (WhatsApp, Messenger, WeChat, and Viber) than the top four social media apps (Facebook, Instagram, Twitter, and LinkedIn). According to Tech Crunch, Facebook’s Messenger alone has more than 1.3 billion monthly active users, and Instagram is officially testing a standalone messaging app, Direct.

Recognizing the problem Kevin Brady teamed up with U.S. District Court Judge Paul Grimm, the leading judicial expert in the field, to create the Grimm/Brady Evidence Admissibility Chart shown below.

The detailed reference chart provides discovery lawyers and trial attorneys with a quick reference guide for handling many different sources of ESI evidence. It covers Rule 104 to Rule 803(6) to Rule 901 and 902. The chart provides a step by step approach for authenticating digital information and successfully getting that information admitted into evidence.

The e-Discovery Team highly recommends that you carefully study this chart. Click on the photos and they will open in a larger size. Also suggest you download your own copy here: Grimm Brady Evidence Admission Chart 2018. Many thanks to Kevin Brady for helping me with this blog.



 


Guest Blog: “Follow the Money” and Metrics: A User’s Guide to Proportionality in the Age of e-Discovery

April 8, 2018

This is a guest blog by a friend and colleague, Philip Favro. Phil is a consultant for Driven, Inc. where he serves as a trusted advisor to organizations and their counsel on issues relating to the discovery process and information governance. Phil is also currently active in The Sedona Conference. He obtained permission from them to include a description of a recent event they sponsored in Nashville on Proportionality.

“Follow the Money” and Metrics: A User’s Guide to Proportionality in the Age of e-Discovery

Moviegoers and political junkies have flocked to theaters over the past few months to watch period-piece epics including Darkest Hour and The Post. While there is undoubted attraction (especially in today’s political climate) in watching the reenactment of genuine leadership and courageous deeds these movies portray, The Post should have particular interest for counsel and others involved in electronic discovery.

With its emphasis on investigation and fact-gathering; culling relevant information from marginally useful materials; and decision-making on how such information should be disclosed and presented to adversaries, The Post features key traits associated with sophisticated discovery counsel.

Not coincidentally, those same attributes were on display in another drama from the 1970s of which The Post viewers were reminded: All The President’s Men. That investigative journalism classic depicts Washington Post correspondents Bob Woodward and Carl Bernstein as dogged reporters determined to identify the culprits responsible for the Watergate Hotel break-in in June 1972.

A critical aspect of their work involved Woodward’s furtive meetings with Mark Felt, who served at that time as the deputy director of the FBI. Known only as “Deep Throat” (until Felt revealed himself in 2005), Felt provided cryptic yet key direction that aided the reporters’ investigation. One of Felt’s most significant tips (as portrayed in the movie) was his suggestion that Woodward investigate the cash contributions made to help reelect then President Richard Nixon in 1972. Played by iconic actor Hal Holbrook in All The President’s Men, Felt’s soft-spoken but serious demeanor underscored the importance of his repeated direction to Woodward to “just follow the money.” By following the money, the Washington Post reporters helped discover many of the nefarious tactics that eventually brought down the Nixon presidency.

Proportionality

The directive to “follow the money” applies with equal force to counsel responsible for handling discovery. This is particularly the case in 2018 since courts now expect counsel to address discovery consistent with proportionality standards. Those standards – as codified in Federal Rule of Civil Procedure 26(b)(1) – require counsel, clients, and the courts to consider various factors bearing on the discovery analysis. They include:

(1) the importance of the issues at stake in this action; (2) the amount in controversy; (3) the parties’ relative access to relevant information; (4) the parties’ resources; (5) the importance of the discovery in resolving the issues; and (6) whether the burden or expense of the proposed discovery outweighs its likely benefit.

While all of the factors may be significant, monetary considerations – elements of which are found in both the “amount in controversy” and “burden or expense” factors – frequently predominate a proportionality analysis. As Ralph Losey (the owner, host, and principal author of this blog) has emphasized many times, “[t]he bottom line in e-discovery production is what it costs.” By following the money or, perhaps more appropriate for discovery, focusing on the money, counsel can drive an effective discovery process and obtain better results for the client.

As lawyers do so, they will find an increasingly sophisticated judiciary who expect counsel to approach discovery through the lens of proportionality. This certainly was the case in Oxbow Carbon & Minerals LLC v. Union Pacific Railroad Company, which has been prominently spotlighted in this blog. In Oxbow, the court applied the Rule 26(b)(1) proportionality factors to a disputed document request, holding that it was not unduly burdensome and that it properly targeted relevant information. While the court examined all of the Rule 26(b)(1) proportionality standards, money was the clearly determinative factor. The amount in controversy, coupled with the comparative costs of discovery – discovery completed and still to be undertaken, tipped the scales in favor of ordering plaintiffs to respond defendants’ document requests.

The Critical Role of Metrics

Essential to Oxbow’s holding were the metrics the parties shared with the court. Metrics – typically defined as a standard of measurement or (as used in business world) a method for evaluating performance – offer counsel ways to assess the “performance” of a particular document production. Metrics can measure the extent to which a production contains relevant materials, undisclosed privileged information, and even nonresponsive documents. Metrics can also estimate – as was the case in Oxbow – the resources (including time, manpower, and costs) a party may be forced to incur to comply with a discovery request.

Metrics enable a court to follow the money and properly balance the burdens of discovery against its benefits. Without metrics, a responding party could hardly expect to establish that a request is disproportionate and thereby prevail in motion practice. As Ralph observed in his post entitled Judge Facciola’s Successor, Judge Michael Harvey, Provides Excellent Proportionality Analysis in an Order to Compel:

Successful arguments on motions to compel require hard evidence. To meet your burden of proof you must present credible estimates of the costs of document review. This requires . . . reliable metrics and statistics concerning the ESI that the requesting party wants the responding party to review.

As discussed later on, other courts have also emphasized the critical role of metrics in evaluating the proportionality of a particular discovery request.

The Sedona Conference, Proportionality, and Metrics

For counsel who wish to better understand the role of metrics in discovery, the directive to “follow the money” will bring them to The Sedona Conference (“Sedona”). Sedona is the preeminent legal institution dedicated to advancing thoughtful reforms on important legal issues. While Sedona addresses matters ranging from patent litigation and trade secret misappropriation to data privacy and cross-border data protection, the organization is best known for its work on electronic discovery.

Renowned for its volunteer model and for attracting many of the best minds in the legal industry, Sedona prepares authoritative resources that are regularly relied on by judges, lawyers, and scholars. This is particularly the case with proportionality standards and how they should drive the determination of discovery issues.

Sedona published its first Commentary on Proportionality in Electronic Discovery (“Commentary” or “Proportionality Commentary”) in 2010 and a second version in 2013. Last spring, Sedona released a third iteration of the Commentary. Collaboratively prepared by a group of renowned judges and practitioners, the third version of the Commentary provides common sense direction on how metrics can help achieve proportional results in discovery:

Burden and expense should be supported by hard information and not by unsupported assertions. For example, if a party claims that a search would result in too many documents, the party should run the search and be prepared to provide the opposing party with the number of hits and any other applicable qualitative metrics. If the party claims that the search results in too many irrelevant hits, the party may consider providing a description or examples of irrelevant documents captured by the search.

Quantitative metrics in support of a burden and expense argument may include the projected volume of potentially responsive documents. It may also encompass the costs associated with processing, performing data analytics, and review, taking into consideration the anticipated rate of review and reviewer costs, based upon reasonable fees and expenses.

As the Commentary makes clear, metrics can provide insights regarding the effectiveness of search methodologies or the nature and extent of a party’s burden in responding to a particular discovery request. By sharing these metrics with litigation adversaries, counsel can informally address legitimate discovery questions or crystallize the issues for resolution by a court. Either way represents a more cost effective approach to discovery than the opacity of traditional meet and confers or motion practice.

Framing the Issues through Sedona’s TSCI Event

These issues were on display last month at Sedona’s TSCI conference in Nashville, Tennessee. The TSCI event typically provides attendees with an annual opportunity to stay current on developing trends in e-Discovery. The 2018 TSCI event remained consistent with that objective, spotlighting practice developments for counsel “from ‘eDiscovery 1.0’ to New and Evolving Legal Challenges.” Expertly chaired by Jerone “Jerry” English and Maura Grossman, TSCI featured sessions covering discovery and other issues relating to artificial intelligence (AI), the Internet of Things (IoT), mobile applications, data breaches, cross-border discovery, and the always engaging case law panel and judicial round-table.

One of the more practical sessions focused on the importance of using metrics, analytics, and sampling to achieve proportionality in discovery. Entitled Using Data Analytics and Metrics to Achieve Proportionality, the purpose of this session was to help attendees understand how counsel should present analytics, metrics, and sampled data to a court. The session featured a fantastic line-up of speakersGareth Evans, Maura Grossman, U.S. Magistrate Judge Anthony Porcelli, U.S. Magistrate Judge Leda Dunn Wettre – who were well situated to provide views on these topics. Audience members additionally offered insightful comments on the issues.[1]

The most important guidance the speakers and audience emphasized was the need for more complete disclosure of supporting metrics. Unless specific metrics are disclosed, neither adversaries nor the court can address issues as varied as the performance of particular search terms, the reasonableness of a production made using TAR or other search methodologies, or the burdens of a particular discovery request.

On the latter issue of substantiating arguments of undue burden, one particularly insightful comment offered during the session concisely summarized the interplay between metrics, proportionality, and cost: “Follow the money.” This admonition dovetailed nicely with the discussion of two recent cases during that session – Duffy v. Lawrence Memorial Hospital, No. 2:14-cv-2256-SAC-TJJ, 2017 WL 1277808 (D. Kan. Mar. 31, 2017) and Solo v. United Parcel Service Co., No. 14-cv-12719, 2017 WL 85832 (E.D. Mich. Jan. 10, 2017). Both of these cases spotlight how reliable metrics enable a court to follow the money and resolve discovery disputes consistent with proportionality standards.

Duffy v. Lawrence Memorial Hospital

In Duffy, the court modified a discovery order issued less than two months beforehand that granted plaintiff’s requests for various categories of emergency room patient records. In that first round of motion practice, defendant had argued that plaintiff’s requests were disproportionate and unduly burdensome. The court overruled those objections, explaining that defendant failed to provide any substantive metrics to support those objections:

Defendant objects to every document request as being unduly burdensome, but provides no facts to support the objection. Neither does Defendant provide evidence of the costs it would incur in responding to the requests.

In summary, defendant’s failure to share any meaningful metrics regarding the time, manpower, or costs it would incur to comply with plaintiff’s requests ultimately left its arguments bereft of any evidentiary support.

In the second round of motion practice, defendant adopted a different approach that yielded a more proportional result. Confronted by the staggering reality of the court’s production order and having learned how to properly use supporting metrics in motion practice, defendant moved for a protective order.

In contrast to its prior briefing, defendant shared specific metrics associated with the burdens of production. Those burdens involved the deployment of staff to individually review 15,574 electronic patient files so as to identify particular patient visit information. Such a process would be labor intensive and cost well over $230,000:

Defendant estimates it would take 7,787 worker hours to locate and produce responsive information for 15,574 patient records. If Defendant had ten employees working on the task, they would spend more than ninety-seven days working eight hours a day, at an estimated cost to Defendant of $196,933.23.

After aggregating the information, Defendant asserts it would need to redact patients’ personal confidential information . . . redaction would take ten reviewers fourteen days at a cost of $37,259.50. The process would include a quality control attorney reviewer who would spend two hours a day, and reviewers who would review 15 documents per hour for eight hours a day.

In sum . . . producing the information relevant to RFP Nos. 40, 41, 43, and 58 would take 8,982 hours of work and cost in excess of $230,000 if done by contract staff.

Simply put, defendant urged the court to follow the money. By substantiating its proportionality arguments with appropriate metrics, the court recognized its initial production order placed an undue burden on defendant.

As a result, the court adopted a modified order that instead allowed defendant to produce a random sample of 257 patient records. While advancing a number of justifications for its modified order, the court ultimately relied on the tripartite mandate from Federal Rule of Civil Procedure 1. The order would provide the parties to the litigation with a substantively better, more efficient, and less expensive method for producing relevant information.

Solo v. United Parcel Service

Solo v. United Parcel Service reached a result analogous to the Duffy holding, ordering that defendant produce only a sample of the information sought by plaintiff. In Solo, plaintiffs served an interrogatory that sought identification of shipment information relating to its putative class action claims (plaintiffs claimed that defendant overcharged certain customers for “shipments that had a declared value of over $300”). The interrogatory sought shipping record information that spanned a period of six years.

Defendant argued in response that the interrogatory was unduly burdensome and would impose a disproportionate production obligation on the company. Because most of the requested information was archived on backup tapes, defendant shared specific metrics regarding the “overwhelming” burdens associated with responding to the interrogatory:

UPS estimates that it would take at least six months just to restore the archived tapes as described above, at a cost of $120,000 in labor . . . that estimate does not include the time and expense of analyzing the data once extracted in order to answer Interrogatory No. 1, which would require extensive additional analysis of each account number and the manual review of contract language for individual shipper. Such a process would also require a substantial amount of time and resources on the part of UPS.

Based on the metrics defendant disclosed and given that plaintiffs’ claims had yet to be certified as a class action, the court found the interrogatory to be disproportionate. Following the money and drawing on the linked concepts of cooperation and proportionality from Rule 1, the court instead ordered that defendant produce a sample of the requested information from a six-month period. The court also directed the parties to meet and confer on developing an agreeable sampling methodology.

Conclusion

Duffy and Solo reinforce the critical interplay between metrics, proportionality, and money. Just like Oxbow, the responding parties from Duffy and Solo could hardly expect to substantiate arguments regarding undue burden and disproportionality without metrics. Indeed, the court in Duffy initially rejected such arguments when defendant failed to support them with actual information. However, by disclosing metrics with reasonable estimates of time, manpower, and costs, Duffy and Solo resulted in production orders more consistent with proportionality limitations.

All of which translated into substantial cost savings for the responding parties. Defendant in Duffy was facing a discovery bill of over $230,000 to review 15,574 patient files. Dividing the projected cost of the entire review process into the number of patient records – $230,000 ÷ 15,574 – reveals that defendant would pay approximately $15 to review an individual patient record. Under the modified production order, the new projected cost – $15 multiplied by 257 patient records – equals $3,855. Follow the money: the tactical use of metrics apparently saved the client over $225,000!

Duffy and Solo are consistent with and confirmed by Oxbow, the Proportionality Commentary, and the Federal Rules of Civil Procedure. These authoritative resources collectively teach that counsel who use metrics and focus on cost can drive an effective discovery process. Lawyers that do so will ultimately obtain better results for the client in discovery.

_______

[1] To encourage candid and robust debate during its events, Sedona has promulgated a nondisclosure rule. Known as “The Sedona Rule,” it proscribes attendees from identifying the speakers or audience members by name who share particular insights. It also forbids divulging the contents of particular brainstorming or drafting projects that have yet to be released for publication. The Sedona Rule otherwise allows for the anonymous disclosure of session content from its events.


Document Review and Proportionality – Part Two

March 28, 2018

This is a continuation of a blog that I started last week. Suggest you read Part One before this.

Simplified Six Step Review Plan for Small and Medium Sized Cases or Otherwise Where Predictive Coding is Not Used

Here is the workflow for the simplified six-step plan. The first three steps repeat until you have a viable plan where the costs estimate is proportional under Rule 26(b)(1).

Step One: Multimodal Search

The document review begins with Multimodal Search of the ESI. Multimodal means that all modes of search are used to try to find relevant documents. Multimodal search uses a variety of techniques in an evolving, iterated process. It is never limited to a single search technique, such as keyword. All methods are used as deemed appropriate based upon the data to be reviewed and the software tools available. The basic types of search are shown in the search pyramid.

search_pyramid_revisedIn Step One we use a multimodal approach, but we typically begin with keyword and concept searches. Also, in most projects we will run similarity searches of all kinds to make the review more complete and broaden the reach of the keyword and concept searches. Sometimes we may even use a linear search, expert manual review at the base of the search pyramid. For instance, it might be helpful to see all communications that a key witness had on a certain day. The two-word stand-alone call me email when seen in context can sometimes be invaluable to proving your case.

I do not want to go into too much detail of the types of searches we do in this first step because each vendor’s document review software has different types of searches built it. Still, the basic types of search shown in the pyramid can be found in most software, although AI, active machine learning on top, is still only found in the best.

History of Multimodal Search

Professor Marcia Bates

Multimodal search, wherein a variety of techniques are used in an evolving, iterated process, is new to the legal profession, but not to Information Science. That is the field of scientific study which is, among many other things, concerned with computer search of large volumes of data. Although the e-Discovery Team’s promotion of multimodal search techniques to find evidence only goes back about ten years, Multimodal is a well-established search technique in Information Science. The pioneer professor who first popularized this search method was Marcia J. Bates, and her article, The Design of Browsing and Berrypicking Techniques for the Online Search Interface, 13 Online Info. Rev. 407, 409–11, 414, 418, 421–22 (1989). Professor Bates of UCLA did not use the term multimodal, that is my own small innovation, instead she coined the word “berrypicking” to describe the use of all types of search to find relevant texts. I prefer the term “multimodal” to “berrypicking,” but they are basically the same techniques.

In 2011 Marcia Bates explained in Quora her classic 1989 article and work on berrypicking:

An important thing we learned early on is that successful searching requires what I called “berrypicking.” . . .

Berrypicking involves 1) searching many different places/sources, 2) using different search techniques in different places, and 3) changing your search goal as you go along and learn things along the way. . . .

This may seem fairly obvious when stated this way, but, in fact, many searchers erroneously think they will find everything they want in just one place, and second, many information systems have been designed to permit only one kind of searching, and inhibit the searcher from using the more effective berrypicking technique.

Marcia J. Bates, Online Search and Berrypicking, Quora (Dec. 21, 2011). Professor Bates also introduced the related concept of an evolving search. In 1989 this was a radical idea in information science because it departed from the established orthodox assumption that an information need (relevance) remains the same, unchanged, throughout a search, no matter what the user might learn from the documents in the preliminary retrieved set. The Design of Browsing and Berrypicking Techniques for the Online Search Interface. Professor Bates dismissed this assumption and wrote in her 1989 article:

In real-life searches in manual sources, end users may begin with just one feature of a broader topic, or just one relevant reference, and move through a variety of sources.  Each new piece of information they encounter gives them new ideas and directions to follow and, consequently, a new conception of the query.  At each stage they are not just modifying the search terms used in order to get a better match for a single query.  Rather the query itself (as well as the search terms used) is continually shifting, in part or whole.   This type of search is here called an evolving search.

Furthermore, at each stage, with each different conception of the query, the user may identify useful information and references. In other words, the query is satisfied not by a single final retrieved set, but by a series of selections of individual references and bits of information at each stage of the ever-modifying search. A bit-at-a-time retrieval of this sort is here called berrypicking. This term is used by analogy to picking huckleberries or blueberries in the forest. The berries are scattered on the bushes; they do not come in bunches. One must pick them one at a time. One could do berrypicking of information without  the search need itself changing (evolving), but in this article the attention is given to searches that combine both of these features.

I independently noticed evolving search as a routine phenomena in legal search and only recently found Professor Bates’ prior descriptions. I have written about this often in the field of legal search (although never previously crediting Professor Bates) under the names “concept drift” or “evolving relevance.” See Eg. Concept Drift and Consistency: Two Keys To Document Review Quality – Part Two (e-Discovery Team, 1/24/16). Also see Voorhees, Variations in Relevance Judgments and the Measurement of Retrieval Effectiveness, 36 Info. Processing & Mgmt  697 (2000) at page 714.

SIDE NOTE: The somewhat related term query drift in information science refers to a different phenomena in machine learning. In query drift  the concept of document relevance unintentionally changes from the use of indiscriminate pseudorelevance feedback. Cormack, Buttcher & Clarke, Information Retrieval Implementation and Evaluation of Search Engines (MIT Press 2010) at pg. 277. This can lead to severe negative relevance feedback loops where the AI is trained incorrectly. Not good. If that happens a lot of other bad things can and usually do happen. It must be avoided.

Yes. That means that skilled humans must still play a key role in all aspects of the delivery and production of goods and services, lawyers too.

UCLA Berkeley Professor Bates first wrote about concept shift when using early computer assisted search in the late 1980s. She found that users might execute a query, skim some of the resulting documents, and then learn things which slightly changes their information need. They then refine their query, not only in order to better express their information need, but also because the information need itself has now changed. This was a new concept at the time because under the Classical Model Of Information Retrieval an information need is single and unchanging. Professor Bates illustrated the old Classical Model with the following diagram.

The Classical Model was misguided. All search projects, including the legal search for evidence, are an evolving process where the understanding of the information need progresses, improves, as the information is reviewed. See diagram below for the multimodal berrypicking type approach. Note the importance of human thinking to this approach.

See Cognitive models of information retrieval (Wikipedia). As this Wikipedia article explains:

Bates argues that searches are evolving and occur bit by bit. That is to say, a person constantly changes his or her search terms in response to the results returned from the information retrieval system. Thus, a simple linear model does not capture the nature of information retrieval because the very act of searching causes feedback which causes the user to modify his or her cognitive model of the information being searched for.

Multimodal search assumes that the information need evolves over the course of a document review. It is never just run one search and then review all of the documents found in the search. That linear approach was used in version 1.0 of predictive coding, and is still used by most lawyers today. The dominant model in law today is linear, wherein a negotiated list of keyword is used to run one search. I called this failed method “Go Fish” and a few judges, like Judge Peck, picked up on that name. Losey, R., Adventures in Electronic Discovery (West 2011); Child’s Game of ‘Go Fish’ is a Poor Model for e-Discovery Search; Moore v. Publicis Groupe & MSL Group, 287 F.R.D. 182, 190-91, 2012 WL 607412, at *10 (S.D.N.Y. Feb. 24, 2012) (J. Peck).

The popular, but ineffective Go Fish approach is like the Classical Information Retrieval Model in that only a single list of keywords is used as the query. The keywords are not refined over time as the documents are reviewed. This is a mono-modal process. It is contradicted by our evolving multimodal process, Step One in our Six-Step plan. In the first step we run many, many searches and review some of the results of each search, some of the documents, and then change the searches accordingly.

Step Two: Tests, Sample

Each search run is sampled by quick reviews and its effectiveness evaluated, tested. For instance, did a search of what you expected would be an unusual word turn up far more hits than anticipated? Did the keyword show up in all kinds of documents that had nothing to do with the case? For example, a couple of minutes of review might show that what you thought would be a carefully and rarely used word, Privileged, was in fact part of the standard signature line of one custodian. All his emails had the keyword Privileged on them. The keyword in these circumstances may be a surprise failure, at least as to that one custodian. These kind of unexpected language usages and surprise failures are commonplace, especially with neophyte lawyers.

Sampling here does not mean random sampling, but rather judgmental sampling, just picking a few representative hit documents and reviewing them. Were a fair number of berries found in that new search bush, or not? In our example, assume that your sample review of the documents with “Privileged” showed that the word was only part of one person’s standard signature on every one of their emails. When a new search is run wherein this custodian is excluded, the search results may now test favorably. You may devise other searches that exclude or limit the keyword “Privileged” whenever it is found in a signature.

There are many computer search tools used in a multimodal search method, but the most important tool of all is not algorithmic, but human. The most important search tool is the human ability to think the whole time you are looking for tasty berries. (The all important “T” in Professor Bates’ diagram above.) This means the ability to improvise, to spontaneously respond and react to unexpected circumstances. This mean ad hoc searches that change with time and experience. It is not a linear, set it and forget it, keyword cull-in and read all documents approach. This was true in the early days of automated search with Professor Bates berrypicking work in the late 1980s, and is still true today. Indeed, since the complexity of ESI has expanded a million times since then, our thinking, improvisation and teamwork are now more important than ever.

The goal in Step Two is to identify effective searches. Typically, that means where most of the results are relevant, greater than 50%. Ideally we would like to see roughly 80% relevancy. Alternatively, search hits that are very few in number, and thus inexpensive to review them all, may be accepted. For instance, you may try a search that only has ten documents, which you could review in just a minute. You may just find one relevant, but it could be important. The acceptable range of number of documents to review in Bottom Line Driven Review will always take cost into consideration. That is where Step-Three comes in, Estimation. What will it costs to review the documents found?

Step Three: Estimates

It is not enough to come up with effective searches, which is the goal of Steps One and Two, the costs involved to review all of the documents returned with these searches must also be considered. It may still cost way too much to review the documents when considering the proportionality factors under 26(b)(1) as discussed in Part One of this article. The plan of review must always take the cost of review into consideration.

In Part One we described an estimation method that I like to use to calculate the cost of an ESI review. When the projected cost, the estimate, is proportional in your judgment (and, where appropriate, in the judge’s judgment), then you conclude your iterative process of refining searches. You can then move onto the next Step-Four of preparing your discovery plan and making disclosures of that plan.

Step Four: Plan, Disclosures

Once you have created effective searches that produce an affordable number of documents to review for production, you articulate the Plan and make some disclosures about your plan. The extent of transparency in this step can vary considerably, depending on the circumstances and people involved. Long talkers like me can go on about legal search for many hours, far past the boredom tolerance level of most non-specialists. You might be fascinated by the various searches I ran to come up with the say 12,472 documents for final review, but most opposing counsel do not care beyond making sure that certain pet keywords they may like were used and tested. You should be prepared to reveal that kind of work-product for purposes of dispute avoidance and to build good will. Typically they want you to review more documents, no matter what you say. They usually save their arguments for the bottom line, the costs. They usually argue for greater expense based on the first five criteria of Rule 26(b)(1):

  1. the importance of the issues at stake in this action;
  2. the amount in controversy;
  3. the parties’ relative access to relevant information;
  4. the parties’ resources;
  5. the importance of the discovery in resolving the issues; and
  6. whether the burden or expense of the proposed discovery outweighs its likely benefit.

Still, although early agreement on scope of review is often impossible, as the requesting party always wants you to spend more, you can usually move past this initial disagreement by agreeing to phased discovery. The requesting party can reserve its objections to your plan, but still agree it is adequate for phase one. Usually we find that after that phase one production is completed the requesting party’s demands for more are either eliminated or considerably tempered. It may well now to possible to reach a reasonable final agreement.

Step Five: Final Review

Here is where you start to carry out your discovery plan. In this stage you finish looking at the documents and coding them for Responsiveness (relevant), Irrelevant (not responsive), Privileged (relevant but privileged, and so logged and withheld) and Confidential (all levels, from just notations and legends, to redactions, to withhold and log. A fifth temporary document code is used for communication purposes throughout a project: Undetermined. Issue tagging is usually a waste of time and should be avoided. Instead, you should rely on search to find documents to support various points. There are typically only a dozen or so documents of importance at trial anyway, no matter what the original corpus size.

 

I highly recommend use of professional document review attorneys to assist you in this step. The so-called “contract lawyers” specialize in electronic document review and do so at a very low cost, typically in the neighborhood of $50 per hour.  The best of them, who may often command slightly higher rates, are speed readers with high comprehension. They also know what to look for in different kinds of cases. Some have impressive backgrounds. Of course, good management of these resources is required. They should have their own management and team leaders. Outside attorneys signing Rule 26(g) will also need to supervise them carefully, especially as to relevance intricacies. The day will come when a court will find it unreasonable not to employ these attorneys in a document review. The savings is dramatic and this in turn increases the persuasiveness of your cost burden argument.

Step Six: Production

The last step is transfer of the appropriate information to the requesting party and designated members of your team. Production is typically followed by later delivery of a Log of all documents withheld, even though responsive or relevant. The withheld logged documents are typically: Attorney-Client Communications protected from disclosure under the client’s privilege; or, Attorney Work-Product documents protected from disclosure under the attorney’s privilege. Two different privileges. The attorney’s work-product privilege is frequently waived in some part, although often very small. The client’s communications with its attorneys is, however, an inviolate privilege that is never waived.

Typically you should produce in stages and not wait until project completion. The only exception might be where the requesting party would rather wait and receive one big production instead of a series of small productions. That is very rare. So plan on multiple productions. We suggest the first production be small and serve as a test of the receiving party’s abilities and otherwise get the bugs out of the system.

Conclusion

In this essay I have shown the method I use in document reviews to control costs by use of estimation and multimodal search. I call this a Bottom Line Driven approach. The six step process is designed to help uncover the costs of review as part of the review itself. This kind of experienced based estimate is an ideal way to meet the evidentiary burdens of a proportionality objection under revised Rules 26(b)(1) and 32(b)(2). It provides the hard facts needed to be specific as to what you will review and what you will not and the likely costs involved.

The six-step approach described here uses the costs incurred at the front end of the project to predict the total expense. The costs are controlled by use of best practices, such as contract review lawyers, but primarily by limiting the number of documents reviewed. Although it is somewhat easier to follow this approach using predictive coding and document ranking, it can still be done without that search feature. You can try this approach using any review software. It works well in small or medium sized projects with fairly simple issues. For large complex projects we still recommend using the eight-step predictive coding approach as taught in the TarCourse.com.


%d bloggers like this: