My Basic Plan for Document Reviews: The “Bottom Line Driven” Approach – Part Two

This is part two of the series by Ralph Losey explaining his document review strategy. Please read part one first.

“Small” Document Review Example

A few examples should help clarify how this works. If you set a proportional document review cost for a case of $25,000.00, and estimate based on experience, sampling and other hard facts that it will cost you about $1.00 per file for both the automated first-pass and subsequent manual review of documents, then you can review no more than 25,000 documents and stay within budget. It is basically that simple. No higher math is required.

The most difficult part is the legal analysis to determine a budget proportional to the real merits of the case. But that is nothing new. What is the golden mean in litigation expense? How to balance just, with speedy and inexpensive?[14] The ideal proportionality question of perfect balance has preoccupied lawyers for decades. It has also preoccupied scientists, mathematicians, and artists for centuries. Unlike lawyers, they claim to have found an answer, which they call the golden mean or golden ratio:

In law this is the perennial Goldilocks question. How much is too much? Too little? Just right? How much is an appropriate spend for document production? The issue is old. I have personally been dealing with this problem for over thirty-three years. What is new is applying that legal analysis to a modern-day high-volume electronic document search and review plan. Unfortunately, unlike art and math, there is no accepted golden ratio in the law, so it has to be recalculated and reargued for each case.[15]

Estimation for bottom line driven review is essentially a method for marshaling evidence to support an undue burden argument under Rule 26(b)(2)(C). It is basically the same thing we have been doing to support motions for protective orders in the paper production world for over sixty years. The only difference is that now the facts are technological, the numbers and varieties of documents are enormous, sometimes astronomical, and the methods of review are very complex and not yet standardized.

Estimate of Projected Costs

The calculation of projected cost per file to review can be quite complicated, and is frequently misunderstood, or is not based on best practices. Still, in essence this cost projection is fairly simple. You basically project how long it will take to do the review and the total cost of the time. To be more accurate you can also include the materials costs, for example, software usage fees, processing costs, and other non-legal vendor charges.[16]

Thus, for example, think of our relatively small project to review 25,000 documents.[17] Your first step in the review is to identify the relevant documents in these 25,000, or put another way, to weed-out the irrelevant documents. In most types of lawsuits, less than five percent of a custodian’s emails will be relevant. Sometimes it is less than one percent, or a tenth of a percent. The primary task is to quickly and efficiently identify the rare relevant needles from the large haystacks of irrelevant documents. Once the probable irrelevant documents are removed and the probable relevant documents are identified, the next Protections review stage can begin. As mentioned, documents coded relevant, and only those documents, are then re-reviewed for privilege and confidentiality, and redacted, labeled and logged as necessary. They are often also issue tagged at this stage for the later use and convenience of trial lawyers, or sometimes the issue tagging may be included in the first pass review. Mistakes in first pass relevancy review are also corrected with quality controls built into the reviewer classifications.

First pass relevancy reviews were originally done by having a lawyer actually look at, meaning skim or read, each of the 25,000 documents. Using low paid contract lawyers, this kind of first-pass relevancy review typically goes at a rate of from between 50 to 100 files per hour. But by using AI enhanced review software, a skilled search expert, who must also be a subject matter expert (SME) for predictive coding to work well, can attain speeds in excess of 1,000 files per hour (or even much faster than that when larger volumes of documents are involved) for first pass review.[18]

Very Small Productions

When you are working with a volume of approximately 25,000 or more documents, as in this example, then the AI enhanced SME type of review is usually a financially viable approach. But with fewer documents than that, or a smaller budget, it may not be. The exact cut off point depends on the costs you incur with your vendor for use of the software, and the costs (and sometimes availability) to use a qualified SME to do the AI enhanced reviews. It also depends on the value of the case.

For very small document productions, but ones that still involve more than a few thousand documents, the linear review of each and every document is still not the solution. Document by document review of thousands of documents by multiple reviewers is a notoriously inefficient, ineffective, and costly approach. Instead, we fall back on other methods for first pass culling that were state of the art just a few years ago before AI enhanced software became available. These easier and less expensive methods employ what I call a multimodal tested keyword approach. The vast majority of the employment law cases I deal with now involve very small productions where we use this alternative non-AI approach.

The methods rely primarily on tested keyword searches. Tested keyword search is described in part by Judge Peck’s well-known wake up call opinion in Gross Construction, which provides basic advice to lawyers on the proper way to use keyword search.[19] In essence, you do not simply guess what key words might be effective, but instead you use witness interviews and other factual research and analysis to derive various possible terms. You then test proposed keywords using both judgmental and random sampling. You also use keywords with Boolean combinations and parametric field limitations (applied to various metadata fields).

Other types of search, including similarity searches, concept searches, and even some linear review of key time periods and custodians, are used to supplement the tested parametric Boolean keyword searches. This is why I call it a multimodal method. This type of search was state of the art just a few years ago. It still has the advantage of being easier to use, requiring less sophisticated/expensive software, and far less SME involvement. But, compared to AI enhanced review it is far less reliable, and when large volumes of documents are involved, far more expensive.

To Be Continued …..

[14] Rule One of the Federal Rules of Civil Procedure requires all other rules to be interpreted to accomplish the just, speedy and inexpensive resolution of all cases.

[15] If the golden ratio were accepted in law as an ideal proportionality, the number is 1.61803399, aka Phi. That would mean 38% is the perfect proportion. I have argued that when applied to litigation that means the total cost of litigation should never exceed 38% of the amount at issue. In turn, the total cost of discovery should not exceed 38% of the total litigation cost, and the cost of document production should not exceed 38% of the total costs of discovery (as opposed to our current 73% reality). (It’s like Russian nesting dolls that get proportionally smaller.) Thus for a $1 million case you should not spend more than $54,872 for document productions (1,000,000 – 380,000 – 144,400 – 54,872). See Losey, R., Beware of the ESI-discovery-tail wagging the poor old merits-of-the-dispute dog. I have not yet made this math and art golden ratio argument in court, but, who knows, it just might work with the right judge. To me the proportions seem reasonable.

[16] For a full picture of total e-discovery spend you should include other e-discovery costs unrelated to review, such as vendor and attorney costs related to preservation and collection. Recall the Rand survey showed found these non-review related expenses to average about 27% of the total e-discovery costs.

[17] Note it probably started as 100,000 documents collected for preservation and possible review, but you bulk-culled it down to 40,000 documents for review (Culling is step six in the EDBP). Bulk culling is performed by making such legal decisions as custodian ranking (for instance, a decision to review only the top five most important custodians, even though the email of ten custodians was collected), date ranges, and file types (for instance, exclude all music and photo files types). See http://www.edbp.com/search-review/bulk-culling/.

[18] For more on how the predictive coding review process works, including the importance of qualified SMEs to train the machine, a process called active machine learning, see my introduction to the subject at http://e-discoveryteam.com/car/ and http://www.edbp.com/search-review/computer-assisted-review/ and other articles cited therein.

[19] William A. Gross Construction Associates, Inc. v. American Manufacturers Mutual Insurance Co., 256 F.R.D. 134, 136 (S.D.N.Y. 2009). Also see: Losey, R. Child’s Game of “Go Fish” is a Poor Model for e-Discovery Search found at http://e-discoveryteam.com/2009/10/04/childs-game-of-go-fish-is-a-poor-model-for-e-discovery-search/.