The Trials and Tribble-ations of the Data Deluge

March 13, 2010

Taken from Wikeiedia explaining the STar Trek episode The Trouble With TribblesThe cover article in this week’s The Economist magazine is The data deluge, featuring a 14-page special report on information management entitled Data, data, everywhere. This article, and our collective situation of data overload, reminds me of one of my favorite Star Trek episodes, The Trouble With Tribbles. The cute little ESI files that everyone owns are now growing out of control and as a result our institutions are imperiled. There is great good that can come out of all this data, as the article points out, but a whole lot of tribble too. (Sorry.)

The Economist article talks about the insights that can be teased out of the data deluge, while at the same time pointing out that no one really knows how to manage it. Information management is a key part of the e-discovery world. In fact, it is the first step in the nine-fold Electronic Discovery Reference Model. We all know that ESI should be better managed, but it seems nigh impossible to do right. For some, it is a mess because they don’t even try. But for others, especially large organizations, it is beyond their capacity no matter how hard they try, primarily because of the information explosion. How can you manage something that multiplies faster that you can count ? How do you find something that moves and morphs into something new when you are not looking? One day its a word doc, the next a wiki, the next a twitter.

Data Are Like Tribbles

For most people data are like tribbles – cheap, easy to come by, cute and endearing, at least at first, but then after a while, they can become a real nightmare. You may think that you have your ESI managed, and that you can find certain tribbles if you need too, but you are probably just kidding yourself. Chances are they are growing faster than you can imagine and anyway they all tend to look alike, even when they aren’t. (Even Jason Baron and I can’t imagine how fast the data universe is growing and we’ve both spent way too much time thinking about it. See: e-Discovery: Did You Know?)

Yes, information is our generation’s tribble. As Dr. McCoy said “they are born pregnant.” One little email can become dozens before you know it. Also, like the tribbles in Star Trek who somehow squeezed into the overhead bins, ESI now-a-days is migrating into unlikely places, such as employee phones, music players, home media centers, even the clouds. If the law demands that you preserve and produce these pesky ESI tribbles in a court proceeding, the chances are high that you won’t catch them all. It’s also likely to cost too much to even try. That’s the trouble with tribbles, they seem so innocuous, just like data, but before you know it, they multiply out of control and destroy your ship. No wonder the Klingons hate them.

New Kind of Professional

The Economist article notes that a new kind of professional is emerging, one very important to business, and I suggest also to law, the data scientist. This insight supports my position that metrics, including multimodal search, is one of the three pillars of e-discovery best practices.

Chief information officers (CIOs) have become somewhat more prominent in the executive suite, and a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.

Substitute relevant evidence for wisdom and you have the challenge of e-discovery in a nutshell. If you can actually find the right tribble, the needle-of-relevance in the haystack-of-noise, then, according to the author of The Economist article, Kenneth Cukier, you would be a corporate sex-icon. Personally, I have my doubts. For those of us in the world of e-discovery, that is just another day in the office. Still, this is the sweet-spot in discovery today – the ultimate challenge: “to extract the nuggets of gold hidden under mountains of data.”  Those who can do this, you can find the most smoking guns for the least amount of money, are the critical players for the 98% of law suits that never go to trail. The Perry Mason jury trial lawyer types are good for the other 2%, but even they are dependent on the smoking gun documents for their gotcha cross-exams.

This point is not lost on scientists who generate huge amounts of information on a daily basis. For instance, according to The Economist the latest Large Synoptic Survey Telescope set to open in Chile in 2016 is designed to generate 28 terabytes of data a day. Facts like this cause Alex Szalay, an astrophysicist at Johns Hopkins University, to observe what the law has already noted and embodied in Rule 26(b)(2)(B), namely that the proliferation of data is making them increasingly inaccessible. Szalay goes on conclude, as do I, that education is key and must change to better train the next generation:

How to make sense of all these data? People should be worried about how we train the next generation, not just of scientists, but people in government and industry.

The Economist also quotes T. S. Eliot, who in 1934 wrote in his poem The Rock:

Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?

Where indeed? That is the golden question today in e-discovery. Where is the highly relevant evidence and how do we find it? In my view the answer lies in a multimodal Where’s Waldo? approach, one that is bottom line driven based on cost and Rule 26(b)(2)(C). But that’s another story.

The Positive Side of Big Data

The article focuses on both the good and bad sides of the deluge. The positive side is often overlooked:

… the world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account.

The Economist web also includes an audio interview with Kenneth Cukier. He thinks that Big Data as he calls it is just starting and is not really here yet. When it comes, he predicts it will change everything. To which I add “again.” The introduction of the interview (after a short commercial) summarizes the Economist article:

We will have more data than ever before, we will be able to tease out new insights with it that we could not ever do before, but it is going to create huge new headaches because we simply don’t know how to handle it.

The positive aspects of the data deluge, the ability to tease out new insights, is often forgotten in the legal arena. We tend to focus on the headaches. The data deluge allows the law to have new insights too, not just the headaches of simply not knowing how to handle all the tribbles.

The Economist article discusses some of the many ways that business innovators are teasing out the insights. Google seems to be the corporate leader. The positive side in the legal profession is that lawyers are teasing out of evidence that would otherwise never be found. In fact, the evidence would not even exist. The smoking gun emails found in many controversies today is a prime example.

In the past paper-world the private, informal, wish-I-hadn’t-said-that types of communications were never recorded. They were just sound waves or telephone calls. They were admissions that never became writings. They were near impossible to discover to impeach recalcitrant memories. They may have been suspected, but were easy to deny and near impossible to prove. The world has become far more literate and writing oriented in the last thirty years. People write more today than in the past, and write more informally and with less thought. According to The Economist:

The amount of reading people do, previously in decline because of television, has almost tripled since 1980, thanks to all that text on the Internet.

The wish-I-hadn’t-said-that types of e-communications are some of the new insights that lawyers can tease out of Big Data. They are insights into previously secret, private communications. This is discovery into true intentions. The new insights of e-discovery reveal what people were really saying and doing in the past by mining their emails, IMs, text messages, Facebooks, twitters, etc.

Conclusion

Electronic records create a history that never existed before. The relevant histories can sometimes be very hard to find because there are so many of them. But for most moderns today, the histories are there for Googlesque lawyers to find. If you know where and how to look, it is really no tribble at all.


Video Supplement to the LegalTech Debate on Keyword Search v. Multimodal Search

March 7, 2010
Keyword Search v. Multimodal

Keyword Search v. Multimodal

This movie requires Adobe Flash for playback.

This three minute video is designed for view with a high-speed connection in high definition (HD), full screen mode. This video supplements last week’s blog on the keyword search debate at LegalTech. It is a video excerpt of a law school class where I respond to a question by co-professor William Hamilton.

In Bill’s question, not included on the video, he changed the fact scenario created by Jason R. Baron for LegalTech to take the asymmetric ethics issue out of the picture. He asked me to assume the requesting party did not specify a list of keywords, and that the producing party did not know of any special “AvatarApp” keyword. Instead, he asked me to assume a very typical scenario in legal practice today where both sides just meet and discuss what search terms to use. Bill then asked me how my argument to general counsel would change. He asked how I would respond to a suggestion by the general counsel that good keywords be used, better than the initial list proposed by the requesting party, instead of a multimodal approach.

Recall from last week’s blog that this is the argument at LegalTech where I try to persuade Orange Corp. to retain my firm, over Jason’s hypothetical firm, to handle the e-discovery work in defense of a large case. My firm argued for using a variety of search techniques, not just keyword search, in what I called multimodal search. Jason advanced the contrary position, one which he does not personally endorse. He proposed that the prospective client follow current custom, and most case law, and simply negotiate and use keyword searches. He argued against spending time and money on other alternative approaches, such as concept search and other expert techniques.

Bill’s variance of the factual scenario takes out the ethical issue and thus brings the conflict between keyword search and multimodal into clear focus. The video concludes with my ad-libbed two-minute pitch for multimodal, which is about all of the time you would be given to present a novel idea to the general counsel of a large corporation, assuming they would agree to see you at all. In today’s environment, with runaway e-discovery costs and a severe economic recession, more in-house will likely take the time to listen to novel cost saving suggestions. Many are likely to conclude, as I did in this scenario, that it its well worth spending a little more money up front to save a lot of money overall.


The Multi-Modal “Where’s Waldo?” Approach to Search and My Mock Debate with Jason Baron

February 27, 2010

At LegalTech New York a few weeks ago, Jason R. Baron and I staged a debate over search strategies and cooperation ethics. As you probably already know, Jason is the Director of Litigation for the National Archives and Records Administration and Co-Chair of The Sedona Conference Working Group on Electronic Document Retention and Production. Jason came up with a factual hypothetical for the debate involving a suit by the 3-D Start Up Corp. against Orange Corp. for copyright infringement, theft of trade secrets and breach of contract. I added an additional hypothetical of the metrics of the search qualities and projected costs. Under the hypothetical Jason and I were in charge of the e-discovery departments of competing law firms. Our mock argument was in the form of presentations to the general counsel of Orange to try to persuade her to retain our respective law firm, and not the other. By the way, can you find the lawyer Waldo in the picture?

Our firms had very different strategies for search and retrieval. Jason’s firm proposed the traditional keyword search approach. My firm proposed  a more diverse and innovative approach, which Jason characterized as unproven and vulnerable to attack. My proposal, which I called multi-modal, used keyword searches along with a variety of other judgmental and concept type search methods.

Although the competing firm’s agreed on the desirability of strategic cooperation, they had sharply different proposals on how to cooperate.  The different approaches raise ethical issues that, in real life, both Jason and I feel are important.

Jeane A. Thomas, Partner and Chair of the e-Discovery Information Management Group of Crowell & Moring played the role of the General Counsel of Orange listening to and questioning our alternative proposals. Judge Paul Grimm played the role of a judge commenting on the differing proposals for legal services.

Pro-Keyword Gotcha Type Cooperation Argument

Jason argued that if his law firm were selected to represent Orange, that he would cooperate with the plaintiff by agreeing to their demands to use the 150 keywords they specified for the search and retrieval of ESI with only minor amendments. Jason noted that although the search terms proposed were numerous and broad, it would still be advantageous to the client to accepts this search protocol. He  recommended that Orange cooperate with the plaintiff on this point because their keywords did not include the secret code-name that some key Orange employees had used for the project: “AvatarApp.”

Jason noted, and I had to agree, that our mutual review of samples of Orange’s ESI made in advance of the meeting showed that the ESI containing the keyword “AvatarApp” was frequently detrimental to the defense of Orange, although not so severely as to be a smoking gun. Jason suggested that Orange agree to the plaintiff’s search demand in order to avoid producing ESI that would harm its defense. Jason argued that Orange should retain his firm, rather than mine, so that Orange could appear to be cooperative and at the same time avoid production of harmful information. This was an appealing type of “have your cake and eat it too” argument designed to get business for his law firm.

Jason’s argument at LegalTech for gotcha type cooperation is quite a stretch from his real beliefs. This was all just role playing for educational purposes. Still, it is true that Jason is a very competitive fellow, supposedly banned from family Monopoly games, and prone to one-up-manship in all areas, including cooperation. I for one would certainly not want to compete against Jason Baron in  real-life for e-discovery search services.

Jason created this hypothetical to follow-up on the ethical problem of asymmetric knowledge, an issue that he first raised at the Mercer Ethics Conference in late 2008 (see page 866). His factual scenario for NY LegalTech continues and refines the theme of asymmetric knowledge, adding issues of real versus feigned cooperation. The scenario of a requesting party seeking to dictate the search terms is somewhat extreme, but not unheard of. Some will propose terms as part of the request, but actively seek input from the producing party. This hypothetical shows what can happen when the requesting party does not seek to make the responding party a partner in the request and place some, if not all, of the burden on the producing party to devise a reasonable search.

The ethical issues raised by the hypothetical implicate the professional duties of competence (ABA Model Rule of Professional Conduct 1.1), diligence (Rule 1.3), expediting litigation (Rule 3.2), candor (Rule 3.3), and fairness (Rule 3.4). Jason and I will further examine these e-discovery ethics issues at a symposium in Chicago at Northern Illinois University on April 16, 2010. Judge John Facciola and William Hamilton will also participate in the panel discussion.

Alternative Where’s Waldo? Multi-Modal Argument

While Jason got stuck at LegalTech arguing the old-line positions and wearing the black hat, I got to be the good-guy. I proposed a truly cooperative approach using more advanced search techniques, which I called the “Where’s Waldo?” Multi-Modal approach. I proposed that Orange use both keyword and other alternative search methods, including what is often called concept-search methods. I call that approach multi-modal because its essence is to use a variety of search methods, and not just rely on keyword search alone, or concepts search alone either. I call it a “Where’s Waldo?” type of multi-modal approach because I proposed that Orange control the search of its own ESI, and not allow the requesting party to dictate the search, and that Orange conduct the search in an impartial and transparent manner. This means they search to try to find the ball, not hide it, and look for all relevant ESI without regard to whether it is positive or negative. See my prior article Child’s Game of “Go Fish” is a Poor Model for e-Discovery Search where I explain the inherent defects in keyword search as it is now conducted by most law firms and the advantages of my proposed alternative Where’s Waldo? approach.

I argued that this controlled, full disclosure Where’s Waldo? type of approach was advantageous to Orange, even in this situation where the requesting party did not know of, and thus did not propose, use of the key key-word “AvatarApp.” I proposed to use the AvatarApp term in our search, and also use other language and patterns that we knew about, and the requestor did not, since it was our data and we could look at all of it and they could not. I tried to persuade the general counsel that it was in the best interests of Orange to do this, even though I had to concede that it would uncover and lead to the production of a horde of otherwise hidden, negative ESI.

Consistent with the Where’s Waldo? method, I suggested that we make full disclosure to the requesting party of our search methods, including the previously secret AvatarApp slang word, and we demonstrate how and why our search protocol was not only reasonable, but superior to theirs. In short,  I proposed that we make our best efforts under budget constraints to find as much relevant evidence as possible, be it good, bad or indifferent. I suggested that it was a waste of time and money, and also invalid cooperation of dubious ethics, to slant the search so as to hide unfavorable ESI from the other side (except of course for privileged ESI). I questioned both the ethics and efficacy of Jason’s approach to accept the plaintiff’s uninformed, Go Fish guess-based keyword list as a way to hide unfavorable ESI. I tried to show that it was a flawed approach, certain only to waste money, and very unlikely to succeed in its dubious hide-the-ball goal.

Scientific Research Supports Multi-Modal

My idea of multi-modal is to create a recipe of search methods appropriate for the particular project. One project may rely heavily on keyword boolean search, with just a few alternatives, or maybe none at all. Another may rely heavily on linguistic analysis, or on new types of software, or other creative approaches. I had the pleasure of citing TREC Legal Track research against Jason to support my argument that a multi-modal approach would be much more effective than simple keyword search alone. See: Jason Baron on Search – How Do You Find Anything When You Have a Billion Emails? TREC shows that a variety of approaches works best, and that for some projects and issues boolean keyword search alone is very effective, but for others it fails miserably, and only alternative concept type searches will work. This is shown in the TREC chart below summarizing findings:

The TREC Legal Track research also shows that keyword search alone (by which I always mean boolean type keyword search that uses connector logic) accounts for only 22% of the relevant documents found in a seven million document database. The alternative search methods employed found the other 78% of the relevant documents. This supports my multi-modal argument that the most effective search method for a particular project will often require concept and other search methods to supplement keyword search. A master carpenter uses a number of different tools for most projects and does not rely on his hammer alone.

Metric Analysis of Projected Search and Review Costs

I also argued that my multi-modal approach would save the client money, lots of money, and uncover more relevant documents in the process. I showed by sampling that use of the 150 keywords proposed by the requesting party would produce far too many documents to review, approximately one million computer files, most of which would be irrelevant. My more precise multi-modal approach would, I contended, generate only 500,000 files, 50% less that the 150 keyword approach. Jason’s firm challenged my sampling and projections, but still had to concede that my methods would likely generate only 750,000 files, 25% less than the 150 keyword approach. I had to concede that my multi-modal search, using both keyword and various concept searches,along with iterative sampling, would cost more to set-up and run than the simple keyword search. We agreed that the use of a multi-modal approach in this situation, with tens of millions of files to search and a large number of custodians, would costs $125,000 to perform, whereas the simple keyword search would only costs $25,000.

In spite of the $100,000 higher initial search costs, the imprecision inherent in the 150 guessed keyword approach generate too many false positives. The one million files generated would drive up the costs of final review and production far more than the initial savings. In fact, my metrics, which the competing law firm could not rebut, showed a total savings of from $550,000 to $1,200,000 by using the more precise multi-modal approach. This savings naturally flows from the fact that the greatest costs in e-discovery are in review and my approach resulted in 500,000 to 250,000 less files to review. For a full analysis of the costs of review metrics see my previously referenced metrics hypothetical. For an overview see the video below of my three-slide presentation at LegalTech.

GoFish v. Where\'s Waldo - Metric Analysis of Review Costs

GoFish v. Where\'s Waldo - Metric Analysis of Review Costs

This movie requires Adobe Flash for playback.

The gotcha pseudo-cooperation approach of using the requesting party’s 150 guessed keywords might, and I emphasize might, succeed in hiding some of the bad documents. But my sampling and projected estimates of the cost to review showed that any such advantage would come at too high a cost. The metrics showed that the informed multi-modal approach would save Orange over a million dollars in e-discovery review costs. My presentation to get this case appealed to both the client’s sense of ethics and pecuniary interests. The prospect of doing the right thing, and saving a million dollars, makes for a compelling argument, although the play-role client here, Jeane Thomas, never told us her decision.

She did, however, ask both of our firms to tell her what the savings would be to Orange to limit the final human review to a privilege review only. In this scenario Orange would use a kind of Quick Peek agreement, strengthened by a Rule 502 Order and an attorneys eyes only Confidentiality Order. They would not do a relevancy review, nor confidentiality review and redaction. The privilege review itself would be partially automated with confirmation by expert human reviewers. Some privileged documents would certainly be produced, but Orange would be protected from waiver by court orders and agreement. The amount of money saved by drastically reducing human review in this way was staggering. The last slide in the above movie sets this out. By limiting the expense of human reviewers to partial privilege review and logging, you save $2,000,000 under the keyword approach, and from $1,000,000 to $1,500,000 using multi-modal. The multi-modal approach was still overall less expensive than keyword, saving between $150,000 and $400,000.

My appeal to metrics and cost analysis to counter Jason’s hide-the-ball arguments was, like the citation to TREC Legal Track, using Jason’s own petard against him. He was a good sport to set himself up in that way. Jason is the Editor in Chief of one of the lead articles on metrics, The Sedona Conference Commentary on Achieving Quality in the E-Discovery Process (2009). See my prior article on this  important work: Sedona on Quality: a Must-Read Commentary. Again, let me stress that Jason was just arguing a position here, and the mock argument obviously does not reflect his personal views, which are certainly not of the hide-the-ball variety. Indeed, the next time we do this particular educational skit, I may lose the coin toss. Then I will have to argue for keyword search and feigned cooperation and Jason will wear the white hat and argue for innovative search and bona fide cooperation.

Strategic Cooperation

In our mock argument Jason told the general counsel of Orange that his approach would appear to be very cooperative, since it involved acceptance of the plaintiff’s search strategy, whereas mine would not. Jason argued that the side that appears to be most cooperative will have a strategic advantage in the case, especially with the supervising judge. I agreed with the later point, but disagreed with the rest. I argued that the approach recommended by Jason’s firm would not fool anyone for long, including the judge. My firm’s approach to cooperation was genuine. Although the refusal to accept the other side’s 150 guessed keywords might appear uncooperative at first, over time it would become obvious that it was driven by a desire for true cooperation. It was driven by the desire to get at as much of the truth as possible under the constraints of time and money placed by this case. My approach would fulfill the ethical duties of candor to the court and fairness to the opposing party and counsel. We would do so by voluntarily disclosing the secret AvatarApp word to the other side at the initial 26(f) conference. This would also fulfill our ethical duty under Rule 3.2 to expedite litigation, not to mention the prime directive of the Federal Rules of Civil Procedure, Rule 1, which calls for the just, speedy and inexpensive resolution of every case.

I argued that the deception of only using the requesting party’s keywords, which did not include the all important AvatarApp keyword, would eventually be uncovered. Some of the documents using the code name AvatarApp would likely turn up because they also contained one or more of the plaintiff’s key words. The plaintiff and its attorneys would then know that they had been had. They would discover that the supposed cooperation of the defense was all along nothing but a trap to allow the plaintiff and its legal counsel to be hoisted by their own petards. In this case the petard, the grenade, was the uninformed arrogance of plaintiff’s counsel to think they knew enough to dictate search terms. Some might think, serves them right, and indeed it would. But what is the result of this feigned cooperation, this clever discovery gamesmanship? How does plaintiffs counsel then react? They counter-attack, and this time with great fury.

After a maneuver like that, there is a complete lack of trust and a flurry of expensive motion practice ensues. The plaintiff would argue bad faith and false cooperation by the part of Orange and its counsel. They would argue fraudulent concealment, that Orange had a duty to disclose the keyword AvatarApp, but did not. They would move for sanctions and to compel another search using the AvatarApp keyword. They would demand a do-over at Orange’s expense. They might persuade the judge and win on some points. The court could order an expensive do-over. Sanctions might even be imposed.

Jason’s firm objected to this part of my argument, and asserted that the plaintiffs would again lose this battle, since we merely followed their demands. I conceded that a do-over was by no means certain. There is authority and logic behind applying the doctrine of estoppel against the plaintiff in this situation. Also, the doctrine of invited error could apply. The plaintiff got what it asked for and then stipulated to. Jason’s clever hide the ball petard trap strategy could succeed, depending on the quality of opposing counsel and the judge. On the other hand, a judge ruling on this issue could well make the producing party, Orange, pay for at least some of the cost of a second search and production. In any event, the litigation costs over the issue would certainly be expensive. I argued that this tactic was ethically questionable, unlikely to prevent the discovery of harmful evidence, and very likely to inflate litigation costs.

How Will the Courts and Clients React?

When we were finished with the competing proposals and arguments, it was then Judge Grimm’s turn to comment. Personally, I had hoped he would indicate how he would rule in such a scenario. Judge Grimm was, however, cautious. He did not indicate how he would rule on this sticky issue of a do-over and who should pay for it. Still, having read Judge Grimm’s many legal opinions and writings over the years, I am confident that if this scenario was presented to him in his courtroom, he would rule against the producing party that snookered the requesting party.

Judge Grimm understands the limits of keyword searches and the ethical duties of cooperation, competence, diligence, expeditiousness, fairness and candor. The feigned cooperation maneuver that Jason’s hypothetical raises is not likely to succeed in front of a sophisticated judge who is knowledgeable in the ways of e-discovery. Still, there is a dearth of e-discovery expert judges, and no legal authority on this issue at this time, nor is such authority likely to come soon. For these reasons, the argument of feigned strategic cooperation will be quite tempting to many attorneys for many years to come.

It will be especially tempting to litigants trying to decide which law firm to retain to help win an emotional or high stakes “bet-the-company” type of law suit. That is why I found Jason’s hypothetical and mock-debate to be especially interesting and important to the profession. Let us hope that litigants will have the opportunity to hear both sides of the argument and the courage and financial sense to make the right decision. I am concerned that only the tough-guy voices of feigned cooperation will be heard, at least at first. I am concerned that clients will not hear the more restrained voices of cooperation, and even if they do, they will not understand the financial savings that these innovative approaches can make possible.

Rulings by our leading judges should help get this message across to litigants so that the quiet voices of reason by competing outside counsel can be heard. Strong opinions on sanctions motions can, and I am confident will, send clear messages to litigants and their attorneys. These rulings already have and will continue to encourage real cooperation and new and improved search and production methods.

Conclusion

The legal profession in the United States is now preoccupied with playing electronic Go Fish-like games of keyword guessing. We can continue this business as usual, and we can continue to over-review and over-produce unwanted mountains of data. We can ignore the cooperative Where’s Waldo multi-modal approaches. But should we? Can our clients, and society as a whole, afford to continue the old ways of hide-the-ball discovery gamesmanship? I think not. It too expensive and it’s morally bankrupt.

We should not give up the Twentieth Century American tradition of discovery, especially e-discovery, as some contend. We should, however, change our attitudes and move to a cooperative model of discovery. When it comes to e-discovery in particular, we should move to a smarter more high-tech oriented model. Keyword search is so last century. The scientific research is in and it shows that keyword search alone usually does not work. It just produces inefficient searches and giant haystacks of irrelevant data that are incredibly expensive to review.

The research shows that we have to employ new, alternative methods that vary according to the needs of the particular case. We should embrace concept search inclusive, multi-modal, Where’s Waldo approaches to e-discovery. We should not just walk away from e-discovery entirely, as most attorneys today are still doing. Electronic discovery is too expensive now, and something to be avoided, because it is usually carried out under the old paper discovery model of gamesmanship and “any and all” productions. The truth is, most litigation attorneys lack the technical competence and attitude needed for e-discovery.

Electronic discovery is too expensive at this point because the profession has been unable to change its ways fast enough to keep up with the mind-boggling advances of technology. See eg.: e-Discovery: Did You Know? We have to pick up the pace and become comfortable with the new technologies. We have to understand that in today’s world of terabytes and exabytes of ESI, no one can afford the whole truth. We have to reign in document reviews with proportionality. That has to start with smarter search that generates smaller document sets. The multi-modal cooperative Waldo approach that I argued at LegalTech is one way to get there.

I do not contend that it is the only way to get there. You can, for instance, have a multi-modal approach that is not also Where’s Waldo. You could, for instance, allow the requester to be an equal partner in designing search protocols. You could engage in an iterative series of negotiated multi-modal sessions where the parties meet to come up with the best search recipe for the case. This would, however, necessarily entail the transparency aspects of the Where’s Waldo approach. The producing party would test and report back in order to make these meetings meaningful. A series of Go Fish guessing games, with no tests and sampling in between would, I contend, be a big waste of time. Even with tests, sampling and transparency, the TREC research shows rapidly diminishing returns after the first two meetings.

It seems obvious to me that the best role of the requesting party is to specify what they are looking for, what does the Waldo they want look like? The Where’s Waldo driven counsel meetings would focus on the requester explaining what they are looking for, narrowing the request and making it more specific. The search design would be controlled by the producing party. The requesting party is in no position to design a search of data that they have not seen, and can never see. More than one meeting may still be necessary under Waldo, as the producing party will need to report back, explain the search they have used and why, and hopefully get buy in from the requester. The requesting party may even have some good search suggestions from time to time, and I am not saying they should not be heard by the producing party. I am just saying they should not dictate or control the search.

New search methods and cooperative attitudes are the best way out of the e-discovery morass we are now in, not rewriting the rules once again. The rules are pretty much fine as they are (although I would make a Rule 16(b) hearing mandatory). We do not need to abandon discovery or dramatically change the rules of the game. We need to improve our game skills and attitude. We need to think different and to cooperate. We need to channel our adversarial skills and arguments to the meaning of the law and the facts, not the hiding of facts. The desire of many trial lawyers today to control the facts, and rewrite history, so that they can win a case is misguided. This is what is ruining litigation today, not discovery or e-discovery per se.

Electronic discovery is over expensive today because it is driven by this type of misplaced adversarial attitude, compounded by a lack of competence and over-reliance on vendors. Vendors have their place, and are often a key part of a good e-discovery team, but they are not lawyers and should never be in charge of e-discovery. Most of them profit from keyword search models of over retrieval and review. So too do many law firms with their armies of reviewers.

The over-review models that dominate e-discovery today are doomed. The future belongs instead to a cooperative, ambidextrous, concept laden fellow named Waldo. The problem is, at this still early stage of the game, he can be awfully hard to find.


Ten Minute Attention Rule

February 23, 2010

Many experienced presenters and educators have found that people tend to get bored after ten minutes of listening to the same thing. The brain seems to be hard-wired to receive new stimulus after that time and it is hard for most people to focus their concentration longer than that. This is especially true for arcane and difficult subjects like e-discovery. Here is a short excerpt from my law school class at the University of Florida last week where I discuss this ten minute rule. Be sure to set the HD (high definition) button to “On” in the upper right corner of the video and view in full screen mode by clicking on the arrows in the lower right corner. Don’t worry, its less than five minutes long.

10 Minute Attention Rule lecture at U.F.

10 Minute Attention Rule lecture at U.F.

This movie requires Adobe Flash for playback.

I am unsure of the science behind the ten minute attention rule. It is promoted by Dr. John Medina, who is a molecular biologist and director of the Brain Center for Applied Learning Research. Although I have not seen the full data to support the science, I am convinced of the efficacy of the rule. After years of talking too long (and too fast), I now try to follow this rule whenever I make presentations, so too does my generation’s true master of presentations, Steve Jobs. Our best political speakers do also.

As a “reformed trail lawyer,” as Craig Ball likes to say, I am always focused on the last step in the EDRM model. I have spent my whole career preparing for trials and thinking about the presentation of evidence. This is, after all, the whole point of e-discovery. So any course on e-discovery should include some of the key rules of the ninth step – presentation.

The rules of evidence are what usually come to mind in considering the ninth step, and they are important to be sure; see for example my prior blog on George Paul’s great book, Foundations of Digital Evidence. But the psychological rules of presentations must also be learned. They are important to anyone who tries to persuade or to teach. So too is the related ability to explain a complex subject within a set, usually short period of time. That is why I assign students in my seminar on advanced e-discovery the task of making a five-minute verbal presentation of the articles they are working on. The exercise forces you to think about the key features of your paper and how to summarize your ideas. Practicing attorneys do this kind of exercise too. It helps to refine your positions and arguments.

I am very interested in online education and, as I have written about before, there is a right way and wrong way to go about doing online instruction. Online programs that simply play a one-hour video on a web page, which is 90% of what passes for online higher education today, do not begin to utilize the full potential of the new media. Such productions also ignore the latest thinking in learning research. They fail to take advantage of the unique features of the world wide web and hyperlinked writings and they violate the ten minute rule. I join in the criticism of such low level productions.

Videos are an important part of online education, but they should be short and enhanced. My quickly made, low-budget video for this blog demonstrates a little of what I mean by enhanced. It does not take that much time and effort to add a few interest-enhancing special effects to a video. What I have done here is very basic.

A good online video should also be integrated with hyper-linked text that puts it in context and links it with other segments of the web. This blog again demonstrates this feature by including links with background and further reading related to the ideas presented in the video. There are many other features that make for a good online training program, primary among them interactivity exercises and mentorships.

Simple podcasts and videos that dominate online education today miss all of these key ingredients to good programming. Yet research shows that even these primitive online learning modules are just as effective as current face-to-face classroom instruction. See: Why Online Education Will Surpass Traditional Face-to-Face Education in the Next 5-10 Years. Just think how learning could accelerate with truly advanced online models. For this reason, I predict that in five to ten years almost all CLEs and other continuing education programs will be online. The face-to-face CLEs that remain will be for relationship building and dialogue, not instruction per se.

Any thoughts or comments you may care to share with us about the ten minute rule? Online education? CLEs? Videos? Or the ninth step in general? Please share them with us by leaving a comment below. Keep it short for obvious reasons.


One Minute Summary of My Three Keys to e-Discovery

February 20, 2010

In my CLEs around the country I usually mention at some point my three keys to successful e-discovery. They are the top three things to do in order to do e-discovery right. All involve fundamental changes to the way most lawyers practice. For that reason, these suggestions seem novel and difficult, if not impossible to most lawyers today. But in the future I am confident that they will seem common place, even obvious. They may seem like that to you already because, as one of my readers, you are probably in the vanguard of e-discovery already. Certainly most experts that I have talked to already embrace these three suggestions as part of their tool kit, even if they do not pick them as their top three. Still, should you disagree, feel free to say so in the comments section at the end.

These three fundamental changes to legal practice must be made if we are to continue to resolve disputes based on the facts, primary among them the writings of the parties. We can no longer practice law with paper documents and paper mentality the way Abraham Lincoln did. The world no longer works that way. It is all digital now. The legal profession must change with society and embrace technology, not run from it. If you do not change your practice to do at least these three things, then e-discovery will be an expensive morass. Put another way, if your company or law firm already finds electronic discovery to be a problem, and most do, then these three new activities provide a way out. They are the keys to your successful, economic deployment of e-discovery.

This video provides a very short digest of this message and, as we all know, sometimes less is more. By the way, if I had to add a fourth step, it would be education. But that should already be obvious to my usual readers and certainly was to the University of Florida law students in this class to whom I was talking.

Threefold Solution

Threefold Solution

This movie requires Adobe Flash for playback.

Click on the four arrows in the bottom right corner of the video for full screen HD view.

Please leave me a comment below that is in any way related, as I am interested in what all of my readers have to say. If it is an off-topic comment, feel free to send me a private email at ralph.losey@gmail.com.



Baron and Losey Video Now on YouTube

February 11, 2010

My movie with Jason R. Baron is now on YouTube. It is called e-Discovery: Did You Know? It is best seen in high definition using the full screen view.  Hope you like it.

Please feel free to add this video to your own web and share it with your friends and colleagues. Jason and I also love to hear all comments.

For more information on the video, see my prior blog where I first published the video on WordPress. Also see the blog with my interview on ESI Bytes.


Why Online Education Will Surpass Traditional Face-to-Face Education in the Next 5-10 Years

February 7, 2010

Most online education today is poorly designed and implemented. It merely places a boring classroom lecture into an online video, changing a synchronous three-dimensional realtime experience into an asynchronous two-dimensional one. It does so without creativity and without harnessing the full power of online technologies. It does not begin to fully use the hyper-linked, multi-media, interactive, community sharing elements of the world-wide web. Still, even in these early primitive forms, current research shows that online education is already more effective than traditional face-to-face instruction!

Although today’s students get this already, many in the academic community find it hard to believe. That is one reason the U.S. Department of Education published a report recently based on a survey of 1,000 studies of online learning conducted between 1996 and 2008. See: Evaluation of Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies. The consensus of research shows that online instruction is better than traditional bricks and mortar instruction for today’s plugged-in students. Also See the NY Times article Study Finds That Online Education Beats the Classroom.

It is time for teachers and administrators to wake up and accept this fact of life. That includes our Universities, our law schools, our programs for continuing professional education, and our educational publishers. Those who change and go with the times will prosper, those who do not will go the way of the newspapers. For law schools that means their income and rankings will decline, their enrollment will suffer, and their faculty will transfer. They will struggle to make ends meet, and ultimately, many will close. The few who lead the way, or quickly catch up, will make up the difference as world-wide matriculation increases. They will grow in quality, prestige, and wealth. I predict a large shakeup from the current, already dubious rankings.

The Key is Liberation from the Restraints of Time and Place

I will examine the details of current research into online education later, but first, here is my personal view of why traditional classroom instruction will soon fade in preeminence, at least at the University and professional continuing education levels. Online instruction has one inherent advantage over traditional bricks and mortar. This advantage is present in even the most rudimentary (primitive) forms of online instruction prevalent today, the video of a teacher on a webpage. Online education is asynchronous, meaning it is free of the traditional classroom learning restrictions of time and space.

A student can receive instruction when and where they want. The virtual online class can be viewed anywhere in the world that has Internet connectivity, which today means pretty much everywhere. It does not require travel or residence in any particular place. It does not require attendance at a particular time. Online studies take place at a time of convenience to the student. The bad dreams common to those of us trained in the bricks and mortar world — of not being able to find the classroom or get there on time – are a thing of the past. (I suppose in the future the nightmares will consist of not being able to find your computers or lost connectivity.)

This liberation from the constraints of place and schedule give online instruction a huge advantage over traditional instruction. It allows a community of teachers and students to converge together from all over the world to study a subject of common interest. There will still be some time constraints, but they will be in broader time frames, of entire days, weeks, or even months and years. A student can choose the best time to study. A teacher the best time to teach. Yes. The liberation of time and space applies to both students and teachers.

This freedom from the four walls of a classroom meeting at a particular time gives online learning its principal advantage. This fact alone radically transforms teaching and learning as we know it. It makes education accessible to everyone. It also makes it possible to drastically reduce the costs of education. And, potentially at least, as new more creative online instruction programs are developed, this new education will not just be slightly better than time-space restricted classroom education (a position it has already achieved), it  will be far more effective, especially for advanced University level instruction and beyond.

A student can logon to study at the time when they are most alert and receptive. They can do so in an environment of their choosing, one that they have found to be most conducive for learning. They may choose to study alone, or in a group.  Some may learn best in a crowded coffee-shop. Others may prefer a quiet room by themselves. For some the preferred time to learn may be in the morning. For others it may be late at night. Online learning can happen anywhere and anytime.

The liberation from the constraints of time and place has, to date, primarily impacted students. But its potential for impact on faculty is just as profound. Today most online instruction consists of a video of one professor teaching a class, plus perhaps some written materials. The video may include questions and interactions with a live class, or may simply consist of a lecture. Either way, it follows the traditional bricks and mortar model of one instructor per class teaching at a set schedule.

The faculty too should teach when and where they are at their best, and in the subject for which they are best qualified. No one professor on campus is the best qualified subject matter expert in the world (or even in the University for which they are employed) on all topics that they teach. They may be an expert in their general field, but as an expert, they will readily acknowledge that there are at least a few others better qualified than they to teach on some specific sub-areas in that field. Also, sometimes the best subject matter expert is not the best teacher of that subject, no matter how broad or focused the area. The experts and best teachers in any field are always scattered throughout the world. They are never all conveniently located on one University campus. Online education removes that spatial obstacle.

As online instruction matures, the best and the brightest instructors will converge online to teach a course together. A new role of moderator instructor will then develop, one who introduces and weaves together the teaching of specialists to present a unified whole for that course. The people with the most knowledge in the world on a subject, and the best skills at explaining it, at transmitting it, will be the teachers of that subject. This may ultimately reduce the number of teachers on any given subject, but should dramatically improve quality. Further, this reduction in the number of teachers needed for a subject is likely to be balanced by an increase in the number of subjects taught.

The top-level teachers can be filmed in bricks and mortar classes interacting with students, or in a studio, or in their home, or in front of a large audience, whatever fits their personality, whatever environment captures their best teaching moments. The best video-takes, where everything really gels and the magic of teaching happens, will be the videos that are preserved and shared in future online instruction of that subject. The video presentations that will be part of an ideal online program will be the videos of the best teachers in the world at their best moments. (I emphasize that these videos will be only a part of the program because there is far more to online education than viewing another speak and gesture. That is mere distance learning, the precursor to modern online instruction). These teaching moments will be preserved for posterity. This again is a game-changer for education and heralds a paradigm shift in quality of instruction.

Transformation to Online Education

This transformation will happen in steps. It will begin by a University or other learning institution capturing its best instructors on a particular subject and weaving their good presentations together into an online course. They may supplement the course by video input from other subject matter experts who serve as guest-lecturers. The variety of input and styles this entails will require, as mentioned, a new type of moderator instructor. They will provide continuity and also serve as mentors for students, to interact with them in a group or individualized basis. Others may specialize in the mentor role, and still others in the testing and certification roles. The personal touch in education will continue, so too will community, but it will change forms.

Such online community building and interactivity will be key, just as it is in current bricks and mortar instruction, but it will use many different media. It could include realtime video and teleconferences, and instant messages, or convenient-time emails, messages, twitters, wikis, and forums. Face-to-face meetings may still occur on occasion. The teachers who respond to questions may also include a much wider universe of subject matter experts, who would agree to be on call for certain universities on various subjects on which they possessed special expertise. Students can also easily interact with each other in online communities built and supported by their school. This interaction, like the instruction itself, will take many forms, including multi-media formats with audio-visual enhancements to videos. It will constantly change and evolve as new technologies emerge.

The instruction will include many short videos, not just a few long-winded talks. Research has shown that students tune out after 15 to 30 minutes. The videos will be interspersed with extensive and fully hyper-linked writings, drawings, diagrams, and other images and animations. Videos of instructors will also be enhanced with graphics and animations, as I have already done with a few videos on this blog. See eg. EDRM – The Unofficial Video Version. Instruction will be designed to impact both the left and right brains. Each course will also include a variety of creative task oriented challenges and interactivity opportunities. It will include periodic testing, both computerized and person-administered and graded. There will be structure and order in the curriculum, but this will be balanced by student empowering selection and arrangement choices. Online instruction in the future will be more like a challenging video game and less like a public television show. As I have often said before: Creativity is the best friend of learning and boredom is the enemy.

Research on Online Education

Current scientific research supports these postulations. First of all, research shows that online education is already better than traditional classroom instruction. Evaluation of Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies (hereinafter “Department of Education Report” or “Report”). The Report at page ix of the Abstract states that:

… on average, students in online learning conditions performed better than those receiving face-to-face instruction.

The Report stresses repeatedly that online instruction tests “significantly positive for undergraduate and other older learners, but not for K–12 students.” The advantages, if any, for the K-12 age group are still somewhat speculative, in large part due to the absence of significant research for that age group. By contrast, there have been many tests and research concerning the university level and beyond, including significant testing in the area of medicine. Report pgs. xiii, xvii. Unfortunately, but not too surprisingly, only a few were noted in the area of law. This is not surprising because there has, until recently, been virtually no online training offered in law. This is, however, now changing rapidly, as seen for instance by NYU’s new offering of an online Masters of Law program in Taxation. This new online program by New York University School of Law is, I submit, of great importance, and should serve as a wake-up call to all in the academic community.

Some of the key findings of the Report include the following quotes:

Students who took all or part of their class online performed better, on average, than those taking the same course through traditional face-to-face instruction. Id. at xiv

The effectiveness of online learning approaches appears quite broad across different content and learner types. Id. at xv.

Blended and purely online learning conditions implemented within a single study generally result in similar student learning outcomes. Id. at xvi. [Note: some of the studies suggested to the contrary that the inclusion of some face-to-face instruction did improve learning. For that reason I believe that some real-time interaction remains important, be it in person, by video, telephone, IM, or the like.]

Elements such as video or online quizzes do not appear to influence the amount that students learn in online classes. Id. [Note: in my opinion this finding in one or more studies is the result of overly-long and otherwise poor quality videos and quizzes.]

Online learning can be enhanced by giving learners control of their interactions with media and prompting learner reflection. Studies indicate that manipulations that trigger learner activity or learner reflection and self-monitoring of understanding are effective when students pursue online learning as individuals. Id.

Providing guidance for learning for groups of students appears less successful than does using such mechanisms with individual learners. When groups of students are learning together online, support mechanisms such as guiding questions generally influence the way students interact, but not the amount they learn. Id.

The Report also includes a caveat about the conclusions and limitations of these studies. This cautions against immediate abandonment of traditional face-to-face instruction for the seemingly superior and obviously much cheaper online instruction modes:

However, several caveats are in order: Despite what appears to be strong support for online learning applications, the studies in this meta-analysis do not demonstrate that online learning is superior as a medium. In many of the studies showing an advantage for online learning, the online and classroom conditions differed in terms of time spent, curriculum and pedagogy. It was the combination of elements in the treatment conditions (which was likely to have included additional learning time and materials as well as additional opportunities for collaboration) that produced the observed learning advantages. At the same time, one should note that online learning is much more conducive to the expansion of learning time than is face-to-face instruction. Id. at vxii.

There are other legitimate criticisms and caveats that you can make concerning the Department of Education Report. See for instance Clive on Learning. But the major thrust of the report is incontrovertible. Online asynchronous learning has many distinct advantages over traditional synchronous classroom learning and is more effective for many students at advanced University and post-graduate levels.

The following quotations provide a good flavor for the contents of the sixty-six page Report. They also suggest ways, which I consider especially important, to improve online education programs by student powered creative interactivity and community based functions:

  • One common conjecture is that learning a complex body of knowledge effectively requires a community of learners (Bransford, Brown and Cocking 1999; Riel and Polin 2004; Schwen and Hara 2004; Vrasidas and Glass 2004) and that online technologies can be used to expand and support such communities. Another conjecture is that asynchronous discourse is inherently self-reflective and therefore more conducive to deep learning than is synchronous discourse (Harlen and Doubler 2004; Hiltz and Goldman 2005; Jaffee et al. 2006).  Id. at pg. 2.
  • In deciding how to implement online learning, it is important to understand the practices that research suggests will increase effectiveness (e.g., community building among participants, use of an online facilitator, blending work and training). Id.
  • Typically, in expository instruction, the technology delivers the content. In active learning, the technology allows students to control digital artifacts to explore information or address problems. In interactive learning, technology mediates human interaction either synchronously or asynchronously; learning emerges through interactions with other students and the technology. Id. at Pg. 4.
  • Scoville and Buskirk (2007) examined whether the use of traditional or virtual microscopy would affect learning outcomes in a medical histology course. Students were assigned to one of four sections: (a) a control section where learning and testing took place face-to-face, (b) a blended condition where learning took place virtually and the practical examination took place face-to-face, (c) a second blended condition where learning took place face-to-face and testing took place virtually, and (d) a fully online condition. Scoville and Buskirk found no significant differences in unit test scores by learning groups. Id. at pg. 39.
  • A study by Zhang et al. (2006) suggests that the way in which a medium is used is more important than merely having access to it. Zhang et al. found that the effect of video on learning hinged on the learner’s ability to control the video (“interactive video”). The authors used four conditions: traditional face-to-face and three online environments—interactive video, noninteractive video, and nonvideo. Students were randomly assigned to one of the four groups. Students in the interactive video group performed significantly better than the other three groups. There was no statistical difference between the online group that had noninteractive video and the online group that had no video. Id. at pg. 40.
  • Zhang (2005) reports on two studies comparing expository learning with active learning, both of which found statistically positive results in favor of active learning. Zhang manipulated the functionality of a Web course to create two conditions. For the control group, video and other instruction received over the Web had to be viewed in a specified order, videos had to be viewed in their entirety (e.g., a student could not fast forward) and rewinding was not allowed. The treatment group could randomly access materials, watching videos in any sequence, rewinding them and fast forwarding through their content. Zhang found a statistically significant positive effect in favor of learner control over Web functionality (see also the Zhang et al. 2006 study described above). Gao and Lehman (2003) found that students who were required to complete a “generative activity” in addition to viewing a static Web page performed better on a test about copyright law than did students who viewed only the static Web page. Id. at pg. 41.
  • Grant and Courtoreille (2007) studied the use of post-unit quizzes presented either as (a) fixed items that provided feedback only about whether or not the student’s response was correct or (b) post-unit quizzes that gave the student the opportunity for additional practice on item types that had been answered incorrectly. The response-sensitive version of the tutorial was found to be more effective than the fixed-item version, resulting in greater changes between pre- and posttest scores. Id. at pg. 44.
  • These studies found that a tool or feature prompting students to reflect on their learning was effective in improving outcomes. Id.
  • Zhao et al. also found that instructor involvement was a strong mediating variable. Distance learning outcomes were less positive when instructor involvement was low (as in “canned” applications), with effects becoming more positive, up to a point, as instructor involvement increased. Id. at pg. 53.

Conclusion

I am not arguing for the dismantlement of our brick and mortar Universities, including our law schools. Nor am I suggesting that our current system of CLEs be discontinued entirely, a system where people travel all over the country, if not the world, for just a few minutes of face-to-face presentations. What I am saying is that the current bricks and mortar only systems cannot and should not continue. More than that, they will not continue. The forces of history, competition, and technological advances will compel change, whether we like it or not.

This is a good thing. Online education is already better than face-to-face education, even though it is, I contend, still in its infancy and just beginning to realize its full potential. Moreover, online education is far more energy efficient and far less expensive. It is also far more capable of widespread distribution. Although we should not abandon or quickly replace our traditional modes of education, we should immediately begin to supplement these traditional modes with innovative online methods.

I predict that in the next five to ten years online education will surpass traditional education in popularity in all major fields of education, including legal. But surpass does not mean replace. Just as some people will still read paper newspapers in the future, myself included, some will still continue to attend and teach in face-to-face schools. Our universities are better funded and established than our newspapers. For that reason, they are better positioned to perpetrate the traditional pre-technological forms while adding to and embracing new online methods.

I call upon all schools to embrace this coming change, including law schools and the ABA which accredits them. The law should take notice and follow the lead of NYU. Law schools should start planning today to add online J.D. instruction and Masters of Law degree programs to supplement their in-person classroom instruction and degrees. CLE programs must do the same. If anything, the need for change is strongest in the area of continuing education, especially in my field of electronic discovery. Our scarce educational resources should be invested in technologies, including video and multimedia production facilities, not in more brick and mortar classrooms and jet fuel. We should be able to turn on our computers to meet the best and the brightest in the field, not travel to New York, Washington, or London. The way of the future is green and efficient. It is online.