BORG CHALLENGE: The Complete Report

April 18, 2013

For ease of future reference this blog contains the entire narrative of the Borg Challenge that was previously reported in five installments: Part OneTwoThreeFour and Five

Borg Challenge: Part One of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology

Borg_Losey_beginning stage assimilationThis is the first in a series of reports on a fifty-hour predictive coding experiment using a semi-automated, single-search-method. I usually refer to this method as the Borg approach, after the infamous Star Trek villains. SeeNever Heard of Star Trek’s Borg? This method is used by many predictive coding software systems. It contrasts with the hybrid multimodal search method that I use, along with many others.

Multimodal in the context of predictive coding means the use of all types of searches to find machine training documents, not just one. I call it a hybrid approach because the human search expert remains in control of the machine learning. I recently discovered this follows a well-established practice in information retrieval science called Human–computer information retrieval (HCIR). The multimodal hybrid approach is contra to a monomodal approach where predictive coding only is used, and where the interaction of humans in the process is minimized, or reduced entirely to yes-no decisions.

I think the hybrid multimodal method is superior to the popular alternative Borg methods, but I have no real factual basis for this supposition. To my knowledge no comparisons or other tests have ever been made between the two methodologies. My opinion was based on logic alone, that if one search method was good, many would be better. But as I have previously stated here, my opinion could be wrong. After a lifetime in the law I know that for judgments to be valid they must be based on evidence, not just reason. See eg. There Can Be No Justice Without Truth, And No Truth Without Search. That is also a basic tenet of our scientific age. For these reasons I decided to perform an experiment to test the Borg approach against my own preferred methods. Thus the Borg Challenge experiment was born.

EnronI had already performed a fifty-hour search project using my methods, which I reported here. Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron. That set a benchmark for me to test the Borg approach against. My experiment would be to repeat the same search I did before, but using the competing method. I would once again search an Enron dataset of 699,082 emails and attachments for evidence of involuntary employment terminations. I would also use the same software, Kroll Ontrack’s Inview, but this time I would configure and use it according to the Borg method. (One of the strengths of Inview is the flexibility of its features, which allowed for easy adaptation to this alternative method.)

As far as I know, this kind of test comparison of the same large search project, by the same person, searching the same data, but with different methods, has never been done before. I began my project in 2012 during Christmas vacation. I did not finish my final notes and analysis until March 2013. The results of my experiment may surprise you. That is the beauty of the scientific method. But you will have to hang in there with me for the full experiment to see how it turned out. I learned many things in the course of this experiment, including my endurance level at reading Enron chatter.

As in the first review I spent fifty hours to make it a fair comparison. But the whole project took quite a bit longer than that as there is more to this work that just reading Enron emails. I had to keep records of what I did and create a report to share my journey. In my original hybrid multimodal review I wrote a 72-page Narrative describing the search. Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron. I know this is a difficult read for all but the most ardent searchers, and so too was the write-up. For these reasons I looked for a way to spice up my reporting of this second effort. I wanted to try to entice my fellow e-discovery professionals into following my little experiment. My solution was to use video reports, instead of a long written narrative. Then I added some twists to the video, twists that you will see for yourself as the Challenge of the Borg experiment unfolds.

Here is the opening act. It begins in my backyard in Florida in late December 2012.

___________

Borg Challenge: Part Two where I begin the search with a random sample.

Ralph_Borg_stationThis is the second in a series of reports on a fifty-hour predictive coding experiment using a modified Borg approach to predictive coding. I say a modified Borg approach because I refused to eliminate human judgment entirely. That would be too hideously boring for me to devote over fifty hours of my time. Some of the modifications I made to slightly humanize the Borg approach are explained in the first video below.

In this segment I begin the search project with a random sample. Then I began the first two rounds of machine training with the usual auto coding runs and prediction error corrections (steps four and five in the below diagram). The second video below describes the first two iterative rounds. For these videos to make sense you first need to read and watch Borg Challenge: Part One of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology.

Creating a random sample at the beginning of a review is not necessarily part of a Borg review, and may not be included in most Borg-like software, but it easily could be. It is a best practice that I always include in large scale reviews. When I explain hybrid multimodal machine training using an eight-step model, the first random sample is the third-step, which I call Random Baseline in this diagram.predictive.coding

I use the first random sample as part of my quality control program. With the sample I calculate prevalence, the percent of relevant documents in the total collection. This information gives you a good idea of how many of the relevant documents you have located during the course of the project. Of course, what information scientists call concept drift, and we lawyers call improved understanding of relevance, can impact this calculation and should be taken into account. In this search concept drift was not really a factor because I had already done this same search before. Since this was a repeat search, albeit using a different method, the Borg did have an advantage in this comparative test.

I included a random sample at the beginning of the Borg experiment to give the monomodal method every possible benefit and make the comparison as fair as possible. During the experiment I used several other quality control methods that Borg-like software may not have for the same reasons.

Watch the video and learn more about how I used a random sample of 1,183 documents to begin the Borg Challenge. The next video will describe the first two rounds of training after the baseline sample. These videos will provide many details on the methods used and progress made.

Technical Notes

Although in the video I refer to the first two rounds of training, to be technically correct, they are actually the second and third rounds. Since I found some relevant documents in the random sample, I trained on them. The computer used at least some of them as part of its training set to calculate the first 200 documents for my review. Each round of review trained on at least 200 documents. I selected this 200 documents number (it could be configured to be any size), because in my experience with Inview software, this has been a good minimum number of training documents to use.

To get even more technical, some of the documents I identified as relevant in the first random sample were not used in the first training, instead they were held in reserve, also known as held-out data, in what Kroll Ontrack and others call the testing set. Manning, et al, Introduction to Information Retrieval, (Cambridge, 2008) at pg. 262. This testing set derivation from an initial random sample is part of the internal quality control and metrics evaluation built into the Kroll Ontrack’s Inview and other good predictive coding enabled software. It allows the software itself to monitor and validate the progress of the review, but as you will see, in my modified hybrid approach, I primarily rely upon my own techniques to monitor progress. See Human–computer information retrieval (HCIR).

The first round of training came after coding the random sample. In the random review some documents were marked as relevant and provided for training. So what I call the first round of 200 documents in the video was selected both by random and internal evaluation, which I have previously labeled the Enlightened Borg approach. Three-Cylinder Multimodal Approach To Predictive Coding. This means that, to be entirely accurate, when my training sets are concluded (and I am not yet going to reveal how many rounds I went before it concluded (care to place bets?)), you should add one to the total count.

You should also know that the very first round of training, here described as the random sample, was what I call a Luck Borg round based on chance selection. Id. All of the ensuing rounds of training used an Enlightened Borg approach, enhanced with limited HCIR, but excluding any kind of search other than predictive coding based search. I will explain these subtle distinctions further as this Borg Challenge narrative continues.

___________

______________

__________

Borg Challenge: Part Three where I continue my search through round 16 of machine training

Ralph_frontThis is the third in a series of reports on a fifty-hour predictive coding experiment using the Borg approach to predictive coding. In this segment my video describes rounds three through sixteen of the training. For these videos to make sense you first need to read and watch Part One and Part Two of the Borg Challenge. If this still makes no sense, you could try reading a science fiction version of the battle between these two competing types of predictive coding review methods, Journey into the Borg Hive. And you thought all predictive coding software and review methods were the same? No, it is a little more complicated than that. For more help and information on Computer Assisted Review, see my CAR page on this blog.

The kind of repetitive review task method I am testing here, where you let the computer do most of the thinking and merely make yes-no decisions, can be tedious and difficult. As the project progressed I began to suspect that it was not only taking a toll on my mind and concentration, but also having physical effects. Some might say the changes are an improvement over my normal appearance, but my wife did not think so. Note the Borg appear to be somewhat like vampires and prefer the dark. Click on these two short video reports and see for yourself.

_____________

_________

Borg Challenge: Part Four where I complete the experiment

This is the fourth in a series of reports on a fifty-hour predictive coding experiment using the Borg approach to predictive coding. In this segment my video describes rounds seventeen forward of the machine training. The experiment comes to an end, and none too soon, as I was beginning to feel like a character in a Kafka novel. For these videos to make sense you first need to read and watch Part One, Part Two, and Part Three of the Borg Challenge. Even then, who knows? Kafkaesque videos providing predictive coding narratives based on Star Trek villans are not for everyone.

___________

_______

Borg Challenge: Part Five where I summarize my findings

This is the fifth installment in a series of reports on a fifty-hour predictive coding experiment using Borg-like monomodal methods. In this last segment I summarize my findings and compare the results with my earlier search using multimodal methods. The two searches of 699,082 documents were identical in every respect, except for methodology. The experiment itself is described in Part OnePart TwoPart Three and Part Four of the Borg Challenge. The results reported in my videos below may surprise you.

___________

 

_______

Now that you have heard the full story of the Borg Challenge, you may want to see the contemporaneous notes that I made during the experiment. I also reproduce them below.

_______

Notes of Borg Challenge Review of 699,082 ENRON Documents

Ralph C. Losey

Initial Random Sample to begin project with review of a random sample of 1,183 documents.

• 4 relevant were found.

• Prevalence = 0.25359% (3/1183)

• Spot Projection = 1,773 documents (699,082* 0.25359%)

• Using Binomial = 0.25 +/- 0.05% to 0.74%

• From 350 documents to 5,173 documents

• Took 130 minutes to review (2.16 hrs)

Total Relevant Found by Borg After 50 Rounds = 579

Total Relevant Found by Multimodal = 661.

That’s 81 more documents by Multimodal, which is a 14% improvement.

Put the other way, the Borg method found just under 88% of the Multimodal total.

Final Quality Assurance Random Sample to end project with random sample of 1,183

• 1 relevant was found in the 1,183. Borderline. News article about Enron firing two employees – 5403549. It had not been reviewed.

• Of the 1,183 sample, 26 had been previously reviewed and all had been categorized irrelevant. I agreed with all prior coding.

• Prevalence = 0.0845% (1/1183)

• Spot Projection = 591 documents (699,082* 0.0845%)

• Using Binomial = 0.08 +/- 0.0% to 0.47%

• From 0 documents to 3,286 documents

• Took 150 minutes to review (2.5 hrs)

Elusion = 0.0845% and none were Hot. Shows inadequacies of the Elusion Test in low prevalence.

Combine two Borg random samples taken: 2,366 with 5 found. (95% +/- 1.57%)

Multimodal random sample of 1,507 with 2 found = 0.13%

All Three: 3,873 samples with 7 found = 0.18% = 1,264 spot projection

Binomial = 0.07% – 0.37% = 489 to 2,587

Using most accurate spot projection of 1,264.

Borg retrieval of 578 = Recall 46% (worst case scenario = 578/2,587 = 22% recall)

Multimodal retrieval 659 = Recall 52% (that’s 13% improvement) (worst case = 659/2587 =

25%) (13.6% improvement)

Best Case scenario = 691 relevant. Borg 578 = 83.6%. Multimodal 659 = 95.4%

TOTAL HOT FOUND IN MULTIMODAL = 18.

TOTAL HOT FOUND IN BORG = 13 (only 72% of total found by multimodal)

____________

ROUNDS

Each round consists of a review of 200 documents selected by Inview for training. Inview selected 80% of the 200 on the basis of its uncertain (160 documents) and 20% by random sample (40 documents)

1. 3 relevant found.

• control # 470687; marginally relevant email

• 12005752; barely relevant email

• 12004180 – marginal

• 6 seconds per file: 9.1 files per minute 546 miles per hour

2. -0- relevant found.

• over 80% were foreign language

• took 30 mins – 6.7 files per min., 9 second per file

3. -0- relevant found.

• 155 of them were empty PST files, 3 mins; 12 minutes for the rest

4. 3 relevant found.

• mix of file types; 54 obvious irrelevant; took 5 mins; 146 took 25 mins

• 12007083 – good email re severance payments

• 12009832 – severance plan related email

• 12006688 – waiver and release form

5. 1 relevant found.

• ZERO Relevant files found in the 200 served

• At this point I also ran a special search to see if any 50%+ probable to rank relevant; one new doc predicted relevant 75.7%, control 12006689 – waiver agreement (similar to one already marked)

6. -0- relevant found.

• 50% + search found no new

7. 1 relevant found.

• one relevant release form 12006941;
• all else were not even close except for a contract termination email
• no new ranked relevant docs

8. 1 relevant found.

• 1 relevant employment handbook with term provisions – 12010249
• The rest of the documents were junk, meaning not even close

9. 3 relevant found.

• one relevant email; a general notice to all employees on bankruptcy that mentions employment termination. Control 11236804

• relevant general release (as before) 12000320

• relevant separation agreement 12000546

10. 38 relevant found.

• 34 relevant found in 200 + 4 known by rank outside of the 200 = 38 total

• multiple copies (28) re Reporting to Work Next Week (multiple copies); this email was sent to all employees just before bankruptcy

• 2 more of same found that were on claims

• 4 relevant employee agreement form

• total 34 relevant out of this 200

• ran relevancy marking search – 4 new ones seen ranked

11. 6 relevant found.

• three of the six found in the 200 were of a new type; the other three were more of the same

• relevancy search rank; none new

12. 12 relevant found.

• 2 relevant severance contracts

• 1 relevant employment manual

• 9 relevant FAQ scripts

• 40 minutes for review here (agreements – require more reading to see if there are previous related to termination of employment)

13. 15 relevant found.

• 3 relevant found out of 200, plus 12 from ranked search; total 15 new relevant

• two relevant from the 200 re notice to all employees

• one Canadian release was relevant in the 200

• search of 50%+ probability showed 202 new files predicted relevant, 182 of which were spreadsheets. That was a big jump. But only12 of the predictions were correct (all the Word docs)

• 190 of the 202 50%+ predicted relevant were not in fact relevant

• all of the 12 Word docs were correct predictions

• 6 pdf files – all were irrelevant.

• took another hour for the probability search of 202 documents

14. -0- relevant found.

• included many (150) agreements having nothing to do with employment. All irrelevant

• Predictions were usually about 33% likely relevant and 33% likely irrelevant

15. 19 relevant found.

• 14 were found from the 200:

• 1 relevant spreadsheet calculating termination severance amount

• 1 script to use to talk to terminated employee 12007598

• 4 benefit severance case scenarios

• 1 form termination letter 12006691

• 1 legal advice letter re severance plan 12114935

• 3 actual employment termination letters 12010037

• 1 policy plan change notice

• 1 restated severance plan 12007612

• 2 copies 1 advice from Littler re severance 12009928

• Ran a 50%+ relevancy probability search and 1,487 new predicted relevant documents were shown. A partial review of the 212 I thought most likely to be relevant showed that five (5) were in fact relevant.

• 106 of these were spreadsheets; all reviewed and irrelevant

• 1,331 of the 1,487 were document files; I coded 58 of them. All

irrelevant.

• 31 emails (approx); reviewed all 31, 3 were relevant

• 1 of these 31, found relevant 12007475 was a close question

• Other two ere relevant, eg. 4500240

• I also reviewed 17 odd type file extensions and all were irrelevant

16. 8 relevant found.

• one of the relevant was a close ques. where an offer of employment was rescinded. I considered it relevant – 12010086

17. 12 relevant found.

• 6 found in the 200

• Another 6 in 50%+ probability search

• These 6 were found in the 200:

• relevant emp. contract provisions discussion re termination 12001241

• severance questions – 12006474

• ques re termination for cause – 12006554

• severance plan revisions email – 12011276

• letter rescinding emp. offer – 12007269

• email re severance meeting – 10715815

• search of 50% + probable relevant; 9 not previously reviewed; 6 were relevant; 3 were not

18. 6 relevant found.

• 2 – interrogatory in an employment case 12006517

• 2- Littler re emp. contract – 8400148 – 2 copies

• employment offer rescission

• Oregon statute re: employee requests after termination – 12002602

• many irrelevant contracts were in the review

• 50% + probability search showed no new

19. 28 relevant found.

• email re severance plan revision 12006469

• note – 2 already marked relevant docs were in this set, they were 99% probable

• email re termination of one employee, 12006943

• email re emp. termination – 15508142

• email with ques. re severance – 12005851

• close ques – ERISA plan question email – 12006777

• employee (Ming) dispute re her terminate documentation – 12010029

• employee (Kim) ques. re his severance and email internal re: giving Kim severance – 12005206

• employee complaint of discriminatory termination – 15514893

• 5 email re script for employee terminations – 12007133

• email ques. re severance – 12005421

• email ques. re: severance – 12005422

• another email reply re severance

• another email reply

• another email reply

• another email reply 12010778

• employee incorrectly sent termination

• 2 notices due to computer error 4109833

• voluntary and involuntary termination discussed

• 4 emails re severance plan

• email re severance legal issue (WARN)

• email re termination law suit – 8003332

• this round had many relevant documents and complicated ones; round took 50-60 minutes

• search of 50%+ shows no uncategorized docs

20. -0- relevant found.

• search of 50% + shows no new documents

21. 12 relevant found.

• did 50% + – nothing new

• minutes re emp. contract of Fastow discussing termination – 8003490

• 2 emails re: severance contract – 12007223

• 2 emails re severance contract

• 2 email from individual employee re unfair termination 15511658

• email re severance plan

• 2 complaints by employee, Lion, re discriminatory termination – 8004527

• Oregon statute re: termination

• waiver interpretation email

• email re severance plan

• note irrelevant email that is close – 17312114

22. 5 relevant found.

• 3 relevant found in 200, plus 2 more in 50% + – total 5 relevant

• email re severance plan

• email re severance plan

• email re severance plan

• 50% + probable search: 8 not previously categorize; 2 were relevant

23. 8 relevant found.

• Relevant – severance calculation spreadsheet

• email re emp. term. and immigration

• email re: severance plan communication to all employees

• email that mentions contract termination of employee for cause

• email re: revised severance plan

• follow-up email re: immigration issue above

• email re severance plan

• email re severance plan

• 50% + probability search – 185 found – all previously categorized

24. 10 relevant found.

• 9 from the 200, and 1 from probability – total 10

• 50%+ probability search – one new prediction 62.6% which was correct – total 179

• Relevant found from the 200

• email re termination date and vesting

• email re tax on severance and agreement

• email re severance and part-time employees

• irrelevant, but close call – 15508438

• relevant email re severance and merge

• email re UK employee that mentions severance if terminated

• email re let the severance plan speak for itself

• severance question

• Ms. Bates employment termination

• EEOC case – strong relevant 12005636; only predicted 19.5%

• Severance pay plan assessment – close – 12011364

• total number of relevant so far = 273

25. 3 relevant found in 200, and 1 from probability – total 4.

• relevant email re termination

• reinstatement remedy in EEOC case for wrongful termination – 12007303

• email re: voluntary and involuntary question

• search 50% + probability – 184 – 2 not categorized, 1 relevant (57.3%) and 1 not relevant (58%)

26. 1 relevant found in 200.

• email re: severance plan

27. 8 relevant found in 200

• 3 emails with emp. complaint re severance and vacation

• severance email

• mass email re mistaken notice of termination 12305510

• part-time employee and severance issue

• retention of outside counsel re termination without legal permission. Close question. 12005884

• severance and UK and US employees

28. 5 relevant found in 200, and 7 from probability – total 12

• email re hiring outside counsel

• Fairness in severance based on tenure 12512159

• 3 emails re emp. contract terms, specific as to termination

• run 50% + probability search – 185 total. Review all. Notes below.

• one doc was 70.4% predicted relevant. I had previously marked it as irrelevant and of course trained on it. I now read the agreement more carefully and found termination provisions I had missed before. I agreed it was relevant! 12003549

• another email found like above where termination is found in end of email and so I changed my mind and mark relevant 1200928

• another, this time in a form letter where I now see it has termination language in it 12001275

• another, a WARN notice discussion 12005220

• another, a non-disclosure form with termination discussion that I had marked irrelevant before; 58.3% probable relevant 14632937

• one where I stuck with my guns and still disagreed 12011390. Computer predicted 90.5% relevant and only 1% irrelevant even though I had categorized it as irrelevant.

• another one where computer is right and I changed my mind 12005421. It was ranked 88.8% likely relevant even though I had previously coded as irrelevant.

• another one re termination of employee named Ming where I changed my mind and agree – 67.7% likely relevant. 12010029

29. 50 relevant found in 200 and 3 from probability – total 53.

• 3 emails re severance and tenure

• termination complaint email re India

• email re: severance benefits

• 40 copies of same email re mistaken notice of termination and savings plan

• email re employee relocation or termination

• severance and part-timers

• 3 severance re total costs and Dynergy

• 50% search – 3 previously marked by me as irrelevant now show as likely relevant. Looking again I now see computer is right and they should be marked relevant:

– one concerns a foreign employment contract with 70.1% relevant ranking and I change my mind and agree – 13725405. It only ranked the agreement as 7.3% likely irrelevant, even though I had coded as irrelevant.

– I agreed with a release email, but it was close call. Computer said 90.5% relevant 12011390

– I agreed with the third one too re co scripts and a question on severance 12007134

30. 8 relevant found from 200. Note: corrections made to errors by a Quality Control check of marked relevant documents. The corrected total relevant is now 292.

• email re severance payments

• email re severance calculations

• email re employment contracts and severance

• email re waivers and release

• email re a suit on wrongful termination; very obvious relevant email 12006801

• 2 emails re amended severance to extend employment a month

• email re bankruptcy and severance payments

• ran 50% + not categorized – zero

• First Quality Control Check: I ran a total relevancy search and found 363 total, and double checked all of the, Then I found 71 errors, all errors, which I corrected. They were labeled relevant and were not – all obvious. Pretty sure this was caused by checking the wrong box when doing rapid coding of obvious irrelevant. Made adjustment to coding input layout to make this more unlikely to happen. Good lesson learned here.

• I reran all of the training to be safe the bad input was corrected

31. 3 relevant found from 200. Note: additional QC efforts found 5 more mistakes in prior manual coding. All errors were documents incorrectly marked relevant.

• Email re complaint about termination

• email re revisions and terminations

• Q&A related questions re severance

• Quality Control Check: Probability search of 25%+ predicted relevant but not coded relevant – 12 found. 5 of them were mistakes where I changed my mind and now coded them as relevant. The details of this QC process are as follows:

– Employment contract 13737371 with termination provisions. I mistakenly marked this as irrelevant before because I saw the employer was UBS Warburg. I looked even more carefully now and saw they were acting as a successor in interest to Enron. For that reason I changed my mind. Relevant prediction was 85.4%

– non-disclosure contract with 41.8% relevant prediction. I stuck with my original decision and didn’t change, but it was a close call as other non disclosure agreements had termination provisions (this one did not) and I had called the others with such provisions as relevant.

– a foreign language email sending picture was 78.6% predicted relevant. Computer was obviously wrong.

– email that on top was about crude oil, but below was about severance. Very unusual. I missed that before. 12512165. I changed the coding to relevant. It had a 36.1% probability rank.

– another email just like above. I was wrong. 32.8% probability.

– email about severance calculation that I made mistaken before as irrelevant and now corrected. 29.4% probability.

– email re employee termination that I missed before as the language was near end of otherwise irrelevant discussion 47.5% probability

– 3 agreements between companies that had termination language. 89.7% probable, but computer is wrong. Not about employees. I was about deal contract terminations between companies.

– list of 7 employees in an email. 60.2%, but computer is wrong as nothing relevant 12832822

– waiver of benefits, 52.6% predicted relevant but computer is wrong, not relevant

• total relevant after corrections is now exactly 300

32. 4 relevant found from 200.

• email re termination and rescissions. Predicted relevant at 62.8%. Only document over 28% predicted relevant in this 200 group

• next highest ranked document was 27% probable relevant, about terminated employees 12007207

• email question re part time employees and severance

• email re suggestion on severance plan and severance

33. 1 relevant found from 200. Note, additional QC finds 7 mistaken irrelevant, which were corrected. And one new relevant found by the QC ranking search. Total net gain of 8.

• email thanking Ken Lay for remaining $200 Million 15511524

• The QC search was a 25%+ probable relevant but uncategorized relevant search. Had 19 results. 10 were correctly categorized as irrelevant. 2 not categorized at all. Of these, one was relevant 69.9%, another was not 52.4%. 7 mistakes found in my prior coding of the documents as irrelevant. Net gain of eight relevant.

34. 9 relevant found from 200.

• Started with a QC of relevant search 25%+ not categorized relevant – 10 docs found. No mistakes.

• Of the 200 machine selected, 9 relevant were found

• email re decision on which employees in a small group to keep and which to terminate 12841827

• email re employee complaint re severance

• 3 more emails re Kim Lay’s won’t take the $200 million bonus

• email re severance

• email related to Key Lay $200 Million

• email re buyer and severance terms

• email re revocation and offers and signing bonus

35. 16 relevant found from 200.

• email re termination and an employment contract

• 4 emails re Fastow rumor and payout clause

• 2 forwards of Fastow rumor email

• notes re floor meeting included termination

• bankruptcy and severance

• re rescissions of offers

• taking points

• re: Ken Lay’s email

• Q&A type memos, includes severance

• email re job offer rescission

• email re offer rescission

• re severance and bankruptcy sale of entity

36. 10 relevant found from 200.

• 2 emails re Fastow rumor

• Q&A type

• Korean employee termination

• severance and Dynergy sale

• employment contract and severance

• offer rescind

• offer rescind

• offer rescind

• emp. contract related

• QC Search using 25% and not marked relevant = 10. Same as before. All irrelevant.

37. 5 relevant found from 200.

• re employment contract termination provisions

• Employee firing. Strong. 12005248

• non-disclosure agreement with termination language

• note re Dynergy meeting severance

• offer rescission

38. 10 relevant found from 200.

• employee contract re termination

• 4 – Q&A type

• offer rescission related

• 2 – rescission related

• employee term in sale of company

• foreign employee termination

• payment error on terminated employee

• for cause terminations

• rescission

• QC search 25% + not categorized relevant: 10. Same as before. No changes.

39. 16 relevant found from 200.

• employee complaint re termination

• foreign employee termination

• 3 – employment contract and term provisions

• severance contract related

• termination and bankruptcy related

• mistaken termination

• foreign separation contract

• employee complaint re firing and benefits

• employee contract re termination

• re litigation

• re termination

• employee litigation

• offer rescission

• offer rescission

40. 9 relevant found from 200. Another 50 found from search of 25%+ not previously categorized. Total 59.

• employee complaint re termination

• no terminations to be announced on thanksgiving

• EEOC case was discussed

• Complaint re severance. Strong relevance. 12005256

• email re CEO termination in sale and severance from David Oxley to Louise Kitchen

• mistaken termination notice

• email re termination of employees and benefits

• email re termination of employees and benefits

• re termination timing

• Rand a search for all 25%+, but not previously categorized search. Total 62 found in that search. Review of all of them shows that 50 of them are relevant, and 12 were not.

• Relevant count search shows total now at 444

41. 10 relevant found from 200. Another 3 found from 25% + search – 13 total.

• re termination notices

• 2 re severance payout

• mistaken severance notice

• termination benefits

• Terminated employees

• terminated foreign citizen employee

• 2 – termination action plan

• mistaken termination notice

• “may Ken Lay rot in Hell” – disgruntled employee 15516349

• Ran a 25%+ probable relevant but not categorized as relevant. 15 found. 3 were in fact relevant. No mistakes in my coding.

42. 9 relevant found from 200. Another 2 found from 25%+ search – total 11.

• re termination bonus

• complaint re review and severance

• complaint re notice of benefits

• 2 – complaint re notice of benefits

• re employment contract for Japan and termination

• re terminations

• terminations and stock options

• termination letters

• Ran a 25%+ search and found 17 not categorized. Only 2 were relevant.

– One prior mistakes in coding where I missed a reference to termination, It was 31.7% probable relevant. Pertained to Leaf River. 12006992

• check on total number of relevant shows 468

• Metrics check shows I have now reviewed 9,893 documents (114,037 pgs)

– 9,425 were irrelevant

– 468 were relevant

43. 4 relevant found from 200.

• re non-compete for laid off employees

• re Japanese employee’s contract and termination

• complaint re fairness of termination selection

• Japanese employee lawsuit

44. 4 relevant found from 200. Another 4 mistakes corrected – total 8.

• complaint email to Kim Kay re $80 million bonus 15507937

• severance packages for terminated employees

• employment letter mentions severance payments

• concerning an impromptu firing for Internet posting

• Ran a 25%+ not categorized relevant and 9 were found. 4 were relevant. I had made a mistake on two documents, one of which had three copies:

– 3  Q&A severance with a severance mention I missed before. 12009822

– non-disclosure contract where close study showed it had a termination provision in it.

45. 8 relevant found out of the 200. Another 4 from QC – total 12.

• Q&A type

• discussion re severance in buy out

• reimbursement contract mentions termination

• ques. re severance plan (close question)

• emp. contract and severance

• severance calculations

• severance calculations

• ques. re termination and severance

• Ran a 25%+ not categorized as relevant search and 10 found. I found 6 to be irrelevant, but 4 were relevant. These had all previously been marked by me as irrelevant. I changed my mind on 4 total. Three here described (forgot to describe the last one):

– Visa related question. Close ques. 49.4%. 12010923

– question re severance and tuition reimbursement 34.2%

– an emp. Contract. Close question; see paragraph 4. 29.7%. 12000991

• to date I’ve trained 9,429 docs, reviewed total of 10,493 files and 116,309 pages

• 492 total relevant

46. 5 relevant found out of 200.

• bonus forms with termination provisions

• Japan terminated manager, relevant part at end

• eliminating positions

• mistaken termination

• no planned layoffs for one office

47. 18 relevant found out of 200.

• Note I reviewed the 200 by relevancy ranking. The top four highest ranked were relevant. 58% -> 36%

– 1 was law suit re EEOC,

– 3 were termination benefits

• severance plan

• termination and savings

• employment reinstatement after termination

• 2 – employment termination (13th in ranking). So this means 10 out of top 13 ranked were relevant.

• change link in above email on termination and savings. (Now 11 out of top 14 were relevant.)

• wrongfully terminated employee. (Now 12 out of top 15.)

• Q&A related (now 13/17)

• 3 – Q&A – 16/21 top ranked; slowed down when hit below 10% probable relevance.

• 2 – relevant legal memos re closure of plant and union regulations. Had an IP relevance score of only .0000687131, but it had a probability score of 33.8%. It was in fact strong relevance of a new document type, (2 copies) #12002609 #12003939 (took picture).

• Note – this round took over an hour, as did several of the prior rounds, where I ran into longer documents and close calls.

48. 16 relevant found out of 200.

• 32 of the 200 were obvious by file name junk files of the kind I had seen before. Bulk coded them all irrelevant

• 4 others were found that had already been coded irrelevant. They were all obvious irrelevant.

• I sorted the remaining documents by IP score and looked at most likely relevant on down to get a better sense of training.

• The highest probability ranked was .7980, an email – 4112126. It was a question about severance plan, which I have seen before on another part of the email chain, or one close to it. Marked relevant (1). Note this email had only a 6.9% probability in the category ranking.

• Next 2 highest ranked were same email type having to do with a list of employees who may have been laid off. 3 more later found 6th are 8th and 11th ranked. A total of relevant (5). 13757418 and 13757419.

• the 4th highest ranked doc was an email on reinstatement that I judged close, but decided was relevant (1) – 3107238

• 2 Q&As, which I’d seen before – both relevant (2) – 12010371 and 12007128. Ranked only 3.2% in category relevance and also .32 (32%) in IP score.

• 1 email re Q&A talking point. Only 2.3% category relevant

• 3 – emails re headcount identifying who will be cut. All relevant. All had only 7.6% probability

• 1 email re French subsidiary employee future

• 1 email again on employee lists, which I think was done for purposes related to termination. Close question.

• 1 borderline relevant

• After this point it was all irrelevant showing n IP score effectiveness

• 533 – total relevant found so far

49. 14 relevant found out of 200.

• I sorted all of the 200 by IP score again for review to try to get a better sense of ranking and whether more rounds would be productive.

• 2 – highest IP score was .694 with category probability of 56.3% – 13759801. They are emails re 4 employees wanting to know if still employed or not. Borderline, but I say relevant again. Another in chain.

• Q&A answer type – relevant – 120006812

• Email re eliminating positions was relevant; but the next document, with 6th highest rank was borderline, but not relevant.

• 7th ranked is relevant – another talking points memo

• 2 – 8th and 10th ranked are obvious relevant, but only cat. prob. of 9.8% (IP score of .347)

• 2 – email and response by employee indicating when he thinks he’ll be terminated

• email re termination and non-compete

• email saying a particular employee shouldn’t be “fired.”

• Note: up to here I’ve reviewed top 19 ranked docs and 13 have been relevant.

• email re waiver form for separation being illegal – 12011427

• email re analysis of who should be terminated; interesting – 3817115

• Note after around the top ranked120 I switched to file name sort as all were probably irrelevant anyway and its faster to review with that sort view in place.

• 1 odd email of no importance was found relevant after that; an outline re where to find terminated employees email – 8909940 – category probability only 1.0%

50. 13 relevant found out of 200. An additional 11 more relevant from 25%+ category searches. Total 24.

• before any review a search shows 550 docs are 50%+ probable relevant and 55 docs of the 550 were not yet categorized relevant. I reviewed these and the 200 from machine selected.

• First marked 81 obvious irrelevant by file name order. The switched to file raking order to review the rest; only 3 docs had .54, .58 and .706; the rest were under .54

• Top ranked .706 – #13723687 re an employee termination. I had seen another earlier part of this email chain before. Her boss didn’t want to lose her.

• 2nd ranked was another talking points memo

• 0 – 3rd ranked – close call, but I know from other emails that the employee list here discussed pertains to who gets retention bonuses, not who gets terminated. Not relevant.

• 4th ranked .466 – is relevant; employee list, but this one pertains to terminations

• list email, close call, but relevant. .406

• list email, close call, but relevant.

• email re keeping the employee again, same before. .389

• another close relevant, repeat re list

• another re list and cuts of employee. Only 8.9% cat. prob. and .356 IP score

#13758900

• email asking question re bankruptcy court and payments to terminated employees. Had not seen before #15507580 – only 10% category probability.

• Another re keeping same employee (Deirdre)

• 2 emails re keeping tow other people

• Note: at this point I am down to the lowest ranked 92 where I find only one relevant, next described. Also, I’m seeing many emails re termination, but involves deal contract terminations between companies.

• Email from fired employee complaining re low severance.) After that was a funny (and dirty) irrelevant email – 10617832.)

• Run total relevant searches and find 567 have been categorized relevant, and 47,021 categorized irrelevant.

• Run probability ranked searches and find 520 documents are ranked 50%+. Of those, 15 were not categorized relevant. Ran a 25%+ and only found 25 more documents, meaning only 10 documents between 25% and 50% probable relevant.

• Reviewed all of the 25%+ higher documents (total 25) that were not categorized relevant. Of those, 11 were not previously categorized. All were reviewed and found to be relevant. The other 14 had been previously reviewed and marked irrelevant. I reviewed them again. Most of these were close calls, but I still considered all of them irrelevant and did not change any. Here are the documents from the not-reviewed 11, where I found them all to be relevant:

• 2 – re terminated employee list – 13759551 and 13723640

• 4 – re position lists with rationale – 13757156

• 2 – employees and positions not to terminate – 9703787

• keep two employees, seen before

• 2 – re save two employees from cut, seen before

• New Total Relevant found is 578

STOPPED ROUNDS HERE AND PERFORM QUALITY ASSURANCE TEST

Decision to stop was based on the few documents found going all of the way down to 25%+, the lack of any real new documents in several rounds, and the total time expended to date, which is about same as time expended in the prior multimodal before final test.


Borg Challenge: Part Two where I begin the search with a random sample

April 9, 2013

Ralph_Borg_stationThis is the second in a series of reports on a fifty-hour predictive coding experiment using a modified Borg approach to predictive coding. I say a modified Borg approach because I refused to eliminate human judgment entirely. That would be too hideously boring for me to devote over fifty hours of my time. Some of the modifications I made to slightly humanize the Borg approach are explained in the first video below.

In this segment I begin the search project with a random sample. Then I began the first two rounds of machine training with the usual auto coding runs and prediction error corrections (steps four and five in the below diagram). The second video below describes the first two iterative rounds. For these videos to make sense you first need to read and watch Borg Challenge: Part One of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology.

Creating a random sample at the beginning of a review is not necessarily part of a Borg review, and may not be included in most Borg-like software, but it easily could be. It is a best practice that I always include in large scale reviews. When I explain hybrid multimodal machine training using an eight-step model, the first random sample is the third-step, which I call Random Baseline in this diagram.predictive.coding

I use the first random sample as part of my quality control program. With the sample I calculate prevalence, the percent of relevant documents in the total collection. This information gives you a good idea of how many of the relevant documents you have located during the course of the project. Of course, what information scientists call concept drift, and we lawyers call improved understanding of relevance, can impact this calculation and should be taken into account. In this search concept drift was not really a factor because I had already done this same search before. Since this was a repeat search, albeit using a different method, the Borg did have an advantage in this comparative test.

I included a random sample at the beginning of the Borg experiment to give the monomodal method every possible benefit and make the comparison as fair as possible. During the experiment I used several other quality control methods that Borg-like software may not have for the same reasons.

Watch the video and learn more about how I used a random sample of 1,183 documents to begin the Borg Challenge. The next video will describe the first two rounds of training after the baseline sample. These videos will provide many details on the methods used and progress made.

Technical Notes

Although in the video I refer to the first two rounds of training, to be technically correct, they are actually the second and third rounds. Since I found some relevant documents in the random sample, I trained on them. The computer used at least some of them as part of its training set to calculate the first 200 documents for my review. Each round of review trained on at least 200 documents. I selected this 200 documents number (it could be configured to be any size), because in my experience with Inview software, this has been a good minimum number of training documents to use.

To get even more technical, some of the documents I identified as relevant in the first random sample were not used in the first training, instead they were held in reserve, also known as held-out data, in what Kroll Ontrack and others call the testing set. Manning, et al, Introduction to Information Retrieval, (Cambridge, 2008) at pg. 262. This testing set derivation from an initial random sample is part of the internal quality control and metrics evaluation built into the Kroll Ontrack’s Inview and other good predictive coding enabled software. It allows the software itself to monitor and validate the progress of the review, but as you will see, in my modified hybrid approach, I primarily rely upon my own techniques to monitor progress. See Human–computer information retrieval (HCIR).

The first round of training came after coding the random sample. In the random review some documents were marked as relevant and provided for training. So what I call the first round of 200 documents in the video was selected both by random and internal evaluation, which I have previously labeled the Enlightened Borg approach. Three-Cylinder Multimodal Approach To Predictive Coding. This means that, to be entirely accurate, when my training sets are concluded (and I am not yet going to reveal how many rounds I went before it concluded (care to place bets?)), you should add one to the total count.

You should also know that the very first round of training, here described as the random sample, was what I call a Luck Borg round based on chance selection. Id. All of the ensuing rounds of training used an Enlightened Borg approach, enhanced with limited HCIR, but excluding any kind of search other than predictive coding based search. I will explain these subtle distinctions further as this Borg Challenge narrative continues.

___________

______________

Stay tuned for Borg Challenge: Part Three where I continue my search through round sixteen of machine training.


Borg Challenge: Part One of my experimental review of 699,082 Enron documents using a semi-automated monomodal methodology

April 7, 2013

Borg_Losey_beginning stage assimilationThis is the first in a series of reports on a fifty-hour predictive coding experiment using a semi-automated, single-search-method. I usually refer to this method as the Borg approach, after the infamous Star Trek villains. SeeNever Heard of Star Trek’s Borg? This method is used by many predictive coding software systems. It contrasts with the hybrid multimodal search method that I use, along with many others.

Multimodal in the context of predictive coding means the use of all types of searches to find machine training documents, not just one. I call it a hybrid approach because the human search expert remains in control of the machine learning. I recently discovered this follows a well-established practice in information retrieval science called Human–computer information retrieval (HCIR). The multimodal hybrid approach is contra to a monomodal approach where predictive coding only is used, and where the interaction of humans in the process is minimized, or reduced entirely to yes-no decisions.

I think the hybrid multimodal method is superior to the popular alternative Borg methods, but I have no real factual basis for this supposition. To my knowledge no comparisons or other tests have ever been made between the two methodologies. My opinion was based on logic alone, that if one search method was good, many would be better. But as I have previously stated here, my opinion could be wrong. After a lifetime in the law I know that for judgments to be valid they must be based on evidence, not just reason. See eg. There Can Be No Justice Without Truth, And No Truth Without Search. That is also a basic tenet of our scientific age. For these reasons I decided to perform an experiment to test the Borg approach against my own preferred methods. Thus the Borg Challenge experiment was born.

EnronI had already performed a fifty-hour search project using my methods, which I reported here. Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron. That set a benchmark for me to test the Borg approach against. My experiment would be to repeat the same search I did before, but using the competing method. I would once again search an Enron dataset of 699,082 emails and attachments for evidence of involuntary employment terminations. I would also use the same software, Kroll Ontrack’s Inview, but this time I would configure and use it according to the Borg method. (One of the strengths of Inview is the flexibility of its features, which allowed for easy adaptation to this alternative method.)

As far as I know, this kind of test comparison of the same large search project, by the same person, searching the same data, but with different methods, has never been done before. I began my project in 2012 during Christmas vacation. I did not finish my final notes and analysis until March 2013. The results of my experiment may surprise you. That is the beauty of the scientific method. But you will have to hang in there with me for the full experiment to see how it turned out. I learned many things in the course of this experiment, including my endurance level at reading Enron chatter.

As in the first review I spent fifty hours to make it a fair comparison. But the whole project took quite a bit longer than that as there is more to this work that just reading Enron emails. I had to keep records of what I did and create a report to share my journey. In my original hybrid multimodal review I wrote a 72-page Narrative describing the search. Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron. I know this is a difficult read for all but the most ardent searchers, and so too was the write-up. For these reasons I looked for a way to spice up my reporting of this second effort. I wanted to try to entice my fellow e-discovery professionals into following my little experiment. My solution was to use video reports, instead of a long written narrative. Then I added some twists to the video, twists that you will see for yourself as the Challenge of the Borg experiment unfolds.

Here is the opening act. It begins in my backyard in Florida in late December 2012.

Stay tuned for Borg Challenge: Part Two where I begin the search, as per my usual methods, with a random sample.


TAR

March 23, 2013

Lexington - IT lexThe only types of Technology Assisted Review (TAR) software that we endorse for the search of large ESI collections include active machine learning algorithms, which provide full featured predictive coding capacities. Active machine learning is a type of artificial intelligence (AI). When used in legal search these AI algorithms significantly improve the search, review, and classification of electronically stored information (ESI). For this reason I prefer to call predictive coding AI-enhanced review or AI-enhanced search. For more background on the science involved, see LegalSearchScience.com and our sixteen class TAR Training Course.

In TARs with AI-enhanced active machine learning, attorneys train a computer to find documents identified by the attorney as a target. The target is typically relevance to a particular lawsuit or legal issue, or some other legal classification, such as privilege. This kind of AI-enhanced review, along with general e-discovery training, are now my primary interests as a lawyer.

Personal Legal Search Background

Ralph and some of his computers at one of his law officesIn 2006 I dropped my civil litigation practice and limited my work to e-discovery. That is also when I started this blog. At that time I could not even imagine specializing more than that. In 2006 I was interested in all aspects of electronic discovery, including computer assisted review. AI-enhanced software was still just a dream that I hoped would someday come true.

The use of software in legal practice has always been a compelling interest for me. I have been an avid user of computer software of all kinds since the late 1970s, both legal and entertainment. I even did some game software design and programming work in the early 1980s. My now-grown kids still remember the computer games I made for them.

I carefully followed the legal search and review software scene my whole career, but especially since 2006. It was not until 2011 that I began to be impressed by the new types of predictive coding software entering the market. After I got my hands on the new software, I began to do what had once been unimaginable. I started to limit my legal practice even further. I began to spend more and more of my time on predictive coding types of review work. Since 2012 my work as an e-discovery lawyer and researcher has focused almost exclusively on using predictive coding in large document production projects, and on e-discovery training, another passion of mine. In that year one of my cases produced a landmark decision by Judge Andrew Peck that first approved the use of predictive coding. Da Silva Moore v. Publicis Groupe, 2012 WL 607412 (SDNY Feb. 24, 2012) (approved and adopted in Da Silva Moore v. Publicis Groupe, 2012 WL 1446534, at *2 (SDNY Apr. 26, 2012)). There have been many cases thereafter that follow Da Silva Moore and encourage the use of predictive coding. See eg.: Rio Tinto v. Vale, 2015 WL 872294 (March 2, 2015, SDNY) with a case collection therein.

LSS_Gavel_3Attorney Maura R. Grossman and I are among the first attorneys in the world to specialize in predictive coding as an e-discovery sub-niche. Maura is a colleague who is both a practicing attorney and an expert in the new field of Legal Search Science.  We have frequently presented on CLE panels as a kind of technology evangelists for these new methods of legal review. Maura, and her partner, ProfessorGordon Cormack, who is an esteemed information scientist and professor, wrote the seminal scholarly paper on the subject, and an excellent glossary of terms used in TAR. Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Richmond Journal of Law and Technology, Vol. XVII, Issue 3, Article 11 (2011); The Grossman-Cormack Glossary of Technology-Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate Judge2013 Fed. Cts. L. Rev. 7 (January 2013); Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic DiscoverySIGIR’14, July 6–11, 2014.

I recommend your reading of all of their works. I also recommend your review of my over sixty articles on the subject, study of the LegalSearchScience.com website that I put together, and the many references and citations included at Legal Search Science, including the writings of other pioneers in the field, such as the founders of TREC Legal Track, Jason R. Baron, Doug Oard, and other key figures in the field, such as information scientist William Webber. Also see Baron and Grossman, The Sedona Conference® Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (2013).pdf (December 2013).

Advanced TARs Require Completely New Driving Methods

TAR is more than just new software. It entails a whole new legal method, a new approach to large document reviews. Below is the diagram of the latest Predictive Coding 4.0 workflow I use in a typical TAR project.

predictive_coding_4-0_2

See: TAR Training Course. This sixteen class course teaches our latest insights and methods of Predictive Coding 4.0.

Predictive Coding using the latest 4.0 methods is the new tool for finding the ESI needles of relevant evidence. When used properly, good predictive coding software allows attorneys to find the information they need to defend or prosecute a case in the vast haystacks of ESI they must search, and to do so in an effective and affordable manner.

Professor Cormack and Maura Grossman have also performed experiments on predictive coding methodologies, which, among other things, tested the efficacy of random only based search. Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic DiscoverySIGIR’14, July 6–11, 2014. They reached the same conclusions that I did, and showed that this random only – Borg approach – is far less effective than even the most simplistic judgmental methods. I reported on this study in full in a series of blogs in the Summer of 2014, Latest Grossman and Cormack Study Proves Folly of Using Random Search for Machine Training, see especially Part One of the series.

The CAL Variation

After study of the 2014 experiments by Professor Cormack and Maura Grossman reported at the SIGIR conference, I created a variation to the predictive coding work flow, which they call CAL, for Continuous Active Learning. Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic DiscoverySIGIR’14, July 6–11, 2014, at pg. 9. Also see Latest Grossman and Cormack Study Proves Folly of Using Random Search for Machine Training – Parts One,  TwoThree and Four. The part that intrigued me about there study was the use of continuous machine training as part of the entire review. This is explained in detail in Part Three of my lengthy blog series on the Cormack Grossman study.

The form of CAL that Cormack and Grossman tested used high probable relevant documents in all but the first training round. (In the first round, the so called seed set, they trained using documents found by keyword search.) This experiment showed that the method of review of the documents with the highest rankings works well, and should be given significant weight in any multimodal approach, especially when the goal is to quickly find as many relevant documents as possible. This is another take-away from this important experiment.

The “continuous” training aspects of the CAL approach means that you keep doing machine training throughout the review project and batch reviews accordingly. This could become a project management issue. But, if you can pull it off within proportionality and requesting party constraints, it just makes common sense to do so. You might as well get as much help from the machine as possible and keep getting its probability predictions for as long as you are still doing reviews and can make last minute batch assignments accordingly.

I have done several reviews in such a continuous training manner without really thinking about the fact that the machine input was continuous, including my first Enron experiment. Predictive Coding Narrative: Searching for Relevance in the Ashes of Enron. But the Cormack Grossman study on the continuous active learning approach caused me to rethink the my flow chart and create the Version 4.0 process shown above. See: TAR Training Courses (teaches Predictive Coding 4.0).

Hybrid Human Computer Information Retrieval

human-and-robots

In further contradistinction to the Borg, or random only approaches, where the machine controls the learning process, I advocate a hybrid approach where Man and Machine work together. In my hybrid method the expert reviewer remains in control of the process, and their expertise is leveraged for greater accuracy and speed. The human intelligence of the SME is a key part of the search process. In the scholarly literature of information science this hybrid approach is known as Human–computer information retrieval (HCIR).

The classic text in the area of HCIR, which I endorse, is Information Seeking in Electronic Environments (Cambridge 1995) by Gary Marchionini, Professor and Dean of the School of Information and Library Sciences of U.N.C. at Chapel Hill. Professor Marchionini speaks of three types of expertise needed for a successful information seeker:

  1. Domain Expertise. This is equivalent to what we now call SME, subject matter expertise. It refers to a domain of knowledge. In the context of law the domain would refer to particular types of lawsuits or legal investigations, such as antitrust, patent, ERISA, discrimination, trade-secrets, breach of contract, Qui Tam, etc. The knowledge of the SME on the particular search goal is extrapolated by the software algorithms to guide the search. If the SME also has System Expertise, and Information Seeking Expertise, they can drive the process themselves. (That is what I did in the EDI Oracle competition. I did the whole project as an Army of One, and my results were unbeatable.)  Otherwise, an SME will need expert helpers with such system and search expertise. These experts must also have legal knowledge because they must be capable of learning enough from the SME to recognize the relevant documents.
  2. System Expertise. This refers to expertise in the technology system used for the search. A system expert in predictive coding would have a deep and detailed knowledge of the software they are using, including the ability to customize the software and use all of its features. In computer circles a person with such skills is often called a power-user. Ideally a power-user would have expertise in several different software systems. They would also be an expert in a particular method of search.
  3. Information Seeking Expertise. This is a skill that is often overlooked in legal search. It refers to a general cognitive skill related to information seeking. It is based on both experience and innate talents. For instance, “capabilities such as superior memory and visual scanning abilities interact to support broader and more purposive examination of text.” Professor Marchionini goes on to say that: “One goal of human-computer interaction research is to apply computing power to amplify and augment these human abilities.” Some lawyers seem to have a gift for search, which they refine with experience, broaden with knowledge of different tools, and enhance with technologies. Others do not.

Id. at pgs.66-69, with the quotes from pg. 69.

All three of these skills are required for an attorney to attain expertise in legal search today, which is one reason I find this new area of legal practice requires a team effort.

Predictive_coding_triangles

It is not enough to be an SME, or a power-user, or have a special knack for search. You have to be able to do it all, and usually the only way to do that is to work with a team that has all these skills, and good software too. With a team it is not really that difficult, but like anything requires initial training and then experience. Still, among the three skill-sets, studies have shown that, System Expertise, which in legal search primarily means mastery of the particular software used (Power User), is the least important. Id. at 67. The SMEs are more important, those  who have mastered a domain of knowledge. In Professor Marchionini’s words:

Thus, experts in a domain have greater facility and experience related to information-seeking factors specific to the domain and are able to execute the subprocesses of information seeking with speed, confidence, and accuracy.

Id. That is one reason that the Grossman Cormack glossary builds in the role of SMEs as part of their base definition of computer assisted review:

A process for Prioritizing or Coding a Collection of electronic Documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of Documents and then extrapolates those judgments to the remaining Document Collection.

Glossary at pg. 21 defining TAR.

According to Marchionini, Information Seeking Expertise, much like Subject Matter Expertise, is also more important than specific software mastery. Id. This may seem counterintuitive in the age of Google, where an illusion of simplicity is created by typing in words to find websites. But legal search of user-created data is a completely different type of search task than looking for information from popular websites. In the search for evidence in a litigation, or as part of a legal investigation, special expertise in information seeking is critical, including especially knowledge of multiple search techniques and methods. Again quoting Professor Marchionini:

Expert information seekers possess substantial knowledge related to the factors of information seeking, have developed distinct patterns of searching, and use a variety of strategies, tactics and moves.

Id. at 70.

In the field of law this kind of information seeking expertise includes the ability to understand and clarify what the information need is, in other words, to know what you are looking for, and articulate the need into specific search topics. This important step precedes the actual search, but is an integral part of the process. As one of the basic texts on information retrieval written by Gordon Cormack, et al, explains:

Before conducting a search, a user has an information need, which underlies and drives the search process. We sometimes refer to this information need as a topic …

Buttcher, Clarke & Cormack, Information Retrieval: Implementation and Evaluation of Search Engines (MIT Press, 2010) at pg. 5. The importance of pre-search refining of the information need is stressed in the first step of the above diagram of Predictive Coding 4.0 methods, ESI Discovery Communications. It seems very basic, but is often under appreciated, or overlooked entirely in the litigation context where information needs are often vague and ill-defined, lost in overly long requests for production and adversarial hostility.

Hybrid Multimodal Bottom Line Driven Review

My descriptive name for what Marchionini calls the variety of strategies, tactics and moves is Hybrid Multimodal. See eg. Bottom Line Driven Proportional Review (2013). I refer to it as a multimodal method because, although the predictive coding type of searches predominate (shown on the below diagram as AI-enhanced review – AI), I also  use the other modes of search, including Unsupervised Learning Algorithms (explained in LegalSearchScience.com) (often called clustering or near-duplication searches), keyword search, and even some traditional linear review (although usually very limited). As described, I do not rely entirely on random documents, or computer selected documents for the AI-enhanced searches, but use a four-cylinder approach that includes human judgment sampling and AI document ranking. See:  TAR Training Course. This sixteen class course teaches our latest insights and methods of Predictive Coding 4.0.

The various types of legal search methods used in a multimodal process are shown in this search pyramid.search_pyramid_revisedMost information scientists I have spoken to agree that it makes sense to use multiple methods in legal search and not just rely on any single method. UCLA Professor Marcia J. Bates first advocated for using multiple search methods back in 1989, which she called it berrypicking. Bates, Marcia J. The Design of Browsing and Berrypicking Techniques for the Online Search Interface, Online Review 13 (October 1989): 407-424. As Professor Bates explained in 2011 in Quora:

An important thing we learned early on is that successful searching requires what I called “berrypicking.” … Berrypicking involves 1) searching many different places/sources, 2) using different search techniques in different places, and 3) changing your search goal as you go along and learn things along the way. This may seem fairly obvious when stated this way, but, in fact, many searchers erroneously think they will find everything they want in just one place, and second, many information systems have been designed to permit only one kind of searching, and inhibit the searcher from using the more effective berrypicking technique.

This berrypicking approach, combined with HCIR, is what I have found from practical experience works best with legal search.

My Battles in Court Over Predictive Coding

In 2012 my case became the first in the country where the use of predictive coding was approved. See Judge Peck’s landmark decision Da Silva Moore v. Publicis, 2012 WL 607412 (S.D.N.Y. Feb. 24, 2012) (approved and adopted in Da Silva Moore v. Publicis Groupe, 2012 WL 1446534, at *2 (S.D.N.Y. Apr. 26, 2012)). In that case my methods of using Recommind’s Axcelerate software were approved. Later in 2012, in another first, an AAA arbitration approved our use of predictive coding in a large document production. In that case I used Kroll Ontrack’s Inview software over the vigorous objections of the plaintiff, which, after hearings, were all rejected. These and other decisions have helped pave the way for the use of predictive coding search methods in litigation.

Scientific Research

In addition to these activities in court I have focused on scientific research on legal search, especially machine learning. I have, for instance, become one of the primary outside reporters on the legal search experiments conducted by TREC Legal Track of the National Institute of Science and Technology. See egAnalysis of the Official Report on the 2011 TREC Legal Track – Part OnePart Two and Part ThreeSecrets of Search: Parts OneTwo, and ThreeAlso see Jason Baron, DESI, Sedona and Barcelona. In 2015, and again in 2016, I was a participant in TREC total Recall Track. My team members were the top search experts at Kroll Ontrack whom have all trained and mastered Predictive Coding 4.0 methods. The e-Discovery Team participation in TREC is reported on at MrEDR.com, the name my team gave to the Kroll Ontrack software we used in these experiments.

After the TREC Legal Track closed down in 2011, and then reopened in 2015 with the Total Recall Track, and again in 2016, the only group participant scientific study to test the efficacy of various predictive coding software, and search methods, is the one sponsored by Oracle, the Electronic Discovery Institute and Stanford. This search of a 1,639,311 document database was conducted in early 2013, with the results reported in Monica Bay’s article, EDI-Oracle Study: Humans Are Still Essential in E-Discovery (LTN Nov., 2013). Here is the below chart published by LTN that summarizes the results.

1202628778400_chart

Monica Bay summaries the findings of the research as follows:

Phase I of the study shows that older lawyers still have e-discovery chops and you don’t want to turn EDD over to robots.

Penrose_triangle_ExpertiseWith respect to my dear friend Monica, I must disagree with her conclusion. The age of the lawyers is irrelevant. The best predictive coding trainers do not have to be old, they just have to be SMEs, power users of good software, and have good search skills. In fact, not all SMEs are old, although many may be. It is the expertise and skills that matter, not age per se. It is true as Monica reports that the lawyer, a team of one, who did better in this experiment than all of the other much larger participant groups, was chronologically old. But that fact is irrelevant. The skill set and small group size, namely one, is what made the difference. SeeLess Is More: When it comes to predictive coding training, the “fewer reviewers the better” – Parts OneTwo, and Three.

Moreover, although Monica is correct to say we do not want to”turn over” review to robots, this assertion misses the point. We certainly do want to turn over review to robot-human teams. We want our predictive coding software, our robots, to hook up with our experienced lawyers. We want our lawyers to enhance their own limited intelligence with artificial intelligence – the Hybrid approach. Robots are the future, but only if and as they work hand-in-hand with our top human trainers. Then they are unbeatable, as the EDI-Oracle study shows.

Secret Shh!For the time being the details of the EDI-Oracle scientific study are still closed, and even though Monica Bay was permitted to publicize the results, and make her own summary and conclusions, participants are prohibited from discussion and public disclosures. For this reason I can say no more on this study, and only assert without facts that Monica’s conclusions are in some respects incorrect, that age is not critical, that the hybrid multimodal method is what is important. I hope and expect that someday soon the gag order for participants will be lifted, the full findings of this most interesting scientific experiment will be released, and a free dialogue will commence. Truth only thrives in the open, and science concealed is merely occult. That is one of many reason why the more open TREC experiments in 2015 and 2016 are so important. See MrEDR.com.

Why Predictive Coding Is Important

I continue to focus on this sub-niche area of e-discovery as I am convinced that it is critical to advancement of the law in the 21st Century. Our own intelligence and search skills must be enhanced by the latest AI software. Predictive Coding 4.0 methods allow a skilled attorney using the latest predictive coding type software to review at remarkable rates of speed and cost. The AI-enhanced review rates are more than 250-times faster than traditional linear review, and the costs less than a tenth as much. See eg Predictive Coding Narrative: Searching for Relevance in the Ashes of EnronEDI-Oracle Study: Humans Are Still Essential in E-Discovery (LTN Nov., 2013); also see MrEDR.com.

My Life as a Limo Driver and Trainer

I have spoken on this subject at many CLEs around the world since 2011. I explain the theory and practice of this new breakthrough technology. I also consult on a hands-on basis to help others learn the new methods. As an old software lover who has been doing legal document reviews since 1980, I also continue to like to do these review projects myself. I like to run AI_enhanced document review projects myself, not just teach others or supervise what they do. I enjoy the interaction and enhancements from the hybrid, human-robot approach. Certainly I need an appreciate the artificial intelligence boosts to my own limited capacities.

I also like to serve as a kind of limo driver for trial lawyers from time to time. The top SMEs in the world (I prefer to work with the best), are almost never also software power-users, nor do they have special skills or talents for information seeking outside of depositions. For that reason they need me to run the review projects for them. To switch to the robot analogy again, I like and can work with the bots, they cannot.

I can only do my job as a limo driver – robot friend in an effective manner if the SME first teaches me enough of their domain to know where I am going; to know what documents would be relevant or hot or not. That is where decades of legal experience handling a variety of cases is quite helpful. It makes it easer to get a download of the SME’s concept of relevance into my head, and then into the machine. Then I can act as a surrogate SME and do the machine training for them in an accurate and consistent manner.

rolls-royce-chauffeur

Working as a driver for an SME presents many special communication challenges. I have had to devise a number of techniques to facilitate a new kind of SME surrogate agency process. SeePredictive Coding 4.0  restated here in one post.

Of course, it is easier to do the search when you are also the SME. For instance, in one project I reviewed almost two million documents, by myself, in only two-weeks. That’s right. By myself. (There was no redaction or privilege logging, which are tasks that I always delegate anyway.) A quality assurance test at the end of the review based on random sampling showed a very high accuracy rate was attained. There is no question that it met the reasonability standards required by law and rules of procedure.

It was only possible to do a project of this size so quickly because I happened to be an SME on the legal issues under review, and, just as important, I was a power-user of the software, and have, at this point, mastered my own search and review methods.

Thanks to the new software and version 4.0 methods, what was considered impossible, even absurd, just a few short years ago, namely one attorney accurately reviewing two million documents by him or herself in 14-days, is attainable by many experts. My story is not unique. Maura tells me that she once did a seven-million document review by herself. That is why Maura and Gordon were correct to refer to TAR as a disruptive technology in the Preface to their Glossary. Technology that can empower one skilled lawyer to do the work of hundreds of unskilled attorneys is certainly a big deal, one for which we have Legal Search Science to thank.  It is also why I urge you to study this subject more carefully and learn to train the document review robots yourself. Either that, or hire a limo driver like me.

Before you begin to actually carry out a predictive coding project, with or without an expert to run your project, you need to plan for it. This is critical to the success of the project. Here is detailed outline of a Form Plan for a Predictive Coding Project that I used to use as a complete checklist. (It’s a little dated now.)

My Writings on TAR

A good way to continue your study in this area is to read the articles by Grossman and Cormack, and the over sixty or so articles on the subject that I have written since mid-2011. They are listed in rough chronological order, with the most recent on top.

Ralph in the morning reading on his 17 inch MacProI am especially proud of the legal search experiments I have done using AI-enhanced search software provided to me by Kroll Ontrack to review the 699,083 public Enron documents and my reports on these reviews. Comparative Efficacy of Two Predictive Coding Reviews of 699,082 Enron Documents(Part Two); A Modest Contribution to the Science of Search: Report and Analysis of Inconsistent Classifications in Two Predictive Coding Reviews of 699,082 Enron Documents. (Part One). I have been told by scientists that my over 100 hours of search, comprised of two fifty-hour search projects using different methods, is the largest search project by a single reviewer that has ever been undertaken, not only in Legal Search, but in any kind of search. I do not expect this record will last for long, as others begin to understand the importance of Information Science in general, and Legal Search Science in particular. But for now I will enjoy both the record and lessons learned from the hard work involved.

April 2014 Slide Presentation by Ralph Losey on Predictive Coding Using now ‘slightly dated’ 3.0 Methods

Please contact me at Ralph.Losey at gmail dot com if you have any questions.


%d bloggers like this: