TAR Course

This is the e-Discovery Team’s training course on how to do TAR (Technology Assisted Review). What TAR really means is electronic document review enhanced by active machine learning, a type of specialized Artificial Intelligence. Our method of AI-enhanced document review is called Hybrid Multimodal IST Predictive Coding 4.0. The Course is composed of Sixteen Classes:

  1. First Class: Introduction (this page)
  2. Second Class: TREC Total Recall Track
  3. Third Class: Introduction to the Nine Insights Concerning the Use of Predictive Coding in Legal Document Review
  4. Fourth Class: 1st of the Nine Insights – Active Machine Learning
  5. Fifth Class: Balanced Hybrid and Intelligently Spaced Training
  6. Sixth Class: Concept and Similarity Searches
  7. Seventh Class: Keyword and Linear Review
  8. Eighth Class: GIGO, QC, SME, Method, Software
  9. Ninth Class: Introduction to the Eight-Step Work Flow
  10. Tenth Class: Step One – ESI Communications
  11. Eleventh Class: Step Two – Multimodal ECA
  12. Twelfth Class: Step Three – Random Prevalence
  13. Thirteenth Class: Steps Four, Five and Six – Iterate
  14. Fourteenth Class: Step Seven – ZEN Quality Assurance Tests
  15. Fifteenth Class: Step Eight – Phased Production
  16. Sixteenth Class: Conclusion

With a lot of hard work you can complete this online training program in a long weekend. After that, this course can serve as a solid reference to consult during your complex document review projects.

First Class: Introduction

The sixteen classes in this course cover seventeen topics:

  1. Active Machine Learning (aka Predictive Coding)
  2. Concept & Similarity Searches (aka Passive Learning)
  3. Keyword Search (tested, Boolean, parametric)
  4. Focused Linear Search (key dates & people)
  5. GIGO & QC (Garbage In, Garbage Out) (Quality Control)
  6. Balanced Hybrid (man-machine balance with IST)
  7. SME (Subject Matter Expert, typically trial counsel)
  8. Method (for electronic document review)
  9. Software (for electronic document review)
  10. Talk (step 1 – relevance dialogues)
  11. ECA (step 2 – early case assessment using all methods)
  12. Random (step 3 – prevalence range estimate, not control sets)
  13. Select (step 4 – choose documents for training machine)
  14. AI Rank (step 5 – machine ranks documents according to probabilities)
  15. Review (step 6 – attorneys review and code documents)
  16. Zen QC (step 7 – Zero Error Numerics Quality Control procedures)
  17. Produce (step 8 – production of relevant, non-privileged documents)

The first nine are insights. They are shown in the chart below.

The last eight points covered in this course are the workflow steps shown in the circular chart below.predictive_coding_4-0_web

These seventeen points, and more, are covered in the sixteen Classes.

We call our latest version of AI enhanced document review “Predictive Coding 4.0” because it substantially improves upon and replaces the methods and insights we announced in our October 2015 publication – Predictive Coding 3.0. There we explained the history of predictive coding software and methods in legal review, including versions 1.0 and 2.0. Unfortunately, most vendors are still stuck in these earlier methods. If you have tried predictive coding and did not like it, then the probable reason is that you used the vendors recommended, but wrong method. Either that, or the software was to blame, but it is probably the method. Many lawyers report that they attain better results when they follow their own methods, not the vendors default methods.

Most vendors are still promoting use of random based control sets based on a misunderstanding of statistics and search. The use of control sets is simply wrong and a waste of time. We never saw any of these same vendors at TREC and for good reason. They are mostly clueless and are not keeping up with the latest developments in search science. They are a business. We are not. The e-Discovery Team is a group of lawyers, lead by me, Ralph Losey, a practicing attorney. We are lawyers sharing what we know with other lawyers (and vendors).

We offer this information for free on this blog to encourage as many people as possible in this industry to get on the AI bandwagon. Predictive coding is based on active machine learning, which is a classic, powerful type of Artificial Intelligence (AI). Our Predictive Coding 4.0 method is designed to harness this power to help attorneys find key evidence in ESI quickly and effectively.

Our October 2015 publication – Predictive Coding 3.0 – debunked the bogus science behind the old vendor methods, version 1.0 and 2.0. In this same article we also described our then new version 3.0. Since that time we have developed more enhancements to our methods.

Go on to Class Two.

Or pause to do this suggested “homework” assignment for further study and analysis.

SUPPLEMENTAL READING:  If you have not already done so, go to the Team’s TAR page and look around. Also read in full the related web created by the Team: Legal Search Science. Also read Losey’s My Basic Plan for Document Reviews: The “Bottom Line Driven” Approach article. It was written in October 2013 and so describes old, dated version 2.0 methods of multimodal hybrid document review. But it still contains important background discussions, especially to understand the economics and proportionality considerations in predictive coding. Finally, search and find the law review article that Losey wrote that also embodies these old methods. It also provides important background reading and has good citations and analysis that you will not find elsewhere. Plus, as a law review article, it is good to cite for the main points that are still valid.

EXERCISES: Here are several exercises for this first otherwise, very short introductory class:

  1. Consider why the profession has moved so slowly to adopt predictive coding in spite of the benefits. It was first approved by Judge Peck in my Da Silva Moore case back in February 2012, and yet is still seldom used.
  2. Try to consider what arguments you would use with opposing counsel to persuade them that they should use predictive coding to search for the documents you have requested?
  3. What arguments would you use to persuade a judge to approve of your use, assuming the requesting party foolishly opposed it? 
  4. What arguments would you use to persuade a judge to force a responding party to use predictive coding?
  5. If you do not already know the case, find the opinion by Judge Peck that addresses this last questions. Speculate on different conditions that might cause Judge Peck to reach a different conclusion.

Students are invited to leave a public comment below. Insights that might help other students are especially welcome. Let’s collaborate!


e-Discovery Team LLC COPYRIGHT 2017



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: