Robots From The Not-Too-Distant Future Explain How They Use Random Sampling For Artificial Intelligence Based Evidence Search

Byte and Switch, my future-law robots, here star in another video animation, this time on random sampling. They explain how sampling is used in machine-learning-based evidence review. In this first segment of a two-part video taking place sometime in the near-future we watch Switch help Byte to get ready to give expert testimony in a Daubert hearing. The presiding Judge, David J. Waxse, in the future routinely insists on that sort of thing. See: Waxse & Yoakum-Kriz, Experts on Computer Assisted Review: Why Federal Rule of Evidence 702 Should Apply To Their Use, 52 WBJLJ 207, (Spring 2013).

Byte, who is an expert by virtue of his knowledge-base, programming, and search experience, makes the perfect witness. Verified programming establishes that he is incapable of lies or evasion. Not only that, he has total recall of everything that happened in every search project he has been involved with. Still, Switch needs to help Byte to get ready to testify. Byte, like the scientists and programmers who created him, needs to learn how to talk simple enough for non-expert humans to comprehend. This animation shows Byte practicing for his testimony.

In this video Byte (shown right) explains how and why random samples are taken at the start of a project, before the active learning training begins. Byte also explains that random sampling is also used again, in a limited fashion, during the training. (The Borg-type predictive coding software that relies entirely on random chance has in this near-future scenario been discredited and abandoned long ago.) In part-two Byte and Switch will go on to explain final quality assurance sampling at or near the end of a robot-enhanced search project.

As usual, pause to let the streaming video get ahead, especially if your connection is slow, and increase the video screen to full size for best effect.

Special thanks to William Webber, Information Scientist, for his background information and help. William has endured hours of my Switch-like questioning on random sampling in active machine learning search projects. His explanations of sampling have been invaluable, including such esoteric topics as Gaussian and Binomial calculations, Simple Random and Stratified Random sampling (William’s speciality), quality control sampling for testing, as opposed to training, prevalence, concept shift, and recall testing. All credit goes to William for what I get right in this future-scenario of random sampling. Any mistakes in the explanation, or errors in predictions, are entirely my own.

For the earlier adventures of Byte and Switch, see:

This entry was posted on Sunday, May 19th, 2013 at 5:21 pm and is filed under Review, Search, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to Robots From The Not-Too-Distant Future Explain How They Use Random Sampling For Artificial Intelligence Based Evidence Search

Electronic Discovery Best Practices Update | e-Discovery Team ® says:

May 30, 2013 at 3:05 pm

[…] I recently added several revisions and citations in the Predictive Coding page, including a summary of a recent article by Warwick Sharp, Ten Essential Best Practices in Predictive Coding (Today’s General Counsel, May 2013). Warwick, who I don’t think I’ve ever met, is a co-founder of Equivo and VP. His suggestions were all good and warranted inclusion on EDBP. I also added to this page a discussion of the difference between a control set and a training set, something that I touched upon in my most recent robot animation, Robots From The Not-Too-Distant Future Explain How They Use Random Sampling For Artificial Intellige…. […]

Loading...

Reply
Legal Search Science | e-Discovery Team ® says:

November 17, 2013 at 2:12 pm

[…] Robots From The Not-Too-Distant Future Explain How They Use Random Sampling For Artificial Intellige…. Video Animation. […]

Loading...

Reply