e-Discovery Team’s Best Practices Education Program

May 8, 2016

EDBP_BANNER

EDBP                   Mr.EDR         Predictive Coding 3.0
59 TAR Articles
Doc Review  Videos

_______

TEAM_TRAINING_screen_shot

e-Discovery Team Training

Information → Knowledge → Wisdom

Ralph_4-25-16Education is the clearest path from Information to Knowledge in all fields of contemporary culture, including electronic discovery. The above links take you to the key components of the best-practices teaching program I have been working on since 2006. It is my hope that these education programs will help move the Law out of the dangerous information flood, where it is now drowning, to a safer refuge of knowledge. Information → Knowledge → Wisdom: Progression of Society in the Age of Computers; and How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom.” For more of my thoughts on e-discovery education, see the e-Discovery Team School Page.

justice_guage_negligenceThe best practices and general educational curriculum that I have developed over the years focuses on the legal services provided by attorneys. The non-legal, engineering and project management practices of e-discovery vendors are only collaterally mentioned. They are important too, but students have the EDRM and other commercial organizations and certifications for that. Vendors are part of any e-Discovery Team, but the programs I have developed are intended for law firms and corporate law departments.

LIFE_magazine_Losey_acceleratesThe e-Discovery Team program, both general educational and legal best-practices, is online and available 24/7. It uses lots of imagination, creative mixes, symbols, photos, hyperlinks, interactive comments, polls, tweets, posts, news, charts, drawings, videos, video lectures, slide lectures, video skits, video slide shows, music, animations, cartoons, humor, stories, cultural themes and analogies, inside baseball references, rants, opinions, bad jokes, questions, homework assignments, word-clouds, links for further research, a touch of math, and every lawyer’s favorite tools: words (lots of them), logic, arguments, case law and precedent.

All of this to try to take the e-Discovery Team approach from just information to knowledge →. In spite of these efforts, most of the legal community still does not know e-discovery very well. What they do know is often misinformation. Scenes like the following in a law firm lit-support department are all too common.

supervising-tipsThe e-Discovery Team’s education program has an emphasis on document review. That is because the fees for lawyers reviewing documents is by far the most expensive part of e-discovery, even when contract lawyers are used. The lawyer review fees, and review supervision fees, including SME fees, have always been much more costly than all vendor costs and expenses put together. Still, the latest AI technologies, especially active machine learning using our Predictive Coding 3.0 methods, are now making it possible to significantly reduce review fees. We believe this is a critical application of best practices. The three steps we identify for this area in the EDBP chart are shown in green, to signify money. The reference to C.A. Review is to Computer Assisted Review or CAR, using our Hybrid Multimodal methods.

EDBP_detail_LARGE

____

Predictive Coding 3.0 Hybrid Multimodal Document Search and Review

Control-SetsOur new version 3.0 techniques for predictive coding makes it far easier than ever before to include AI in a document review project. The secret control set has been eliminated, so too has the seed set and SMEs wasting their time reviewing random samples of mostly irrelevant junk. It is a much simpler technique now, although we still call it Hybrid Multimodal.

robot-friendHybrid is a reference to the Man/Machine interactive nature of our methods. A skilled attorney uses a type of continuous active learning to train an AI to help them to find the documents they are looking for. This Hybrid method greatly augments the speed and accuracy of the human attorneys in charge. This leads to cost savings and improved recall. A lawyer with an AI helper at their side is far more effective than lawyers working on their own. This means that every e-discovery team today could use a robot like Kroll Ontrack’s Mr. EDR to help them to do document review.

Search_pyramidMultimodal is a reference to the use of a variety of search methods to find target documents, including, but not limited to, predictive coding type ranked searches. We encourage humans in the loop running a variety of searches of their own invention, especially at the beginning of a project. This always makes for a quick start in finding relevant and hot documents. Why the ‘Google Car’ Has No Place in Legal Search. The multimodal approach also makes for precise, efficient reviews with broad scope. The latest active machine learning software when fully integrated with a full suite of other search tools is attaining higher levels of recall than ever before. That is one reason Why I Love Predictive Coding.

Mr_EDRI have found that Kroll Ontrack’s EDR software is ideally suited for these Hybrid, Multimodal techniques. Try using it on your next large project and see for yourself. The Kroll Ontrack consultant specialists in predictive coding, Jim and Tony, have been trained in this method (and many others). They are well qualified to assist you in every step of the way and their rates are reasonable. With you calling the shots on relevancy, they can do most of the search work for you and still save your client’s money. If the matter is big and important enough, then, if I have a time opening, and it clears my firm’s conflicts, I can also be brought in for a full turn-key operation. Whether you want to include extra time for training your best experts is your option, but our preference.

Team_TREC_2

__________

Embrace e-Discovery Team Education to Escape Information Overload

____


Five Reasons You Should Read the ‘Practical Law’ Article by Maura Grossman and Gordon Cormack called “Continuous Active Learning for TAR”

April 11, 2016

Maura-and-Gordon_Aug2014There is a new article by Gordon Cormack and Maura Grossman that stands out as one of their best and most accessible. It is called Continuous Active Learning for TAR (Practical Law, April/May 2016). The purpose of this blog is to get you to read the full article by enticing you with some of the information and knowledge it contains. But before we go into the five reasons, we will examine the purpose of the article, which aligns with our own, and touch on the differences between their trademarked TAR CAL method and our CAR Hybrid Multimodal method. Both of our methods use continuous, active learning, the acronym for which, CAL, they now claim as a Trademark. Since they clearly did invent the acronym, CAL, we for one will stop using it – CAL – as a generic term.

The Legal Profession’s Remarkable Slow Adoption of Predictive Coding

The article begins with the undeniable point of the remarkably slow adoption of TAR by the legal profession, in their words:

Adoption of TAR has been remarkably slow, considering the amount of attention these offerings have received since the publication of the first federal opinion approving TAR use (see Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012)).

Winners in Federal CourtI remember getting that landmark ruling in our Da Silva Moore case, a ruling that pissed off plaintiffs’ counsel, because, despite what you may have heard to the contrary, they were strenuously opposed to predictive coding. Like most other lawyers at the time who were advocating for advanced legal search technologies, I thought Da Silva would open the flood gates, that it would encourage attorneys to begin using the then new technology in droves. In fact, all it did was encourage the Bench, but not the Bar. Judge Peck’s more recent ruling on the topic contains a good summary of the law. Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125 (S.D.N.Y. 2015). There were a flood  of judicial rulings approving predictive coding all around the country, and lately, around the world. See Eg. Pyrrho Investments v MWB PropertyEWHC 256 (Ch) (2/26/16).

The rulings were followed in private arbitration too. For instance, I used the Da Silva More ruling a few weeks after it was published to obtain what was apparently the first ruling by an arbitrator in AAA approving use of predictive coding. The opposition to our use of cost-saving technology in that arbitration case was again fierce, and again included personal attacks, but the arguments for use in arbitration are very compelling. Discovery in arbitration is, after all, supposed to be constrained and expedited.

IT_GovernanceAfter the Da Silva Moore opinion, Maura Grossman and I upped our speaking schedule (she far more than me), and so did several tech-minded judges, including Judge Peck (although never at the same events as me, until the cloud of false allegations created by a bitter plaintiff’s counsel in Da Silva Moore could be dispelled). At Legal Tech for the next few years Predictive Coding is all anybody wanted to talk about. Then IG, Information Governance, took over as the popular tech-child of the day. In 2015 we had only a few predictive coding panels at Legal Tech, but they were well attended.

The Grossman Cormack speculates that the cause of the remarkably slow adoption is:

The complex vocabulary and rituals that have come to be associated with TAR, including statistical control sets, stabilization, F1 measure, overturns, and elusion, have dissuaded many practitioners from embracing TAR. However, none of these terms, or the processes with which they are associated, are essential to TAR.

Control-SetsWe agree. The vendors killed what could have been their golden goose with all this control set nonsense and their engineers love of complexity and misunderstanding of legal search. I have ranted about this before. See Predictive Coding 3.0. I will not go into that again here, except to say the statistical control set nonsense that had large sampling requirements was particularly toxic. It was not only hard and expensive to do, it led to mistaken evaluations of the success or failure of projects because it ignored the reality of the evolving understand of relevance, so called concept drift. Another wrong turn involved the nonsense of using only random selection to find training documents, a practice that Grossman and I opposed vigorously. See Latest Grossman and Cormack Study Proves Folly of Using Random Search For Machine Training – Part One,  Part Two,  Part Three, and Part Four. Grossman and Cormack correctly criticize these old vendor driven approaches in Continuous Active Learning for TAR. They call them SAL and SPL protocols (a couple of acronyms that no one wants to trademark!).

Bottom line, the tide is changing. Over the last several years the few private attorneys who specialize in legal search, but are not employed by a vendor, have developed simpler methods. Maura and I are just the main ones writing and speaking about it, but there are many others who agree. Many have found that it is counter-productive to use control sets, random input, non-continuous training with its illogical focus on the seed set, and misleading recall point projections.

grossman_cormack_filteredWe do so in defiance of the vendor establishment and other self-proclaimed pundits in this area who benefitted by such over-complexity. Maura and Gordon, of course, have their own software (Gordon’s creation), and so never needed any vendors to begin with. Not having a world renowned information scientist like Professor Cormack as my life partner, I had no choice but to rely on vendors for their software. (Not that I complaining, mind you. I’m married to a mental health counselor, and it does not get any better than that!)

MrEdr_CapedAfter a few years I ultimately settled on one vendor, Kroll Ontrack, but I continue to try hard to influence all vendors. It is a slow process. Even Kroll Ontrack’s software, which I call Mr. EDR, still has control set functions built in. Thanks to my persistence, it is easy to turn off these settings and do things my way, with no secret control sets and false recall calculations. Hopefully soon that will be the default setting. Their eyes have been opened. Hopefully all of the other major vendors will soon follow suit.

All of the Kroll Ontrack experts in predictive coding are now, literally, a part of my Team. They are now fully trained and believers in the simplified methods, methods very similar to those of Grossman and Cormack, albeit, as I will next explain, slightly more complicated. We proved how well these methods worked at TREC 2015 when the Kroll Ontrack experts and I did 30 review projects together in 45 days. See e-Discovery Team at TREC 2015 Total Recall Track, Final Report (116 pg. PDF), and  (web page with short summary). Also see – Mr. EDR with background information on the Team’s participation in the TREC 2015 Total Recall Track.

We Agree to Disagree with Grossman and Cormack on One Issue, Yet We Still Like Their Article

Team_TRECWe are fans of Maura Grossman and Gordon Cormack’s work, but not sycophants. We are close, but not the same; colleagues, but not followers. For those reasons we think our recommendation for you to read this article means more than a typical endorsement. We can be critical of their writings, but, truth is, we liked their new article, although we continue to dislike the name TAR (not important, but we prefer CAR). Also, and this is of some importance, my whole team continues to disagree with what we consider the somewhat over-simplified approach they take to finding training documents, namely reliance on the highest ranking documents alone.

LogisticRegressionWindowLogisticFitChart6Despite what some may think, the high-ranking approach does eventually find a full diversity of relevant documents. All good predictive coding software today pretty much uses some type of logistic regression based algorithms that are capable of building out probable relevance in that way. That is one of the things we learned by rubbing shoulders with text retrieval scientists from around the world at TREC when participating in the 2015 Total Recall Track that Grossman and Cormack helped administer. This regression type of classification system works well to avoid the danger of over-training on a particular relevancy type. Grossman and Cormack have proven that before to our satisfaction (so have our own experiments), and they again make a convincing case for this approach in this article.

4_Cylinder_engineStill, we disagree with their approach of only using high-ranking documents for training, but we do so on the grounds of efficiency and speed, not effectiveness. The e-Discovery Team continues to advocate a Hybrid Multimodal approach to active machine learning. We use what I like to call a four-cylinder type of CAR search engine, instead of one-cylinder, like they do.

  1. High-ranking documents;
  2. Mid-level, uncertain documents;
  3. A touch, a small touch, of random documents; and,
  4. Human ingenuity found documents, using all type of search techniques (multimodal) that seem appropriate to the search expert in charge, including keyword, linear, similarity (including chains and families), concept (including passive machine learning, clustering type search).

Predictive Coding 3.0 – The method is here described as an eight-part work flow (Step 6 – Hybrid Active Training).

The latest Grossman and Cormack’s versions of CAL (their trademark) only uses the highest-ranking documents for active training. Still, in spite of this difference, we liked their article and recommend you read it.

The truth is, we also emphasize the high-probable relevant documents for training. The difference between us is that we use the three other methods as well. On that point we agree to disagree. To be clear, we are not talking about continuous training or not, we agree on that. We are not talking about active training, or not (passive), we agree on that. We are not talking about using what they call using SAL or SPL protocols (read their article for details), we agree with them that these protocols are ineffective relics invented by misguided vendors. We are only talking about a difference in methods to find documents to use to train the classifier. Even that is not a major disagreement, as we agree with Grossman and Cormack that high-ranking documents usually make the best trainers, just not in the first seed set. There are also points in a search, depending on the project, where the other methods can help you get to the relevant documents in a fast, efficient manner. The primary difference between us is that we do not limit ourselves to that one retrieval method like Grossman and Cormack do in their trademarked CAL methodology.

Cormack and Grossman emphasize simplicity, ease of use, and reliance on the software algorithms as another way to try to overcome the Bar’s continued resistance to TAR. The e-Discovery Team has the same goal, but we do not think it is necessary to go quite that far for simplicity sake. The other methods we use, the other three cylinders, are not that difficult and have many advantages. e-Discovery Team at TREC 2015 Total Recall Track, Final Report (116 pg. PDF and web page with short  summary). Put another way, we like the ability of fully automatic driving from time to time, but we want to keep an attorney’s learned hand at or near the wheel at all times. See Why the ‘Google Car’ Has No Place in Legal Search.

Accessibility with Integrity: The First Reason We Recommend the Article

Professor Gordon Cormack

Here’s the first reason we like Grossman & Cormack’s article, Continuous Active Learning for TAR: you do not have to be one of Professor Cormac’s PhD students to understand it. Yes. It is accessible, not overly technical, and yet still has scientific integrity, still has new information, accurate information, and still has useful knowledge.

It is not easy to do both. I know because I try to make all of my technical writings that way, including the 57 articles I have written on TAR, which I prefer to call Predictive Coding, or CAR. I have not always succeeded in getting the right balance, to be sure. Some of my articles may be too technical, and perhaps some suffer from breezy information over-load and knowledge deficiency. Hopefully none are plain wrong, but my views have changed over the years. So have my methods. If you compare my latest work-flow (below) with earlier ones, you will see some of the evolution, including the new emphasis over the past few years with continuous training.

predictive_coding_revised_small_size

The Cormacks and I are both trying hard to get the word out to the Bar as to the benefits of using active machine learning in legal document review.  (We all agree on that term, active machine learning, and all agree that passive machine learning is not an acceptable substitute.) It is not easy to write on this subject in an accurate, yet still accessible and interesting manner. There is a constant danger that making a subject more accessible and simple will lead to inaccuracies and misunderstandings. Maura and Gordon’s latest article meets this challenge.

Search ImageTake for example the first description in the article of their continuous active training search method using highest ranking documents:

At the outset, CAL resembles a web search engine, presenting first the documents that are most likely to be of interest, followed by those that are somewhat less likely to be of interest. Unlike a typical search engine, however, CAL repeatedly refines its understanding about which of the remaining documents are most likely to be of interest, based on the user’s feedback regarding the documents already presented. CAL continues to present documents, learning from user feedback, until none of the documents presented are of interest.

That is a good way to start an article. The comparison with a Google search having continued refinement based on user feedback is well thought out; simple, yet accurate. It represents a description honed by literally hundreds of presentations on the topic my Maura Grossman. No one has talked more on this topic than her, and I for one intend to start using this analogy.

Rare Description of Algorithm Types – Our Second Reason to Recommend the Article

Another reason our Team liked Continuous Active Learning for TAR is the rare description of search algorithm types that it includes. Here we see the masterful touch of one of the world’s leading academics on text retrieval, Gordon Cormack. First, the article makes clear the distinction between effective analytic algorithms that truly rank documents using active machine learning, and a few other popular programs now out there that use passive learning techniques and call it advanced analytics.

The supervised machine-learning algorithms used for TAR should not be confused with unsupervised machine-learning algorithms used for clustering, near-duplicate detection, and latent semantic indexing, which receive no input from the user and do not rank or classify documents.

Old_CAR_stuck_mudThese other older, unsupervised search methods are what I call concept search. It is not predictive coding. It is not advanced analytics, no matter what some vendors may tell you. It is yesterday’s technology – helpful, but far from state-of-the-art. We still use concept search as part of multimodal, just like any other search tool, but our primary reliance to properly rank documents is placed on active machine learning.

hyperplanes3d_2The Cormack-Grossman article goes farther than pointing out this important distinction, it also explains the various types of bona fide active machine learning algorithms. Again, some are better than others. First Professor Cormack explains the types that have been found to be effective by extensive research over the past ten years or so.

Supervised machine-learning algorithms that have been shown to be effective for TAR include:

–  Support vector machines. This algorithm uses geometry to represent each document as a point in space, and deduces a boundary that best separates relevant from not relevant documents.

– Logistic regression. This algorithm estimates the probability of a document’s relevance based on the content and other attributes of the document.

Conversely Cormack explains:

Popular, but generally less effective, supervised machine-learning algorithms include:

– Nearest neighbor. This algorithm classifies a new document by finding the most similar training document and assuming that the correct coding for the new document is the same as its nearest neighbor.

– Naïve Bayes (Bayesian classifier). This algorithm estimates the probability of a document’s relevance based on the relative frequency of the words or other features it contains.

Ask your vendor which algorithms its software includes. Prepare yourself for double-talk.

Hot-or-Not

If you try out your vendors software and the Grossman-Cormack CAL method does not work for you, and even the e-Discovery Team’s slightly more diverse Hybrid Multimodal method does not work, then your software may be to blame. As Grossman-Cormack put it, where the phrase “TAR tool” means software:

[I]t will yield the best possible results only if the TAR tool incorporates a state-of-the-art learning algorithm.

That means software that uses a type of support vector machine and/or logistic regression.

Teaching by Example – Our Third Reason to Recommend the Article

The article uses a long example involving search of Jeb Bust email to show you how their CAL method works. This is an effective way to teach. We think they did a good job with this. Rather than spoil the read with quotes and further explanation, we urge you to check out the article to see for yourself. Yes, it is an oversimplification, after all this is a short article, but it is a good one, and is still accurate.

 Quality Control Suggestions – Our Fourth Reason to Recommend the Article

quality_diceAnother reason we like the article are the quality control suggestions it includes. They essentially speak of using other search methods, which is exactly what we do in Hybrid Multimodal. Here are their words:

To increase counsel’s confidence in the quality of the review, they might:

Review an additional 100, 1,000, or even more documents.

Experiment with additional search terms, such as “Steve Jobs,” “iBook,” or “Mac,” and examine the most-likely relevant documents containing those terms.

Invite the requesting party to suggest other keywords for counsel to apply.

Review a sample of randomly selected documents to see if any other documents of interest are identified.

We like this because it shows that the differences are small between the e-Discovery Team’s Hybrid Multimodal method (hey, maybe I should claim Trademark rights to Hybrid Multimodal, but then again, no vendors are using my phrase to sell their products) using continuous active training, and the Grossman-Cormack trademarked CAL method. We also note that their section on Measures of Success essentially mirrors our own thoughts on metric analysis and ei-Recall. Introducing “ei-Recall” – A New Gold Standard for Recall Calculations in Legal SearchPart One, Part Two and Part Three.

Article Comes With an Online “Do it Yourself” CAL Trial Kit – Our Fifth Reason to Recommend the Article

We are big believers in learning by doing. That is especially true in legal tasks that seem complicated in the abstract. I can write articles and give presentations that provide explanations of AI-Enhanced Review. You may get an intellectual understanding of predictive coding from these, but you still will not know how to do it. On the other hand, if we have a chance to show someone an entire project, have them shadow us, then they will really learn how it is done. It is like teaching a young lawyer how to try a case. For a price, we will be happy to do so (assuming conflicts clear).

Jeb_BushMaura and Gordon seem to agree with us on that learn by doing point and have created an online tool that anyone can use to try out their method. In allows for a search of the Jeb Bush email, the same set of 290,099 emails that we used in ten of the thirty topics in 2015 TREC. In their words:

There is no better way to learn CAL than to use it. Counsel may use the online model CAL system to see how quickly and easily CAL can learn what is of interest to them in the Jeb Bush email dataset. As an alternative to throwing up their hands over seed sets, control sets, F1 measures, stabilization, and overturns, counsel should consider using their preferred TAR tool in CAL mode on their next matter.

You can try out their method with their online tool, or in a real project using your vendor’s tool. By the way, we did that as part of our TREC 2015 experiments, and the Kroll Ontrack software worked about the same as theirs, even when we used their one-cylinder, high ranking only, CAL (their trademark) method.

Here is where you can find their CAL testing tool: cormack.uwaterloo.ca/cal. Those of you who are still skeptical can see for yourself how it works. You can follow the example given in the article about searching for documents relevant to Apple products, to verify their description of how that works. For even more fun, you can dream up your own searches.

030114-O-0000D-001 President George W. Bush. Photo by Eric Draper, White House.

Perhaps, if you try hard enough, you can find some example searches where their high-end only method, which is built into the test software, does not work well. For example, try finding all emails that pertain to, or in any way mention, the then President, George Bush. Try entering George Bush in the demo test and see for yourself what happens.

It becomes a search for George + Bush in the same document, and then goes from there based on your coding the highest ranked documents presented as either relevant or non-relevant. You will see that you quickly end up in a TAR pit. The word Bush is in every email (I think), so you are served up with every email where George is mentioned, and believe me, there are many Georges, even if there is only one President George Bush. Here is the screen shot of the first document presented after entering George Bush. I called it relevant.

Screen Shot 2016-04-10 at 4.13.24 PM

These kind of problem searches do not discredit TAR, or even the Grossman Cormack one-cylinder search method. If this happened to you in a real search project, you could always use our Hybrid Multimodal™ method for the seed set (1st training), or start over with a different keyword or keywords to start the process. You could, for instance, search for President Bush, or President within five of George, or “George Bush.” There are many ways, some faster and more effective than others.

Even using the single method approach, if you decided to use the keywords “President + Bush”, then the search will go quicker than “George + Bush.” Even just using the term “President” works better than George + Bush, but still seems like a TAR pit, and not a speeding CAR. It will probably get you to the same destination, high recall, but the journey is slightly longer and, at first, more tedious. This high recall result was verified in TREC 2015 by our Team, and by a number of Universities who participated in the fully automatic half of the Total Recall Track, including Gordon’s own team. This was all done without any manual review by the fully automatic participants because there was instant feedback of relevant or irrelevant based on a prejudged gold standard. See e-Discovery Team at TREC 2015 Total Recall Track, Final Report (116 pg. PDF), and (web page with short  summary). With this instant feedback protocol, all of the teams attained high recall and good precision. Amazing but true.

You can criticized this TREC experiment protocol, which we did in our report, as unrealistic to legal practice because:

(1) there is no SME who works like that (and there never will not be, until legal knowledge itself is learned by an AI); and,

(2) the searches presented as tasks were unrealistically over-simplistic. Id.

But you cannot fairly say that CAL (their trademark) does not work. The glass is most certainly not half empty. Moreover, the elixir in this glass is delicious and fun, especially when you use our Hybrid Multimodal™ method. See Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding 3.0.

Conclusion

Ralph_head_2016Active machine learning (predictive coding) using support vector or logistic regression algorithms, and a method that employs continuous active training, using either one cylinder (their CAL), or four (our Hybrid Multimodal), really works, and is not that hard to use. Try it out and see for yourself. Also, read the Grossman Cormack article, it only takes about 30 minutes. Continuous Active Learning for TAR (Practical Law, April/May 2016). Feel free to leave any comments below. I dare say you can even ask questions of Grossman or Cormack here. They are avid readers and will likely respond quickly.


How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom”

April 5, 2016
TS_Eliot

T. S. Eliot (1888-1965). See his work The Rock

A year ago, April 5, 2015, we published what some consider the e-Discovery Team’s best essay, even though it had little to do with e-discovery: Information → Knowledge → Wisdom: Progression of Society in the Age of Computers. We wrote about the rapid changes in society caused by personal computers and set out our theory of three stages of social development.

hypothesis_testing-cycleThe Information → Knowledge → Wisdom blog included twelve predictions to test the accuracy of our social-technological hypothesis. The predictions concerned the transition of society from an Information Age, in which we believe we now live, to a society based on Knowledge. The transition from mere Information to Knowledge is seen as a necessary survival step for society, not an idealistic dream.

socrates3The next Knowledge Age is also seen as a transition step to the ultimate goal of a society based on Wisdom. Our predictions did not address this last step to Wisdom because this step is, in our opinion, too far out time-wise for any meaningful predictions. It is possible for some individuals to make this step now, but not enough for a whole society to be centered in Wisdom. We have a long way to go to move from an Information to a Knowledge Society before we can make predictions on how a Wisdom based society will arise.

Our time-line for the first transition from Information to Knowledge is already pretty “far-out.” We thought the predictions would come true in five to twenty years. In this blog, a year later, we check the predictions (in bold) and provide a short report on how well they are doing.

Ralph_VRBottom line, our predictions are holding up remarkably well, especially our top prediction of new kinds of cyber education environments and VR. It is very encouraging to see how far society has progressed in just a year. Our technological civilization is still in danger from Information overload, and lack of processed Knowledge, to be sure. The political events of the last year underscore the serious threats. Still, our technology is evolving as predicted and, overall, society is moving in the right direction.

Here is our first prediction.

Top Prediction – VR Community Education

1. Several inventions, primarily in insanely great new computer hardware and software, will allow for the creation of many new types of cyber and physical interconnectivity environments. There will be many more places that will help people to go beyond information to knowledge. They will be both virtual realities, for you or your avatars to hang out, and real-world meeting places for you and your friends to go to. They will not be all fun and games (and sex), although that will be a part of it. Many will focus exclusively on learning and knowledge. The new multidimensional, holographic, 3D, virtual realities will use wearables of all kinds, including Oculus-like glasses, iWatches, and the like. Implant technology will also arise, including some brain implants, and may even be common in twenty years. Many of the environments, both real and VR, will take education and knowledge to a new level. Total immersion in a learning environment will take on new meaning. The TED of the future will be totally mind-blowing.

SONY_VRAlthough we said five to twenty years for these predictions, as it turns out the first prediction is much further along than we knew. On the new inventions front, we now know that Sony will release a PlayStation VR in October 2016 for $399. Also see 7 Virtual Reality Highlights From the Game Developers Conference (NYT, 3/19/16).

holoportationWe also now know that Microsoft is releasing new technology in 2016 that it calls Windows Holographic, and modestly describes as “the most advanced holographic computer the world has ever seen.” It allows for both VR and augmented reality. That means it can add holograms to the real 3D world around you. Microsoft is also reportedly working on a related 3D communications technology that simulates teleportation using the HoloLens augmented-reality glasses that it has dubbed ‘Holoportation‘ (shown in picture above). That looks really cool.

There have been a host of other inventions as well, including improved VR hand sensorsanimated ebooks controlled by the speed of your voice, Samsung phone display holograms. Virtual reality is taking off for a number of small and large companies. Virtual reality trips were, for instance, everywhere at the South by Southwest conference in 2016. Many VR communities are in various stages of development, including a VR City that is well along, which has many social-educational components, called Hypatia.

oculus_riftFacebook’s long-awaited Oculus Rift (shown right) began shipping at the end of March 2016. Right now it costs $599 and you need a souped-up PC to use it. First reviews of the Oculus hardware are praise-filled, although the device itself is well ahead of the software designed to use it. All of these new VR headsets are expected to trigger many new apps, some of which will likely have educational components.

NYT_VR_cellphoneThe New York Times give-away of Google Cardboard VR viewers was also a big deal in late 2015. I got one included with my Sunday paper. I was surprised by the quality of both the free cardboard headset and the content the Times created for it. All you do is put your cell phone in the cardboard box that has lenses in it. It’s simple and works well. This is good start to a new type of total immersion journalistic reports. I highly recommend you try one of the Google cardboard viewers, especially since they are still very low cost. Most of apps, including games, designed to work on them them are also free or low cost.

Immersive_Ed_2015On the importance of VR to education, and thus the transition to a Knowledge based society, a noteworthy conference was held in Paris in October 2015, sponsored by the Sorbonne and the Smithsonian called IMMERSION 2015. One of the modules of the conference was Immersive Education: Teaching and Learning in the Age of ImmersionVR Education group. Also see Two students hope to help explain complex 3D math and science concepts through virtual reality enterprise. I am sure if we kept researching we would find many more examples like that.

Larissa_BailiffThe first prediction is moving into reality faster than we expected. That is good news. Larissa Bailiff, the senior editor of education and content for WoofbertVR (shown right) wrote in her article When Virtual Reality Meets Education:

In what may turn out to be an immersive education game changer, Google launched its Pioneer Expeditions in September 2015. Under this program, thousands of schools around the world are getting — for one day — a kit containing everything a teacher needs to take their class on a virtual trip …

And with VR platforms like AltspaceVR and LectureVR (an initiative of Immersive VR Education), entirely new possibilities are available for teachers of all kinds, as the technology of making avatars and supporting “multi-player” sessions allows for an exponentially­ scaled level of socialization and outreach.

3d-phone_YogaThe use of VR for educational environments in global communities is already far along. When the even better technology just released is developed in the marketplace, and prices come down, this should scale quickly. The Sony, Microsoft, Samsung, and Facebook (Oculus) hardware coming out in 2016 will enable thousands of software entrepreneurs to enter the market. So too will new projects coming out by Google, Apple, etc., in 2017. These new technologies will allow education and art content creators to have the kind of impact needed to push us into a Knowledge-based Society.

rfid-chip-handThe use of implants that is part of the first prediction is also progressing rapidly. See Eg. Grinders, Cyborgs & TranshumanistsScientists propose ‘cortical modem’ implantDARPA is sending brain implants on a voyage round the body to power artificial limbsDo-it-yourself biology: Biohackers implanting rice grain-sized chips under skin. I question the balance of people experimenting with body augmentation at this early stage, but some people like dangerous things, such as the hand implant shown by the thumb in the x-ray photo above.

iWatch_futureNew applications for the i-Watch and other wearables will hopefully come out soon too. (The iWatch to date has largely been a dud, thanks to poor sensors and app development delays.) Increased sensor abilities should also come soon. When that happens it will be easier to personalize information in a holistic manner and so hopefully facilitate self-knowledge.

There is at least one-far out type of technology research now underway that involves the targeted stimulation of the peripheral nervous system to facilitate learning in a wide range of cognitive skills. It sounds bogus, but for the fact it is sponsored by DARPA, the Defense Departments Advanced Research Projects Agency. DARPA calls the project Targeted Neuroplasticity Training (TNT). According to the DARPA announcement of 3/16/16:

Doug_Webber_DARPA“Recent research has shown that stimulation of certain peripheral nerves, easily and painlessly achieved through the skin, can activate regions of the brain involved with learning,” said TNT Program Manager Doug Weber (shown here) adding that the signals can potentially trigger the release of neurochemicals in the brain that reorganize neural connections in response to specific experiences. “This natural process of synaptic plasticity is pivotal for learning, but much is unknown about the physiological mechanisms that link peripheral nerve stimulation to improved plasticity and learning,” Weber said. “You can think of peripheral nerve stimulation as a way to reopen the so-called ‘Critical Period’ when the brain is more facile and adaptive. TNT technology will be designed to safely and precisely modulate peripheral nerves to control plasticity at optimal points in the learning process.”

neurostimulation

DAPRA chart by Dr. Weber

You can follow Dr. Weber here on Twitter. We are.

Nerves_VagusIn an article on this project by Kurzweill News, DARPA’s ‘Targeted Neuroplasticity Training’ program aims to accelerate learning ‘beyond normal levels’ (3/23/16), they state:

DARPA already has research programs underway to use targeted stimulation of the peripheral nervous system as a substitute for drugs to treat diseases and accelerate healing*, to control advanced prosthetic limbs**, and to restore tactile sensation.

But now DARPA plans to take an even more ambitious step: It aims to enlist the body’s peripheral nerves to achieve something that has long been considered the brain’s domain alone: facilitating learning — specifically, training in a wide range of cognitive skills. …

The program is also notable because it will not just train; it will advance capabilities beyond normal levels — a transhumanist approach. …

The engineering side of the program will target development of a non-invasive device that delivers peripheral nerve stimulation to enhance plasticity in brain regions responsible for cognitive functions.

Obviously the military is interested in the potential of brain stimulation, they say to train super-spy agents to rapidly master foreign languages and cryptography. If this works (a big if right now), it likely would go much further than that. What would a commando unit of super-quick learners look like? Could they beat robots (another DARPA project)? I hope we never find out.

DARPA_Robots

Ineuro_stimulation_BRAINf TNT neurostimulation is really able to enhance learning, as DARPA thinks, then it could have many non-military applications too. What if anybody could study law for just a few months, or a week, and pass the Bar exam? What if the same applied to most PhD programs? What if you could learn to speak a new language in a week? Write in a new software code? New martial arts moves as in The Matrix? What if you could learn anything you wanted, when you wanted, really really fast? Or, what if it was just fast, say half the time, or a tenth the time, that it would normally take to learn a complex skill?

What if electro-stimulation (or some other method) could hack your brain into a super high-gear that was once the exclusive province of rare geniuses? If genius becomes commonplace, could a knowledge society be far off? The TNT project has the potential to accelerate our transition to a knowledge based society very rapidly, especially if there is wide-spread distribution of this new technology. The twists and turns that could come out of this are mind-boggling. Let’s just hope we do not become overwhelmed with idiot-savants.

To be honest, the whole theory of simple nerve stimulation triggering freak learning abilities sounds more than a little ridiculous to us. Too easy. Nevertheless, the DAPRA funding and Doug Weber give it credibility. One of DARPA’s past projects included ARPANET that later became the Internet. Indeed, many common place things once seemed ridiculous, such as a computer in every home.

Four Predictions on Social Media and Dissemination of Expertise

2. Some of the new types of social media sites will be environments where subject matter experts (SME) are featured, avatars and real, cyber and in-person, shifted and real-time. There will also be links to other sites or rooms that are primarily information sources.

Pope_InstagramThe Pope is now on Instagram. What more need we say? There has been real progress in this area, although we still have a long way to go. See egWhat Do These Top Industry Experts Use Social Media For?How Social Media Can Help Students StudyConnecting a Classroom: Reflections on Using Social Media With My Students. Still, when a Pope like Francis use media for educational, inspirational purposes, we have made real progress.  Everyday people are doing it too in their own way, even us. See the Team community growing on Twitter.

3. The new SME environment will include products and services, with both free and billed aspects. 

Slow and steady development, but, as expected, has not taken off yet. See Eg. PrestoExperts (online hook-up to experts in many fields); www.experts.com; Experts Exchange; and the site for legal services, AVVO.

4. The knowledge nest community environments will be both online and in-person. The real life, real world, interactions will be in safe public environments with direct connections with cyberspaces. It will be like stepping out of your computer into a Starbucks or laid-back health spa.

Shaw_academyUniversities with old-timely, all too linear professors still rule the roost. Although some colleges are becoming more online and digital oriented, real innovation is still a few years off. Penn Study: Massive Open Online Courses Not a Threat to Traditional Business SchoolsedX – Free online courses from the world’s best universities; The 30 Most Innovative Online CollegesCOURSERA

Most of the professors and other professionals, including law and medicine, have yet to step out of their comfort zone and into cyberspace, much less non-traditional education zones. As the technologies improve we expect that they will be more motivated to do so. Real progress and innovation will follow after that happens. We do note, however, that Amazon has just opened its first physical bookstores and see this an encouraging step. It may seem retro, but it is really a step forwards toward knowledge based communities. We may see more high-tech libraries constructed soon that also fill that purpose. We expect they will be more about space than books.

5. The knowledge focused cyberspaces, both those with and without actual real-words SMEs, will look and feel something like a good social media site of today, but with multimedia of various kinds. Some will have Oculus type VR enhancements like the StarTrek holodeck. All will have system administrators and other staff who are tireless, knowledgable, and fair; but most will not be human.

This prediction depends in large part on the actualization of the first four. These kind of mature multidimensional cyberspaces will come later, when the other predictions come true, and when AIs are more developed as discussed next.

Predictions on AI

Robot_with_HeartSeven of our predictions as to how society will likely transition from an Information Age to a Knowledge Age involved the use of new and improved kinds of artificial intelligence entities. Although this was a big year for AI PR, there were no major break throughs. Not yet.

Google_AI_win_at_GOThe big news this year in AI is that Google created a deep learning based AI system for playing the world’s most complicated game. The AlphaGo software was able to beat a reigning Grand Master in GO in four out of five games. AlphaGo, Lee Sedol, and the Reassuring Future of Humanity (The New Yorker, 3/15/16). Many thought that it would take a decade for a computer to learn how to beat a Grand Master at the world’s most complex game.

It was an impressive victory. Still, Google’s AlphaGo, which used deep learning algorithms, can only do one thing, play GO. AlphaGo and the Limits of Machine Intuition (Harvard Business Review, 3/18/16). To overuse the word, the fact that this was the big news in AI development this year, shows that we still have a long way to go. Moreover, no AI yet born, much less conceived, would appreciate why you are now snickering, or annoyed, or both.

It may be that new hardware development was the big news last year, computers designed to help run AI code. See Nvidia announces a supercomputer aimed at deep learning and AI, (TechCrunch, April 5, 2016). The new Nvidia computers are designed to run deep learning systems a/k/a neural networks. They are due to be released in June 2016 and will sell for $129,000. These Nvidia supercomputers could also become the gold standard in VR machines.

Coldewey in his Tech Crunch article explains that:

These are programs that simulate human-like thought processes by looking very closely at a huge set of data and noting similarities and differences on multiple levels of organization.

This is how Nvidia explains the new technology: (emphasis added)

Computer programs contain commands that are largely executed sequentially. Deep learning is a fundamentally new software model where billions of software-neurons and trillions of connections are trained, in parallel. Running DNN algorithms and learning from examples, the computer is essentially writing its own software. This radically different software model needs a new computer platform to run efficiently.

Nvidia claims to have created a supercomputer designed to fill that platform need. It uses what they call GPUs instead of CPUs. I think this will soon be a crowded field.

Here are the seven AI related predictions made last year. Again, we do not expect to see these advances for at least five years, and as many as twenty.

6. The admins, operators and other staff in these cyberspaces will be advanced AI, like cyber-robots. Humans will still be involved too, but will delegate where appropriate, which will be most of the time. This is one of my key predictions.

The only development I am aware of along these lines is on Facebook. It now has an AI that is automatically writing photo captions. If you hear of anything more, please let me know. It would not seem that difficult to do on at least a rudimentary level, so I still expect to see this advance soon. Much easier than an adult Turing test. See Edge.org contributors discuss the future of AI.

7. The presence of AIs will spread and become ubiquitous. They will be a key part of the IOT – Internet of Things. Even your refrigerator will have an AI, one that you program to fit your current dietary mood and supply orientation.

The IOT is spreading fast as expected, but not yet the communicative AI. Since our cybersecurity is so poor, we are not so sure that is a bad thing. Still, the recent advances in Amazon’s Alexa are promising, and do doubt Siri will get also get lot smarter in the next few years, so too will Google Now and Cortana, so too might a new personal assistant startup called Viv. There are many like this in the works. See Virtual Personal Assistants: The software secretaries (The Economist, 9/12/15).

8. The knowledge products and services will come in a number of different forms, many of which do not exist in the present time, but will be made possible by other new inventions, especially in the area of communications, medical implants, brain-mind research, wearables, and multidimensional video games and conferences.

See our prior comments to the related predictions two through five. Until the AI improves, and/or human inventors take off with great new ideas and products, this prediction of innovation remains conjecture. The creative diversity here predicted requires a developed market that is still several years out. Still, we are seeing early forms of this in things like online mental health counseling using video connections and the like.

9. All subject areas will be covered, somewhat like Wikipedia, but with super-intelligent cyber robots to test, validate and edit each area. The AI robots will serve most of the administrator and other cyber-staffing functions, but not all.

This kind of super-librarian AI still seems decades away. But, we recently found out that Wikipedia is already working on something like this. Artificial intelligence introduced to improve Wikipedia edits (“The Wikimedia foundation is embracing machine learning to make the editing process more streamlined and forgiving for new contributors.”) That is a good start.

10. The AI admins will monitor, analyze, and screen out alleged SMEs who do not meet certain quality standards. The AI admins will thus serve as a truth screen and quality assurance. An SME’s continued participation in an AI certified site will be like a Good Housekeeping Seal of Approval.

We see nothing like this yet on the horizon, although we do see some physician and attorney ranking systems that work on crowd-sourcing. We do, however, remain confident that this prediction will come true within our outside time range of twenty years.

11. The AI admins will also monitor and police the SME services and opinions for fraud and other unacceptable use, and for general cybersecurity. The friendly management AIs will even be involved in system design, billing, collection, and dispute resolution.

The use of AI in general fraud detection, credit scoring and all types of financial analysis, including stock trading, is already well underway. But our prediction here was oriented to AI administration and monitoring of SME services. When these new social media type SME services are developed, the AIs will be well equipped to service the sites and protect users.

12. Environments hosted by such friendly, fair, patient, sometimes funny, polite (per your specified level, which may include insult mode), high IQ intelligence, both human and robot, will be generally considered to be reliable, bona fide, effective, safe, fun, enriching, and beautiful. They will provide a comforting alternative to information overload environments filled with conflicting information, including its lowest form, data. These alternative knowledge nests will become a refuge of music in a sea of noise. Some will become next generation Disney World vacation paradises.

This twelfth prediction is built on all of the rest. It will necessarily be one of the last to come true.

Conclusion

crystal-ball.ESCHER_VRThe development of VR and education is proceeding very rapidly, well ahead of our minimum five year projections. AI is also making steady progress, especially with deep learning algorithms. D. Scott PhoenixHow artificial intelligence is getting even smarter (World Economic Forum, Aug. 2015); Clark, Jack, Why 2015 Was a Breakthrough Year in Artificial Intelligence (Bloomberg, 12/8/15). Although we do not think 2015 was a breakthrough year for AI, we remain confident that their day will come. When the breakthrough year does in fact arrive, it will be quite momentous.

We remain hopeful that artificial intelligence will help usher in a Golden Age of Knowledge, then ultimately of Wisdom. This is not to deny the possibility of dark futures with human subjugation by robot overlords or all-too-human political despots, etc. In order to avoid these dystopias we need to know and understand the real dangers we are now facing, including, without limitation, AI, and act accordingly. The AI dangers of unethical robots is another area where lawyers could work with scientists and others to make valuable contributions to the future of humanity.

In closing I leave you with a question, who would you rather hang-out with, a well informed person, a knowledgeable person, or a wise one? Here are my thoughts. One important thing I forgot to mention in my video is that the wise are always funny. If they sound wise, but are very serious, you know you are in the presence of a merely knowledgable person who has pretensions of wisdom. Run.

_________

____


Five Tips to Avoid Costly Mistakes in Electronic Document Review – Part 3

April 2, 2016

5-Tips_Review_FOCUS_WORMS_CHECKThis is part-three of my blog series, Five Tips to Avoid Costly Mistakes in Document Review. Part One gave an introduction and explained the first tip, the Time factor, talking about the importance of avoiding time pressures and resultant hurried activities. Part Two explained the second tip, Ethics, and how scrupulous integrity and compliance with the Rules of Professional Conduct will help you to avoid a host of serious mistakes, including career-ending ones. This third and final blog explains tip Three on Focused Concentration, which was mentioned in passing in the Part One video on Time, and tips four and five, on Worms and Check Again.

focus2The Focus tip is based on my own experiences in cultivating the ability to concentrate on legal work, or anything else. It is contra to the popular, but erroneous notion, a myth really, that you can multi-task and still do each task efficiently. Our brain does not work that way. See Eg. Crenshaw, The Myth of Multitasking: How “Doing It All” Gets Nothing Done; and the work of neuroscientist Daniel J Levitin, who has found the only exception is adding certain background music. All document reviewers who wear head-sets, myself included, know this exception very well.

Steve-Jobs-zenFor more on quality control and improved lifestyle by focused attention and other types of meditation, see my earlier video blog, Document Review and Predictive Coding: Video Talks – Part Six, especially the 600 word introduction to that video that includes information on the regular meditation practices of Supreme Court Justice Stephen Breyer, among others. Also see A Word About Zen Meditation. This practice helped Steve Jobs, and helps Justice Breyer and countless others. It could help you too. It will, at the very least, allow for more focused attention to what you are doing, including document review, and thus greatly reduce mistakes.

cdr-logoThe Worms tip is a simple technical one, unique to e-discovery, where Worm is an acronym that means write once, read many times. I prefer to make productions on write-only or recordable only CDs, aka, CD-R, or DVD-R, and not by file transfers. I do not want to use a CD-RW, or DVD-RW meaning one that is rewritable.

quality_diceThe fifth tip of Check Again, has to do with the importance of redundancy in quality control, subject only to proportionality considerations, including the tip to spot check your final production CD. I discuss briefly the tendency of lawyers to be trapped by paralysis by analysis, and why we are sometimes considered deal killers by business people because we focus so much on risk avoidance and over-think things. There has to be a proportional limit on the number and cost of double-checks in document review. I also mention in the fifth tip my Accept of Zero Error and ei-Recall checks, which are quality assurance efforts that we make in larger document review projects.

________

Tip #3 – Focused Concentration

____

Tip #4 – Use WORMS to Produce

____

Tip #5 – Check Again

____


Follow

Get every new post delivered to your Inbox.

Join 4,779 other followers

%d bloggers like this: