What Information Theory Tell Us About e-Discovery and the Projected ‘Information → Knowledge → Wisdom’ Transition

May 28, 2016

Ralph_and_LexieThis is an article on Information Theory, the Law, e-Discovery, Search and the evolution of our computer technology culture from Information → Knowledge → Wisdom. The article as usual assumes familiarity with writings on AI and the Law, especially active machine learning types of Legal Search. The article also assumes some familiarity with the scientific theory of Information as set forth in James Gleick’s book, The Information: a history, a theory, a flood (2011). I will begin the essay with several good instructional videos on Gleick’s book and Information Theory, including a bit about the life and work of the founder of Information Theory, Claude Shannon. Then I will provide my personal recapitulation of this theory and explore the application to two areas of my current work:

  1. The search for needles of relevant evidence in large, chaotic, electronic storage systems, such as email servers and email archives, in order to find the truth, the whole truth, and nothing but the truth needed to resolve competing claims of what happened – the facts – in the context of civil and criminal law suits and investigations.
  2. The articulation of a coherent social theory that makes sense of modern technological life, a theory that I summarize with the phrase: Information → Knowledge → Wisdom. See Information → Knowledge → Wisdom: Progression of Society in the Age of Computers and the more recent, How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom.”

I essentially did the same thing in my blog last week applying Chaos Theories. What Chaos Theory Tell Us About e-Discovery and the Projected ‘Information → Knowledge → Wisdom’ Transition. This essay will, to some extent, build upon the last and so I suggest you read it first.

Information Theory

Gleick_The_InformationGleick’s The Information: a history, a theory, a flood covers the history of cybernetics, computer science, and the men and women involved with Information Theory over the last several decades. Gleick explains how these information scientists today think that everything is ultimately information. The entire Universe, matter and energy, life itself, is made up of information. Information in turn is ultimately binary, zeros and ones, on and off, yes and no. It is all bits and bytes.

Here are three videos, including two interviews of James Gleick, to provide a refresher on Information Theory for those who have not read his book recently. Information Wants to Have Meaning. Or Does It? (3:40, Big Think, 2014).

The Story of Information (3:47, 4th Estate Books, 2012).

Shannon_ClaudeThe generally accepted Father of Information Theory is Claude Shannon (1916-2001). He is a great visionary engineer whose ideas and inventions led to our current computer age. Among other things, he coined the word Bit in 1948 as the basic unit of information. He was also one of the first MIT hackers, in the original sense of the word as a tinkerer, who was always building new things. The following is a half-hour video by University of California Television (2008) that explains his life’s work and theories. It is worth taking the time to watch it.

Shannon was an unassuming genius, and like Mandelbrot, very quirky and interested in many different things in a wide variety of disciplines. Aside from being a great mathematician, Bell Labs engineer, and MIT professor, Shannon also studied game theory. He went beyond theory and devised several math based probability methods to win at certain games of chance, including card counting at blackjack. He collaborated with a friend at MIT, another mathematician, Edward Thorp, who became a professional gambler.

Shannon_movie_21_SpaceyShannon, his wife, and Thorp travelled regularly to Las Vegas for a couple of years in the early sixties where they constantly won at the tables using their math tricks, including card counting.  Shannon wanted to beat the roulette wheel too, but the system he and Thorp developed to do that required probability calculations beyond what he could do in his head. To solve this problem in 1961 he invented a small, concealable computer, the world’s first wearable computer, to help him calculate the odds. It was the size of a cigarette pack. His Law Vegas exploits became the very loose factual basis for a 2008 movie “21“, where Kevin Spacey played Shannon. (Poor movie, not worth watching.)

Shannon made even more money by applying his math abilities in the stock market. The list of his eclectic genius goes on and on, including his invention in 1950 of an electromechanical mouse named Theseus that could teach itself how to escape from a maze. Shannon’s mouse appears to have been the first artificial learning device. All that, and he was also an ardent juggler and builder/rider of little bitty unicycles (you cannot make this stuff up). Here is another good video of his life, and yet another to celebrate 2016 as the 100th year after his birth, The Shannon Centennial: 1100100 years of bits by the IEEE Information Theory Society.

claude_shannon_bike_juggle

_______

For a different view loosely connected with Information Theory I recommend that you listen to an interesting Google Talk by Gleick.“The Information: A History, a Theory, a Flood” – Talks at Google (53:45, Google, 2011). It pertains to news and culture and the tension between a humanistic and mechanical approach, a difference that mirrors the tension between Information and Knowledge. This is a must read for all news readers, especially NY Times readers, and for everyone who consumes, filters, creates and curates Information (a Google term). This video has  a good dialogue concerning modern culture and search.

As you can see from the above Google Talk, a kind of Hybrid Multimodal approach seems to be in use in all advanced search. At Google they called it a “mixed-model.” The search tools are designed to filter identity-consonance in favor of diverse-harmonies. Crowd sourcing and algorithms function as curation authority to facilitate Google search. This is a kind of editing by omission that human news editors have been doing for centuries.

The mixed-model approach implied here has both human and AI editors working together to create new kinds of interactive search. Again, good search depends upon a combination of AI and human intelligence. Neither side should work alone and commercial interests should not be allowed to take control. Both humans and machines should create bits and transmit them. People should use AI software to refine their own searches as an ongoing process. This should be a conversation, an interactive Q&A. This should provide a way out of Information to Knowledge.

Lexington - IT lex

Personal Interpretation of Information Theory

My takeaway from the far out reaches of Information theories is that everything is information, even life. All living entities are essentially algorithms of information, including humans. We are intelligent programs capable of deciding yes or no, capable of conscious, intelligent action, binary code. Our ultimate function is to transform information, to process and connect otherwise cold, random data. That is the way most Information Theorists and their philosophers see it, although I am not sure I agree.

Life forms like us are said to stand as the counter-pole to the Second Law of Thermodynamics. The First Law you will remember is that energy cannot be created or destroyed. The Second Law is that the natural tendency of any isolated system is to degenerate into a more disordered state. The Second Law is concerned with the observed one-directional nature of all energy processes. For example, heat always flows spontaneously from hotter to colder bodies, and never the reverse, unless external work is performed on the system. The result is that entropy always increases with the flow of time.

Ludwig_BoltzmannThe Second Law is causality by multiplication, not a zig-zag Mandelbrot fractal division. See my last blog on Chaos Theory. Also see: the work of the Austrian Physicist, Ludwig Boltzmann (1844–1906) on gas-dynamical equations, and his famous H-theorem: the entropy of a gas prepared in a state of less than complete disorder must inevitably increase, as the gas molecules are allowed to collide. Boltzman’s theorem-proof assumed “molecular chaos,” or, as he put it, the Stosszahlansatz, where all particle velocities were completely uncorrelated, random, and did not follow from Newtonian dynamics. His proof of the Second Law was attacked based on the random state assumption and the so called Loschmidt’s paradox. The attacks from pre-Chaos, Newtonian dominated scientists, many of whom still did not even believe in atoms and molecules, contributed to Boltzman’s depression and, tragically, he hanged himself at age 62.

My personal interpretation of Information Theory is that humans, like all of life, counter-act and balance the Second Law. We do so by an organizing force called negentropy that balances out entropy. Complex algorithms like ourselves can recognize order in information, can make sense of it. Information can have meaning, but only by our apprehension of it. We hear the falling tree and thereby make it real.

This is what I mean by the transition from Information to Knowledge. Systems that have ability to process information, to bring order out of chaos, and attach meaning to information, embody that transition. Information is essentially dead, whereas Knowledge is living. Life itself is a kind of Information spun together and integrated into meaningful Knowledge.

privacy-vs-googleWe humans have the ability to process information, to find connections and meaning. We have created machines to help us to do that. We now have information systems – algorithms – that can learn, both on their own and with our help.  We humans also have the ability find things. We can search and filter to perceive the world in such a way as to comprehend its essential truth. To see through appearances, It is an essential survival skill. The unseen tiger is death. Now, in the Information Age, we have created machines to help us find things, help us see the hidden patterns.

We can create meaning, we can know the truth. Our machines, our robot friends, can help us in these pursuits. They can help us attain insights into the hidden order behind chaotic systems of otherwise meaningless information. Humans are negentropic to a high degree, probably more so than any other living system on this planet. With the help of our robot friends, humans can quickly populate the world with meaning and move beyond a mere Information Age. We can find order, process the binary yes-or-no choices and generate Knowledge. This is similar is the skilled editor’s function discussed in Gleick’s Talks at Google (53:45, Google, 2011), but one whose abilities are greatly enhanced by AI analytics and crowdsourcing. The arbitration of truth as they put it in the video is thereby facilitated.

With the help of computers our abilities to create Knowledge are exploding. We may survive the Information flood. Some day our Knowledge may evolve even further, into higher-level integrations – into Wisdom.

James GleickWhen James Gleick was interviewed by Publishers Weekly in 2011 about his book, The Information: a history, a theory, a floodhe touched upon the problem with Information:

By the technical definition, all information has a certain value, regardless of whether the message it conveys is true or false. A message could be complete nonsense, for example, and still take 1,000 bits. So while the technical definition has helped us become powerful users of information, it also instantly put us on thin ice, because everything we care about involves meaning, truth, and, ultimately, something like wisdom. And as we now flood the world with information, it becomes harder and harder to find meaning. That paradox is the final tension in my book.

Application of Information Theory to e-Discovery and Social Progress

Information-mag-glassIn responding to lawsuits we must search through information stored in computer systems. We are searching for information relevant to a dispute. This dispute always arises after the information was created and stored. We do not order and store information according to issues in a dispute or litigation that has not yet happened. This means that for purposes of litigation all information storage systems are inherently entropic, chaotic. They are always inadequately ordered, as far as the lawsuit is concerned. Even if the ESI storage is otherwise well-ordered, which in practice is very rare (think random stored PST files and personal email accounts), it is never well-ordered for a particular lawsuit.

As forensic evidence finders we must always sort through meaningless, irrelevant noise to find the meaningful, relevant information we need. The information we search is usually not completely random. There is some order to it, some meaning. There are, for instance, custodian and time parameters that assist our search for relevance. But the ESI we search is never presented to us arranged in an order that tracks the issues raised by the new lawsuit. The ESI we search is arranged according to other logic, if any at all.

It is our job to bring order to the chaos, meaning to the information, by separating the relevant information from the irrelevant information. We search and find the documents that have meaning for our case. We use sampling, metrics, and iteration to achieve our goals of precision and recall. Once we separate the relevant documents from the irrelevant, we attain some knowledge of the total dataset. We have completed First Pass Review, but our work is not finished. All of the relevant information found in the First Pass is not produced.

Additional information refinement is required. More yes-no decisions must be made in what is called Second Pass Review. Now we consider whether a relevant document is privileged and thus excluded from production, or whether portions of it must be redacted to protect confidentiality.

Even after our knowledge is so further enhanced by confidentiality sorting, and a production set is made, the documents produced, our work is still incomplete. There is almost always far too much information in the documents produced for them to be useful. The information must be further processed. Relevancy itself must be ranked. The relevant documents must be refined down to the 7 +/- 2 documents that will persuade the judge and jury to rule our way, to reach the yes or no decision we seek. The vast body of knowledge, relevant evidence, must become wisdom, must become persuasive evidence.

Knowledge_Information_Wisdom

In a typical significant lawsuit the metrics of this process are as follows: from trillions, to thousands, to a handful. (You can change the numbers if you want to fit the dispute, but what counts here are the relative proportions.)

In a typical lawsuit today we begin with an information storage system that contains trillions of computer files. A competent e-discovery team is able to reduce this down to tens of thousands of files, maybe less, that are relevant. The actual count depends on many things, including issue complexity, cooperation and Rule 26(b)(1) factors. The step from trillions of files, to tens of thousands of relevant files, is the step from information to knowledge. Many think this is what e-discovery is all about: find the relevant evidence, convert Information to Knowledge. But it is not. It is just the first step: from 1 to 2. The next step, 2 to 3, the Wisdom step, is more difficult and far more important.

The tens of thousands of relevant evidence, the knowledge of the case, is still too vast to be useful. After all, the human brain can, at best, only keep seven items in mind at a time. Miller, The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information, Psychological Review 63 (2): 81–97. Tens of thousands of documents, or even thousands of documents, are not helpful to jurors. It may all be relevant, but is not all important. All trial lawyers will tell you that trials are won or lost by only five to nine documents. The rest is just noise, or soon forgotten foundation. Losey, Secrets of Search – Part III (5th secret).

The final step of information processing in e-discovery is only complete when the tens of thousands of files are winnowed down to 5 or 9 documents, or less. That is the final step of Information’s journey, the elevation from Knowledge to Wisdom.

Our challenge as e-discovery team members is to take raw information and turn it into wisdom – the five to nine documents with powerful meaning that will produce the favorable legal rulings that we seek. Testimony helps too of course, but without documents, it is difficult to test memory accuracy, much less veracity. This evidence journey mirrors the challenge of our whole culture, to avoid drowning in too-much-information, to rise above, to find Knowledge and, with luck, a few pearls of Wisdom.

Conclusion

Ralph_green2From trillions to a handful, from mere information to practical wisdom — that is the challenge of our culture today. On a recursive self-similar level, that is also the challenge of justice in the Information Age, the challenge of e-discovery. How to meet the challenges? How to self-organize from out of the chaos of too much information? The answer is iterative, cooperative, interactive, interdisciplinary team processes that employ advanced hybrid, multimodal technologies and sound human judgment. See What Chaos Theory Tell Us About e-Discovery and the Projected ‘Information → Knowledge → Wisdom’ Transition.

The micro-answer for cyber-investigators searching for evidence is fast becoming clear. It depends on a balanced hybrid application of human and artificial intelligence. What was once a novel invention, TAR, or technology assisted review, is rapidly becoming an obvious solution accepted in courts around the world. Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125 (S.D.N.Y. 2015); Pyrrho Investments v MWB PropertyEWHC 256 (Ch) (2/26/16). That is how information works. What was novel one day, even absurd, can very quickly become commonplace. We are creating, transmitting and processing information faster than ever before. The bits are flying at a rate that even Claude Shannon would never have dreamed possible.

The pace of change quickens as information and communication grows. New information flows and inventions propagate. The encouragement of negentropic innovation – ordered bits – is the basis of our property laws and commerce. The right information at the right time has great value.

Just ask a trial lawyer armed with five powerful documents — five smoking guns. These essential core documents are what make or break a case. The rest is just so much background noise, relevant but unimportant. The smoking hot Wisdom is what counts, not Information, not even Knowledge, although they are, of course, necessary prerequisites. There is a significant difference between inspiration and wisdom. Real wisdom does not just appear out of thin air. It arises out of True Information and Knowledge.

The challenge of Culture, including Law and Justice in our Information Age, is to never lose sight of this fundamental truth, this fundamental pattern: Information → Knowledge → Wisdom. If we do, we will get lost in the details. We will drown in a flood of meaningless information. Either that, or we will progress, but not far enough. We will become lost in knowledge and suffer paralysis by analysis. We will know too much, know everything, except what to do. Yes or No. Binary action. The tree may fall, but we never hear it, so neither does the judge or jury. The power of the truth is denied,

There is deep knowledge to be gained from both Chaos and Information Theories that can be applied to the challenges. Some of the insights can be applied in legal search and other cyber investigations. Others can be applied in other areas. As shown in this essay, details are important, but never lose sight of the fundamental pattern. You are looking for the few key facts. Like the Mandelbrot Set they remain the same, or at least similar, over different scales of magnitude, from the small county court case, to the largest complex multinational actions. Each case is different, yet the same. The procedures ties them all together.

Meaning is the whole point of Information. Justice is whole point of the Law.

You find the truth of a legal controversy by finding the hidden order that ties together all of the bits of evidence together. You find the hidden meaning behind all of the apparent contradictory clues, a fractal link of the near infinite strings of bits and bytes.

What really happened? What is the just response, the equitable remedy? That is the ultimate meaning of e-discovery, to find the few significant, relevant facts in large chaotic systems, the facts that make or break your case, so that judges and juries can make the right call. Perhaps this is the ultimate meaning of many of life’s challenges? I do not have the wisdom yet to know, but, as Cat Stevens says, I’m on the road to find out.


What Chaos Theory Tell Us About e-Discovery and the Projected ‘Information → Knowledge → Wisdom’ Transition

May 20, 2016
Ralph and Gleick

Gleick & Losey meeting sometime in the future

This article assumes a general, non-technical familiarity with the scientific theory of Chaos. See James Gleick’s book, Chaos: making a new science (1987). This field of study is not usually discussed in the context of “The Law,” although there is a small body of literature outside of e-discovery. See: Chen, Jim, Complexity Theory in Legal Scholarship (Jurisdymanics 2006).

The article begins with a brief, personal recapitulation of the basic scientific theories of Chaos. I buttress my own synopsis with several good instructional videos. My explanation of the Mandelbrot Set and Complex numbers is a little long, I know, but you can skip over that and still understand all of the legal aspects. In this article I also explore the application of the Chaos theories to two areas of my current work:

  1. The search for needles of relevant evidence in large, chaotic, electronic storage systems, such as email servers and email archives, in order to find the truth, the whole truth, and nothing but the truth needed to resolve competing claims of what happened – the facts – in the context of civil and criminal law suits and investigations.
  2. The articulation of a coherent social theory that makes sense of modern technological life, a theory that I summarize with the words/symbols: Information → Knowledge → Wisdom. See Information → Knowledge → Wisdom: Progression of Society in the Age of Computers and the more recent, How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom.”

Introduction to the Science of Chaos

Gleick’s book on Chaos provides a good introduction to the science of chaos and, even though written in 1987, is still a must read. For those who have read this long ago, like me, here is a good, short, 3:53, refresher video James Gleick on Chaos: Making a New Science (Open Road Media, 2011) below:

mandelbrot_youngA key leader in the Chaos Theory field is the late great French mathematician, Benoit Mandelbrot (1924-2010) (shown right). Benoit, a math genius who never learned the alphabet, spent most of his adult life employed by IBM. He discovered and named the natural phenomena of fractals. He discovered that there is a hidden order to any complex, seemingly chaotic system, including economics and the price of cotton. He also learned that this order was not causal and could not be predicted. He arrived at these insights by study of geometry, specifically the rough geometric shapes found everywhere in nature and mathematics, which he called fractals. The penultimate fractal he discovered now bears his name, The Mandelbrot Fractalshown in the computer photo below, and explained further in the video that follows.

Mandelbrot set

Look here for thousands of additional videos of fractals with zoom magnifications. You will see the recursive nature of self-similarity over varying scales of magnitude. The patterns repeat with slight variations. The complex patterns at the rough edges continue infinitely without repetition, much like Pi. They show the unpredictable element and the importance of initial conditions played out over time. The scale of the in-between dimensions can be measured. Metadata remains important in all investigations, legal or otherwise.

mandelbrot_equation

The Mandelbrot is based on a simple mathematical formula involving feedback and Complex Numbers: z ⇔ z2 + c. The ‘c’ in the formula stands for any Complex Number. Unlike all other numbers, such as the natural numbers one through nine – 1.2.3.4.5.6.7.8.9, the Complex Numbers do not exist on a horizontal number line. They exist only on an x-y coordinate time plane where regular numbers on the horizontal grid combine with so-called Imaginary Numbers on the vertical grid. A complex number is shown as c= a + bi, where a and b are real numbers and i is the imaginary number. Complex_number_illustration

A complex number can be visually represented as a pair of numbers (a, b) forming a vector on a diagram called an Argand diagram, representing the complex plane. “Re” is the real axis, “Im” is the imaginary axis, and i is the imaginary number. And that is all there is too it. Mandelbrot calls the formula embarrassingly simple. That is the Occam’s razor beauty of it.

To understand the full dynamics of all of this remember what Imaginary Numbers are. They are a special class of numbers where a negative times a negative creates a negative, not a positive, like is the rule with all other numbers. In other words, with imaginary numbers -2 times -2 = -4, not +4. Imaginary numbers are formally defined as i2 = −1.

Thus, the formula z ⇔ z2 + c, can be restated as z ⇔ z2 + (a + bi).

The Complex Numbers when iterated according to this simple formula – subject to constant feedback – produce the Mandelbrot set.

mandelbrot

Mandelbrot_formulaThe value for z in the iteration always starts with zero. The ⇔ symbol stands for iteration, meaning the formula is repeated in a feedback loop. The end result of the last calculation becomes the beginning constant of the next: z² + c becomes the z in the next repetition. Z begins with zero and starts with different values for c. When you repeat the simple multiplication and addition formula millions of times, and plot it on a Cartesian grid, the Mandelbrot shape is revealed.

When iteration of a squaring process is applied to non-complex numbers the results are always known and predictable. For instance when any non-complex number greater than one is repeatedly squared, it quickly approaches infinity: 1.1 * 1.1 = 1.21 * 1.21 = 1.4641 * 1.4641 = 2.14358 and after ten iterations the number created is 2.43… * 10 which written out is 2,430,000,000,000,000,000,000,000,000,000,000,000,000,000. A number so large as to dwarf even the national debt. Mathematicians say of this size number that it is approaching infinity.

The same is true for any non-complex number which is less than one, but in reverse; it quickly goes to the infinitely small, the zero. For example with .9: .9.9=.81; .81.81=.6561; .6561.6561=.43046 and after only ten iterations it becomes 1.39…10 which written out is .0000000000000000000000000000000000000000000000139…, a very small number indeed.

With non-complex numbers, such as real, rational or natural numbers, the squaring iteration must always go to infinity unless the starting number is one. No matter how many times you square one, it will still equal one. But just the slightest bit more or less than one and the iteration of squaring will attract it to the infinitely large or small. The same behavior holds true for complex numbers: numbers just outside of the circle z = 1 on the complex plane will jump off into the infinitely large, complex numbers just inside z = 1 will quickly square into zero.

The magic comes by adding the constant c (a complex number) to the squaring process and starting from z at zero: z ⇔ z² + c. Then stable iterations – a set attracted to neither the infinitely small or infinitely large – become possible. The potentially stable Complex numbers lie both outside and inside of the circle of z = 1; specifically on the complex plane they lie between -2.4 and .8 on the real number line, the horizontal x grid, and between -1.2 and +1.2 on the imaginary line, the vertical y grid. These numbers are contained within the black of the Mandelbrot fractal.

Mandelbrot_grid

In the Mandelbrot formula z ⇔ z² + c, where you always start the iterative process with z equals zero, and c equaling any complex number, an endless series of seemingly random or chaotic numbers are produced. Like the weather, the stock market and other chaotic systems, negligible changes in quantities, coupled with feedback, can produce unexpected chaotic effects. The behavior of the complex numbers thus mirrors the behavior of the real world where Chaos is obvious or lurks behind the most ordered of systems.

With some values of ‘c’ the iterative process immediately begins to exponentially increase or fall into infinity. These numbers are completely outside of the Mandelbrot set. With other values of ‘c’ the iterative process is stable for a number of repetitions, and only later in the dynamic process are they attracted to infinity. These are the unstable strange attractor numbers just on the outside edge of the Mandelbrot set. They are shown on computer graphics with colors or shades of grey according to the number of stable iterations. The values of ‘c’ which remain stable, repeating as a finite number forever, never attracted to infinity, and thus within the Mandelbrot set, are plotted as black.

Mandel_Diagram

Some iterations of complex numbers like 1 -1i run off into infinity from the start, just like all of the real numbers. Other complex numbers are always stable like -1 +0i. Other complex numbers stay stable for many iterations, and then only further into the process do they unpredictably begin to start to increase or decrease exponentially (for example, .37 +4i stays stable for 12 iterations). These are the numbers on the edge of inclusion of the stable numbers shown in black.

Chaos enters into the iteration because out of the potentially infinite number of complex numbers in the window of -2.4 to .8 along the horizontal real number axis, and -1.2 to 1.2 along the vertical imaginary number axis. There are an infinite subset of such numbers on the edge, and they cannot be predicted in advance. All that we know about these edge numbers is that if the z produced by any iteration lies outside of a circle with a radius of 2 on the complex plane, then the subsequent z values will go to infinity, and there is no need to continue the iteration process.

By using a computer you can escape the normal limitations of human time. You can try a very large number of different complex numbers and iterate them to see what kind they may be, finite or infinite. Under the Mandelbrot formula you start with z equals zero and then try different values for c. When a particular value of c is attracted to infinity – produces a value for z greater than 2 – then you stop that iteration, go back to z equals zero again, and try another c, and so on, over and over again, millions and millions of times as only a computer can do.

Mandel_zoom_08_satellite_antennaMandelbrot was the first to discover that by using zero as the base z for each iteration, and trying a large number of the possible complex numbers with a computer on a trial and error basis, that he could define the set of stable complex numbers graphically by plotting their location on the complex plane. This is exactly what the Mandelbrot figure is. Along with this discovery came the surprise realization of the beauty and fractal recursive nature of these numbers when displayed graphically.

The following Numberphile video by Holly Krieger, an NSF postdoctoral fellow and instructor at MIT, gives a fairly accessible, almost cutesy, yet still technically correct explanation to the Mandelbrot set.

Fractals and the Mandelbrot set are key parts of the Chaos theories, but there is much more to it than that. Chaos Theory impacts our basic Newtonian, cause-effect, linear world view of reality as a machine. For a refresher on the big picture of the Chaos insights and how the old linear, Newtonian, machine view of reality is wrong, look at this short summary: Chaos Theory (4:48)

Anther Chaos Theory instructional applying the insights to psychology is worth your view. The Science and Psychology of the Chaos Theory (8:59, 2008). It suggests the importance of spontaneous actions in the moment, the so-called flow state.

Also see High Anxieties – The Mathematics of Chaos (59:00, BBC 2008) concerning Chaos Theories, Economics and the Environment, and Order and Chaos (50:36, New Atlantis, 2015).

Application of Chaos Theories to e-Discovery

The use of feedback, iteration and algorithmic processes are central to work in electronic discovery. For instance, my search methods to find relevant evidence in chaotic systems follow iterative processes, including continuous, interactive, machine learning methods. I use these methods to find hidden patterns in the otherwise chaotic data. An overview of the methods I use in legal search is summarized in the following chart. As you can see, steps four, five and six iterate. These are the steps where human computer interactions take place. 
predictive_coding_3.0

My methods place heavy reliance on these steps and on human-computer interaction, which I call a Hybrid process. Like Maura Grossman and Gordon Cormack, I rely heavily on high-ranking documents in this Hybrid process. The primary difference in our methods is that I do not begin to place a heavy reliance on high-ranking documents until after completing several rounds of other training methods. I call this four cylinder multimodal training. This is all part of the sixth step in the 8-step workflow chart above. The four cylinders search engines are: (1) high ranking, (2) midlevel ranking or uncertain, (3) random, and (4) multimodal (including all types of search, such as keyword) directed by humans.

Analogous Application of Similar Mandelbrot Formula For Purposes of Expressing the Importance of the Creative Human Component in Hybrid 

4-5-6-only_predictive_coding_3.0

Recall Mandelbrot’s formula: z ⇔ z² + c, which is the same as z ⇔ z2 + (a + bi). I have something like that going on in my steps four, five and six. If you plugged the numbers of the steps into the Mandelbrot formula it would read something like this: 4 ⇔ 4² + (5+6i). The fourth step is the key AI Predictive Ranking step, where the algorithm ranks the probable relevance of all documents. The fourth step of computer ranking is the whole point of the formula, so AI Ranking here I will call ‘z‘ and represents the left side of the formula. The fifth step is where humans read documents to determine relevance, let’s call that ‘r‘ and the sixth step is where human’s train the computer, ‘t‘. This is the Hybrid Active Training step where the four cylinder multimodal training methods are used to select documents to train the whole set. The documents in steps five and six, r and t are added together for relevance feedback, (r + ti).

Thus, z ⇔ z² + c, which is the same as z ⇔ z2 + (a + bi), becomes under my system z ⇔ z + (r + ti). (Note: I took out the squaring, z², because there is no such exponential function in legal search; it’s all addition.) What, you might ask, is the i in my version of the formula? This is the critical part in my formula, just as it is in Mandelbrot’s. The imaginary number – i – in my formula version represents the creativity of the human conducting the training.

The Hybrid Active Training step is not fully automated in my system. I do not simply use the highest ranking documents to train, especially in the early rounds of training, as do some others. I use a variety of methods in my discretion, especially the multimodal search methods such a keywords, concept search, and the like. In text retrieval science this use of human discretion, human creativity and judgment, is called an ad hoc search. It contrasts with fully automated search, where the text retrieval experts try to eliminate the human element. See Mr EDR for more detail on 2016 TREC Total Recall Track that had both ad hoc and fully automated sections.

My work with legal search engines, especially predictive coding, has shown that new technologies do not work with the old methods and processes, such as linear review or keyword alone. New processes are required that employ new ways of thinking. The new methods that link creative human judgments (i) and the computer’s amazing abilities at text reading speed, consistency, analysis, learning and ranking (z).

A rather Fat Cat. My latest processes, Predictive Coding  3.0, are variations of Continuous Active Training (CAT) where steps four, five and six iterate until the project is concluded. Grossman & Cormack call this Continuous Active Learning or CAL, and they claim Trademark rights to CAL. I respect their right to do so (no doubt they grow weary of vendor rip-offs) and will try to avoid the acronym henceforth. My use of the acronym CAT essentially takes the view of the other side, the human side that trains, not the machine side that learns. In both Continuous Active Learning and CAT the machine keeps learning with every document that a human codes. Continuous Active Learning or Training, makes the linear seed-set method obsolete, along with the control set and random training documents. See Losey, Predictive Coding 3.0.

In my typical implementation of Continuous Active Training I do not automatically include every document coded as a training document. This is the sixth training step (‘t‘ in the prior formula). Instead of automatically using every document to train that has been coded relevant or irrelevant, I select particular documents that I decide to use to train. This, in addition to multimodal search in step six, Hybrid Active, is another way in which the equivalent of Imaginary Numbers come into my formula, the uniquely human element (ti). I typically use most every relevant document coded in step five, the ‘r‘ in the formula, as a training document, but not all. z ⇔ z + (r + ti)

I exercise my human judgment and experience to withhold certain training documents. (Note, I never withhold hot trainers (highly relevant documents)). I do this if my experience (I am tempted to say ‘my imagination‘) suggests that including them as training documents will likely slow down or confuse the algorithm, even if temporarily. I have found that this improves efficiency and effectiveness. It is one of the techniques I used to win document review contests.

robot-friendThis kind of intimate machine communication is possible because I carefully observe the impact of each set of training documents on the classifying algorithm, and carryover lessons – iterate – from one project to the next. I call this keeping a human in the loop and the attorney in charge of relevance scope adjudications. See Losey, Why the ‘Google Car’ Has No Place in Legal Search. We humans provide experienced observation, new feedback, different approaches, empathy, play and emotion. We also add a whole lot of other things too. The AI-Robot is the Knowledge fountain. We are the Wisdom fountain.That it is why we should strive to progress into and through the Knowledge stage as soon as possible. We will thrive in the end-goal Wisdom state.

Application of Chaos Theory to Information→Knowledge→Wisdom

mininformation_arrowsThe first Information stage of the post-computer society in which we live is obviously chaotic. It is like the disconnected numbers that lie completely outside of the Mandelbrot set. It is pure information with only haphazard meaning. It is often just misinformation. Just exponential. There is an overwhelming deluge of such raw information, raw data, that spirals off into an infinity of dead-ends. It leads no where and is disconnected. The information is useless. You may be informed, but to no end. That is modern life in the post-PC era.

The next stage of society we seek, a Knowledge based culture, is geometrically similar to the large black blogs that unite most of the figure. This is the finite set of numbers that provide all connectivity in the Mandelbrot set. Analogously, this will be a time when many loose-ends will be discarded, false theories abandoned, and consensus arise.

In the next stage we will not only be informed, we will be knowledgable. The information we all be processed. The future Knowledge Society will be static, responsible, serious and well fed. People will be brought together by common knowledge. There will be large scale agreements on most subjects. A tremendous amount of diversity will likely be lost.

After a while a knowledgable world will become boring. Ask any professor or academic.  The danger of the next stage will be stagnation, complacency, self-satisfaction. The smug complacency of a know-it-all world. This may be just as dangerous as the pure-chaos Information world in which we now live.

If society is to continue to evolve after that, we will need to move beyond mere Knowledge. We will need to challenge ourselves to attain new, creative applications of Knowledge. We will need to move beyond Knowledge into Wisdom.

benoit-mandelbrot-seahorse-valleyI am inclined to think that if we ever do progress to a Wisdom-based society, we will be a place and time much like the unpredictable fractal edges of the Mandelbrot. Stable to a point, but ultimately unpredictable, constantly changing, evolving. The basic patterns of our truth will remain the same, but they will constantly evolve and be refined. The deeper we dig, the more complex and beautiful it will be. The dry sameness of a Knowledgable based world will be replaced by an ever-changing flow, by more and more diversity and individuality. Our social cohesivity will arise from recursivity and similarity, not sameness and conformity. A Wisdom based society will be filled with fractal beauty. It will live ever zigzagging between the edge of the known and unknown. It will also necessarily have to be a time when people learn to get along together and share in prosperity and health, both physical and mental. It will be a time when people are accustomed to ambiguities and comfortable with them.

In Wisdom World knowledge itself will be plentiful, but will be held very lightly. It will be subject to constant reevaluation. Living in Wisdom will be like living on the rough edge of the Mandelbrot. It will be a culture that knows infinity firsthand. An open, peaceful, ecumenical culture that knows everything and nothing at the same time. A culture where most of the people, or at least a strong minority, have attained a certain level of personal Wisdom.

Conclusion

Back to our times, where we are just now discovering what machine learning can do, we are just beginning to pattern our investigations, our search for truth, in the Law and elsewhere, on new information gleaned from the Chaos theories. Active machine learning, Predictive Coding, is a natural outgrowth of Chaos Theory and the Mandelbrot Set. The insights of hidden fractal order that can only be seen by repetitive computer processes are prevalent in computer based culture. These iterative, computer assisted processes have been the driving force behind thousands of fact investigations that I have conducted since 1980.

I have been using computers to help me in legal investigations since 1980. The reliance on computers at first increased slowly, but steadily. Then from about 2006 to 2013 the increase accelerated and peaked in late 2013. The shift is beginning to level off. We are still heavily dependent on computers, but now we understand that human methods are just as important as software. Software is limited in its capacities without human additive, especially in legal search. Hybrid, Man and Machine, that is the solution. But remember that the focus should be on us, human lawyers and search experts. The AIs we are creating and training should be used to Augment and Enhance our abilities, not replace them. They should complement and complete us.

butterfly_effectThe converse realization of Chaos Theory, that disorder underlies all apparent order, that if you look closely enough, you will find it, also informs our truth-seeking investigatory work. There are no smooth edges. It is all rough. If you look close enough the border of any coastline is infinite.

The same is true of the complexity of any investigation. As every experienced lawyer knows, there is no black and white, no straight line. It always depends on so many things. Complexity and ambiguity are everywhere. There is always a mess, always rough edges. That is what makes the pursuit of truth so interesting. Just when you think you have it, the turbulent echo of another butterfly’s wings knock you about.

The various zigs and zags of e-discovery, and other investigative, truth-seeking activities, are what make them fascinating. Each case is different, unique, yet the same patterns are seen again and again with recursive similarity. Often you begin a search only to have it quickly burn out. No problem, try again. Go back to square one, back to zero, and try another complex number, another clue. Pursue a new idea, a new connection. You chase down all reasonable leads, understanding that many of them will lead nowhere. Even failed searches rule out negatives and so help in the investigation. Lawyers often try to prove a negative.

The fractal story that emerges from Hybrid Multimodal search is often unexpected. As the search matures you see a bigger story, a previously hidden truth. A continuity emerges that connects previously unrelated facts. You literally connect the dots. The unknown complex numbers – (a + bi) – the ones that do not spiral off into the infinite large or small, do in fact touch each other when you look closely enough at the spaces.

z ⇔ z2 + (a + bi)

SherlockI am no Sherlock, but I know how to find ESI using computer processes. It requires an iterative sorting processes, a hybrid multimodal process, using the latest computers and software. This process allows you to harness the infinite patience, analytics and speed of a machine to enhance your own intelligence ……. to augment your own abilities. You let the computer do the boring bits, the drudgery, while you do the creative parts.

The strength comes from the hybrid synergy. It comes from exploring the rough edges of what you think you know about the evidence. It does not come from linear review, nor simple keyword cause-effect. Evidence is always complex, always derived from chaotic systems. A full multimodal selection of search tools is needed to find this kind of dark data.

The truth is out there, but sometimes you have to look very carefully to find it. You have to dig deep and keep on looking to find the missing pieces, to move from Information → Knowledge → Wisdom.

_______

______

_____

____

___

__

_

.

Mandelbrot_zoom

.

_

.

blue zoom Mandelbrot fractal animation of looking deeper into the details

.

.


e-Discovery Team’s Best Practices Education Program

May 8, 2016

EDBP_BANNER

EDBP                   Mr.EDR         Predictive Coding 3.0
59 TAR Articles
Doc Review  Videos

_______

TEAM_TRAINING_screen_shot

e-Discovery Team Training

Information → Knowledge → Wisdom

Ralph_4-25-16Education is the clearest path from Information to Knowledge in all fields of contemporary culture, including electronic discovery. The above links take you to the key components of the best-practices teaching program I have been working on since 2006. It is my hope that these education programs will help move the Law out of the dangerous information flood, where it is now drowning, to a safer refuge of knowledge. Information → Knowledge → Wisdom: Progression of Society in the Age of Computers; and How The 12 Predictions Are Doing That We Made In “Information → Knowledge → Wisdom.” For more of my thoughts on e-discovery education, see the e-Discovery Team School Page.

justice_guage_negligenceThe best practices and general educational curriculum that I have developed over the years focuses on the legal services provided by attorneys. The non-legal, engineering and project management practices of e-discovery vendors are only collaterally mentioned. They are important too, but students have the EDRM and other commercial organizations and certifications for that. Vendors are part of any e-Discovery Team, but the programs I have developed are intended for law firms and corporate law departments.

LIFE_magazine_Losey_acceleratesThe e-Discovery Team program, both general educational and legal best-practices, is online and available 24/7. It uses lots of imagination, creative mixes, symbols, photos, hyperlinks, interactive comments, polls, tweets, posts, news, charts, drawings, videos, video lectures, slide lectures, video skits, video slide shows, music, animations, cartoons, humor, stories, cultural themes and analogies, inside baseball references, rants, opinions, bad jokes, questions, homework assignments, word-clouds, links for further research, a touch of math, and every lawyer’s favorite tools: words (lots of them), logic, arguments, case law and precedent.

All of this to try to take the e-Discovery Team approach from just information to knowledge →. In spite of these efforts, most of the legal community still does not know e-discovery very well. What they do know is often misinformation. Scenes like the following in a law firm lit-support department are all too common.

supervising-tipsThe e-Discovery Team’s education program has an emphasis on document review. That is because the fees for lawyers reviewing documents is by far the most expensive part of e-discovery, even when contract lawyers are used. The lawyer review fees, and review supervision fees, including SME fees, have always been much more costly than all vendor costs and expenses put together. Still, the latest AI technologies, especially active machine learning using our Predictive Coding 3.0 methods, are now making it possible to significantly reduce review fees. We believe this is a critical application of best practices. The three steps we identify for this area in the EDBP chart are shown in green, to signify money. The reference to C.A. Review is to Computer Assisted Review or CAR, using our Hybrid Multimodal methods.

EDBP_detail_LARGE

____

Predictive Coding 3.0 Hybrid Multimodal Document Search and Review

Control-SetsOur new version 3.0 techniques for predictive coding makes it far easier than ever before to include AI in a document review project. The secret control set has been eliminated, so too has the seed set and SMEs wasting their time reviewing random samples of mostly irrelevant junk. It is a much simpler technique now, although we still call it Hybrid Multimodal.

robot-friendHybrid is a reference to the Man/Machine interactive nature of our methods. A skilled attorney uses a type of continuous active learning to train an AI to help them to find the documents they are looking for. This Hybrid method greatly augments the speed and accuracy of the human attorneys in charge. This leads to cost savings and improved recall. A lawyer with an AI helper at their side is far more effective than lawyers working on their own. This means that every e-discovery team today could use a robot like Kroll Ontrack’s Mr. EDR to help them to do document review.

Search_pyramidMultimodal is a reference to the use of a variety of search methods to find target documents, including, but not limited to, predictive coding type ranked searches. We encourage humans in the loop running a variety of searches of their own invention, especially at the beginning of a project. This always makes for a quick start in finding relevant and hot documents. Why the ‘Google Car’ Has No Place in Legal Search. The multimodal approach also makes for precise, efficient reviews with broad scope. The latest active machine learning software when fully integrated with a full suite of other search tools is attaining higher levels of recall than ever before. That is one reason Why I Love Predictive Coding.

Mr_EDRI have found that Kroll Ontrack’s EDR software is ideally suited for these Hybrid, Multimodal techniques. Try using it on your next large project and see for yourself. The Kroll Ontrack consultant specialists in predictive coding, Jim and Tony, have been trained in this method (and many others). They are well qualified to assist you in every step of the way and their rates are reasonable. With you calling the shots on relevancy, they can do most of the search work for you and still save your client’s money. If the matter is big and important enough, then, if I have a time opening, and it clears my firm’s conflicts, I can also be brought in for a full turn-key operation. Whether you want to include extra time for training your best experts is your option, but our preference.

Team_TREC_2

__________

Embrace e-Discovery Team Education to Escape Information Overload

____



Follow

Get every new post delivered to your Inbox.

Join 4,764 other followers

%d bloggers like this: