Ralph Losey, March 29, 2026.
We are currently living through a “Gutenberg Moment,” but with a complex, digital twist: our new printing press is alive, probabilistic, and prone to “confident delusions.” While AI may be humanity’s most transformative invention, it remains an enigma to most.
For many legal professionals, the outputs of Generative AI feel like a digital seance—words appearing out of the ether with no visible logic. This “Black Box” is not just a technical curiosity; it is a professional liability. If you cannot at least partially understand and explain how your “assistant” reached a conclusion, you are effectively practicing in the dark. To move from being a passenger to a pilot, you must understand the mechanical soul of the machine and learn how to make it sing with the voices you command.

My recent article, What People Want To Know About AI: Top 10 Curiosity Index, revealed that the primary thing people want to know is how the machine actually works. They are asking the most difficult question in the field: How does AI “think” or make decisions?
This article answers that question by providing a structured understanding of Large Language Models (LLMs) across five levels of technical complexity:
- The Smart Child: The world’s best guessing game.
- The High School Graduate: Statistical probability at a global scale.
- The College Graduate: Mapping meaning in Latent Space.
- The Computer Scientist: The logic of the Transformer and Self-Attention.
- The Tech-Minded Legal Professional: Navigating probabilistic advocacy.

There is a meta-lesson here too that goes beyond the words on this page. Some of my favorite explanations of complex subjects emulate the fresh, clear speech of fifth graders. You will often find deep creativity when AI models parrot their language.
I chose five kinds of speech to describe how AI works. There are hundreds more that I could have picked. I also could have asked for explanations that use story or humor, much like Abraham Lincoln liked to do. It is fun to learn to tell AI what to do so that you can better communicate. It empowers a level of creativity never before possible. Maybe next time I will use comedy or poetry. For now, let’s peel back the curtain using these five.

1. The Smart Child Level: The World’s Best Guessing Game
Definition: Generative AI is like a magic “Fill-in-the-Blank” machine that has played the game trillions of times with almost every book ever written.
Imagine you are playing a game. If I say, “The peanut butter and…”, you immediately think of the word “jelly.” You don’t need to look at a jar of jelly to know that word fits. You’ve heard those words together so many times that your brain just knows they belong together.
An AI is a computer that has “listened” to almost everyone in the world talk and “read” almost every story ever told. It doesn’t “know” what a sandwich is, and it doesn’t have a stomach that feels hungry. It simply knows that in the history of human writing, the word “jelly” follows “peanut butter” more than almost any other word.
But it’s even smarter than that. If you say, “I am at the library and I am reading a…”, the AI knows that “book” is a much better guess than “sandwich”. It looks at all the words you give it—the “clues”—to narrow down the billions of possibilities into one likely answer. It makes decisions by picking the word that is most likely to come next to complete a pattern that makes sense to us. It isn’t “thinking” about the story; it’s just very, very good at predicting the next piece of the puzzle.

2. The High School Level: Statistical Probability at Global Scale
Definition: AI is a Prediction Engine. It uses “Big Data” to calculate the statistical likelihood of the next piece of information.
Most of us use the “Autofill” feature on our smartphones every day. As you type a text, the phone suggests the next likely word based on your past habits. If you often text “I’m on my way,” the phone learns that “way” usually follows “my.” Generative AI—specifically Large Language Models—is essentially Autofill scaled to include the vast majority of digitized human knowledge.
During its “training” phase, the model does not “memorize” facts like a traditional database. If you ask it for the date of the Magna Carta, it isn’t looking it up in a digital encyclopedia. Instead, it has learned through billions of examples that the words “Magna Carta” and “1215” have a very high statistical correlation.
This explains why AI can sometimes be “confidently wrong.” It isn’t “lying” in the human sense; it is simply following a statistical path that leads to a mistake. If the data it was trained on contains a common error, the AI will repeat that error because, in its mathematical world, that error is the “most likely” next word. It recognizes the “shape” of human thought without actually having a human mind.

3. The College Graduate Level: Mapping the Latent Space
Definition: AI organizes information using Vector Embeddings, which convert words into numerical coordinates on a massive, multi-dimensional map called Latent Space.
To understand how AI moves beyond mere word-matching, we have to look at how it “maps” meaning. In a physical library, books are organized by a 1D system (the spine) or 2D (the shelf). AI organizes information in a “map” that has thousands of dimensions.
- Vectoring (The Coordinate System): Every word or concept is assigned a “Coordinate”—a long string of numbers. For example, the word “Stealing” is mathematically plotted very close to “Larceny” but far away from “Charity”.
- Conceptual Proximity: Think of this as the “Relativity” of language. If you ask the AI about “theft,” it doesn’t look for that specific word. It navigates to those coordinates in Latent Space and finds all the “neighboring” concepts like “property,” “intent,” and “deprivation.”
- Vector Arithmetic: Researchers discovered that you can actually perform “logic” using these numbers. A famous example is: King – Man + Woman = Queen. The model “understands” the relationship between these concepts because the mathematical distance between “King” and “Man” is the same as the distance between “Queen” and “Woman.”
When you provide a prompt, the AI identifies the coordinates of your request. It then “walks” through the nearby clusters of meaning to synthesize an answer. The “Black Box” is the result of the sheer scale of this map. With hundreds of billions of dimensions, the path the AI takes is so complex that no human can trace the logic of a single output back to a single “rule.”

4. The Computer Scientist Level: The Decoder-Only Transformer
Definition: Generative AI is a system powered by neural network architectures—most notably the Decoder-only Transformer—that is specifically tuned to generate the next piece of information by mathematically looking back at everything that came before it. Rather than relying on rigid rules, these models evaluate entire inputs using a mathematical weighting system called Self-Attention to determine the contextual relationship between every element.
To achieve this generative capability, the architecture relies on several complex mathematical mechanisms:
A. The “Query, Key, and Value” System: To decide how much “weight” to give a word, the AI creates three numerical identities for every token. The Query represents what the token is looking for (like a pronoun searching for a subject), the Key represents what the token offers (like a subject offering its identity), and the Value represents the token’s actual semantic meaning.

B. The Logic of Self-Attention: The AI establishes context by comparing the Query of one word against the Keys of all other words in the sequence. Imagine a judge sitting through a long trial. When a witness says the word ‘It,’ the judge immediately looks back at previous exhibits to see what ‘It’ refers to. The AI does this mathematically by comparing the Query of one word against the Keys of every other word in the sequence. For example, in the sentence “The court sanctioned the attorney because his motion was meritless,” the AI mathematically calculates the relationship between “his” and the surrounding words. The Query for “his” finds a high match with the Key for “attorney,” allowing the model to assign a high Attention Weight to “attorney” so the word “his” inherits the correct context.

C. Multi-Head Attention (Parallel Deliberation): The model doesn’t just evaluate the text once; it runs these calculations dozens of times in parallel. Different “Heads” focus on different aspects simultaneously—one might evaluate syntax and grammar, another focuses on technical legal definitions, and a third assesses the overall tone or sentiment.

D. The Decision Layer (Feed-Forward Networks): After attention weights are settled, the data moves into a decision-making layer consisting of billions of Weights (connection strengths) and Biases (baseline leanings). These act as the model’s “institutional knowledge,” which was grown during training to satisfy the objective of predicting the next token.

E. The Softmax Verdict: Finally, the model uses a Softmax function to produce a probability list of every possible word in its vocabulary. It calculates the exact odds—for example, assigning “Court” an 85% probability and “Sandwich” a 0.01% probability—and then mathematically samples the winner to generate the next word. Since the Softmax Verdict generates words based on statistical odds rather than verified facts, it is crucial for lawyers to verify the output, which we will also discuss in more detail later in this article.

5. The Tech-Minded Legal Professional Level: Probabilistic Advocacy
Definition: For the legal professional, Generative AI is not a database, but a Probabilistic Inference Engine. It does not “find” data in the traditional sense; it infers the most likely response based on the conceptual coordinates of your request and the mathematical “gravity” of the language it was trained on.
A. From Search to Inference
For fifty years, the legal industry’s relationship with technology was deterministic. Traditional legal databases use rigid logic gates: Does Document A contain Word X AND Word Y? If the words are present, it is a ‘hit’; if not, it is ignored, functioning as a simple ‘On/Off’ switch. The Transformer changes this completely. It is not a search database, but a Probabilistic Inference Engine. When you ask it to ‘analyze a witness’s credibility,’ it doesn’t just look for the word ‘credibility’; it infers a conclusion by weighing the context of every word in the record.

B. Navigating the Latent Space
To perform this analysis, the model navigates the Latent Space coordinates of your query. It uses the Self-Attention weights discussed in Level 4 to “infer” a conclusion by weighing the context of every word in the record. It identifies the “Intent” and “Sentiment” within millions of documents in a second. Such tasks were previously impossible for deterministic software.
C. The Weight of the Legal Oath
While the machine provides the “Magic Guesses” of a child and the “Neural Weights” of a scientist, it lacks the professional standing to be an advocate.
- The Black Box as an Invitation: The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of legal practice.
- The Human Validator: We use the machine to find the “needle” (the insight), but we use our human judgment to prove it is evidence and not a hallucination.
- The Ultimate Weight: In this new era, the most important “Weight” in the entire system is the one held by the human professional.

6. The “Growing, not Building” Concept: The Genesis of the Black Box
To understand why even the creators of these models cannot always explain a specific output, we have to understand that AI is trained into complexity, rather than just hard-coded with logic.
- The Old World of Software: In the past, we built programs based on rigid, transparent logic. If the code said “If X, then Y,” but it did something else, it was a “bug” to be corrected within a deterministic machine.
- The New World of Generative AI: This technology is created through Self-Supervised Learning. We don’t provide the model with logic blueprints (corrected spelling from “bluepritns”); instead, we provide an ocean of data and a single objective: “Predict the next piece of information.”
- The “Growth” of Intelligence: The model then “grows” its own internal pathways—billions of connections known as Weights and Biases—to satisfy that objective.
Think of it like a massive vine growing through a lattice. As engineers, we provide the lattice (the Transformer architecture), but the vine (the intelligence) grows itself. By the time training is finished, there are hundreds of billions of connections. There is no “Master Code” for a human to read or audit. The “Black Box” is not a wall; it is a forest so dense that no human can map every leaf.
In the era of AI Entanglement, we must judge the AI by its results (the fruit) rather than its process (the roots).

7. The “Context Window” as a Trial Record
In the computer scientist level we discussed the Transformer’s ability to look at a whole document simultaneously. In practice, this capability is governed by the Context Window. In AI, the Context Window is the specific amount of data the model can “Attend” to at any one time. When you upload a 100-page contract, the AI holds that text in a temporary “workspace.”
The Judicial Analogy: Think of the Context Window as a judge’s Active Memory during a hearing.
The Risk of Loss: If a trial lasts for ten days, but the judge can only remember the last two hours of testimony, they will lose the thread of the case.
Hallucination via Omission: They might “hallucinate” a fact not because they are lying, but because they have lost the beginning of the record.
Legal Strategy: For the tech-minded lawyer, you must manage the “Active Record” of your conversation to ensure the model maintains access to critical early facts. In a similar way, a judge relies on a court reporter who makes a transcript of the record to ensure nothing is lost to the passage of time.

8. Anatomy of a Hallucination
A “Case Study” of a hallucination through the lens of Latent Space will help us to understand them.
Suppose you ask an AI for a case supporting a specific point of Florida law. The AI navigates to the “Neighborhood” of Florida Law and the “Street” of that specific legal issue. It sees a cluster of real cases—Smith v. Jones and Doe v. Roe.
Because it is a Probabilistic Inference Engine, the AI doesn’t naturally “check” a verified list of real cases. Instead, it follows the mathematical pattern of how Florida cases are typically named and cited.
The AI then “generates” Brown v. State—a case that sounds perfectly correct because its coordinates are exactly where a real case should be based on the surrounding patterns. It has followed the statistical “gravity” of the neighborhood, but it has drifted into a sequence of words that is factually untethered from reality.
It is a perfectly logical mathematical guess that happens to be a factual lie. This is the primary reason why we must cross-examine our assistants. We use our human judgment to prove the output is a needle of truth and not a hallucination of the “Black Box.” Cross-Examine Your AI: The Lawyer’s Cure for Hallucinations (12/17/25).

Conclusion: A Symphony of Five Understandings
We have traveled from the magic toy box to the multi-dimensional math of the Transformer. To close, let’s look at the “Black Box” one last time through all five lenses.
The Smart Child sees a magic friend who is the best guesser in the world. To the child, the lesson is simple: the magic friend is fun, but sometimes they make up stories. Enjoy the story, but don’t bet your lunch money on it.
The High Schooler sees a massive “Autocomplete” engine. They understand that the AI is just a mirror of everything we’ve ever written. The lesson: the mirror is only as good as the light you shine into it.
The College Graduate sees the “Latent Space”—a map of human culture turned into math. They realize that meaning is not found in isolated words, but in the mathematical distance and relationship between them.
The Computer Scientist sees the Decoder-only Transformer—a masterpiece of matrix multiplication and Self-Attention weights. They know that “thinking” is just the sound of billions of Query and Key vectors finding their mathematical match.
The Tech-Minded Legal Professional—the “Human in the Loop”—sees a revolution. We see a tool that can navigate the “Intent” and “Sentiment” of millions of documents in a heartbeat using Probabilistic Inference. But we also see the weight of our professional oath.

Our New Role: From Searcher to Validator. Electronic discovery professionals are no longer just “Searchers” of data; we are the Validators of a new, probabilistic reality.
We are the ones who must take the “Magic Guesses” of the child, the “Statistical Patterns” of the high schooler, the “Latent Map” of the college graduate, and the “Neural Weights” of the scientist, and forge them into Evidence.
The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of practice. We use the machine to find the needle, but we use our human judgment to prove it is a needle and not a hallucination.
In the era of AI Entanglement, the most important “Weight” in the entire system is the human in charge: You.

Ralph Losey Copyright 2026 — All Rights Reserved
Posted by Ralph Losey
There is a battle in the legal tech world between Information Governance and Search. It reflects a larger conflict in IT and all of society. Last year I came to believe that Information Governance’s preoccupation with classification, retention, and destruction of information was a futile pursuit. I challenged these activities as inefficient and doomed to failure in the age of information explosion. Instead of classify and kill, I embraced the googlesque approach of save and search. 


Where are the rights to both privacy and security in the challenge of too-much-information? I am a strong proponent of privacy, and so are many in the IG world. I am also a strong proponent of cybersecurity. I think it is possible to have both. In both the Search and IG camps their are people who agree with me on these points, and others who disagree. Many see it as one or the other, especially people in government. They take extreme views favoring either security or privacy. Many in both tech and government simply dismiss the importance of privacy, and say just get over it. Advocacy for individual privacy is a separate battle in both worlds, IG and Search. The same is true over cybersecurity. I favor a balanced approach, and so do many in the IG world.
The traditionalists in the IG world whom I continue to oppose, the ones who are glorified records managers, have another five years, at best, before complete obsolescence. The classify and control lock-down approach of records management is contrary to the time. It cannot withstand the continuing exponential growth of data, nor the basic entropy forces aligned against all attempts to govern by all-too-human rules and compliance. Records managers are caterpillars waiting to be reborn. They should withdraw into a cocoon and embrace the change.
My prediction is that within five years the traditional records management activities, specifically the classification, filing and obsessive deletion of data, will no longer be worth the effort. (I concede that some deletion is necessary and will continue.) It will be far more efficient to rely on advanced Search, than classify and kill. This five-year projection assumes continued exponential growth and complexity of ESI. Breakthroughs in search in the next five years would be nice too, but my prediction does not depend on that. It assumes instead a slow, steady improvement of search technologies. They are already awesome, when used properly. The caterpillar record managers will grow big and fly high with search if they will only allow themselves to have new eyes.

Stuart Brand of Whole Earth Catalogue fame is credited with originating the phrase information wants to be free, but in fact his quote is taken out of context. His whole 
In the meantime we have records managers running around who serve like heroic bomb squads. Some know that it is just a noble quest, doomed to failure. Most do not. Some helicopter in and out of corporate worlds like wannabe 

My understanding and experiences with Big Data analytics over the last few years have led me understand that more data can mean more intelligence, that it does not necessarily mean more trouble and expense. I understand that more and bigger data has its own unique values, so long as it can be analyzed and searched effectively.
The point is, with the never-ending uncertainties of tomorrow, you can never know for sure that information is valueless and should be destroyed, and what information has value and should be saved. There may be an unimaginably large haystack of information, and you may think it only has a few valuable needles. But, you never really know. Today’s irrelevant straw could be tomorrow’s relevant needle. With the AI based search capacities we already have, capacities that are surely to improve, when you need to find a needle in these near infinite stacks, you will be able to. The cost of storage itself has become so low as to become a negligible factor for most large corporations. Why destroy data when you can effectively search it and mine it for value? That is the butterfly view.



The records life-cycle ideas all made perfect sense in the world of paper information. It cost a lot of money to save and store paper records. Everyone with a monthly Iron Mountain paper records storage bill knows that. Even after the computer age began, it still cost a fair amount of money to save and store ESI. The computers needed to buy and maintain digital storage used to be very expensive. Finding the ESI you needed quickly on a computer was still very difficult and unreliable. All we had at first was keyword search, and that was very ineffective.
The New York Times in an opinion editorial in late 2014 discussed recent breakthroughs in Artificial Intelligence and speculated on alternative futures this could create. Our Machine Masters, 

Conclusion
Preservation is far less difficult when you are anyway saving everything forever. With this approach the challenging task remaining in e-discovery is really just search. That is why I say, only slightly tongue in cheek, that Information Governance is actually a sub-set of Search, not visa versa. In so far as e-discovery is concerned, that is true; but IG is a concern that goes beyond e-discovery.
In the IG now emerging – IG 2.0 – Information Governance serves as a kind of umbrella organization for all things information. It is not just a hyped up version of records management. It is a center of a high-tech wheel built around information. That image has traction for Search advocates such as myself, just so long as search is not considered to be just another spoke in the Wheel. Search has a much more important position. It is the tire around the wheel, where the rubber meets the road. In today’s world you are likely to get lost without it.