Ralph Losey, March 29, 2026.
We are currently living through a “Gutenberg Moment,” but with a complex, digital twist: our new printing press is alive, probabilistic, and prone to “confident delusions.” While AI may be humanity’s most transformative invention, it remains an enigma to most.
For many legal professionals, the outputs of Generative AI feel like a digital seance—words appearing out of the ether with no visible logic. This “Black Box” is not just a technical curiosity; it is a professional liability. If you cannot at least partially understand and explain how your “assistant” reached a conclusion, you are effectively practicing in the dark. To move from being a passenger to a pilot, you must understand the mechanical soul of the machine and learn how to make it sing with the voices you command.

My recent article, What People Want To Know About AI: Top 10 Curiosity Index, revealed that the primary thing people want to know is how the machine actually works. They are asking the most difficult question in the field: How does AI “think” or make decisions?
This article answers that question by providing a structured understanding of Large Language Models (LLMs) across five levels of technical complexity:
- The Smart Child: The world’s best guessing game.
- The High School Graduate: Statistical probability at a global scale.
- The College Graduate: Mapping meaning in Latent Space.
- The Computer Scientist: The logic of the Transformer and Self-Attention.
- The Tech-Minded Legal Professional: Navigating probabilistic advocacy.

There is a meta-lesson here too that goes beyond the words on this page. Some of my favorite explanations of complex subjects emulate the fresh, clear speech of fifth graders. You will often find deep creativity when AI models parrot their language.
I chose five kinds of speech to describe how AI works. There are hundreds more that I could have picked. I also could have asked for explanations that use story or humor, much like Abraham Lincoln liked to do. It is fun to learn to tell AI what to do so that you can better communicate. It empowers a level of creativity never before possible. Maybe next time I will use comedy or poetry. For now, let’s peel back the curtain using these five.

1. The Smart Child Level: The World’s Best Guessing Game
Definition: Generative AI is like a magic “Fill-in-the-Blank” machine that has played the game trillions of times with almost every book ever written.
Imagine you are playing a game. If I say, “The peanut butter and…”, you immediately think of the word “jelly.” You don’t need to look at a jar of jelly to know that word fits. You’ve heard those words together so many times that your brain just knows they belong together.
An AI is a computer that has “listened” to almost everyone in the world talk and “read” almost every story ever told. It doesn’t “know” what a sandwich is, and it doesn’t have a stomach that feels hungry. It simply knows that in the history of human writing, the word “jelly” follows “peanut butter” more than almost any other word.
But it’s even smarter than that. If you say, “I am at the library and I am reading a…”, the AI knows that “book” is a much better guess than “sandwich”. It looks at all the words you give it—the “clues”—to narrow down the billions of possibilities into one likely answer. It makes decisions by picking the word that is most likely to come next to complete a pattern that makes sense to us. It isn’t “thinking” about the story; it’s just very, very good at predicting the next piece of the puzzle.

2. The High School Level: Statistical Probability at Global Scale
Definition: AI is a Prediction Engine. It uses “Big Data” to calculate the statistical likelihood of the next piece of information.
Most of us use the “Autofill” feature on our smartphones every day. As you type a text, the phone suggests the next likely word based on your past habits. If you often text “I’m on my way,” the phone learns that “way” usually follows “my.” Generative AI—specifically Large Language Models—is essentially Autofill scaled to include the vast majority of digitized human knowledge.
During its “training” phase, the model does not “memorize” facts like a traditional database. If you ask it for the date of the Magna Carta, it isn’t looking it up in a digital encyclopedia. Instead, it has learned through billions of examples that the words “Magna Carta” and “1215” have a very high statistical correlation.
This explains why AI can sometimes be “confidently wrong.” It isn’t “lying” in the human sense; it is simply following a statistical path that leads to a mistake. If the data it was trained on contains a common error, the AI will repeat that error because, in its mathematical world, that error is the “most likely” next word. It recognizes the “shape” of human thought without actually having a human mind.

3. The College Graduate Level: Mapping the Latent Space
Definition: AI organizes information using Vector Embeddings, which convert words into numerical coordinates on a massive, multi-dimensional map called Latent Space.
To understand how AI moves beyond mere word-matching, we have to look at how it “maps” meaning. In a physical library, books are organized by a 1D system (the spine) or 2D (the shelf). AI organizes information in a “map” that has thousands of dimensions.
- Vectoring (The Coordinate System): Every word or concept is assigned a “Coordinate”—a long string of numbers. For example, the word “Stealing” is mathematically plotted very close to “Larceny” but far away from “Charity”.
- Conceptual Proximity: Think of this as the “Relativity” of language. If you ask the AI about “theft,” it doesn’t look for that specific word. It navigates to those coordinates in Latent Space and finds all the “neighboring” concepts like “property,” “intent,” and “deprivation.”
- Vector Arithmetic: Researchers discovered that you can actually perform “logic” using these numbers. A famous example is: King – Man + Woman = Queen. The model “understands” the relationship between these concepts because the mathematical distance between “King” and “Man” is the same as the distance between “Queen” and “Woman.”
When you provide a prompt, the AI identifies the coordinates of your request. It then “walks” through the nearby clusters of meaning to synthesize an answer. The “Black Box” is the result of the sheer scale of this map. With hundreds of billions of dimensions, the path the AI takes is so complex that no human can trace the logic of a single output back to a single “rule.”

4. The Computer Scientist Level: The Decoder-Only Transformer
Definition: Generative AI is a system powered by neural network architectures—most notably the Decoder-only Transformer—that is specifically tuned to generate the next piece of information by mathematically looking back at everything that came before it. Rather than relying on rigid rules, these models evaluate entire inputs using a mathematical weighting system called Self-Attention to determine the contextual relationship between every element.
To achieve this generative capability, the architecture relies on several complex mathematical mechanisms:
A. The “Query, Key, and Value” System: To decide how much “weight” to give a word, the AI creates three numerical identities for every token. The Query represents what the token is looking for (like a pronoun searching for a subject), the Key represents what the token offers (like a subject offering its identity), and the Value represents the token’s actual semantic meaning.

B. The Logic of Self-Attention: The AI establishes context by comparing the Query of one word against the Keys of all other words in the sequence. Imagine a judge sitting through a long trial. When a witness says the word ‘It,’ the judge immediately looks back at previous exhibits to see what ‘It’ refers to. The AI does this mathematically by comparing the Query of one word against the Keys of every other word in the sequence. For example, in the sentence “The court sanctioned the attorney because his motion was meritless,” the AI mathematically calculates the relationship between “his” and the surrounding words. The Query for “his” finds a high match with the Key for “attorney,” allowing the model to assign a high Attention Weight to “attorney” so the word “his” inherits the correct context.

C. Multi-Head Attention (Parallel Deliberation): The model doesn’t just evaluate the text once; it runs these calculations dozens of times in parallel. Different “Heads” focus on different aspects simultaneously—one might evaluate syntax and grammar, another focuses on technical legal definitions, and a third assesses the overall tone or sentiment.

D. The Decision Layer (Feed-Forward Networks): After attention weights are settled, the data moves into a decision-making layer consisting of billions of Weights (connection strengths) and Biases (baseline leanings). These act as the model’s “institutional knowledge,” which was grown during training to satisfy the objective of predicting the next token.

E. The Softmax Verdict: Finally, the model uses a Softmax function to produce a probability list of every possible word in its vocabulary. It calculates the exact odds—for example, assigning “Court” an 85% probability and “Sandwich” a 0.01% probability—and then mathematically samples the winner to generate the next word. Since the Softmax Verdict generates words based on statistical odds rather than verified facts, it is crucial for lawyers to verify the output, which we will also discuss in more detail later in this article.

5. The Tech-Minded Legal Professional Level: Probabilistic Advocacy
Definition: For the legal professional, Generative AI is not a database, but a Probabilistic Inference Engine. It does not “find” data in the traditional sense; it infers the most likely response based on the conceptual coordinates of your request and the mathematical “gravity” of the language it was trained on.
A. From Search to Inference
For fifty years, the legal industry’s relationship with technology was deterministic. Traditional legal databases use rigid logic gates: Does Document A contain Word X AND Word Y? If the words are present, it is a ‘hit’; if not, it is ignored, functioning as a simple ‘On/Off’ switch. The Transformer changes this completely. It is not a search database, but a Probabilistic Inference Engine. When you ask it to ‘analyze a witness’s credibility,’ it doesn’t just look for the word ‘credibility’; it infers a conclusion by weighing the context of every word in the record.

B. Navigating the Latent Space
To perform this analysis, the model navigates the Latent Space coordinates of your query. It uses the Self-Attention weights discussed in Level 4 to “infer” a conclusion by weighing the context of every word in the record. It identifies the “Intent” and “Sentiment” within millions of documents in a second. Such tasks were previously impossible for deterministic software.
C. The Weight of the Legal Oath
While the machine provides the “Magic Guesses” of a child and the “Neural Weights” of a scientist, it lacks the professional standing to be an advocate.
- The Black Box as an Invitation: The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of legal practice.
- The Human Validator: We use the machine to find the “needle” (the insight), but we use our human judgment to prove it is evidence and not a hallucination.
- The Ultimate Weight: In this new era, the most important “Weight” in the entire system is the one held by the human professional.

6. The “Growing, not Building” Concept: The Genesis of the Black Box
To understand why even the creators of these models cannot always explain a specific output, we have to understand that AI is trained into complexity, rather than just hard-coded with logic.
- The Old World of Software: In the past, we built programs based on rigid, transparent logic. If the code said “If X, then Y,” but it did something else, it was a “bug” to be corrected within a deterministic machine.
- The New World of Generative AI: This technology is created through Self-Supervised Learning. We don’t provide the model with logic blueprints (corrected spelling from “bluepritns”); instead, we provide an ocean of data and a single objective: “Predict the next piece of information.”
- The “Growth” of Intelligence: The model then “grows” its own internal pathways—billions of connections known as Weights and Biases—to satisfy that objective.
Think of it like a massive vine growing through a lattice. As engineers, we provide the lattice (the Transformer architecture), but the vine (the intelligence) grows itself. By the time training is finished, there are hundreds of billions of connections. There is no “Master Code” for a human to read or audit. The “Black Box” is not a wall; it is a forest so dense that no human can map every leaf.
In the era of AI Entanglement, we must judge the AI by its results (the fruit) rather than its process (the roots).

7. The “Context Window” as a Trial Record
In the computer scientist level we discussed the Transformer’s ability to look at a whole document simultaneously. In practice, this capability is governed by the Context Window. In AI, the Context Window is the specific amount of data the model can “Attend” to at any one time. When you upload a 100-page contract, the AI holds that text in a temporary “workspace.”
The Judicial Analogy: Think of the Context Window as a judge’s Active Memory during a hearing.
The Risk of Loss: If a trial lasts for ten days, but the judge can only remember the last two hours of testimony, they will lose the thread of the case.
Hallucination via Omission: They might “hallucinate” a fact not because they are lying, but because they have lost the beginning of the record.
Legal Strategy: For the tech-minded lawyer, you must manage the “Active Record” of your conversation to ensure the model maintains access to critical early facts. In a similar way, a judge relies on a court reporter who makes a transcript of the record to ensure nothing is lost to the passage of time.

8. Anatomy of a Hallucination
A “Case Study” of a hallucination through the lens of Latent Space will help us to understand them.
Suppose you ask an AI for a case supporting a specific point of Florida law. The AI navigates to the “Neighborhood” of Florida Law and the “Street” of that specific legal issue. It sees a cluster of real cases—Smith v. Jones and Doe v. Roe.
Because it is a Probabilistic Inference Engine, the AI doesn’t naturally “check” a verified list of real cases. Instead, it follows the mathematical pattern of how Florida cases are typically named and cited.
The AI then “generates” Brown v. State—a case that sounds perfectly correct because its coordinates are exactly where a real case should be based on the surrounding patterns. It has followed the statistical “gravity” of the neighborhood, but it has drifted into a sequence of words that is factually untethered from reality.
It is a perfectly logical mathematical guess that happens to be a factual lie. This is the primary reason why we must cross-examine our assistants. We use our human judgment to prove the output is a needle of truth and not a hallucination of the “Black Box.” Cross-Examine Your AI: The Lawyer’s Cure for Hallucinations (12/17/25).

Conclusion: A Symphony of Five Understandings
We have traveled from the magic toy box to the multi-dimensional math of the Transformer. To close, let’s look at the “Black Box” one last time through all five lenses.
The Smart Child sees a magic friend who is the best guesser in the world. To the child, the lesson is simple: the magic friend is fun, but sometimes they make up stories. Enjoy the story, but don’t bet your lunch money on it.
The High Schooler sees a massive “Autocomplete” engine. They understand that the AI is just a mirror of everything we’ve ever written. The lesson: the mirror is only as good as the light you shine into it.
The College Graduate sees the “Latent Space”—a map of human culture turned into math. They realize that meaning is not found in isolated words, but in the mathematical distance and relationship between them.
The Computer Scientist sees the Decoder-only Transformer—a masterpiece of matrix multiplication and Self-Attention weights. They know that “thinking” is just the sound of billions of Query and Key vectors finding their mathematical match.
The Tech-Minded Legal Professional—the “Human in the Loop”—sees a revolution. We see a tool that can navigate the “Intent” and “Sentiment” of millions of documents in a heartbeat using Probabilistic Inference. But we also see the weight of our professional oath.

Our New Role: From Searcher to Validator. Electronic discovery professionals are no longer just “Searchers” of data; we are the Validators of a new, probabilistic reality.
We are the ones who must take the “Magic Guesses” of the child, the “Statistical Patterns” of the high schooler, the “Latent Map” of the college graduate, and the “Neural Weights” of the scientist, and forge them into Evidence.
The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of practice. We use the machine to find the needle, but we use our human judgment to prove it is a needle and not a hallucination.
In the era of AI Entanglement, the most important “Weight” in the entire system is the human in charge: You.

Ralph Losey Copyright 2026 — All Rights Reserved
Discover more from e-Discovery Team
Subscribe to get the latest posts sent to your email.

