Ralph Losey, May 2026
This article is about a real event. It is not satire, parody, or metaphor. In late April 2026, OpenAI publicly explained why one of its frontier AI systems had developed an unusual tendency to mention goblins, gremlins, raccoons, trolls, ogres, pigeons, and similar creatures in places where they did not belong. OpenAI titled its official explanation “Where the Goblins Came From.” The title sounds fictional. The problem was not.

If you take the time to study this strange episode, you will gain more than an amusing story about artificial intelligence. You will see, in unusually visible form, how Large Language Models can acquire unintended behavior from training incentives, how that behavior can spread beyond its original context, why prompt-level or developer-level instructions may be used to suppress it, and how the same root causes help explain the ongoing problem of AI hallucination. For lawyers, judges, e-discovery professionals, and legal technology vendors, this is not a curiosity. It is a warning label written in unusually memorable ink.

The Most Bizarre Codex Instruction of All Time
OpenAI’s example involved Codex, its AI coding agent. For non-programmers, Codex is not a fantasy product and not a casual chatbot. It is a professional software-development tool designed to help engineers plan, write, refactor, test, review, and release code. OpenAI describes Codex as “a coding agent that helps you build and ship with AI,” used for real engineering work across development tools.
That context matters. The now-famous instruction was not a joke inserted into a toy system. It was a developer-level instruction in a serious AI coding agent. According to reporting and OpenAI’s later explanation, Codex had been instructed not to talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless they were clearly relevant to the user’s request.
WIRED first reported the Codex CLI instruction that the model should “never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.” Maxwell Zeff, OpenAI Really Wants Codex to Shut Up About Goblins (WIRED, Apr. 2026). OpenAI, then responded with its own article, Where the Goblins Came From, OpenAI (Apr. 29, 2026), explaining that GPT-5.5 in Codex showed an affinity for goblin metaphors and tracing the behavior to training incentives connected with the “Nerdy” personality. It is well worth the read.
The facts are unusual enough that they do not need embellishment. Indeed, embellishment would weaken the point. The issue is not that an AI system said something funny. The issue is that a frontier model, shaped by modern training methods, developed a persistent behavior that its maker had to investigate, explain, and mitigate. That is precisely why lawyers should pay attention.

The “Goblin” Problem Was an Alignment Problem in Plain Sight
The legal technology world often discusses AI alignment in abstract language. We talk about bias, safety, truthfulness, reliability, explainability, auditability, and human values. Those are important words, but they can become bloodless. The goblin incident gives us something more concrete.
OpenAI explained that the behavior emerged from “many small incentives,” including training by AI of itself connected to its personality customization feature, especially an introversive “Nerdy” personality. That personality was designed to make the model more playful, intellectually enthusiastic, and metaphor-friendly. In the process, certain creature metaphors were rewarded often enough that the model learned to repeat and generalize them.
I have frequently written about the ability of AI to form fictitious sub-personalities for brainstorming purposes, and note the Devils Advocate character is especially effective. Fortunately he was not involved in this OpenAi fiasco. I never instructed AI to form a shy, super-nerd personality type for training purposes. If I ever do in the future (doubtful), I will obviously be very careful to provide strong human supervisions, something which was obviously missing here. This whole incident seems like over-delegation, where the humans in the loop were not paying attentions and so triggered this Gremlin crisis,
This brings up a key point. The AI model was not “thinking about goblins.” It was responding to patterns shaped by training data, reinforcement learning, preference signals, and later adjustments. If a certain style of answer receives favorable feedback, the model can learn that style as a useful pattern. If that pattern includes odd creature metaphors, those metaphors can become part of the model’s behavior.
OpenAI’s post-mortem is valuable because it shows something that usually remains hidden. Model behavior does not simply appear at deployment. It is cultivated. It is selected. It is rewarded. It is penalized. It is patched. It is monitored. Sometimes, it is suppressed by instructions that users never see. I never knew that before.
In this case, the visible symptom was bizarre. The underlying process was ordinary. That is what makes the episode important.

What Are These “Instructions,” and Why Should Lawyers Care?
Modern AI systems are not governed only by the words users type into the chat window. They also operate under layers of instructions. Some instructions come from the system level. Some come from developers. Some come from product settings, safety policies, tool configurations, or specialized agent workflows. Some come from users themselves. The user may never see, nor even know about the developers instructions that shape the response to the user’s prompts.
A developer instruction is essentially a command placed above the ordinary user prompt. It tells the model how to behave in a particular product environment. In Codex, such instructions may shape how the model writes code, uses tools, comments on programming tasks, avoids certain behaviors, or responds within a software-development workflow.
That is not improper. In fact, layered instructions are necessary. A legal AI tool should be told to protect confidentiality, avoid unauthorized practice of law, cite sources, flag uncertainty, preserve privilege, and follow the user’s workflow. The problem is not the existence of instructions. The problem is invisibility, auditability, and as just mentioned, the lack of proper human supervision of the whole process. The humans in the loop were asleep at the wheel and as a consequence the dogs got out.
In legal work, hidden constraints can matter. If a model suppresses certain language (such as profanity), and favors certain categories (such as propriety), emphasizes certain risks (such as letting the dogs out), avoids certain conclusions (such as user is wrong), or changes behavior after an update (such as no hacking allowed, eh Claude), the lawyer may not know why. That matters in e-discovery, privilege review, contract analysis, legal research, expert preparation, and litigation strategy. Another layer of e-discovery open up.
The Codex no-goblin instruction is therefore not important because lawyers care about goblins. (I for one do not, although I do. care about ‘not letting the dogs out’). It is important because it reveals how behavioral control can operate behind the scenes.

The Hallucination Connection
The goblin problem is not identical to hallucination, but the two issues share root causes.
The goblin problem involved an unintended stylistic habit. Hallucination involves plausible but false content. One produces irrelevant creature metaphors. The other produces fake cases, invented quotations, nonexistent statutes, false summaries, fabricated citations, or confident statements unsupported by the record.
The difference is obvious. The connection is deeper.
Both problems arise from the same basic fact: Large Language Models are not born as truth engines. They are trained to predict and generate language. Later training stages, including supervised fine-tuning, reinforcement learning, preference optimization, safety training, and evaluation systems, try to make that language helpful, accurate, safe, and aligned with user expectations.
But training incentives can misfire. Evaluation methods can reward the wrong behavior. A system can learn to produce answers that sound good rather than answers that are verified. It can learn fluency before truth, confidence before calibration, and completion before uncertainty. It could be trained to say, “I don’t know,” but it wasn’t. There is not much of that on the Internet. So, instead it just makes up an answer, one that it infers the user wants, because it is also trained to be a nice sycophant. Nobody wants a devils advocate around that disagrees with you. We should of course, and that is why lawyers have the potential to be great users of generative AI.
OpenAI made this point directly in its 2025 discussion of why language models hallucinate. Why Language Models Hallucinate, (OpenAI, Sept. 5, 2025). OpenAI explained that hallucinations persist in part because many evaluation systems reward accuracy alone, which can push models to guess rather than admit uncertainty. If a model guesses, it may get lucky and receive credit. If it says “I don’t know,” it may receive no credit at all. Over many evaluations, that scoring structure can make a guessing model appear more successful than a more careful model that abstains when it lacks reliable information.
That is the real connection between goblins and hallucinations. They are different failures, but they reflect the same training logic. In the goblin case, the rewarded behavior was playful metaphor, so the model learned to repeat and generalize playful creature references. In hallucination, the rewarded behavior is often answer-giving itself, so the model may learn to produce a confident response even when it lacks adequate grounding. In both cases, the model is not following truth as an independent legal or evidentiary standard. It is following patterns that its training, feedback, and evaluation systems have taught it to treat as successful.
The danger for lawyers is that hallucinations usually do not look strange. Goblins and pigeons are obvious intrusions. They announce that something has gone wrong. A fake citation does not. A fabricated quotation does not. A false summary of a contract clause, deposition answer, medical record, email thread, or judicial opinion may read with the same polish and confidence as a correct one. The surface quality of the prose may conceal the absence of reliable support.
That is why hallucinations are more dangerous than the goblin problem. The goblins expose the machinery because they look absurd. Hallucinations hide the machinery because they look professional. For legal work, that difference is critical. The risk is not merely that an AI system may be odd. The risk is that it may be wrong in a way that looks authoritative, usable, and ready to file.

This Is Not Just an OpenAI Problem
It would be a mistake to treat this as an OpenAI-only issue. The OpenAI goblin post-mortem is useful because it is unusually visible, candid, and memorable. But hallucination and unintended model behavior afflict all modern LLM systems under development, including Claude, Gemini, and other leading models.
Anthropic’s own Claude documentation expressly addresses hallucination reduction, warning that even advanced models can generate text that is factually incorrect or inconsistent with context, and recommending mitigation techniques such as allowing Claude to say it does not know, grounding answers in provided source material, using direct quotations, verifying with citations, and validating critical information. Anthropic, Reduce Hallucinations (Claude API Docs).
Google’s Gemini documentation similarly warns that Gemini for Google Cloud may produce hallucinations, including outputs that are plausible-sounding but factually incorrect, irrelevant, inappropriate, or nonsensical, and may even fabricate links to web pages that do not exist and have never existed. Google Cloud, Gemini for Google Cloud and Responsible AI (Google Cloud Documentation),
The vendors differ. The architectures differ. The safety philosophies differ. The product interfaces differ. But the fundamental problem is shared. These systems are trained to generate plausible language under complex incentives. Plausibility is not truth. Fluency is not verification. Confidence is not reliability.
This point should be stated carefully. It does not mean that all systems are equally risky, equally useful, or equally well governed. They are not. Some models perform better than others on particular tasks. Some products provide stronger grounding, citation, retrieval, logging, or enterprise controls. Some workflows are safer than others.
But no responsible legal professional should assume that hallucination and goblins are confined to one vendor. It is a structural limitation of current LLM technology.

The Legal Technology Lesson
Legal professionals should not respond to this by rejecting AI. That would be the wrong lesson. It would also ignore the enormous value these tools already provide when used with care.
The correct lesson is disciplined adoption.
In e-discovery, we already understand this principle. Technology-assisted review is not accepted because someone declares the software intelligent. It is accepted when the process is reasonable, validated, documented, and proportionate. Sampling matters. Quality control matters. Human judgment matters. Reproducibility matters. Transparency matters.
The same discipline must now be applied to generative AI. Legal AI workflows should be designed to answer practical questions:
- Can the output be traced to reliable source material?
- Did the model actually use the cited source?
- Can each legal citation be verified?
- Can each quotation be checked against the original?
- Can each factual assertion be tied to the record?
- Can the workflow be reproduced if challenged?
- Was the model permitted to say “I don’t know”?
- Was uncertainty preserved, or did the workflow pressure the model into confident completion?
- Were model version, prompt structure, source set, and review procedures documented?
- Was a qualified human responsible for final legal judgment?
These questions are not anti-AI. They are pro-reliability. They are the questions that separate professional use from casual use.
Why This Matters for Courts and Clients
Courts do not need lawyers to become machine-learning engineers. Clients do not need their lawyers to understand every detail of transformer architecture. But both courts and clients are entitled to competent professional judgment.
That includes knowing when an AI output is grounded and when it is merely plausible. It includes knowing when a citation has been verified and when it has merely been generated. It includes knowing when an AI tool is being used for brainstorming, drafting, summarization, classification, legal research, or evidence analysis, because each use carries different risks.
The goblin incident offers a rare window into model behavior because the symptom was so visible. Most legally significant failures will not be so obvious. They will not involve fantasy creatures. They will involve a misstated holding, an omitted exception, a distorted fact pattern, a privilege call made too broadly, a missed document, or a confident statement about law that is no longer current. By the way, humans can all make the same mistakes, which is one reason we tend to do better working in small teams.
That is why the legal profession, indeed all of humanity, must treat generative AI as powerful but not self-validating.

Practical Guidance for Lawyers and Legal Tech Users
The practical response is straightforward:
- Use AI, but verify.
- Use AI for first drafts, issue spotting, summarization, brainstorming, and classification support, but do not outsource professional judgment.
- Use retrieval, citations, and source-grounded workflows whenever factual accuracy matters.
- Require the model to distinguish between sourced statements, inferences, and speculation.
- Require explicit uncertainty when the record is incomplete.
- For legal research, verify every case, statute, rule, quotation, and parenthetical against authoritative sources.
- For e-discovery and document review, use sampling, validation, audit trails, and human quality control.
- For AI vendor selection, ask what model is being used, how outputs are grounded, how hallucination risk is measured, what logs are preserved, what changes when the model is updated, and whether the workflow can be explained if challenged.
- For judicial or regulatory settings, avoid vague claims that an AI tool is “aligned,” “safe,” or “accurate” without evidence. Ask what was tested, how it was tested, and under what conditions.
The lesson is not distrust. The lesson is earned trust.

Conclusion: The Promise and the Work Ahead
At the beginning of this article, I promised that this strange episode would offer more than an amusing story. It does.
OpenAI’s real no-goblin, no-pigeon instruction gives lawyers a concrete example of how modern AI behavior can be shaped by training incentives, generalized beyond its original setting, and later mitigated through hidden or semi-hidden instructions. The hallucination problem shows the same root issue in more serious form. When models are rewarded for fluent completion, confidence, and benchmark performance, they may learn to answer when they should abstain, to sound certain when they should qualify, and to generate plausible legal authority when only verified authority will do.
Users must learn these idiosyncrasies and adapt.
This is not just about OpenAI. It is not just about Codex. It is not just about goblins. It is about every legal professional’s duty to understand the tools now entering legal practice. It is about understanding how to use them properly.
Generative AI can help lawyers become faster, broader, more creative, and more effective. It can improve access to justice, reduce drudgery, accelerate document review, strengthen legal education, and help professionals see patterns they might otherwise miss. But these benefits will not be realized by pretending the risks are gone. They will be realized by confronting the risks directly and building better habits, better workflows, better audits, better training, and better professional norms.
The goblins are real in the only sense that matters here: real enough to show us how fragile model behavior can be. The hallucinations are more dangerous because they usually do not look strange at all.
That is the call to action. Legal professionals should not stand outside the AI revolution, arms folded, waiting for perfect machines. Nor should they rush in, eyes closed, dazzled by fluent output. We should do what good lawyers have always done with powerful evidence and powerful tools: question them, test them, document them, verify them, and use them responsibly.
The future of legal AI will not be built by blind trust or reflexive fear. It will be built by informed confidence.
And informed confidence begins with verification.

Ralph Losey Copyright 2026. All Rights Reserved
For educational use only. Not legal advice.
Posted by Ralph Losey 







































