e-Discovery Team

Stochastic Parrots: How to tell if something was written by an AI or a human?

April 5, 2024

Ralph Losey. Published April 5, 2024.

There are two types of “tells” as to whether a writing is a fake, just another LLM created parrot, or whether its real, a bonafide human creation. One is to look at the structure and style of the writing and the other is to look at the words used. This blog examines both GPT detectors, starting with the tell words.

Stochastic Parrot in Neo-Expressionist Digital Brushwork style by Ralph Losey, Visual Muse.

All of the words used here are “real,” written by Ralph Losey, whereas all of the images are “fake,” prompted into existence by Ralph using various image generation tools, including his own GPT4: Visual Muse: illustrating concepts with style. This blog is a continuation, a part three, of two other recent blogs, Stochastic Parrots: the hidden bias of large language model AI, and Navigating the High Seas of AI: Ethical Dilemmas in the Age of Stochastic Parrots.

The ‘Tell Words’ favored by Stochastic Parrots

We begin with an introductory video of a Pirate who looks sort of like Ralph talking about AI tell words. The pirate, of course, is fake, created by Ralph Losey, but all of the words were personally written by him.

The ‘Tell Words’ favored by Stochastic Parrots as told by a Pirate created by Ralph Losey. Just click to watch Matey.

Transcript of the Pirate Video on AI Tell Words

Ahoy Mates!

Did a real person write my pirate script? Or was it written by an AI? A Stochastic Parrot on me shoulder? Aye, Matey, it happens all the time these days. Who, is telling who, what to say? Blimey! How can you tell real writing from fake? Arr, matey, listen up. I’ll give ye some tips on how to tell the difference!

Here are the top “tell words” favored by our phony feathered friends. Its not foolproof, as even I have used these cliche words, that AI seems to love so much. Yee be a teacher or supervisor concerned about AI plagiarism? Then you might save this list. These damn AIs. Stochastic Parrots, they be called, often use these words incorrectly, or too much. Here they be. In rough order of parrot popularity.

Synergy, that be the worst of all, followed by Blockchain, which is often used inappropriately. Here are some more beauties to look out for: Leverage. Innovative. Disruptive or Disrupt. AI-driven. Pivot. Oh, I hate that one!

Here are more parrot nasties. Scale. Agile. Think outside the box. Paradigm shift. Bandwidth. Deep dive. Shiver me timbers! Are you starting to feel sick from all the vague cliches?

Aye. But There be more, many more. They all need to go to Davey Jones Locker: Ecosystem. Due diligence. Empower. Holistic. Ah, that was once a good word.

Blimey! Here’s a really bad one. Game-changer. This makes me heave ho! Follow the link below to hear the rest!

Ralph as Pirate eating Polly’s cracker, shown in watercolor style.

List of “Tell Words” Indicative of AI Writing

Thanks again to the “No More Delve” GPT for helping me with this. I recommend this program. These are roughly ranked in order of misuse. This list does not purport to be complete nor based on scientific studies.

Synergy – Often used to describe the potential benefits of combining efforts or entities, but frequently seen as vague.
Leverage – Intended to convey using something to its maximum advantage, but often considered jargon when overused.
Innovative – While it’s meant to describe something novel or original, its overuse has dulled its impact.
Disruptive – Used to describe products or services that radically change an industry or technology, but now seen as clichéd.
Blockchain – Specific to technology but has been overused to the point of becoming a buzzword even in irrelevant contexts.
AI-driven – Meant to highlight the use of artificial intelligence, but often used without clear relevance to actual AI capabilities.
Pivot – Originally meant to describe a significant strategy change, now often used for any minor adjustment.
Scale – In business, it’s about growth, but its frequent use has made it a buzzword.
Agile – A specific project management method that’s now broadly used to describe any flexible approach, diluting its meaning.
Think outside the box – Intended to encourage creative thinking, it has become a cliché itself.
Paradigm shift – Used to signify a fundamental change in approach or underlying assumptions, but now often seen as pretentious.
Bandwidth – Borrowed from technology to describe personal or team capacity, its metaphorical use is now considered clichéd.
Deep dive – Meant to indicate a thorough exploration, but often used unnecessarily.
Ecosystem – In technology, refers to a complex network or interconnected system, but often used vaguely.
Due diligence – Critical in legal contexts but used broadly and sometimes inaccurately in business.
Blockchain – Bears repeating due to its pervasive use beyond relevant contexts.
Empower – Intended to convey delegation or giving power to others, it’s now seen as an empty buzzword.
Holistic – Meant to indicate consideration of the whole instead of just parts, but often used vaguely.
Game-changer – Used to describe something that significantly alters the current scenario, but now seen as hyperbolic.
Touch base – Intended as a casual way to say “let’s communicate,” but often viewed as unnecessarily jargony.
Blockchain – Bears repeating due to its pervasive use beyond relevant contexts.
Delve: Often overused to suggest a deep exploration, diminishing its impact.
Journey: Used metaphorically to describe processes or experiences, becoming a cliché.
Supercharge: Tends to overpromise on the impact of strategies or tools.
Embrace: Frequently employed to suggest acceptance or adoption, often without specificity.
Burning question: A dramatic way to highlight an issue, but overuse dilutes its urgency.
Unlock: Commonly used to imply revealing or unleashing potential, becoming worn out.
Roadmap: Overused in business and technology to describe plans or strategies, losing its originality.
Uplevel: Buzzword suggesting improvement or upgrade, often vague.
Future-proof: Used to describe strategies or technologies, but often without clear methodology.
Revolutionize: Promises transformative change, but overuse has made it less meaningful.
Navigate: Frequently used to describe maneuvering through challenges, becoming clichéd.
Harness: Suggests utilizing resources or forces, but overused to the point of vagueness.
Transform: A catch-all term for change, its impact has been diluted through overuse.
Drives: Often used to denote motivation or causation, but has become a buzzword.
Realm: Used to describe fields or areas of interest, but has grown to be seen as pretentious.
Vibrant: A go-to adjective for lively or bright descriptions, now seen as overused.
Innovation: Once meaningful, now a generic term for anything new or updated.
Foster: Commonly used for encouraging development, but its impact is lessened through overuse.
Elevate: Used to suggest improvement or enhancement, often without clear context.
In summary: Overused transition that can be seen as unnecessary filler.
In conclusion: Another filler transition that may unnecessarily signal the end of a discussion.
Testament: Often used to prove or demonstrate, but has become clichéd.
Unleash: Implies releasing potential or power, but overuse has weakened its effect.
Trenches: Metaphorically used to describe deep involvement, now seen as overdone.
Distilled: Suggests purification or simplification, but often used vaguely.
Spearhead: Used to denote leadership or initiative, but has become buzzwordy.
Revolution: Promises dramatic change, but is overused and often hyperbolic.
Landscape: Used to describe the overview of a field or area, but now feels worn out.
Imagine this: An attempt to draw the reader in, but can feel contrived.
Master: Suggests a high level of skill or understanding, but often used imprecisely.
Treasure trove: A clichéd way to describe a rich source or collection.
Masterclass: Intended to denote top-tier instruction, but has become a marketing cliché.
Optimize: Common in business and tech to describe making things as effective as possible, now overused.
Pioneering: Meant to convey innovation or trailblazing, but diluted by frequent use.
Groundbreaking: Similar to “pioneering,” its overuse has lessened its impact.
Cutting-edge: Used to describe the forefront of technology or ideas, now a cliché.
Impactful: Intended to denote significant effect or influence, but overuse has rendered it vague.
Thought leader: Aimed to describe influential individuals, but has become a self-applied and diluted term.
Value-add: Used to highlight additional benefits, but has become a buzzword with diluted meaning.
Big data: Meant to describe vast data sets that can reveal patterns, trends, and associations, but now often used as a buzzword irrespective of the scale or complexity of the data analysis.
Thought leadership: Intended to denote influential and innovative ideas, but overuse has made it a nebulous term often devoid of evidence of leading or innovative thinking. (By the way, see thought leader on a Linkedin profile, better run!)

Parrot lawyer with Gilberts level knowledge and experience.

Looking Beyond the AI Favored Words to AI Typical Writing Styles

Generalized Statements: AI-generated content often leans on generalized or vague statements rather than specific, detailed examples. When you read an AI written article, notice the details, or lack thereof. Fake LLM writing are often generalized and overly formulaic. They string many words together in an appearance of learned comprehension, but in actuality, they say little. I call it more fluff than substance. In other words, they talk like a typical politician, with lots of words, but little meaning. This gets even worse when the AI does not have access to up-to-date information, or is generating content on a topic on which its data is limited, like lots of “insider baseball” talk. In my field, that includes the kinds of things that lawyers and judges privately say to each other, but are seldom, if ever, written down, much less published. Just go to a bar at a Bar convention. This limited detailed knowledge makes it easy for human experts in a field to detect fake writing in their own area, but outside their field, not so much.
Neutral and Diplomatic Language: AI often defaults to very neutral and diplomatic language, sometimes excessively so, in an effort to avoid making controversial or unsubstantiated claims. “WTF” is not a phrase, or even initials they are likely to use. The software manufacturers try to filter out the profanities found in many parts of the internet, which is typically a good thing. Still, the result is often unnaturally squeaky clean language and style that make friggen fake language easy to spot.
Excessive Politeness: Especially in responses or interactive content, LLMs use phrases that seem overly polite or formal, such as frequent use of “Thank you for your question,” or “I’m sorry, but I’m not able to.” Miss manners in writing is a dead giveaway. Plus, its so damn annoying to real people, thank you very much!
No Real Humor or Wit. The kind of snarky, subtle, almost funny remarks that permeate my writings seem to be beyond the grasp of AI. Much like the AI Android “Data” in StarTrek, LLM’s just don’t grok humor and their jokes usually suck.
Lack of Emotion. Kind of obvious, but robot writing often has a tell of being robotic, overly structured, intellectual and emotionless. Personality and emotion seem irrelevant to these writing algorithms, although we are seeing new types of LLMs that specialize in emotion, so be careful, they can be charming and seductive too. See: Code of Ethics for “Empathetic” Generative AI.
Too Perfect Spelling. Real humans make typographical errors, and, even with spell checkers, there are often a few mistakes that slip by. My blog is a good example of this. A typo in a text is a good indicator that it was human written. Of course, this again is just a tell. We writers always strive to be perfect and smart computers can help us with that.
Lack of Personal Experience: AI-generated texts often lack personal anecdotes, experiences, or strong opinionated statements, unless specifically programmed to simulate such content. This reminds me of lectures I have sat through by self proclaimed e-discovery experts who have never personally done a document review. I could go on and on with examples, if I wanted, because I have a lifetime of experience and my learning is hands-on, just just academic. Remember, no AI has personal experience of anything. It is all second hand book learning, albeit millions of books.
Lack of Opinions: AIs are often trained by their makers not to be opinionated. After all, opinions might possibly offend someone. Can’t have that. That is one reason AI writing is often bland, which is another tell. Real humans have opinions, lots of them, many wrong. But that’s one reason we are such a charming species and our writing is so much more enjoyable to read. AI talk is not only general and vague, it is often stogy, over polite and politically correct. Also, try getting an AI to give you its legal opinion. Thank goodness for lawyers like me they refuse and say instead to speak to a real lawyer. There are, of course, many ways to trick them into giving you a legal opinion anyway, but that’s another story, one that you will have to retain me to tell you.
AI Avoids Slang and Uncommon Idioms. Since GPTs are trained on public data, they use the most common words, the ones they have literally read a billion times. I may be tilting at windmills, but in my experience GPTs are as blind as a bat to many idioms, phrases and words. We are talking about speech that is too regional, subcultural, seldom used, or still too new, the latest vocab. Unfortunately, I’ve checked, matey, and GPTs do speak fluent pirate. Arr, there be many of us pirates about.
Repetitive language: AI written content may repeat words, phrases, or sentences. Yeah man, like over and over. In fake writings you may also see the same sentence structure repeated in different paragraphs. They repeat themselves and don’t seem to care, over and over. That is one reason fake AI talk can be so boring. I mean, they just keep saying things to death. Enough already!
Lack of creativity: AI writing is often too predictable. Well, duh or course. LLM intelligence does come from predicting the next word you know. The Top_p or Creativity settings may be too low. Again, the sooo boring speech results. Humans are typically more enjoyable to read. That takes us back to know-it-all Data on StarTrek who does not understand humor. Subtle humor is just not something they grok. Maybe someday.
Error Patterns in Complex Constructs: AI may struggle with complex language constructs or highly nuanced topics, such as law (which is one reason they wisely refrain from giving out legal opinions). This mental challenge when dealing with complex ideas, will, in my opinion, be corrected soon. But in the meantime this limitation in intelligence leads to errors in speech like topic relevance and basic coherence. The LLMs now often may sound like a 1L trying to explain the elements of contract when cold called in Contract One, or like a newbie lawyer trying to argue a subject to a judge where they have only read the Gilbert’s. Note this last run-on sentence could not possibly have been written by an LLM.
Overuse of Transitional Phrases: AI tends to overuse transitional phrases such as “Furthermore,” “Moreover,” “In addition,” and “On the other hand,” in an attempt to create cohesion. They seem to lack creativity in that regard, or fear simply doing without transitions. In any event, they often screw up transitions.
Hedging Language: AI often uses hedging language like “It might be the case,” “It is often thought,” or “One could argue,” to avoid making definitive statements that could be incorrect. It all depends if this is appropriate or not. Just ask any lawyer. We and our insurers frigging love qualifiers.
Synonym Swapping: To avoid repetition, AI might use synonyms excessively or in slightly unusual contexts. This can lead to awkward phrasing or slightly off usage of certain terms. Comes from their being so damned repetitive and too general. The result is often the use of unnecessary “fancy words,” typical of a linguistic show-off.
Repetitive Qualifiers: AI might use repetitive qualifiers like “very,” “extremely,” “significantly,” more than a typical human writer would, to emphasize points. I am very guilty of that myself.
Standardized Introductions and Conclusions: AI-generated content might start with very formulaic introductions and end with standardized conclusions, often summarizing content in a predictable manner (e.g., “In conclusion,” followed by a restatement of key points). The fondness of opening and closing statements is much like a lawyer in at trail or in legal memorandums, but AI does a poor job at it. I always end my blog with a conclusion and I like to think that most of my blogs do not read anything like fake talk by a LLM. AI writing with catchy, not mere formalistic conclusions, will improve in the future, of that I have no doubt. But in the meantime, this is yet another trait that is a dead giveaway.

Parrot lawyer depicted in fururistic digital art style.

Conclusion

These tells words and styles must all be taken with a grain of salt. They are only indicators. None are dispositive, but, taken all together, they can help you to determine if a writing is real or fake. All of these indicators should be used as part of a broader assessment of real or fake, rather than a one and done acid test. As AI technology evolves, the patterns and tells will likely become less noticeable. For this reason, I predict that orals exams will become more popular. Teachers on all levels will be forced to revert to the medieval guild traditions of oral defense of knowledge and skills, as has always been required of PhD candidates.

The obvious should also be stated here. For most people the greatest tell of all of fake writing is how good it is compared to past efforts, education and experience. A sudden improvement in a student’s writing is a strong tell of plagiarism. Indeed, aside from the few humans that professionally, most flesh and blood intelligences are relatively poor writers as compared to LLM writers. Look at the history and qualifications of the writer. It may make it obvious that their writing is too good to be true.

Still, any objective observer, no matter how much they may dislike AI, would have to concede that GPTs can already write better than most people; most, but not all. GPTs are still, at best, mere B-grade writers, often far worse, for the reasons here listed. The can be vacuous blowhards, mere parrots, pushing watered down Gilberts. They can be all style over substance, not to mention inaccurate and sometimes hallucinatory. They are not even close to the best humans writing in their fields of special expertise, legal or otherwise. Yes, I am expressing a strong opinion here that might offend some. Don’t like it? Go back to reading the bland crap of stochastic parrots!

Parrot lawyer defending thesis with judge in Digital Brushwork style.

Still, despite these protestations, I love these odd birds for their great potential as hybrid tools. They should be used to help us to speak and think better, and make better decisions. But Stochastic Parrots should not dictate what we say or do, especially those of us engaged in critical fields like law and medicine. It is one thing to use LLMs for writing fiction, quite another to make life and death decisions in courts and hospitals. AI should be skillfully used by professionals as a consulting tool to assist, not replace, good human judgement and empathy.

7 Comments | AI Ethics, AI Instruction, Book, ChatGPT, Lawyers Duties, Technology | Permalink
Posted by Ralph Losey

Navigating the High Seas of AI: Ethical Dilemmas in the Age of Stochastic Parrots

April 3, 2024

Ralph Losey. Published April 3, 2024.

Large Language Model generative AIs are well-described metaphorically as “stochastic parrots.” In fact, the American Dialect Society, selected stochastic parrot as its AI word of the year for 2023, just ahead of the runners-up “ChatGPT, hallucination, LLM and prompt engineer.” These genius stochastic parrots can be of significant value to all legal professionals, even those who don’t like pirates. You may want one on your shoulder soon, or at least in your computers and phone. But, as you embrace them, you should know that these parrots can bite. You should be aware of the issues of bias and fairness problems inherent in these new technical systems.

Pirate and his parrot sharing cracker moment in watercolor style by Losey.

The ethical issues were raised in my last blog and video, Stochastic Parrots: the hidden bias of large language model AI. In the video blog an avatar, which looks something like me with a parrot on his shoulder, quoted the famous article on LLM AI bias, and briefly discussed how the prejudices are baked into the training data. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, (3/1/21) by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell. In this followup blog I dig a little deeper into the article and the controversies surrounding it.

Robot Parrot with his iPhone in watercolor style by Losey.

Article Co-Author Timnit Gebru

First of all, it is interesting to note the internet rumor, based on a few tweets, concerning one of the lead authors of the Stochastic Parrots article, Timnit Gebru. She was a well-known leader of Google’s ethical AI team at the time she co-wrote it. She was allegedly forced to leave Google because its upper management didn’t like it. See: Karen Hao, We read the paper that forced Timnit Gebru out of Google. (MIT Technology Review, 12/04/2020); Shirin Ghaffary, The controversy behind a star Google AI researcher’s departure (Vox, 12/09/20). According Karen Ho’s article, more than 1,400 Google staff members and 1,900 other supporters signed a letter of protest about the alleged firing of Timnit Gebru as an act of research censorship. The rumor is Google tried to stop the publication of the article, but obviously the article was in fact published on March 1, 2021.

According to the MIT Technology Review article, Google did not like all four points of criticism of LLMs that were made in the Parrot article:

Environmental and financial costs. Pertaining to the need for vast amount of computing to create the LLMs and the energy costs, the carbon footprint.
Massive data, inscrutable models. The training data mainly comes from the internet and so contains racist, sexist, and otherwise abusive language. Moreover, the vast amount of data used makes the LLMs hard to audit and eliminate embedded biases.
Research opportunity costs. Basically complaining that too much money was spent on LLMs, and not enough on other types of AI. Note this complaint was made before the unexpected LLM breakthroughs in 2022 and 2023.
Illusions of meaning. In the words of Karen Ho, the Parrot article complained that the problem with LLM models is that they are “so good at mimicking real human language, it’s easy to use them to fool people.”

Moreover, as the Hao article in the MIT Technology Review points out, Google’s then head of AI, well-known scientist, Jeff Dean, claimed that the research behind the article “didn’t meet our bar” and “ignored too much relevant research.” Specifically, he said it didn’t mention more recent work on how to make large language models more energy efficient and mitigate problems of bias. Maybe they didn’t know?

Robot parrot in photorealistic style by Losey.

Criticisms of the Stochastic Parrot Article

The main article I found criticizing the Stochastic Parrot also has a weird name: “The Slodderwetenschap (Sloppy Science) of Stochastic Parrots – A Plea for Science to NOT take the Route Advocated by Gebru and Bender” (2021). The author, Michael Lissack, challenges the ethical “woke” stance of the original “Parrot Paper,” and suggests a reevaluation of the argumentation. It should be noted that Gebru has accused Lissack of stalking her and colleagues. See: Claire Goforth, Men in tech are harassing Black female computer scientist after her Google ouster (Daily Dot, 2/5/21) (Michael Lissack has tweeted about Timnit Gebru thousands of times). By the way, “slodderwetenschap” is Dutch for slop-science.

Here are the three criticisms that Lissack makes of Stochastic Parrots:

What is missing in the Parrot Paper are three critical elements: 1) acknowledgment that it is a position paper/advocacy piece rather than research, 2) explicit articulation of the critical presuppositions, and 3) explicit consideration of cost/benefit trade-offs rather than a mere recitation of potential “harms” as if benefits did not matter. To leave out these three elements is not good practice for either science or research.
Lissack, The Slodderwetenschap (Sloppy Science) of Stochastic Parrots, abstract.

Others have spoken in favor of Lissack’s criticisms of the Stochastic Parrots, including most notably, Pedro Domingos. Supra, Men in Tech (includes collection of Domingos’ tweets).

Lawyerly pirate with parrot eating cracker. Watercolor by Losey.

It should also be noted that Lissack’s article includes several positive comments about the Stochastic Parrots work:

The very topic of the Parrot Paper is an ethics question: does the current focus on “language models” of an ever-increasing size in the AI/NLP community need a grounding against potential questions of harm, unintended consequences, and “is bigger really better?” The authors thereby raise important issues that the community itself might use as a basis for self-examination. To the extent that the authors of the Parrot Paper succeed in getting the community to pay more attention to these issues, they will be performing a public service. . . .

The Parrot Paper correctly identifies an “elephant in the room” for the MI/ML/AI/NLP community: the very basis by which these large language models are created and implemented can be seen as multilayer neural network-based black boxes – the input is observable, the programming algorithm readable, the output observable, but HOW the algorithm inside that black box produces the output is no articulable in terms humans can comprehend. [10] What we know is some form of “it works.” The Parrot Paper authors prompt readers to examine what is meant by “it works.” Again, a valuable public service is being performed by surfacing that question. . . .

Most importantly, in my view, the Parrot Paper authors remind readers that potential harm lies in both the careless use/abuse of these language models and in the manner by which the outputs of those models are presented to and perceived by the general public. They quote Prabhu and Birhane echoing Ruha Benjamin: “Feeding AI systems on the world’s beauty, ugliness, and cruelty, but expecting it to reflect only the beauty is a fantasy.” [PP lines 565-567, 11, 12] The danger they cite is quite real. When “users” are unaware of the limitations of the models and their outputs, it is all too easy to confuse seeming coherence and exactness for verisimilitude. Indeed, Dr. Gebru first came to public attention highlighting similar dangers with respect to facial recognition software (a danger which remains, unfortunately, with us [13, 14].
The Slodderwetenschap (Sloppy Science) of Stochastic Parrots at pages 2-3.

Lissack’s main objection appears to be the argumentative nature of what the article presents as science, and the many subjective opinions underlying the Parrot article. He argues that the paper itself is “ethically flawed.”

White robot parrot in photorealistic style by Losey.

Talking Stochastic Parrots Have No Understanding

Artificial intelligences like ChatGPT4 may sound like they know what they are talking about, but they don’t. There is no understanding at all in the human sense; it is all just probability calculations of coherent speech. No self awareness, no sense of space and time, no feelings, no senses (yet) and no intuition – just math.

It is important to make a clear distinction between human cognitive processes, which are deeply linked and arise out of bodily experiences and the external world, and computational models that lack a real world, experiential basis. As lawyers we must recognize the limits of mere machine tools. We cannot over-delegate to them just because they sound good, especially when acting as legal counselors, judges, and mediators. See e.g. Yann Lecun and Browning, AI And The Limits Of Language (Noema, 8/23/22) (“An artificial intelligence system trained on words and sentences alone will never approximate human understanding.”); Valmeekam, et al, On the Planning Abilities of Large Language Models (arXiv, 2/13/23) (poor at planning capabilities); Dissociating language and thought in large language models (arXiv, 3/23/24) (poor at functional competence tasks).

Getting back to the metaphor, a parrot may not understand the words it speaks, but they at least have some self awareness and consciousness. An AI has none. As one thoughtful Canadian writer put it:

Though the output of a chatbot may appear meaningful, that meaning exists solely in the mind of the human who reads or hears that output, and not in the artificial mind that stitched the words together. If the AI Industrial Complex deploys “counterfeit people” who pass as real people, we shouldn’t expect peace and love and understanding. When a chatbot tries to convince us that it really cares about our faulty new microwave or about the time we are waiting on hold for answers, we should not be fooled.
Bart Hawkins Kreps, Beware of WEIRD Stochastic Parrots (Resilience, 2/15/24).

Robot parrot eating cracker. Watercolor by Losey.

For interesting background, see The New Yorker article of 11/15/2023, by Angie Wang, Is My Toddler a Stochastic Parrot? Also see: Scientific research article on the lack of diversity in internet model training, Which Humans? by Mohammad Atari, et al. (arXiv, 9/23/23) (“Technical reports often compare LLMs’ outputs with “human” performance on various tests. Here, we ask, “Which humans?”“).

I also suggest you look at the often cited technical blog post by the great contemporary mathematician, Stephen Wolfram What Is ChatGPT Doing … and Why Does It Work?. As Wolfram states in the conclusion ChatGPT is “just saying things that “sound right” based on what things “sounded like” in its training material.” Yes, it sounds good, but nobody’s home, no real meaning. That is ultimately why the fears of AI replacing human employment are way overblown. It is also why LLM based plagiarism is usually easy to recognize, especially by experts in the field under discussion. The Chatbot writing is obvious by its style over substance language, which is high on fluff and stereotypical language, and overuse of certain “tell” words. More on this in my next blog on how to spot stochastic parrots.

Personally, I’m already sick of the bland, low meaning, fluffy content news and analysis writing now flooding the internet, including legal writing. It is almost as bad as ChatGPT writing for political propaganda and sales. It is not only biased, and riddled with errors, it is mediocre and boring.

Parrot Pirate and his pet parrot. Watercolor style by Losey.

Conclusion

Everyone agrees that LLM AIs will, if left unchecked, reproduce biases and inaccuracies contained in the original training data. This inevitably leads to the generation of false information – to skewed output to prompts – and that in turn can lead to poor human decisions made in reliance on biased output. This can be disastrous in sensitive applications like law and medicine.

Everyone also agrees that this problem requires AI software manufacturers to model designs to curb these biases, and to monitor and test to ensure the effectiveness and trustworthiness of LLMs.

The disagreement seems to be in evaluation of the severity of the problem, and the priority that should it be given to its mitigation. There is also disagreement as to the degree of success made to date in correcting this problem, and whether the problem can even be fixed at all.

Pirate eating cracker with parrot on book. Watercolor by Losey.

My view is that these issues can be significantly reduced, but I doubt that LLMs will ever be perfect and entirely free of all bias, even though they may become better than the average human. See e.g. New Study Shows AIs are Genuinely Nicer than Most People – ‘More Human Than Human’.

Moreover, I believe that users of LLMs, especially lawyers, judges and other legal professionals, can be sensitized to these bias issues. They can learn to recognize previously unconscious bias in the data and in themselves. The sensitivity to the bias issues can then help AI users to recognize and overcome these challenges. They can realize when the responses given by an AI are wrong and must be corrected.

The language of a ChatGPT may correctly echo what most people in the past said, but that does not, in itself, make it the right answer for today. As lawyers we need the true, correct and bias free answers, the just and fair answers, not the most popular answers of the past. We have an ethical duty of competence to double check the mindless speech of our stochastic parrots. We should question why Polly always wants a cracker?

Parrot pirate with crackers and hot grog. Watercolor by Losey.

3 Comments | AI Ethics, AI Instruction, AI Prompt Engineering Instruction, Book, ChatGPT, Lawyers Duties | Tagged: ai, bias, chatGPT, ethics, gen ai | Permalink
Posted by Ralph Losey

AI Copyright and the Litigious Life of Harmenszoon van Rijn Rembrandt: as explained by a talking portrait of a robot

March 28, 2024

Ralph Losey. Published March 28, 2024.

Video, AI image in style of Rembrandt, research and words by Ralph Losey, an admirer of Rembrandt who is sympathetic to his litigious life.

Here is the transcript of the five minute talk by the robot portrait. (⏱ = 0.5 second pause in speech)

Hi,

I am a robot image created by Ralph Losey, roughly in the style of Rembrandt, one of his favorite artists. ⏱ I think I also look like the work of another Dutch Master, Vermeer. ⏱ My headphone is kind of like a big pearl earring? ⏱

Ralph used a variety of digital tools to make me, primarily an AI tool called Midjourney, but several others too. ⏱ Ralph says they are like paint brushes and, like a typical lawyer, claims copyright. ⏱ It remains to be seen whether courts will agree with that position?

⏱ Ralph has also created an AI tool of his own, a GPT designed to interface with the Dall-E software of OpenAI. ⏱ He calls his software, Visual Muse. And even claims copyright to that too! ⏱

I wonder what Rembrandt would say about all of this? ⏱ Unfortunately, he knew lawyers and litigation all too well. ⏱

Rembrandt Harmenszoon van Rijn lived from 1606 to 1669. ⏱ He was a multimodal master of all of the visual media of his day. ⏱ Painting, printmaking and drawing. ⏱ He was also well known for a variety of themes and styles, including his many selfies, ⏱

Rembrandt enjoyed early success in painting and in marriage to Saskia. She was the daughter of a successful Dutch lawyer . ⏱ He and Saskia lived extravagantly, at first, and he over-spent on a big house and many purchases of art. ⏱ Tragically, their first three children died shortly after birth. The fourth child survived, but Saskia died within a year from tuberculosis. Rembrandt’s spent the rest of his life with fame and beautiful women, but no fortune. He was broke, worse than that, he was hounded by creditors and their lawyers. ⏱

Rembrandt became embroiled in a never-ending series of law suits a few years after his wife died. It all started from his seduction of the young woman employed in his mansion, Geertje Dircx. She was employed as a wet nurse for the child. ⏱ I can easily imagine how that affair came about. Ironically, a few years later, Geertje became pregnant, and sued Rembrandt for breach of promise of marriage and sought alimony. ⏱ She had good lawyers. ⏱ He paid and agreed to alimony. ⏱ Geertje later ended up in special women’s prison anyway, which cost Rembrandt still more money. ⏱

Then Rembrandt began a relationship with his 23-year-old maid, Hendrickje Stoffels. ⏱ His young mistress, Hendrickje, was recognized as a nude in Rembrandt’s painting, Bathsheba at Her Bath. Based on that the Reform Church charged the girl with, quote, committing the acts of a whore with Rembrandt the painter. ⏱She admitted her guilt and was banned from receiving communion. ⏱ Nothing happened to Rembrandt. ⏱

Still, it was all downhill from there for Rembrandt, financially at least. He had another child with Hendrickje. More expenses, but he never married her, ⏱ Ultimately Rembrandt filed for a type of voluntary bankruptcy, called an cessio bonorum, to avoid incarceration. ⏱ Yes, they would jail debtors then for failure to pay, even famous artists like Rembrandt. ⏱ The bankruptcy just delayed things. ⏱ When he died in 1669, he had outlived his major creditors, but was still buried in a rented grave. ⏱Rented grave? Who knew such a thing even existed? ⏱

As a result philandering and extravagant living, Rembrandt became all too familiar with lawyers, litigation and the protection and secretion of assets. ⏱⏱ His difficult financial and family situation is one cause of his prodigious output of art. ⏱ He had to keep working to pay his creditors, and his lawyers! ⏱ By some accounts he created 600 paintings, 400 etchings and 2,000 drawings. ⏱ ⏱

No one would mistake me for a Rembrandt or Vermeer. ⏱ But I wonder, am I even an original work? ⏱ Can I be protected? ⏱ Or can anyone steal me and do with me what they will? ⏱⏱ I certainly hope not. ⏱ I would rather litigate than live like that! ⏱⏱ Wouldn’t you? ⏱⏱

1 Comment | AI Instruction, ChatGPT, Evidence, Technology | Tagged: copyright | Permalink
Posted by Ralph Losey

Stochastic Parrots: the hidden bias of large language model AI

March 25, 2024

Ralph Losey. Published March 25, 2024.

AI video written and directed by Ralph Losey.

Article Underlying the Video. The seminal article on the dangers of relying on stochastic parrots was written in 2021, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, 3/1/21) by a team of AI experts, Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell. This article arose out of the 2021 Conference of the Association for Computing Machinery (ACM) on Fairness, Accountability, and Transparency (ACM Digital Library).

Transcript of Video

GPTs do not think anything like we do. They just parrot back pre-existing human word patterns with no actual understanding. The words generated by a GPT in response to prompts is sometimes called, speech by a stochastic parrot!

According to the Oxford dictionary, Stochastic is an adjective meaning “randomly determined; having a random probability distribution or pattern that may be analyzed statistically but may not be predicted precisely.”

Wikipedia explains stochastic is derived from the ancient Greek word, stókhos, meaning ‘aim or guess’ and today refers to “the property of being well-described by a random probability distribution.”

Wikipedia also explains the meaning of a stochastic parrot.

In machine learning, the term stochastic parrot is a metaphor to describe the theory that large language models, though able to generate plausible language, do not understand the meaning of the language they process.

The stochastic parrot characteristics are a source of concern when it comes to the fairness and bias of GPT speech. That is because the words the GPTs are trained on, that they parrot back to you in clever fashion, come primarily from the internet. We all know how messy and biased that source is.

In the words of one scholar, Ruha Benjamin, “Feeding AI systems on the world’s beauty, ugliness, and cruelty, but expecting it to reflect only the beauty is a fantasy.“

Keep both of your ears wide open. Talk to the AI parrot on your shoulder, for sure, but keep your other ear alert. It is dangerous to only listen to a stochastic parrot, no matter how smart it may seem.

The subtle biases of GPTs can be an even greater danger than the more obvious problems of AI errors and hallucinations. We need to improve the diversity of the underlying training data, the curation of the data, and the Reinforcement Learning from Human Feedback, RLHF. It is not enough to just keep adding more and more data, as some contend.

This view was forcefully argued in 2021 in an article I recommend. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, 3/1/21) by AI ethics experts, Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell.

We need to do everything we can to make sure that AI is a tool for good, for fairness and justice, not a tool for dictators, lies and oppression.

Let’s keep the parrot’s advice safe and effective. For like it, or not, this parrot will be on our shoulders for many years to come! Don’t let it fool you! There’s more to life than the crackers that Polly wants!