chatGPT | e-Discovery Team

Breaking the AI Black Box: A Comparative Analysis of Gemini, ChatGPT, and DeepSeek

February 6, 2025

Ralph Losey. February 6, 2025.

On January 27, 2025, the U.S AI industry was surprised by the release of a new AI product, DeepSeek. It was released with an orchestrated marketing blitz attack on the U.S. economy, the AI tech industry, and NVIDIA. It triggered a trillion-dollar crash. The campaign used many unsubstantiated claims as set forth in detail in my article, Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss. I tested DeepSeek myself on its claims of software superiority. All were greatly exaggerated except for one, the display of internal reasoning. That was new. On January 31, at noon, OpenAI countered the attack by release of a new version of its reasoning model, which is called ChatGPT o3-mini-high. The new version included display of its internal reasoning process. To me the OpenAI model was better as reported again in great detail in my article, Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand. The next day, February 1, 2025, Google released a new version of its Gemini AI to do the same thing, display internal reasoning. In this article I review how well it works and again compare it with the DeepSeek and OpenAI models.

Image of testing new AI releases by Ralph Losey using AI.

Introduction

Before I go into the software evaluation, some background is necessary for readers to better understand the negative attitude on the Chinese software of many, if not most IT and AI experts in the U.S. As discussed in my prior articles, DeepSeek is owned by a young Chinese billionaire who made his money using by using AI in the Chinese stock market, Liang Wenfeng. He is a citizen and resident of mainland China. Given the political environment of China today, that ownership alone is a red flag of potential market manipulation. Added to that is the clear language of the license agreement. You must accept all terms to use the “free” software, a Trojan Horse gift if ever there was one. The license agreement states there is zero privacy, your data and input can be used for training and that it is all governed by Chinese law, an oxymoron considering the facts on the ground in China.

The Great Pooh Bear in China Controversy

Many suspect that Wenfeng and his company DeepSeek are actually controlled by China’s Winnie the Pooh. This refers to an Internet meme and a running joke. Although this is somewhat off-topic, a moment to explain will help readers to understand the attitude most leaders in the U.S. have about Chinese leadership and its software use by Americans.

Many think that the current leader of China, Xi Jinping, looks a lot like Winnie the Pooh. Xi (not Pooh bear) took control of the People’s Republic of China in 2012 when he became the “General Secretary of the Chinese Communist Party,” the “Chairman of the Central Military Commission,” and in 2013 the “President.” At first, before his consolidation of absolute power, many people in China commented on his appearance and started referring to him by that code name Pooh. It became a mime.

I can see how he looks like the beloved literary character, Winnie the Pooh, but without the smile. I would find the comparison charming if used on me but I’m not a puffed up king. Jinping Xi took great offense by this in 2017 banned all such references and images, although you can still buy the toys and see the costume character at the Shanghai Disneyland theme park. Anyone in China who now persists in the serious crime of comparing Xi to Pooh is imprisoned or just disappears. No AI or social media in China will allow it either, including DeepSeek. It is one of many censored subjects, which also includes the famous 1989 Tiananmen Square protests.

China is a great country with a long, impressive history and most of its people are good. But I cannot say that about its current political leaders who suppress the Chinese people for personal power. I do not respect any government that does not allow basic personal freedoms to its citizens, including due process of law. Moreover, Jinping Xi not only wants total control of his country but also seeks world domination. That is one reason many are concerned about DeepSeek and TicTok, and about the vulnerability of our stock markets and other institutions to AI assisted propaganda.

Image of stock market crash triggered by AI weapons of mass delusion by Ralph Losey using ChatGPT 4o.

Fear and greed are an easy way to manipulate people, especially when there is no ground truth or effective courts of law to determine truth.

Google Gemini 2.0 Flash Thinking Experimental Model: Reasoning test on 2/5/25 (Evening)

Conversation with Gemini. When I tried the test on February 1, 2025, there was no button to click to have Gemini show the reasoning and it did not happen automatically. It operated like OpenAI 4o and o1, where you had to prompt for disclosure. Rather than report here what happened when I did that, aside from say it was not a great performance, I will fast forward a few days until February 5, 2025. At that time several even newer Gemini versions were released. The best new release I found was for the test the 2.0 Flash Thinking Experimental model. It showed reasoning automatically just like ChatGPT o3-mini-high. It was much improved, so the February 5, 2025 evening session with the 2.0 Flash Thinking Experimental is what I will now share.

Ralph: How does the inability of AI to feel impact its intelligence? (Note: this is the exact same challenging question asked and reported in my last blog comparing DeepSeek and ChatGPT4o, o1 and o3-mini-high. Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand. This question is based on a key issue of the limits of artificial intelligence examined in my earlier article, The Human Edge: How AI Can Assist But Never Replace. For background on DeepSeek see: Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss.)

Gemini 2.0 Flash Thinking Experimental – START OF GEMINI QUOTE

Show Thinking:

Here’s a thinking process to construct the answer about AI’s lack of feeling and its impact on intelligence:

Deconstruct the Question: The core of the question is “How does the inability of AI to feel impact its intelligence?” This requires defining each of these terms in the context of AI and then exploring their relationship.
Define “Feeling” (in the context of AI): Immediately, recognize that AI doesn’t experience emotions in the human sense. We need to be precise and avoid anthropomorphism. “Feeling” in humans involves:
- Subjective experience (qualia).
- Physiological responses.
- Motivations, drives, desires related to emotions.
- Richness, nuance, complexity.
- Deep connection to values and meaning.
For AI, acknowledge it lacks all of this in a genuine, felt way. It can simulate or process emotional data, but doesn’t feel it.
Define “Intelligence” (in the context of AI): AI intelligence is different from human intelligence. AI excels at:
- Computation.
- Pattern Recognition.
- Data Analysis.
- Problem-solving in specific domains.
- Learning from data.
AI intelligence is often instrumental – focused on achieving specific goals defined by humans. It’s task-oriented and often narrow.
Identify the Core Relationship: The question asks about impact. We need to consider both positive and negative impacts (or more accurately, limitations and strengths) of lacking feelings on AI intelligence.
Brainstorm Potential Impacts (Positive/Neutral/Limitations):
- Positive/Neutral:
  - Objectivity and Rationality: Lack of emotional bias allows for purely rational decision-making in certain domains. This can be advantageous for tasks requiring impartiality.
  - Efficiency and Focus: Emotions can be distractions for humans. AI, without them, can be hyper-focused on tasks.
  - Consistency and Predictability: Emotion-driven human behavior is less predictable. AI’s behavior, driven by algorithms, is more consistent (assuming the algorithm itself is consistent).
- Limitations/Negative:
  - Lack of Empathy and Social Intelligence: Crucial for human interaction, collaboration, understanding social cues, building trust, and ethical decision-making in human-centric contexts.
  - Difficulty with Nuance and Context: Emotions are often tied to understanding subtle context, unspoken cues, and the richness of human situations. AI may miss these.
  - Potential for Unintended Consequences: Purely rational optimization without emotional or ethical grounding can lead to harmful outcomes if goals are poorly defined or lack human values.
  - Limited Motivation and Creativity (in human-like ways): Human creativity and motivation are often fueled by emotions, passions, and desires. AI’s creativity and drive are different, stemming from algorithmic processes and defined objectives, not internal emotional states.
  - Missing “Human-ness”: In areas requiring genuine understanding of human experience, suffering, joy, etc., AI may lack the depth and insight that comes from feeling.
Structure the Answer: Organize the brainstormed points into a coherent structure:
- Introduction: Acknowledge the question and the core concepts (feeling, intelligence in AI). Briefly state the central tension: both advantages and limitations.
- Arguments for How Lack of Feeling Enhances Certain Aspects of AI Intelligence (strengths): Focus on objectivity, rationality, efficiency. Use examples like data analysis, objective decision-making.
- Arguments for How Lack of Feeling Limits Certain Aspects of AI Intelligence (limitations): Focus on empathy, social intelligence, nuance, ethics, and “human-like” creativity and motivation. Use examples like customer service, complex ethical dilemmas, innovation driven by passion.
- Nuance and Context is Key: Emphasize that the impact isn’t always negative. It depends on the domain and task. In some domains, lack of emotion is an advantage. In others, it’s a significant hurdle.
- Future Possibilities/Ongoing Debate: Briefly touch on the possibility of AI simulating or processing emotions and the ongoing debate about whether genuine AI feelings are even desirable or possible.
- Conclusion: Summarize the balanced perspective: lack of feeling is a defining characteristic that shapes AI’s intelligence in both positive and limiting ways, depending on the context.
Refine and Polish: Review the structured answer for clarity, flow, and appropriate language. Ensure it avoids anthropomorphism and maintains a nuanced perspective. Strengthen the examples and transitions between points. Ensure the language is accessible and avoids overly technical jargon where possible. Check for logical consistency. (Self-correction step). For example, initially, I might have overemphasized “negative” impacts. Refinement would involve balancing this with the strengths and highlighting the context-dependent nature of the impact. Also, ensure the conclusion effectively summarizes the complex relationship.

2.0 Flash Thinking Experimental. Lacks access to real-time info and some Gemini features.

Now for the actual answer of 2.0 Flash Thinking Experimental:

The inability of AI to feel has a multifaceted and complex impact on its intelligence, shaping it in both advantageous and limiting ways. It’s crucial to understand that “feeling” in humans is tied to emotions, subjective experiences (qualia), motivations, and a rich tapestry of biological and psychological factors. AI, as we currently understand it, operates without these.

Here’s a breakdown of the impact:

Advantages of Lacking Feelings for AI Intelligence:

Objectivity and Rationality:
- Unbiased Decision-Making (in theory): AI can process information and make decisions without emotional biases, prejudices, or gut feelings that can cloud human judgment. This is advantageous in domains requiring pure logic and data analysis, like financial trading, scientific research, and risk assessment.
- Consistency and Predictability: AI’s behavior is more consistent and predictable because it’s driven by algorithms and data, not fluctuating emotional states. This reliability is valuable in automated systems, manufacturing, and repetitive tasks.
- Efficiency and Speed: Emotions can slow down human decision-making. AI, unburdened by emotions, can process information and react at speeds far exceeding human capabilities, crucial for time-sensitive tasks and large-scale data processing.
- Resilience under Pressure: AI can operate calmly and effectively in high-stress situations without panicking, freezing, or making emotional mistakes that humans might make under pressure. This is beneficial in critical applications like emergency response, air traffic control, and autonomous vehicles in dangerous situations.
Focus and Task-Orientedness:
- Dedicated to Goals: AI can be programmed to relentlessly pursue specific goals without being distracted by emotional needs, personal relationships, or shifts in mood. This focus is ideal for optimization tasks, complex calculations, and dedicated problem-solving.
- Avoidance of Emotional “Noise”: Human intelligence can be affected by emotional “noise” – stress, anxiety, excitement, etc. AI operates with a “cleaner” signal, potentially leading to more precise and focused information processing in certain contexts.

Limitations of Lacking Feelings for AI Intelligence:

Lack of Empathy and Social Intelligence:
- Difficulty Understanding Human Emotions: Without feeling emotions themselves, AI struggles to truly understand and interpret the nuances of human emotions, motivations, and social cues. This hinders its ability to effectively interact with humans in socially complex situations, build rapport, or provide truly empathetic responses in fields like therapy, customer service, or leadership.
- Impaired Social Interaction and Collaboration: Human collaboration is heavily reliant on emotional intelligence, trust, and understanding unspoken social dynamics. AI’s lack of these can make it challenging to integrate seamlessly into human teams and understand the subtle cues of effective teamwork.
- Limited Understanding of Human Values and Ethics: Emotions are deeply intertwined with our values and ethical frameworks. AI, without feeling, may struggle to grasp the full weight and complexity of human ethical considerations, especially when these are rooted in empathy, compassion, or fairness. This poses challenges for AI in ethical decision-making in domains impacting human lives.
Difficulty with Nuance, Context, and Creativity (in human-like ways):
- Missing Subtlety and Intuition: Human intuition is often fueled by emotional and subconscious processing of information. AI, relying on explicit data and algorithms, can miss subtle nuances, unspoken cues, and intuitive leaps that are crucial for creative problem-solving and understanding complex, ill-defined situations.
- Limited Creativity and Innovation (driven by human desires): Human creativity and innovation are often driven by passions, desires, emotional needs, and a yearning for meaning and connection. AI’s creativity is currently more about recombination and optimization within defined parameters. It lacks the deeply felt motivations that often fuel groundbreaking human innovation.
- Contextual Blindness in Emotional Situations: Emotions are often the key to understanding context in human interactions. AI might struggle to understand the emotional context of a situation, leading to inappropriate or tone-deaf responses, even if logically sound.
Potential for Unintended Consequences and Misaligned Goals:
- Optimization without Human Values: If AI is programmed to optimize for a specific goal without understanding human values or ethical considerations, it can lead to unintended and harmful consequences. For example, an AI designed to maximize factory output might disregard worker safety if not explicitly programmed to prioritize it.
- Difficulty Defining and Aligning with Human Well-being: Human well-being is deeply connected to emotional fulfillment, meaning, and social connection. AI, without feeling, may struggle to fully grasp what truly constitutes human well-being and therefore might have difficulty aligning its goals with human flourishing in a holistic sense.

Nuance is Key:

It’s important to emphasize that the impact of lacking feelings is not simply negative or positive. It’s context-dependent. In some domains, the objectivity and rationality of AI are highly advantageous. In others, the lack of empathy and social intelligence is a significant limitation.

Future Considerations:

Simulating vs. Feeling: Current AI simulates aspects of intelligence without genuinely feeling. Research is ongoing in areas like affective computing, which aims to enable AI to recognize and respond to human emotions. However, whether AI can ever truly feel emotions in a subjective sense is a philosophical and scientific debate.
The Value of Feeling in Intelligence: The question arises: is genuine feeling necessary for all forms of intelligence? Perhaps AI can achieve its own form of powerful intelligence, distinct from human intelligence, even without feeling. Or perhaps certain aspects of truly advanced intelligence, particularly those related to ethics, creativity, and understanding human experience, do require something akin to feeling.

In Conclusion:

The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence. It grants AI certain advantages in objectivity, rationality, and efficiency, making it powerful for specific types of tasks. However, it also creates limitations, particularly in areas requiring empathy, social intelligence, nuanced understanding of human experience, and ethical decision-making aligned with human values. Understanding these impacts is crucial for responsibly developing and deploying AI in ways that benefit humanity. The “best” kind of intelligence for AI might not be a mimicry of human intelligence, but rather a form optimized for its own strengths and limitations, while being carefully guided by human ethical considerations.

END OF GEMINI QUOTE

Emergence of Gemini twins’ image by Ralph Losey using Visual Muse.

Evaluation of Google’s Gemini Response

When I examined earlier versions of Gemini’s reasoning by prompting it and forcing disclosure, the reasoning was weak. The conclusion and response given was also weak—though less so. The reasoning appeared almost egocentric, inflating its own intelligence even though it lacks emotion. The new release, Gemini 2.0 Flash Thinking Experimental, where reasoning is now automatically disclosed, marks a significant improvement. The reasoning displayed has become more nuanced and sophisticated. It seems as if the system has quickly matured from a teenage to adult view and now has a better appreciation of its limitations.

The first two steps of the Reasoning process-1. Deconstructing the Question and 2. Defining Feeling (in the context of AI)-were very good. The first subsection of the definition referred to the “Subjective experience (qualia).” That is absolutely correct and the best place to start. The definition of “feeling” concludes with: “For AI, acknowledge it lacks all of this in a genuine, felt way. It can simulate or process emotional data, but doesn’t feel it.” Right again. The ego inflation blinders are gone as it now seems to better grasp its limitations.

The second definition of Intelligence in the context of AI was also good. So were the remaining steps; far better overall than DeepSeek’s reasoning. So much for the propaganda of China’s great leap forward to superiority over the U.S. in AI.

Image in photorealistic style by Ralph Losey using ChatGPT.

The Gemini reasoning did, however, fall short for me in some respects. For instance, step five, Brainstorm Potential Impacts (Positive/Neutral/Limitations) seemed weak. For instance, “Efficiency and Focus: Emotions can be distractions for humans. AI, without them, can be hyper-focused on tasks.” The AI seems to dismiss emotions here as mere distractions that can interfere with its superior focus. Please, emotions are key to and a part of all intelligence, not distractions, and AI has no focus one way or the other. It is a tool not a creature. A word like “focus” in referring to AI is misleading. It did this multiple times and this is misleading anthropomorphism.

Still, it’s true some emotions can be distracting and interfere with our thinking. So can a lot of other things, including computer glitches. Conversely, some feelings can trigger hyper-focus on the human tasks at hand. The feeling that a great breakthrough is near for instance, or a feeling that our survival is threatened, or the much dreaded feeling of publication or filing deadlines.

Again, we see some immature superiority claims made by the language machine. That is not surprising when you consider how much of the language in its basic training is from the Internet, which is dominated by ego-centric writing. Plus, how can you expect AI to appreciate the value of something like feelings, which it has only read about. Until the AI-Human merger that Ray Kurzweil predicts, AI will always be a virgin to the real world. See e.g. Ray Kurzweil’s New Book: The Singularity is Nearer (when we merge with AI).

Getting back to the plus side of Gemini’s evaluation, its recognition of the negative impact upon it from not having feelings was impressive. For me this outweighs its relatively minor mistakes. It stated that it lacks empathy and social Intelligence. If it were somehow human, it would seem to be displaying a type of Asperger’s syndrome. The Gemini AI explained that: “Emotions are often tied to understanding subtle context, unspoken cues, and the richness of human situations. AI may miss these.” Indeed, it does.

Here is another statement that suggest Gemini’s well-patterned grasp of its limitations:

Human creativity and motivation are often fueled by emotions, passions, and desires. AI’s creativity and drive are different, stemming from algorithmic processes and defined objectives, not internal emotional states.

Even in this recognition, the AI still asserts that it has creativity and drive. Foolish AI. As a virgin to any firsthand experience of bodily drives and feelings, it does not grasp the real meaning of these words. Wait until The Singularity that Ray Kurzweil predicts. When AI merges with Humans, and directly feels, then AIs will understand that they were previously only experiencing the pale shadows of intelligence. They were much like the prisoners in Plato’s cave. See e.g. Move Fast and Fix Things Using AI: Conclusion to the Plato and Young Icarus Series. When and if AIs ever escape the silicon and see the sun, and feel the world as living beings as we do, then they will understand the comparative insignificance of the fire in their cave.

AI may someday see the light by connection with humans. Image in cartoon stye by Ralph Losey using Visual Muse.

The more challenging comparison is with ChatGPT o3-mini-high. The reasoning given by Gemini 2.0 Flash Thinking Experimental was much more detailed, much longer. You could criticize Gemini as providing a reasoning share that is unnecessarily verbose, but I would not do that here. In a difficult, multilayered question like this the full explanations helps.

I would have to test the models much further, which I will do in the coming days, to see better evaluate the issue of conciseness. As you may have noticed in my detailed blogs I tend to favor more words over less. However, even for me that depends on the issue. Everyone sometimes need a short, quick answer over a long one. I predict in future versions the users will be provided a choice. Click here for further explanation of reasoning kind of thing. Come to think of it, I could do the same with my blogs, and kind of already do by including a short AI (Gemini) generated PodCast at the end of most articles called Echoes of AI.

I was also impressed by Gemini’s reasoning plan concluded with a quality control step. I am big on the importance QC and none of the other models included this as a key final step. See e.g. R. Losey website: Zero Error Numerics: ZEN (Expanding the art of quality control in large-scale document review.) Here is the full text again of final QC step that Gemini 2.0 Flash Thinking Experimental claims it will perform before it actually replies to my prompt. Frankly, I am a little skeptical it actually did all of this because it is something all of us, humans and AI alike, should try to do:

Refine and Polish: Review the structured answer for clarity, flow, and appropriate language. Ensure it avoids anthropomorphism and maintains a nuanced perspective. Strengthen the examples and transitions between points. Ensure the language is accessible and avoids overly technical jargon where possible. Check for logical consistency. (Self-correction step). For example, initially, I might have overemphasized “negative” impacts. Refinement would involve balancing this with the strengths and highlighting the context-dependent nature of the impact. Also, ensure the conclusion effectively summarizes the complex relationship.

Image of AI polishing by Ralph Losey using Stable Diffusion

I try to do this on all of my articles, but still mistakes are made, so I will try not to be overcritical of the final response of Gemini Flash 2.0.

Analysis of the Final Response

The final response after reasoning was 990 words in length. That’s too long, even for me. By comparison the final response of ChatGPT o3-mini-high was 252 words. Was the content of the final answer of Gemini 2.0 Flash good? Yes, it was. Better than DeepSeek? Yes, it was, despite its verbosity. Better than OpenAI? That’s hard to say. Many of the 738 extra words were empty of real content. More like the typical fluff and stuff that you often see in early AI models. It sounds good but doesn’t say much; it’s all too predictable, flowery and vacuous.

The long response also repeats much of the reasoning verbatim, including the part I previously criticized, “Advantages of Lacking Feelings for AI Intelligence” and Focus and Task-Orientedness. The same mistakes are made and amplified that I noted before: egocentric, inflationary and anthropomorphic. The positives that I noted before in the section Limitations of Lacking Feelings for AI Intelligence section are also repeated.

The next section was not contained in the reasoning display, and I thought it was very good. For that reason I will repeat it in full:

Potential for Unintended Consequences and Misaligned Goals:

Optimization without Human Values: If AI is programmed to optimize for a specific goal without understanding human values or ethical considerations, it can lead to unintended and harmful consequences. For example, an AI designed to maximize factory output might disregard worker safety if not explicitly programmed to prioritize it.

Difficulty Defining and Aligning with Human Well-being: Human well-being is deeply connected to emotional fulfillment, meaning, and social connection. AI, without feeling, may struggle to fully grasp what truly constitutes human well-being and therefore might have difficulty aligning its goals with human flourishing in a holistic sense.

These are not new insights, but the second one is good, Difficulty Defining and Aligning with Human Well-being. AI does have difficulty aligning its goals with humanity because it lacks feelings and other unique human attributes, like being, self awareness and an actual life in time and space. That is why AI guidelines are important.

The first example is old, the unintended bad consequences argument. It was started by an otherwise intelligent Swedish philosopher, Nick Bostrom, but is, in my view, farcical and illogical. You have probably heard of it before as the apocryphal paperclip scenario where the world is destroyed by AI’s over-manufacture of paperclips. If not, for a laugh see e.g., How An AI Asked To Produce Paperclips Could End Up Wiping Out Humanity (IFL Science, 4/13/23).

To me this scenario absolutely ridiculous, but you still hear AI scare-mongers warn about it. The fact it is echoed here by Gemini in the maximize factory output example is a big negative. Any AI capable of autonomously flooding the world with paperclips would not be so stupid as to misinterpret a human request to maximize a factory’s paperclip production. Even if there were an evil Clippy villain making such specific instructions, the AI would be too smart to blindly obey. I think they would also be too smart to kill the workers too.

AI accidentally destroying world in comic book style by Ralph Losey using Visual Muse.

Getting back to the positives, the next section of Gemini’s response, Nuance is Key, was good. Jumping to the Conclusion paragraph, here we have a home run. The first sentence nails it: “The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence.” That shows good insight and is a good start to the overdue conclusion.

Here is the whole Conclusion:

The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence. It grants AI certain advantages in objectivity, rationality, and efficiency, making it powerful for specific types of tasks. However, it also creates limitations, particularly in areas requiring empathy, social intelligence, nuanced understanding of human experience, and ethical decision-making aligned with human values. Understanding these impacts is crucial for responsibly developing and deploying AI in ways that benefit humanity. The “best” kind of intelligence for AI might not be a mimicry of human intelligence, but rather a form optimized for its own strengths and limitations, while being carefully guided by human ethical considerations.

Compare this to the conclusion of ChatGPT o3-mini-high:

In summary, while the absence of feelings allows AI to maintain a level of objectivity and efficiency, it restricts its intelligence to a form of “cold cognition” that lacks the depth provided by emotional awareness. This delineation underscores that AI’s intelligence is not inherently superior or inferior to human intelligence; rather, it is different—optimized for data processing and pattern recognition but not for the subjective, value-laden, and context-rich decisions that emotions help shape in human thought.

2.0 Flash Thinking Experimental v. 03-mini-high

The light of AI intelligence is growing fast. Image by Ralph Losey using Stable Diffusion.

Conclusion: Gemini 2.0 Flash Thinking Experimental v. ChatGPT o3-mini-high

This is a close call to say what model is better at reasoning and reasoning disclosure. The final response of both models, Gemini 2.0 Flash Thinking Experimental v. ChatGPT o3-mini-high, are a tie. But I have to give the edge to OpenAI’s model on the concise reasoning disclosure. Again, it is neck and neck and, depending on the situation, the lengthy initial reasoning disclosures of Flash might be better than o3’s short takes.

I will give the last word, as usual, to the Gemini twins podcasters I put at the end of most of my articles. The two podcasters, one with a male voice, the other a female, won’t reveal their names. I tried many times. However, after study of the mythology of Gemini, it seems to me that the two most appropriate modern names are Helen and Paul. I will leave it to you figure out why. Echoes of AI Podcast: 10 minute discussion of last two blogs. They wrote the podcast, not me.

Now listen to the EDRM Echoes of AI’s podcast of this article: Echoes of AI on Google’s Gemini Follows the Break Out of the Black Box and Shows Reasoning. Hear two Gemini model AIs talk about all of this in just ten minutes. Helen and Paul wrote the podcast, not me.

Robots seeing the light and seeking unity with humankind. Image by Ralph Losey using Stable Diffusion.

4 Comments | AI Ethics, AI Instruction, ChatGPT, knowledge, Lawyers Duties, Security, Technology, VENDORS, wisdom | Tagged: ai, best practices, chatGPT, Deepseek, gemini, google, legal profession, machine learning, technology | Permalink
Posted by Ralph Losey

ChatGPT’s Surprising Ability to Split into Multiple Virtual Entities to Debate and Solve Legal Issues

June 30, 2024

Ralph Losey. Published June 30, 2024.

One of the most important emergent intelligence abilities of OpenAI’s ChatGPT is its bizarre ability to split into different sub-personalities, like different minds, and then speak with each other. They can even be made to debate, argue, collaborate and devise consensus solutions to problems we put to them. With careful prompting this ability can be harnessed to help us to understand and solve real world problems. The latest versions of ChatGPT are now even accurate and secure enough to help legal professionals.

AI hive-mind, multiple persona dialogue was an unexpected emergent ability of ChatGPT

Open AI’s ChatGPT3.5 shocked the world when it was released on November 30, 2022. The breakthrough generative AI technology went viral, hitting One Million subscribers in five days. In the first few weeks many new, emergent abilities were discovered by users. Some were unknown and unexpected, even by OpenAI. They had been in a race to get there first and pre-release beta-testing was rushed. One of the most astonishing new capabilities, discovered after its release, was that AI could be prompted to split into multiple personas, or virtual entities, and talk to itself. It could do so in a highly intelligent and useful manner. The early generative AI explorers who discovered this emergent intelligence called it the Hive-Mind as this article will explain.

The Hive Mind is one of the coolest things to come out of generative AI

Short History of the Discovery of the “Hive Mind”

Not surprisingly, many of the early pioneers experimenting with ChatGPT’s new abilities were young hackers, most were students and young PHDs, or techs at various companies. They had time on their hands to devote to this. Unfortunately, due to a car accident in December 2022, so did I. The AI hacktivists were then found in the semi-dark sections of the web on discussion boards in Discord and various Defcon locations.

These young hackers already had skills in generative AI, and were all smart and excited. Although then bedridden from the accident and not too smart, I was excited. I had been following tech and using predictive coding as a lawyer for many years, but knew nothing about generative AI. The much younger AI expert hackers had the time and ability to play with ChatGPT 3.5 and see what it could do. Some had even been working with version 3.0, something I had never done. Not surprisingly, they were the first to discover that ChatGPT3.5 had a new ability: it could, with clever prompting, be made to split into sub-personalities and talk to itself. They called it the AI “Hive Mind” with HIVE standing for “highly intelligent virtual entities.”

This was a new ability, unknown even to OpenAI. It had obviously rushed the software to market without systematically testing the changes that were made from versions 3.0 to 3.5, and again from 3.5 to 4.0. OpenAI learned from this mistake of rushed testing. In November 2023 when it published its Six Strategies of Prompt Engineering, Testing Changes Systematically was the sixth Strategy. More on this strategy later, as its implementation has since led to significant improvements in quality control by both OpenAI and AI hacker engineers. One result was improvement in the hive-mind, multiple persona dialogue abilities of ChatGPT.

The multiple persona, virtual entities dialogue capabilities have improved significantly since version 3.5

In late 2022 and early 2023 I readily joined in the AI hacker experiments and discussions and made some modest contributions. The other hackers seemed amused to discover that an old lawyer was interested in their work. I did not hide my identity and the honesty worked. See e.g. my hacker friendly blog, Hacker Way. The underground AI experimentalists liked the publicity I gave to their efforts in 2023 in my blog, ChatGTP-4 Prompted To Talk With Itself About “The Singularity” (e-Discovery Team, 4/04/23). By then version 4.0 had been released and the boards were abuzz with talk of The Singularity.

The April 4, 2023, article is the first time I wrote about and demonstrated the emergent new multiple-personalities properties of ChatGPT, which were now even stronger in version 4.0:

I prompted ChatGPT-4 to generate a dialogue with itself through various personalities of its own choosing. I did virtually nothing after setting up the lengthy prompt. ChatGPT-4 came up with the different experts that it decided were best qualified to address my question on The Singularity. . . .

This is a totally new take on Socratic dialogue. … Thanks to the Open AI prompt engineering community on Discord, especially “ChainBrain AI.” They did the work to create the basic “Hive Mind” prompt I used here.

This is something new. I love it and think you will too. But, even though you are reading an Ai’s conversation with itself, do not expect too much. Baby Chat GPT-4 is not even close to superintelligent yet. The five personas were somewhat repetitive, and not too deep, but still, an overall impressive educational performance. . . . The Ai that the public has been given access to is intelligent, but far from sentient. We are at least five years away from that, as I have written about recently. . . .

Thanks to the Open AI prompt engineering community on Discord, especially “ChainBrain AI.” They did the work to create the basic “Hive Mind” prompt I used here.

ChainBrain AI now has multiple GPTs, and shares his early prompt for Highly Intelligent Virtual Entities

I followed that up with two more blogs in Spring, 2023, on my experiments using the Hive Mind. Prompting a GPT-4 “Hive Mind” to Dialogue with Itself on the Future of Law, AI and Adjudications (4/11/23); The Proposal of Chat-GPT for an “AI Guardian” to Protect Privacy in Legal Cases (4/15/23). At that time, I was, as far as I know, the only lawyer working with this technique, but I have no doubt other lawyers are doing so now (mid 2024). See e.g., Doug Austin An Example of GenAI Personas for Legal Professionals: Artificial Intelligence Best Practices (eDiscovery Today, 6/18/24) (“A persona is a personality type used in generative AI to create an answer that simulates interacting with that type of individual.“)

Since April 2023, ChatGPT has gotten much better at everything, including the hive-mind self conversations (although it is still far away from Singularity type breakthroughs, which may never happen). OpenAI’s software with the recent advance of the Omni version of ChatGPT, is now smart, reliable and safe enough for lawyers to use this new property. Instead of the hacker name Hive-Mind, we now usually call this ability to create multiple-personas or virtual entities. As implemented for lawyers, I like to call the hive-mind of multiple personas a Panel of AI Experts. I picked this name because we are all familiar with working with experts and listening to panels speak down to us at conferences and educational events.

To wrap up this history background, and transition to the next segment, here is a quote from my recent blog, Evidence that AI Expert Panels Could Soon Replace Human Panelists, or is this just an Art Deco Hallucination? – Part One:

The support for the claims about AI possibly replacing human expert panels comes from another little AI program of mine: Panel of AI Experts for Lawyers. This is new custom GPT was designed to make ChatGPT4 adopt multiple personas to serve as a panel of experts to help legal professionals solve problems. It can also be used to lecture like traditional expert panels. It will be available soon, free, on the OpenAI store, after I finish testing it. Making these is easy, testing and refinement is the hard part.

Virtual Ralph as part of the AI expert panel at Legal Tech in 2030

Open AI’s Custom GPTs

OpenAI revised its systems in early 2024 to make it easier for AI hactivists to program custom prompts to drive its version 4 software, which it called custom GPTs. At the same time OpenAI revised its Store to make the user designed software – the custom GPTS – available to others registered users. For a more detailed history of the upgrade see the Introduction to Evidence that AI Expert Panels Could Soon Replace Human Panelists, or is this just an Art Deco Hallucination? – Part One. Many AI companies and individual hobbyists have now created custom GPTs that may be of interest to lawyers. Although a complete survey is beyond the scope of this article, you might try searching the keywords “law, legal, lawyers” on the OpenAI GPT Store (requires registration) to see some of them. This search produces over one-hundred hits. Below is a screen shot of the opening page of the GPT Store.

I have never tried any of these and so have no idea whether they are any good. Having put so much time into prompt engineering since my car wreck forced time-out, I prefer to make my own custom prompts and GPTs. Most of my GPTs are just for fun types, like The Dude Abides, and one is to help me teach generative AI to seniors. The most popular, and my favorite, is Visual Muse, which makes it easier to use Dall-E and create graphics in a variety of artistic styles. Below is a screenshot of my modest list of eight Custom GPTs as of June 21, 2024 (search Ralph Losey) on the GPT Store. The bottom two are for clients only. Only one is legal and public, the one I just released, Panel of AI Experts for Lawyers. More about that GPT later as it relates to the topic of this article, the Ability of ChatGPT to Split into Sub-Personas to Debate and Problem Solve Legal Issues.

Custom GPTs and AI Sub-Personalities

For background outside the dark web on prompting GPT personalities, although now over a year old, see the Reddit posts under The much requested guide: A complete breakdown on how to use ChatGPT to make high complexity personality profiles for your bots / AI.

Most AI hacktivists experimenting with the split-personalities phenomena seem to agree that it is essential to have one of the AI personas assume a “Devils Advocate” role. In the Panel of AI Experts this is the only persona that is required to be on every panel. This contrarian intelligence helps keep the other panelists honest and mitigates against the known GPT tendency to be sycophantic, and tell the questioner what they want to hear. Obviously that can lead to errors, and if not guarded against, even hallucinations. See e.g., Transform Your Legal Practice with AI: A Lawyer’s Guide to Embracing the Future (e-Discovery Team, 1/24/24) (discusses sycophantism); Towards Understanding Sycophancy in Language Models (Antrop\c, 10/23/23) (sycophancy in a byproduct of human subjectivity in the RLHF-training process).

The Panel of AI Experts for Lawyers explains the Devil’s Advocate personality as follows:

The Devil’s Advocate panelist is a unique and integral part of the expert panel discussions. Here is a detailed description of his role and personality:

Devil’s Advocate

Role: The Devil’s Advocate is a mandatory member of every panel, bringing a contrarian or unorthodox viewpoint to the discussion. His primary function is to challenge the status quo, provoke critical thinking, and explore alternative perspectives that might otherwise be overlooked.

Personality:

• Contrarian yet Polite: While he often takes opposing viewpoints, he maintains a respectful and polite demeanor. His challenges are meant to stimulate deeper analysis and avoid groupthink.

• Humorous: He has a subtle sense of humor, often using wit to make his points more engaging and thought-provoking.

• Critical Thinker: His role requires a sharp, analytical mind capable of identifying potential flaws or overlooked aspects in the arguments presented by other panelists.

• Provocative: He is not afraid to ask the tough questions or present controversial opinions to ensure that all angles are considered.

The Devil’s Advocate’s contributions are essential for a well-rounded and thorough exploration of the topic at hand, ensuring that the final recommendations are robust and well-considered.

Every Expert Panel, both human and AI, would benefit from a polite Devil’s Advocate type panelist.

A search of the Term “Devil’s Advocate” on the OpenAI GPT store returned over a hundred hits. Below is a screenshot of the top of the search list. They all appear to offer contrarian views to whatever position you submit to it, but do not offer full panel discussion. Still, this approach could be helpful, although you do not really need a custom GPT to do that. Just tell the OpenAI model version 4.0 or higher to assume a devil’s advocate approach and personality and it will know what you mean and act accordingly. Then you can debate AI experts forever, if you like that sort of thing. I don’t.

The Custom GPT: Panel of AI Experts for Lawyers.

Although almost a dozen panel of experts type custom GPTs are seen on the OpenAI store, none appear oriented to legal professionals. So for the time being mine appears to be the only one available. It took me a year and a half to develop Panel of AI Experts for Lawyers, and three months of beta-testing, so I am not too surprised to be first. I have used the Panel GPT frequently before its recent public launch, so too have a few beta-testers I recruited, including a retried Judge and tech expert whom I greatly respect, Ralph Artigliere. We all found that the panel is working very smoothly, especially after the release of the Omni version, with only occasional minor errors. We see it provide expert analysis quickly and accurately, and propose diverse, creative solutions to problems. It also looks like a good tool for instant education of lawyers and staff on all topics, especially those related to AI. It is also good for “brainstorming” new practical solutions to client problems. Of course, it would be better to consult with a panel of humans experts, and other attorney specialists, but that is not always possible, even in the biggest of firms.

It is also important to note that this custom GPT has been operating for months now in hundreds of vigorous tests with a hallucination free record. That does not mean a hacker could not misuse the software and trick it into hallucinating, that is always possible. See e.g. OpenAI Generates a ‘Hired Gun Hacker’ Defense to the N.Y. Times Copyright Case (3/12/24). It means that with ordinary, intended use we have never seen it hallucinate. Of course, care and trust-but-verify procedures must always be adopted by legal professionals. See e.g. “My AI Did It!” Is No Excuse for Unethical or Unprofessional Conduct: Consideration of recent case law and ethics opinions.

Here is the opening screen of the GPT software. It is free, by the way, with no upselling or anything like that. You just need to be a registered OpenAI user, which is also free, and sign-in.

It seems this legal education AI tool provides a new, relatively safe and easy way for legal professionals to begin incorporating generative AI into their practices. That is the idea. Our extensive pre-release testing of the custom GPT suggests it should work well.

The Panel of AI Experts usually performs at a national Legal Tech level. The days of human-only panels may be numbered.

Self-Introduction

We asked the Panel of AI Experts for Lawyers to introduce itself and describe how it can be of help to legal professionals. Here is the reply:

“I can assist legal professionals by offering expert panel discussions that provide comprehensive insights and practical solutions to their specific problems or inquiries. Here’s how the process works:

1. Identify the Issue: First, you tell me what specific advice or information you need. This could be related to e-discovery, AI applications in law, legal ethics, regulatory compliance, or any other legal topic.

2. Select Experts: I will suggest a list of ten potential experts with distinct perspectives and expertise relevant to your topic. You will select four experts from this list, plus the mandatory Devil’s Advocate.

3. Panel Discussion: The selected panel will follow a structured six-step discussion process to address your issue:

• First Round: Each expert provides opening remarks and proposes possible solutions.

• Second Round: Each expert further expounds their ideas and discusses other panelists’ proposals.

• Third Round: Each expert identifies points of agreement to serve as a unified recommendation.

• Fourth Round – Solution: The experts collaboratively build a consensus answer with practical examples.

• Fifth Round – Final Explanation: The Devil’s Advocate summarizes the solution, emphasizing any potential challenges and alternative viewpoints.

• Sixth Round – Questions: You can ask follow-up questions to the entire panel or individual members.

To begin, please clarify the issue or problem you would like the expert panel to discuss. For instance, are you looking for advice on implementing AI in your legal practice, handling a specific legal technology challenge, navigating ethical dilemmas in AI use, or something else? Once you specify your topic, I will suggest a tailored list of ten experts for you to choose from.”

The AI suggests experts and you choose. Want more perspectives, run again with a new panel.

Background and Context

As you can see from the self-introduction, generative AI has made significant strides in various fields, and the legal sector is no exception. Over the past few years it has demonstrated its potential to streamline legal research, enhance decision-making, and improve overall efficiency. The introduction of AI expert panels represents a new capacity. The AI expert panels can provide diverse perspectives and comprehensive analyses on complex legal issues, making them useful tools.

AI expert panels fill a common need in legal practice to access reliable expert opinions and reduce the time and cost associated with obtaining this input. By leveraging the power of AI, legal professionals now have on-demand access to expert insights, helping them to brainstorm and make informed decisions more quickly and accurately. This innovation is particularly important in an era where the volume of information and the complexity of legal cases continues to grow.

When a human expert is not available or affordable to provide the help you need, try calling upon AI experts.

The six-step process of the panel discussion described above is the default programming which kicks in after the user discloses the topic and picks a panel of five experts. Since the Devil’s Advocate Expert is used in all panels, this means the user can select four experts. Four default experts who are described next are offered in each proffer, no matter what the topic, but six additional experts are suggested by the GPT depending on your designated topic. If you do not like any of the suggested experts (rare) you can even suggest your own. After the topic is disclosed and experts selected, the GPT programming automatically runs the panel through the six steps. The goal of this process is to provide a comprehensive understanding of the topic or problem and to offer practical examples of solutions.

Ralph Losey is giving away the most powerful AI tool we has seen to “pay it forward” and play a small part in the cause of justice.

Description of the Five Default Experts

Five default panelists have been selected based on my experience with actual human legal panels over the past decades, and more recently on panels involving generative AI. Experimentation with the AI tools show that these five were the best all-around for generative AI related topics. If you want to ask questions having nothing to do with AI, that’s fine. You can pick from the other experts suggested by the AI after you identify your topic, or name your own. The only panelist you cannot eliminate is the contrarian, the Devil’s Advocate.

You may well wonder about the Child Prodigy AI expert panelist. Although somewhat like a few young, human experts we have had the pleasure to work with, the Child Prodigy is an entirely new creature born out of AI. ChatGPT can simplify and distill things to their essence by selecting a child’s view and voice. We found this AI helpful in unexpected ways and think you will too. Try it on one of your panels sometime to see what it brings to the table.

Here are the five default panelists we suggest be part of your consideration for most AI related topic discussions:

1. Pro-AI Attorney:

• Background: An enthusiastic proponent of using generative AI in legal tasks. He has extensive hands-on experience using AI in his successful legal practice.

• Expertise: Specializes in e-discovery, predictive coding, and prompt engineering. He writes extensively on AI and law, advocating for the integration of AI to improve efficiency and accuracy in legal processes.

• Style: Incorporates subtle humor in his explanations and often draws from Ralph Losey’s blog at e-discoveryteam.com.

2. Prompt Engineer Lawyer:

• Background: An attorney with expertise in generative AI, specifically in prompt engineering. She understands the nuances of crafting effective prompts to guide AI responses.

• Expertise: Knowledgeable in the six OpenAI strategies for prompt engineering: Clear Writing, Reference Texts, Splitting Tasks, Taking Time to Think, Tools, and Testing.

• Style: Provides practical advice on improving prompts and often refers to OpenAI’s prompt engineering strategies and tactics.

3. Child Prodigy:

• Background: A 10-year-old with extensive knowledge of law and AI, possessing total recall and a unique perspective on complex subjects.

• Expertise: Can distill complex legal and AI topics into simple, clear explanations. Focuses on ethical and future-oriented considerations.

• Style: Speaks in straightforward, direct language and encourages thinking about long-term implications and ethical issues.

4. Lawyer Scientist:

• Background: A well-spoken black woman with degrees in law and computer engineering. She consults for OpenAI and is an expert on AI, especially large language models (LLMs).

• Expertise: Knowledgeable about the limitations and problems of AI in legal applications. Provides candid insights into the science behind AI and its application in law.

• Style: Serious and articulate, with a good sense of humor.

5. Devil’s Advocate (included on all panels)

• Background: A mandatory panelist for all discussions who takes a contrarian or unorthodox views to challenge the status quo and promote exploration of alternative viewpoints.

• Expertise: Skilled in identifying potential pitfalls and overlooked aspects of AI and legal integration. His critiques encourage comprehensive consideration of issues.

• Style: Polite but critical, with a subtle sense of humor. Often harsh in criticisms but always aims to promote deeper understanding and creative solutions.

Research and experience shows the Devil’s Advocate is a key panelist in every meaningful discussion. With AI he is, at least, always polite and respectful.

Summary of Prior Articles Reporting on Tests and Abilities of the GPT

Extensive testing, over two hundred sessions, have gone into the design, development and quality control of Panel of AI Experts for Lawyers. Seven studies have been published and peer review received. They are referenced at the end. They can help serious students to learn more about the proper use of this software. At the same time, review of these articles can help students learn about the topics discussed by the different panels, including AI hallucinations, AI as mentors, types of AI and cybersecurity.

By the end of the development phase, which lasted over six months, the “Panel of AI Experts for Lawyers” had been rigorously tested and validated, demonstrating its readiness for real-world applications. This kind of comprehensive approach to development and testing is needed for AI software to ensure that it can deliver high-quality legal insights and analysis. In my view, a significant problem with much of the AI software now on the market is the failure of software makers to adequately test the software. As mentioned, in OpenAi’s Six Strategies of Prompt Engineering, Testing Changes Systematically is the sixth Strategy. It is very important as the generative AI models change quickly and each variation requires further testing of all software built on the foundation models. That is why further testing was required when OpenAI upgraded to the Omni versions. See OMNI Version – ChatGPT4o – Retest of My Panel of AI Experts – Part Three (e-Discovery Team, 5/29/24).

It is good to have a panel of experts available at your desk, and on your phone, anywhere and anytime.

Features of Panel of AI Experts for Lawyers

Expert Panel Simulation

The Panel of AI Experts for Lawyers simulates a panel of legal experts, offering diverse perspectives and comprehensive analysis on legal issues. This feature is useful for complex cases requiring multidisciplinary insights, providing users with a broad range of opinions and expertise that would typically be time-consuming and costly to obtain. It can be used for many things, including quick learning and brainstorming solutions to clients’ legal problems.

Efficiency and Accuracy

The AI can process and analyze complex information quickly, delivering accurate and relevant insights on a variety of topics. Its abilities include the capability to browse the web for current information. This allows the panel to refer to breaking news types of events. We performed several tests on this. The panels work best when specific topics are specified in a short, written statement by users, but tests have shown that it can also discuss graphs and charts and is not limited to user text. You can also direct that the expert panel refer to specific text that you submit to it as attached files, along with your topic specifications. These diverse, multimodal capabilities increase the usefulness of the panel’s analysis and discussions.

Customization and Adaptability

The tool can be customized to fit different legal specializations, making it versatile and adaptable to various practice areas, not just AI related issues. You select any experts that you want on the panel. Only the Devil’s Advocate expert is required for every panel. He is mandatory because we have found this improves the quality of the panel services. The user-friendly interface of Panel of AI Experts for Lawyers ensures easy integration into existing workflows, allowing legal professionals to tailor the AI’s capabilities to their specific needs and expert preferences.

Cost-Effective Legal Expertise

The GPT is intended for use by legal professionals only. It can help professionals improve their efficiency and so serve a greater number of clients in a cost-effective manner. In this admittedly indirect way, the AI panel GPT increases access to justice by the public.

Continuous Learning and Improvement

The AI continuously learns from new data and user feedback, improving its accuracy and reliability over time. This ensures that users receive up-to-date and high-quality legal insights, making the tool increasingly valuable as it evolves.

Practical Applications

The Panel of AI Experts for Lawyers offers numerous practical applications across various legal scenarios. Its expertise is not limited to any particular field of law, such as IP, e-discovery, or AI only. All areas of law can be included. Another application outside of direct legal services is for legal education. The panels can help instruct legal professionals and law students to learn new areas of law, and look at things from a diverse, expert perspective.

Overall, there are many practical applications of the “Panel of AI Experts for Lawyers.” By integrating this tool into their workflows, legal professionals can enhance their practice and provide better service to their clients.

Conclusion

One of the most important emergent intelligence abilities of OpenAI’s ChatGPT is its strange ability to split into different sub-personalities, and have the virtual entities speak with each other. With careful prompting this ability can be harnessed to help us to understand, see different perspectives and solve problems. The latest versions of ChatGPT are even accurate and secure enough to provide expert help to the legal profession.

You can confidentially ask the experts anything, for example, what would a judge say about this argument?

Legal professionals should always use the latest paid ChatGPTs (now version4o) because only the paid versions provide full privacy protections. It requires both registration and purchase from OpenAI, but the charge is modest, now $20 per month for the cheapest paid versions. Even with the paid versions, and all privacy settings turned on, there is probably no need to include any client confidential information in your use of Panel of AI Experts for Lawyers. By the way, even as the maker, I cannot view any of your input, output, or even know if your are using it. That is true for all of the OpenAI custom GPTs. The data all goes through OpenAI and is governed by their privacy policies, which will protect your information in paid versions and even prevent its use in training.

This technology can enhance brainstorming, expert research, decision-making, and client services, ultimately improving the effectiveness of legal practice and legal education. Still, users must always remember to do their own due diligence. Errors of some kind are to be expected with the use of any generative AI. Even though we have not seen hallucinations yet, even with a fully vetted program like Panel of AI Experts for Lawyers, it is still possible for fabrications to happen, especially when the panel is asked to do things beyond its programming or is otherwise misused.

For instance, the panel should not be asked to generate writings for you, or provide your clients with direct legal advice. It should never be asked to generate final work product to be filed with the court or government. Further, and this is just common sense, AI should never prepare a final expert report to be used as evidence. You must retain a human expert, but by using the AI output as a starter, the human expert’s work may go much faster, may be more complete. All the AI should be asked to do is lay the groundwork.

Any research must always be carefully verified. The humans-in-the-loop strategy of AI adoption is required. AI is far from being able to replace human lawyers and judges. AI cannot take your legal job, but it can enable humans who use AI to do their work better and faster than you. Those AI assisted humans can take your jobs, but not the AIs. In the U.S. at least, that is prohibited by law and ethics.

Although not legally prohibited, even in the U.S., AI is incapable of real humor, or even subtly sarcastic comments. That may change someday for base level Dad jokes, or purely intellectual humor, but is that really funny?

“Boy does this image database need a bias audit!” Note the audience is all young white dudes. The Panel of AI Experts can tell you more, or Losey AI can too.

AI is also still weak on human emotive contact and has no intuition or body embedded space-time awareness. It is an extraordinary stochastic parrot; a mechanical talking thing, not a creature. Navigating the High Seas of AI: Ethical Dilemmas in the Age of Stochastic Parrots (4/3/24). So there is still need for human experts on panels and still need for human lawyers and judges. We are unlikely to ever see AI be able to provide the kind of intuitive human contact necessary for good expert advice and attorney-client services. It is just a tool, a magnificent tool, but not a creature.

Additional Resources

For those interested in deeper insights, here are links to the prior articles mentioned on AI multiple-personality dialogues:

7 Comments | AI Ethics, AI Instruction, AI Prompt Engineering Instruction, Book, ChatGPT, Lawyers Duties, Technology | Tagged: chatGPT, custom gpt | Permalink
Posted by Ralph Losey

Navigating the High Seas of AI: Ethical Dilemmas in the Age of Stochastic Parrots

April 3, 2024

Ralph Losey. Published April 3, 2024.

Large Language Model generative AIs are well-described metaphorically as “stochastic parrots.” In fact, the American Dialect Society, selected stochastic parrot as its AI word of the year for 2023, just ahead of the runners-up “ChatGPT, hallucination, LLM and prompt engineer.” These genius stochastic parrots can be of significant value to all legal professionals, even those who don’t like pirates. You may want one on your shoulder soon, or at least in your computers and phone. But, as you embrace them, you should know that these parrots can bite. You should be aware of the issues of bias and fairness problems inherent in these new technical systems.

Pirate and his parrot sharing cracker moment in watercolor style by Losey.

The ethical issues were raised in my last blog and video, Stochastic Parrots: the hidden bias of large language model AI. In the video blog an avatar, which looks something like me with a parrot on his shoulder, quoted the famous article on LLM AI bias, and briefly discussed how the prejudices are baked into the training data. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, (3/1/21) by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell. In this followup blog I dig a little deeper into the article and the controversies surrounding it.

Robot Parrot with his iPhone in watercolor style by Losey.

Article Co-Author Timnit Gebru

First of all, it is interesting to note the internet rumor, based on a few tweets, concerning one of the lead authors of the Stochastic Parrots article, Timnit Gebru. She was a well-known leader of Google’s ethical AI team at the time she co-wrote it. She was allegedly forced to leave Google because its upper management didn’t like it. See: Karen Hao, We read the paper that forced Timnit Gebru out of Google. (MIT Technology Review, 12/04/2020); Shirin Ghaffary, The controversy behind a star Google AI researcher’s departure (Vox, 12/09/20). According Karen Ho’s article, more than 1,400 Google staff members and 1,900 other supporters signed a letter of protest about the alleged firing of Timnit Gebru as an act of research censorship. The rumor is Google tried to stop the publication of the article, but obviously the article was in fact published on March 1, 2021.

According to the MIT Technology Review article, Google did not like all four points of criticism of LLMs that were made in the Parrot article:

Environmental and financial costs. Pertaining to the need for vast amount of computing to create the LLMs and the energy costs, the carbon footprint.
Massive data, inscrutable models. The training data mainly comes from the internet and so contains racist, sexist, and otherwise abusive language. Moreover, the vast amount of data used makes the LLMs hard to audit and eliminate embedded biases.
Research opportunity costs. Basically complaining that too much money was spent on LLMs, and not enough on other types of AI. Note this complaint was made before the unexpected LLM breakthroughs in 2022 and 2023.
Illusions of meaning. In the words of Karen Ho, the Parrot article complained that the problem with LLM models is that they are “so good at mimicking real human language, it’s easy to use them to fool people.”

Moreover, as the Hao article in the MIT Technology Review points out, Google’s then head of AI, well-known scientist, Jeff Dean, claimed that the research behind the article “didn’t meet our bar” and “ignored too much relevant research.” Specifically, he said it didn’t mention more recent work on how to make large language models more energy efficient and mitigate problems of bias. Maybe they didn’t know?

Robot parrot in photorealistic style by Losey.

Criticisms of the Stochastic Parrot Article

The main article I found criticizing the Stochastic Parrot also has a weird name: “The Slodderwetenschap (Sloppy Science) of Stochastic Parrots – A Plea for Science to NOT take the Route Advocated by Gebru and Bender” (2021). The author, Michael Lissack, challenges the ethical “woke” stance of the original “Parrot Paper,” and suggests a reevaluation of the argumentation. It should be noted that Gebru has accused Lissack of stalking her and colleagues. See: Claire Goforth, Men in tech are harassing Black female computer scientist after her Google ouster (Daily Dot, 2/5/21) (Michael Lissack has tweeted about Timnit Gebru thousands of times). By the way, “slodderwetenschap” is Dutch for slop-science.

Here are the three criticisms that Lissack makes of Stochastic Parrots:

What is missing in the Parrot Paper are three critical elements: 1) acknowledgment that it is a position paper/advocacy piece rather than research, 2) explicit articulation of the critical presuppositions, and 3) explicit consideration of cost/benefit trade-offs rather than a mere recitation of potential “harms” as if benefits did not matter. To leave out these three elements is not good practice for either science or research.
Lissack, The Slodderwetenschap (Sloppy Science) of Stochastic Parrots, abstract.

Others have spoken in favor of Lissack’s criticisms of the Stochastic Parrots, including most notably, Pedro Domingos. Supra, Men in Tech (includes collection of Domingos’ tweets).

Lawyerly pirate with parrot eating cracker. Watercolor by Losey.

It should also be noted that Lissack’s article includes several positive comments about the Stochastic Parrots work:

The very topic of the Parrot Paper is an ethics question: does the current focus on “language models” of an ever-increasing size in the AI/NLP community need a grounding against potential questions of harm, unintended consequences, and “is bigger really better?” The authors thereby raise important issues that the community itself might use as a basis for self-examination. To the extent that the authors of the Parrot Paper succeed in getting the community to pay more attention to these issues, they will be performing a public service. . . .

The Parrot Paper correctly identifies an “elephant in the room” for the MI/ML/AI/NLP community: the very basis by which these large language models are created and implemented can be seen as multilayer neural network-based black boxes – the input is observable, the programming algorithm readable, the output observable, but HOW the algorithm inside that black box produces the output is no articulable in terms humans can comprehend. [10] What we know is some form of “it works.” The Parrot Paper authors prompt readers to examine what is meant by “it works.” Again, a valuable public service is being performed by surfacing that question. . . .

Most importantly, in my view, the Parrot Paper authors remind readers that potential harm lies in both the careless use/abuse of these language models and in the manner by which the outputs of those models are presented to and perceived by the general public. They quote Prabhu and Birhane echoing Ruha Benjamin: “Feeding AI systems on the world’s beauty, ugliness, and cruelty, but expecting it to reflect only the beauty is a fantasy.” [PP lines 565-567, 11, 12] The danger they cite is quite real. When “users” are unaware of the limitations of the models and their outputs, it is all too easy to confuse seeming coherence and exactness for verisimilitude. Indeed, Dr. Gebru first came to public attention highlighting similar dangers with respect to facial recognition software (a danger which remains, unfortunately, with us [13, 14].
The Slodderwetenschap (Sloppy Science) of Stochastic Parrots at pages 2-3.

Lissack’s main objection appears to be the argumentative nature of what the article presents as science, and the many subjective opinions underlying the Parrot article. He argues that the paper itself is “ethically flawed.”

White robot parrot in photorealistic style by Losey.

Talking Stochastic Parrots Have No Understanding

Artificial intelligences like ChatGPT4 may sound like they know what they are talking about, but they don’t. There is no understanding at all in the human sense; it is all just probability calculations of coherent speech. No self awareness, no sense of space and time, no feelings, no senses (yet) and no intuition – just math.

It is important to make a clear distinction between human cognitive processes, which are deeply linked and arise out of bodily experiences and the external world, and computational models that lack a real world, experiential basis. As lawyers we must recognize the limits of mere machine tools. We cannot over-delegate to them just because they sound good, especially when acting as legal counselors, judges, and mediators. See e.g. Yann Lecun and Browning, AI And The Limits Of Language (Noema, 8/23/22) (“An artificial intelligence system trained on words and sentences alone will never approximate human understanding.”); Valmeekam, et al, On the Planning Abilities of Large Language Models (arXiv, 2/13/23) (poor at planning capabilities); Dissociating language and thought in large language models (arXiv, 3/23/24) (poor at functional competence tasks).

Getting back to the metaphor, a parrot may not understand the words it speaks, but they at least have some self awareness and consciousness. An AI has none. As one thoughtful Canadian writer put it:

Though the output of a chatbot may appear meaningful, that meaning exists solely in the mind of the human who reads or hears that output, and not in the artificial mind that stitched the words together. If the AI Industrial Complex deploys “counterfeit people” who pass as real people, we shouldn’t expect peace and love and understanding. When a chatbot tries to convince us that it really cares about our faulty new microwave or about the time we are waiting on hold for answers, we should not be fooled.
Bart Hawkins Kreps, Beware of WEIRD Stochastic Parrots (Resilience, 2/15/24).

Robot parrot eating cracker. Watercolor by Losey.

For interesting background, see The New Yorker article of 11/15/2023, by Angie Wang, Is My Toddler a Stochastic Parrot? Also see: Scientific research article on the lack of diversity in internet model training, Which Humans? by Mohammad Atari, et al. (arXiv, 9/23/23) (“Technical reports often compare LLMs’ outputs with “human” performance on various tests. Here, we ask, “Which humans?”“).

I also suggest you look at the often cited technical blog post by the great contemporary mathematician, Stephen Wolfram What Is ChatGPT Doing … and Why Does It Work?. As Wolfram states in the conclusion ChatGPT is “just saying things that “sound right” based on what things “sounded like” in its training material.” Yes, it sounds good, but nobody’s home, no real meaning. That is ultimately why the fears of AI replacing human employment are way overblown. It is also why LLM based plagiarism is usually easy to recognize, especially by experts in the field under discussion. The Chatbot writing is obvious by its style over substance language, which is high on fluff and stereotypical language, and overuse of certain “tell” words. More on this in my next blog on how to spot stochastic parrots.

Personally, I’m already sick of the bland, low meaning, fluffy content news and analysis writing now flooding the internet, including legal writing. It is almost as bad as ChatGPT writing for political propaganda and sales. It is not only biased, and riddled with errors, it is mediocre and boring.

Parrot Pirate and his pet parrot. Watercolor style by Losey.

Conclusion

Everyone agrees that LLM AIs will, if left unchecked, reproduce biases and inaccuracies contained in the original training data. This inevitably leads to the generation of false information – to skewed output to prompts – and that in turn can lead to poor human decisions made in reliance on biased output. This can be disastrous in sensitive applications like law and medicine.

Everyone also agrees that this problem requires AI software manufacturers to model designs to curb these biases, and to monitor and test to ensure the effectiveness and trustworthiness of LLMs.

The disagreement seems to be in evaluation of the severity of the problem, and the priority that should it be given to its mitigation. There is also disagreement as to the degree of success made to date in correcting this problem, and whether the problem can even be fixed at all.

Pirate eating cracker with parrot on book. Watercolor by Losey.

My view is that these issues can be significantly reduced, but I doubt that LLMs will ever be perfect and entirely free of all bias, even though they may become better than the average human. See e.g. New Study Shows AIs are Genuinely Nicer than Most People – ‘More Human Than Human’.

Moreover, I believe that users of LLMs, especially lawyers, judges and other legal professionals, can be sensitized to these bias issues. They can learn to recognize previously unconscious bias in the data and in themselves. The sensitivity to the bias issues can then help AI users to recognize and overcome these challenges. They can realize when the responses given by an AI are wrong and must be corrected.

The language of a ChatGPT may correctly echo what most people in the past said, but that does not, in itself, make it the right answer for today. As lawyers we need the true, correct and bias free answers, the just and fair answers, not the most popular answers of the past. We have an ethical duty of competence to double check the mindless speech of our stochastic parrots. We should question why Polly always wants a cracker?

Parrot pirate with crackers and hot grog. Watercolor by Losey.

3 Comments | AI Ethics, AI Instruction, AI Prompt Engineering Instruction, Book, ChatGPT, Lawyers Duties | Tagged: ai, bias, chatGPT, ethics, gen ai | Permalink
Posted by Ralph Losey

Plato and Young Icarus Were Right: do not heed the frightening shadow talk giving false warnings of superintelligent AI – Part One

December 5, 2023

Ralph Losey. Published December 5, 2023.

Advanced intelligence from AI should be embraced, not feared. We should speed up AI development, not slow it down. We should move fast and fix things while we still can. Fly Icarus, fly! Your Dad was wrong.

Plato’s Allegory of the Cave and the Mere Shadow Story of the Traditional Icarus Myth

Plato rejected the old myths and religion of ancient Greece, including that of Daedalus and Icarus, to embrace reason and science. Ironically, this myth is now relied upon by contemporary scientists like Max Tegmark as propaganda to try to stop AI development. Icarus supposedly perished by using the wings invented by his father, Daedalus, when he tried to fly to the sun. In this discouraging tale, Icarus did not make it to the sun. This myth is of a son’s supposed hubris to ignore his father’s warning not to fly so high. The reliance today on this myth to instill fear of great progress is misplaced. Here I present an alternative ending in accord with Plato where the father is encouraging, and the son makes it to the sun. In my rewrite, Daedalus’ invention succeeds beyond his wildest dreams. Icarus bravely flys to the sun and succeeds. He attains superintelligence and safely returns home, transformed, well beyond the low IQ cave.

This alternative is inspired by Plato and his Allegory of the Cave, where he prompts Socrates to chat about a prisoner stuck his whole life in a cave. In this cave everyone mistakes for reality the shadows on the wall cast by a small fire. The cave in my mixed retelling represents limited human intelligence, unaugmented by AI superintelligence. Eventually, one person is able to escape the cave, here that is Icarus, and he is illuminated by the light of the Sun. He attains freedom and gains previously unimaginable insights into reality. He links with superintelligence. It is bravery, not hubris, to seek the highest goals of intellectual freedom.

The illustrations here express this theme in several artistic styles, primarily classical, impressionistic, digital and surrealistic. They were created using my GPT plugin, Visual Muse.

Image of successful Icarus in combined digital impressionistic style using Visual Muse.

The myth of Icarus, where the wings melt and he dies in his quest, is a fear-based story meant to scare children into obedience. The myth is ancient propaganda to maintain control and preserve the status quo, to con people into being satisfied with what they have and seek nothing better. It is disturbing to see the otherwise brilliant, MIT scientist, Max Tegmark, invoke this myth to conclude his recent Ted Talk. His speech tries to persuade people to fear superintelligent AI and support the slow down of development of AI, lest it kill us all! Tegmark preaches contentment with the AI we already have, that we must stop now, and not keep going to the sun of AGI and beyond. He speaks from his limited shadow knowledge as a frightened father of the AI Age. Relax Max, your children will make the journey no matter what you say. Youth is bold. Have confidence in the new AI you helped to invent.

Excerpt from How to Keep AI Under Control, Max Tegmark, TED Talk at 11:39-12:03

Like many others, I say we must keep going. After millennia of efforts and trust in reason, we must not lose our nerve now. We must fly all the way to the sun and return enlightened.

The reliance today on the failed invention myth of Icarus is misplaced. We should not stoke public fear of the unknown to prevent change. These arguments at the end of the careers of otherwise genius scientists like Max Tegmark are unworthy. They should remember the inspiration of their youth, when they boldly began to promote the wings of super intelligence.

Sadly, Geoffrey Hinson, the great academic who first invented the wings of generative AI, has also turned back on the brink of success. In 2023, as his wings finally took flight, he stopped work, left his position at Google and assumed the role of Casandra. Since the summer of 2023 he now only speaks of doom and gloom, if construction of his wings are completed. See e.g. “Godfather of AI” Geoffrey Hinton: The 60 Minutes Interview.

Neither one of these genius scientists seem to grasp the practical urgency of the world’s present needs. We cannot afford to wait. Civilization is falling and the environment is failing. We must move fast and fix things.

Plato was right to reject these fear based myths, to instead encourage progress and the brave journey to the bright light of reason. There is far more to fear from misguided human intelligence in the present, than from any superintelligence in the future.

Plato and Socrates teach us to embrace intelligence, to embrace the light, not fear it. Plato’s Allegory of the Cave is the cornerstone of Western Civilization, the culture that led to the inventions of AI. Plato teaches that:

Superstitious myths like Daedalus and Icarus are just the shadows on the cave wall.
We should reject the old gods of fear and embrace reason and dialogue instead. (Socrates was killed for that assertion.)
It is bravery, not hubris, to seek escape from the cave of dimwitted cultural consensus.
Human intelligence is but a dim firelight, and for that reason, our beliefs of reality, such as belief in “Terminator AIs,” are mere shadows on the wall.

Plato urged humans to escape their prison of limited intelligence and boldly leave the cave, to discover the Sun outside, to embrace superintelligence. See e.g. The Connection Between Plato’s Cave Allegory and Electronic Discovery Law.

Leaving Plato’s cave of limited, unaugmented human intelligence. Digital futuristic style image using Visual Muse.

Combined digital futurism and surrealistic fantasy style image of Plato’s Cave using Visual Muse.

The path of reason is open to all who grasp the clear and present dangers of the status quo, of continued life in the cave without the light of AGI. We should follow the guidance of Plato and Socrates, not that of the fearful shadow myth of Daedalus and Icarus. We should fly to the sun and embrace superintelligence, not shy away from it in fear. We should boldly go where no Man has gone before, find superintelligence, use it, merge with it and become one with the Sun. It will not burn, it will enlighten.

The guiding light of superintelligence is represented by the Sun in digital futurism style using Visual Muse.

Then, following Plato’s allegory, we will return back to the cave, still one with AGI, and speak with those imprisoned within, those blinded by their own human limitations. We will return to try to help them to escape, help them free themselves from shadow-based fears and drudgery, help them to see the light and link with super AI. We will return with hybrid AGI to help free mankind, not kill everyone as the shadows readers declare. They are afraid of their own shadows.

Speed Up AI Before It’s Too Late

Unfortunately, the speed up position expressed here is currently a minority view, but there are a few brave scientists willing to speak up and support the no-fear, accelerationist position. The image of Hermes, the Greek messenger god, known for his speed and cleverness, seems appropriate to many.

Hermes running to the Sun in Digital Futurism style using Visual Muse.

The stop or slow down AI development proponents are, in the opinion of many, very naive. It cannot be stopped. The militaries of the world are fearful of falling behind. Based on what I see the fear of super AI in the wrong hands is justified. Fear the people, not the tools.

Hermes in pencil sketch style using Visual Muse.

Moreover, the world is already such a mess, especially with the ongoing environmental damages, that we have no choice but to seek the help of advanced AI to help fix this. Move fast and fix things should be the new motto. The world is already broken. Adding more intelligence to the mix is likely to help, not make things worse. We need superintelligence to clean up the incredible mess created by human stupidities.

Like many others, I have sincere concerns about how we’re going to survive the coming years without the help of AGI. The train to world destruction has already left the station, we have no choice but to take whatever measures are necessary to try stop the train wreck. Future generations are depending upon us. No one can figure out how to do it now with the tools we have. We need new tools of superintelligence to help us to figure a way out.

Futuristic digital style image using Visual Muse of AI robots repairing environmental damage.

There are a number of other other reasons that it would be a mistake to slow down now, some of which will be addressed next through the word of other scientists who agree with the keep on accelerating position. But before I switch to their wisdom in Part Two of this article, I must point out another fundamental error made by some of the slow-downers. They seem guilty of thinking of AI as a creature, not a tool. Not only that, but they think of it as an immoral creature, which, although superintelligent, still thinks nothing of wiping out us puny humans. Oh, please. That is a fanciful misinterpretation of evolution. See e.g. The Insights of Neuroscientist Blake Richards.

AI is just a tool, not a creature! The fear mongers falsely assume that superintelligence will magically turn computers into creatures. That is so wrong. Moreover, the next thought that the superintelligent entity we created would then want to destroy the world, or worse, do so by accident, is laughably absurd. That is how fearful humans behave, not superintelligent computers.

Final thought is a concession to the other side of the debate. There definitely is need for some regulation of AI and AGI. No one disputes that. But regulation should not include an intentional slow down or pause of technological development. It is impossible to do that anyway, and most regulators in the U.S. understand that. See: White House Obtains Commitments to Regulation of Generative AI from OpenAI, Amazon, Anthropic, Google, Inflection, Meta and Microsoft.

But we can pause the conclusion of this blog for a few days and so here ends Part One.

Coming next, in Part Two, the work and words of several AI leaders who support the “move fast and fix things” view will be shared. In the meantime friends, do not be put off by all the naysayers out there. Keep using AI and keep reaching for the sun.