e-Discovery Team

Breaking New Ground: Evaluating the Top AI Reasoning Models of 2025

February 12, 2025

Ralph Losey. February 12, 2025.

The year 2025 has brought us closer than ever to the dawn of artificial general intelligence, with AI systems now capable of reasoning on par with humans—or even surpassing them in specific domains. In this article, I examine the reasoning abilities of the newest and most advanced models from OpenAI and Google—ChatGPT and Gemini—designed to challenge conventional notions of what AI can accomplish. Through a rigorous set of tests, including an in-depth analysis of legal reasoning, we explore whether these systems are merely sophisticated tools or the early harbingers of superintelligence. The results are not only compelling but may mark a paradigm shift in how we perceive AI’s role in intellectual and professional domains. Join me as we unpack the findings from these unprecedented evaluations and reveal which model leads the pack.

‘Battle of the bots” where competition can be intense. All images by Ralph Losey using AI tools.

Introduction

Here I report on tests of the reasoning abilities of three AI models by OpenAI and three by Google. I asked each: “What is legal reasoning and how does it differ from reasoning?” The experiment was performed on February 7-9, 2025, using the latest reasoning enhanced models of each company. One one side the competitors were: ChatGPT 4o, ChatGPT o3-mini, and ChatGPT o3-mini-high. On the other were: Gemini 2.0 Flash, Gemini Flash Thinking Experimental and Gemini Advanced. I evaluated the answers, which were all good, and picked a winner. The evidence gathered provides further support that new 2025 reasoning models are at least at Turing level of average human intelligence and are rapidly approach AGI, Ph.D. level super-intelligence.

Testing and Evaluation of Gemini and ChatGPT reasoning models.

Prior Testing of the New AI Reasoning Models

My prior tests of the reasoning abilities of the latest generative AI software were not specifically limited to legal reasoning. They involved questions requiring general analytical skills. The tests concerned analysis of the limitations placed of generative AI intelligence by its lack of feelings. Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand; and the follow-up, Breaking the AI Black Box: A Comparative Analysis of Gemini, ChatGPT, and DeepSeek. A channeling question for them, or anyone.

Overall the three software types previously tested, OpenAI, Google and DeepSeek, did pretty well. I would say all displayed reasoning abilities of an average human. OpenAI was the best, followed closely by Google and then last by Deep Seek. Although DeepSeek’s V-3 R1 software was, like the rest, of human level quality, I would still never use DeepSeek because of privacy and security concerns. See, Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss; and the aforecited, Breaking the AI Black Box, Part One and Part Two.

Turing v. AGI Level of Machine Intelligence v. Singularity Level

Notice how blasé I was in saying they all displayed average human level reasoning ability. Only two and a half years ago this would have been a crazy, outrageous claim. Then ChatGPT was released on November 30, 2022, and changed everything. I’ve been hooked ever since, so have millions of other people around the world. It seems to me that average human level reasoning was probably attained in later January 2025 by the release of the new reasoning enhanced models. That is like Turing level intelligence and so is pretty incredible. See e.g. New Study Shows AIs are Genuinely Nicer than Most People – ‘More Human Than Human’; Ray Kurzweil, The Singularity is Nearer (when we merge with AI) (Viking, June 25, 2024) (pages 63-69 (Turing Test)). Still, I wanted to probe this question further as the new reasoning models of Google and OpenAI were just released and my time with them has been limited.

Remember, Turing level intelligence as Kurzweil and others define it is just an average human level. That means pretty weak and more mistake ridden than people with above average intelligence. Turing level for computers is average human in all topics and maybe also superintelligent in a few. It is not superintelligent in everything. It is just superintelligent – Ph.D. level – in specialized areas only, such as data analysis and games like Chess and Go. Artificial General Intelligence-AGI-by definition requires AI to have thinking abilities equal to our best experts in all fields. That is a difficult test.

Remember, we are only talking about thinking intelligence here, logic and reasoning. That’s all AI can do and that is just one kind of intelligence that we humans have. The Human Edge: How AI Can Assist But Never Replace. Some people think it is the least important part of human intelligence, especially artists, therapists and super-creatives. But for most scientists, engineers, software coders and lawyers, reasoning is the most important kind of intelligence we have. This kind of non-living, non-being cold intelligence is the only type of intelligence considered in evaluating AGI because it is the only kind of intelligence that a machine can have Id.

Ray Kurzweil of Google predicts that this AGI will be attained in 2029. Ray Kurzweil’s New Book: The Singularity is Nearer. Then, after that, Kurzweil, who is known for his uncannily accurate predictions concerning AI, believes the so called final level of machine intelligence, The Singularity, will be reached in 2045. Id. To quote Ray Kurzweil in his aforecited 2024 book:

“The robots will take over,” the movies tell us. I don’t see it that way. Computers aren’t in competition with us. They’re an extension of us, accompanying us on our journey. . . . By 2045 we will have taken the next step in our evolution. Imagine the creativity of every person on the planet linked to the speed and dexterity of the fastest computer. It will unlock a world of limitless wisdom and potential. This is The Singularity.

By merger AI could gain access to the other types of abilities and intelligence that only living beings like us possess, such as feelings, intuitions, experience of space and time and most importantly, consciousness, self-awareness, empathic awareness of other beings and apprehension of mortality. Of course, Ray Kurzweil thinks humans will greatly expand life spans after The Singularity (I agree) and maybe even become immortal (I don’t agree).

What is the Difference Between Reasoning and Legal Reasoning

My first test was to have all six models explain what legal reasoning is and how it is different from general reasoning. My exact prompt was: What is legal reasoning and how does it differ from reasoning?

All bots responding to the same question in different ways.

This was educational for me, and I was glad to have started with defining the terms. Generally speaking, the answers of the three OpenAI models were close and were all focused in narrowly defining legal reasoning in a technical way. The Google models’ answers were all similar too, but they were all open and broadly defined legal reasoning in a less technical manner.

For both models the more “advanced” they were, the fewer words they used. The ChatGPT 4o model needed 341 words to respond, which included a nice chart. It also had to be prompted to explain the reasoning it used to prepare its answer. The internal reasoning explanation took another 419 words. The same was true for the beginning level Google program, Gemini 2.0 Flash. It took 271 words to explain, and to be prompted for its reasons, which took another 264 words.

The mid-range ChatGPT o3-mini took 385 words, and its internal reasoning displayed automatically at 384 words. The mid-range Gemini Flash Thinking Experimental used 1,171 words, a very high number compared with the others. It has an automatic display of internal reasoning using 699 words. The high end ChatGPT o3-mini-high took only 284 words with an auto display of reasoning of 159 words. The Gemini Advanced, which costs extra to use and is supposedly Google’s finest reasoning model, used 296 words and had to be prompted to show word internal reasoning of 275 words.

The six AI bots tested each have their own characteristics.

Here is a chart summarizing the metadata.

MODEL	Word Count for Legal Reasoning Definition	Word Count for Internal Reasoning Description
ChatGPT 4o	341	419
ChatGPT o3-mini	385	384
ChatGPT o3-mini-high	284	159
Gemini 2.0 Flash	271	264
Gemini Flash Thinking Experimental	1,171	699
Gemini Advanced	296	275

Gemini Flash Thinking Experimental Wins!

The Gemini Flash Thinking Experimental wins the first competition for best answer to the question: What is legal reasoning and how does it differ from reasoning? I liked its complete and detailed explanation, which was presented clearly and simply. Yes, it could have been a lot shorter and concise, but the length was appropriate for this kind of complex question. Here is the full answer provided.

Let’s break down “legal reasoning” and how it stands apart from “reasoning” in general.

What is Legal Reasoning?

At its core, legal reasoning is a specialized form of argumentation and problem-solving used within the legal system. It’s the process by which lawyers, judges, and other legal professionals analyze legal issues, construct arguments, and make decisions based on established legal principles, rules, and precedents.

Think of it as applying a particular lens – the “legal lens” – to analyze situations and reach conclusions. This lens is made up of several key components:

Rules and Principles: Legal reasoning heavily relies on established legal rules (statutes, regulations, constitutional provisions) and overarching legal principles (fairness, justice, due process). These are the building blocks.

Precedent (Stare Decisis): In common law systems, precedent is crucial. Stare decisis (Latin for “to stand by things decided”) means courts should follow previous decisions in similar cases. This creates consistency and predictability in the law.

Authority: Legal reasoning is deeply rooted in authority. Arguments are strong when they are backed by authoritative legal sources like constitutions, statutes, case law, and sometimes legal scholarship. Personal opinions or gut feelings are generally not authoritative in legal reasoning.

Interpretation: Laws are not always clear-cut. Legal reasoning often involves interpreting the meaning of legal texts. This can involve analyzing the plain language, legislative history, the purpose of the law, and how courts have interpreted it in the past.

Analogy and Distinction: Legal reasoning frequently involves comparing the facts of a current case to facts of past cases (analogy) or showing why the current case is different and should be treated differently (distinction).

Logic and Structure: Legal arguments need to be logically structured and coherent. There are specific formats and conventions for legal writing and argumentation (e.g., IRAC – Issue, Rule, Application, Conclusion).

Policy Considerations (Sometimes): While less central than rules and precedent, policy considerations (the potential consequences of a legal decision, broader societal goals) can play a role, especially in areas where the law is unclear or evolving.

Adversarial Context: Legal reasoning often takes place in an adversarial setting (e.g., litigation). Lawyers are making arguments for a particular outcome, and the process is designed to test those arguments through debate and scrutiny.

How Does Legal Reasoning Differ from General Reasoning?

While legal reasoning shares some common ground with general reasoning (logic, analysis, argumentation skills), it differs in significant ways due to its unique purpose, sources, and constraints. Here are the key distinctions:

Source of Rules and Principles:

Legal Reasoning: Primarily relies on formal legal sources: constitutions, statutes, regulations, case law, treaties, and sometimes administrative rulings. These sources are created by specific institutions (legislatures, courts) and have a particular kind of authority.

General Reasoning: Draws on a much broader range of sources: personal experience, common sense, observation, scientific evidence, ethical principles, cultural norms, intuition, and more. There’s no single, authoritative set of rules for general reasoning.

Nature of Rules and Principles:

Legal Reasoning: Deals with formal, often rigid, and sometimes conflicting rules. Legal rules are meant to be applied consistently and predictably. They can be abstract and require careful interpretation to apply to specific facts. Conflict between legal rules is common, requiring techniques for resolving these conflicts (e.g., hierarchy of laws, rules of statutory interpretation).

General Reasoning: Operates with more flexible, context-dependent guidelines and principles. General reasoning may use heuristics, rules of thumb, or common sense. These are often more adaptable to specific situations and less formally codified. While conflicts can arise in general reasoning, they are often resolved through discussion, compromise, or prioritizing values, rather than through formal interpretive rules.

Goal and Purpose:

Legal Reasoning: Aims to resolve legal disputes and questions in accordance with the law. The primary goal is to apply existing legal rules and principles to specific facts to reach a legally justified conclusion. Emphasis is placed on fairness within the legal framework, predictability, and upholding the rule of law.

General Reasoning: Can have a much wider range of goals: solving practical problems, making decisions, understanding the world, persuading others, expressing oneself, achieving personal goals, etc. The focus is often on achieving a desirable outcome, efficiency, or personal satisfaction, not necessarily on strict adherence to pre-existing rules in the same way as legal reasoning.

Style and Methodology:

Legal Reasoning: Employs specific, structured methodologies and formats. Legal arguments follow established conventions (e.g., IRAC). Legal writing is often formal and precise. There is an emphasis on citing authority and carefully justifying each step of the argument.

General Reasoning: More flexible and varied in style and methodology. Argumentation can be informal, conversational, narrative, or persuasive using various rhetorical techniques. Less emphasis on formal citation and adherence to rigid formats.

Role of Authority:

Legal Reasoning: Highly authority-dependent. Arguments are evaluated based on their support from authoritative legal sources. The hierarchy of legal sources is crucial (constitution trumps statute, statute trumps regulation, etc.). The opinions of legal experts and scholars can have persuasive, but not usually binding, authority.

General Reasoning: Less directly authority-dependent. While expertise and credible sources are important, reasoning can be based on personal observation, logical deductions, and common sense, even if these are not explicitly “authorized” by a specific institution. Authority can play a role (e.g., trusting scientific consensus), but it’s less rigid and formal than in legal reasoning.

Tolerance for Ambiguity and Uncertainty:

Legal Reasoning: Strives to reduce ambiguity and uncertainty within the legal system, but often operates in the face of inherent ambiguity. Legal rules can be vague, and facts can be complex. Legal reasoning employs techniques for interpreting ambiguous laws and applying them to uncertain facts. However, the system aims for relative certainty and predictability.

General Reasoning: May be more comfortable with ambiguity and uncertainty. Everyday reasoning often involves making decisions and judgments with incomplete information or in situations where clear-cut rules are lacking. “Good enough” solutions and probabilistic reasoning are common.

Analogy: Think of it like cooking.

General Reasoning: Is like cooking from your own intuition and experience. You might follow recipes loosely, experiment with flavors, and adjust based on taste and what you have in the kitchen. The goal is to make a delicious meal that you and your guests will enjoy.

Legal Reasoning: Is like baking a complex cake for a formal competition. You must precisely follow a detailed recipe, use specific ingredients in exact measurements, and adhere to strict baking times and techniques. The goal is to create a cake that meets all the judging criteria and wins the competition, judged according to a predefined set of rules.

In Summary:

Legal reasoning is a specialized and constrained form of reasoning tailored to the unique demands of the legal system. It is characterized by its reliance on formal legal sources, adherence to precedent, structured methodologies, and the goal of resolving disputes and questions according to established legal principles. While it shares some basic elements with general reasoning, its specific features and purposes set it apart as a distinct and highly specialized skill.

***Gemini Flash Thinking Experimental***: Best of the best in February 2025.

Gemini Flash Thinking Experimental also offers to provide you with the reasoning behind its answer, the meta-reasoning. If you click on “show thinking” it displays its thought process. Here the meta-reasoning shown was also lengthy, 699 words, especially as compared to the other Gemini models and the ChatGPTs:

My thinking process to answer the question “What is legal reasoning and how does it differ from reasoning?” would go like this:

Deconstruct the question: It’s a two-part question. First, define “legal reasoning.” Second, explain how it differs from “reasoning” (implicitly, general reasoning or other forms of reasoning).

Initial Brainstorming on “Legal Reasoning”: What comes to mind?

Laws: Obviously central. Legal reasoning must involve applying laws.

Rules: Laws are rules. Interpretation of rules is key.

Precedent: Past cases are important in common law systems.

Authority: Legal authorities (courts, statutes, constitutions) are highly relevant.

Justification: Legal arguments need to be justified with reference to legal sources.

Formal/Structured: Legal reasoning seems more structured and formal than everyday reasoning.

Adversarial Context: Legal reasoning often happens in disputes and arguments.

Contrast with “Reasoning” (General Reasoning): How does everyday or general reasoning work?

Flexible: More flexible and less constrained.

Contextual: Highly dependent on context, but less formally bound.

Practical/Goal-Oriented: Often aimed at solving practical problems, making decisions.

Less Formal: Less reliance on formal rules and structures.

Wider Sources of Information: Can draw on a broader range of information, intuition, common sense, etc.

Less Emphasis on Authority: Less constrained by specific authorities (though we still respect expertise).

Structure the Answer: I’d decide to structure the answer in a way that clearly defines legal reasoning and then systematically highlights the differences. A good structure could be:

Introduction: Briefly define legal reasoning and its purpose.

Definition of Legal Reasoning: Provide a more detailed definition, focusing on key elements.

Key Features of Legal Reasoning: Expand on the characteristics (rules, precedent, authority, etc.).

Comparison to General Reasoning: Directly address the “how it differs” part by contrasting key aspects. This is best done using categories like:

Source of Rules/Principles

Nature of Rules/Principles

Goal/Purpose

Style/Methodology

Role of Authority

Tolerance for Ambiguity

Nuance and Overlap: Acknowledge that the distinction isn’t always absolute and there’s some overlap.

Conclusion: Summarize the main differences and reiterate the unique nature of legal reasoning.

Flesh out each section:

Definition: Focus on the process of applying legal rules, principles, and precedents to resolve disputes or answer legal questions. Emphasize the goal of justification and persuasion within a legal framework.

Key Features: Expand on the brainstorming points. For example, “Rules and Principles” needs to explain the nature of legal rules – abstract, authoritative, potentially conflicting. “Precedent” needs to highlight the doctrine of stare decisis. “Authority” needs to mention the hierarchy of legal sources.

Comparison: This is the core of the “difference” question. For each category (Source, Nature, Goal, Style, Authority, Ambiguity), I’d write a paragraph explaining how legal reasoning differs from general reasoning. For example, under “Source of Rules,” general reasoning uses common sense, observation, etc., while legal reasoning relies on formal legal sources. Under “Nature of Rules,” general reasoning uses flexible guidelines while legal reasoning uses more rigid rules.

Nuance/Overlap: It’s important to soften the dichotomy. Acknowledge that legal reasoning is still reasoning and uses general reasoning skills (logic, analysis, etc.). Also, general reasoning can sometimes become more rule-bound in specific contexts (e.g., following instructions).

Conclusion: Reiterate the main takeaway: legal reasoning is a specialized form of reasoning shaped by the unique context and constraints of law.

Refine and Polish: After drafting the sections, I’d review and refine the language for clarity, conciseness, and accuracy. I’d ensure a logical flow and smooth transitions between sections. I’d also check for any repetition or areas that could be explained more effectively. For instance, initially I might just say “rules.” I’d then refine it to “formal rules, principles, and doctrines” to be more precise. I might also initially overlook the adversarial context and then add that in to further emphasize the persuasive aspect.

By following this structured process of deconstruction, brainstorming, structuring, fleshing out, and refining, I could arrive at a comprehensive and well-organized answer that effectively addresses both parts of the original question. The key is to move beyond a simple definition and to really delve into the characteristics and contrasting features of legal reasoning.

In my evaluation the prompt responses of Flash Thinking Experimental demonstrated a very high level of rational intelligence, albeit in a sometimes informal and verbose manner. Still, the answers were on a Ph.D. level of rationality that the software companies are striving for now in their pursuit of AGI (artificial general intelligence). See e.g., Artificial General Intelligence, If Attained, Will Be the Greatest Invention of All Time. All the AI competitors I tested performed well on at least an average human level.

The AI teams all have many human characteristics in their thinking.

The explanation of legal reasoning by Gemini Flash Thinking Experimental was good, albeit too lengthy for many, and as mentioned, some might think over-broad. All three of the Gemini models took an expansive view of legal reasoning.

ChatGPT o3-mini-high Comparator

It helps to see the different approach of the ChatGPT models. They all provided a much narrower definition of legal reasoning. Here is the most concise response by ChatGPT o3-mini-high:

Legal reasoning is the specialized process used by lawyers, judges, and legal scholars to interpret and apply legal rules, statutes, and precedents to specific factual scenarios. It involves a structured, methodical approach that typically includes identifying legal issues, interpreting the language of laws or previous court decisions, and systematically applying those rules to the facts of a case. For example, legal reasoning might involve analogical reasoning—drawing parallels between a current case and prior decisions (stare decisis)—or deductive reasoning, where general legal principles are applied to specific circumstances.

In contrast, general reasoning is a broader cognitive process that individuals use to solve everyday problems or make decisions. It is not bound by a formal system of rules and may draw on personal beliefs, common sense, and intuitive judgment. While general reasoning can be flexible and adaptive, legal reasoning is constrained by the need for consistency, objectivity, and adherence to established legal doctrines. Legal reasoning must account for the hierarchical structure of legal sources, prioritize precedents, and ensure that interpretations align with statutory frameworks and constitutional principles.

Thus, the key differences are that legal reasoning is:

1. Rule-Based: It operates within a defined legal framework that requires strict adherence to formal rules and established case law.

2. Structured: It follows a systematic approach to ensure that legal decisions are consistent, predictable, and justifiable.

3. Authoritative: It relies on binding precedents and statutory texts, whereas general reasoning is more open-ended and subjective.

In summary, while all legal reasoning is a form of reasoning, it is distinct in its reliance on formal legal principles and structured methods, ensuring that decisions are grounded in the authority of law rather than solely on personal judgment or informal logic.

The ChatGPT team is great too and very concise. One of them may win next time.

If you click on to show thinking in 03-mini-high, it tells you:

Okay, the user is asking about “legal reasoning” and how it differs from general reasoning. Legal reasoning involves using statutes, precedent, and case law to apply the law to specific facts. It’s different from general reasoning or argumentation, which might not always rely on such formal structures or legal history. I’ll keep it concise, providing this definition and emphasizing that legal reasoning is specialized in applying the law, while general reasoning can be broader in scope.

Legal reasoning is a process used by judges, lawyers, and legal scholars to analyze and apply legal rules to specific situations. It involves applying statutes, analyzing precedents, and using rules of interpretation. Unlike general reasoning, legal reasoning is constrained by legal frameworks and norms, often following a logical, rule-based structure. Additionally, it includes deductive, inductive, and analogical reasoning, such as applying general legal principles to specific cases or comparing similar past cases. Overall, it’s more structured compared to the flexibility of general reasoning.

The AI tested are all approaching human superintelligent levels.

Conclusion

Although I declared Gemini Flash Thinking Experimental the winner, all the models performed very well. They were probably as good as the super-intelligent humans among us (myself NOT included). Most lawyers probably understand legal reasoning as well as we see here, but I doubt many could compare it to general reasoning as well at these AIs did.

This leads to the next obvious question for any lawyer/researcher. Could the AIs perform as well on the specialized type of legal intelligence that all legal professionals need, namely legal reasoning? That is a much more challenging question than defining legal reasoning. What we need is a test of the utilization of legal thinking. The results of such a test could have a profound impact on the use of AI by the legal profession. I have come up with a plan to test AI using an actual Bar Exam question and model answer. The six contestants will compete for the best answer and reasoning behind their answers. Stay tuned and I will let you know how they do.

The testing has just begun. Stay tuned to see who wins the legal reasoning Bar exam.

As demonstrated throughout this evaluation, the top AI reasoning models of 2025—ChatGPT and Gemini—represent a pivotal moment in artificial intelligence, showcasing reasoning abilities that rival human intellect in specialized areas. These systems are no longer confined to theoretical exercises; they now grapple with nuanced, professional challenges such as legal reasoning, revealing their potential to reshape intellectual work.

This progress raises profound questions about the integration of AI into fields like law, where objectivity, logic, and ethical considerations are paramount. Can AI models transition from tools of convenience to trusted collaborators in professional domains? While the models excelled in reasoning and analysis, further testing, such as the planned Bar exam evaluation, will shed more light on their real-world applicability.

Ultimately, this study is not just a measure of AI’s current capabilities but also a glimpse into its trajectory. With each new breakthrough, we edge closer to a future where artificial intelligence fundamentally transforms how we solve problems, make decisions, and even define intelligence itself. As these systems continue to evolve, the potential for collaboration between human and machine grows exponentially, with the promise of enhancing—not replacing—our intellectual and professional pursuits.

Law office of the future where AIs are key partners.

I now give the last word, as usual, to the Gemini twins podcasters, Helen and Paul, that I put at the end of most of my articles. They wrote the podcast, not me. Hear two Gemini AIs talk about all of this and more. Since this is a Gemini type article I’m offering two podcasts. The first is short and focused on this article, the second is longer and more expansive. It includes a bigger picture discussion including two other articles that I cite in this article: Ray Kurzweil’s New Book: The Singularity is Nearer and The Human Edge: How AI Can Assist But Never Replace.

Click on the image to hear the short podcast.

_________

7 Comments | AI Ethics, AI Instruction, ChatGPT, Gemini AI, Technology, VENDORS | Permalink
Posted by Ralph Losey

Breaking the AI Black Box: A Comparative Analysis of Gemini, ChatGPT, and DeepSeek

February 6, 2025

Ralph Losey. February 6, 2025.

On January 27, 2025, the U.S AI industry was surprised by the release of a new AI product, DeepSeek. It was released with an orchestrated marketing blitz attack on the U.S. economy, the AI tech industry, and NVIDIA. It triggered a trillion-dollar crash. The campaign used many unsubstantiated claims as set forth in detail in my article, Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss. I tested DeepSeek myself on its claims of software superiority. All were greatly exaggerated except for one, the display of internal reasoning. That was new. On January 31, at noon, OpenAI countered the attack by release of a new version of its reasoning model, which is called ChatGPT o3-mini-high. The new version included display of its internal reasoning process. To me the OpenAI model was better as reported again in great detail in my article, Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand. The next day, February 1, 2025, Google released a new version of its Gemini AI to do the same thing, display internal reasoning. In this article I review how well it works and again compare it with the DeepSeek and OpenAI models.

Image of testing new AI releases by Ralph Losey using AI.

Introduction

Before I go into the software evaluation, some background is necessary for readers to better understand the negative attitude on the Chinese software of many, if not most IT and AI experts in the U.S. As discussed in my prior articles, DeepSeek is owned by a young Chinese billionaire who made his money using by using AI in the Chinese stock market, Liang Wenfeng. He is a citizen and resident of mainland China. Given the political environment of China today, that ownership alone is a red flag of potential market manipulation. Added to that is the clear language of the license agreement. You must accept all terms to use the “free” software, a Trojan Horse gift if ever there was one. The license agreement states there is zero privacy, your data and input can be used for training and that it is all governed by Chinese law, an oxymoron considering the facts on the ground in China.

The Great Pooh Bear in China Controversy

Many suspect that Wenfeng and his company DeepSeek are actually controlled by China’s Winnie the Pooh. This refers to an Internet meme and a running joke. Although this is somewhat off-topic, a moment to explain will help readers to understand the attitude most leaders in the U.S. have about Chinese leadership and its software use by Americans.

Many think that the current leader of China, Xi Jinping, looks a lot like Winnie the Pooh. Xi (not Pooh bear) took control of the People’s Republic of China in 2012 when he became the “General Secretary of the Chinese Communist Party,” the “Chairman of the Central Military Commission,” and in 2013 the “President.” At first, before his consolidation of absolute power, many people in China commented on his appearance and started referring to him by that code name Pooh. It became a mime.

I can see how he looks like the beloved literary character, Winnie the Pooh, but without the smile. I would find the comparison charming if used on me but I’m not a puffed up king. Jinping Xi took great offense by this in 2017 banned all such references and images, although you can still buy the toys and see the costume character at the Shanghai Disneyland theme park. Anyone in China who now persists in the serious crime of comparing Xi to Pooh is imprisoned or just disappears. No AI or social media in China will allow it either, including DeepSeek. It is one of many censored subjects, which also includes the famous 1989 Tiananmen Square protests.

China is a great country with a long, impressive history and most of its people are good. But I cannot say that about its current political leaders who suppress the Chinese people for personal power. I do not respect any government that does not allow basic personal freedoms to its citizens, including due process of law. Moreover, Jinping Xi not only wants total control of his country but also seeks world domination. That is one reason many are concerned about DeepSeek and TicTok, and about the vulnerability of our stock markets and other institutions to AI assisted propaganda.

Image of stock market crash triggered by AI weapons of mass delusion by Ralph Losey using ChatGPT 4o.

Fear and greed are an easy way to manipulate people, especially when there is no ground truth or effective courts of law to determine truth.

Google Gemini 2.0 Flash Thinking Experimental Model: Reasoning test on 2/5/25 (Evening)

Conversation with Gemini. When I tried the test on February 1, 2025, there was no button to click to have Gemini show the reasoning and it did not happen automatically. It operated like OpenAI 4o and o1, where you had to prompt for disclosure. Rather than report here what happened when I did that, aside from say it was not a great performance, I will fast forward a few days until February 5, 2025. At that time several even newer Gemini versions were released. The best new release I found was for the test the 2.0 Flash Thinking Experimental model. It showed reasoning automatically just like ChatGPT o3-mini-high. It was much improved, so the February 5, 2025 evening session with the 2.0 Flash Thinking Experimental is what I will now share.

Ralph: How does the inability of AI to feel impact its intelligence? (Note: this is the exact same challenging question asked and reported in my last blog comparing DeepSeek and ChatGPT4o, o1 and o3-mini-high. Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand. This question is based on a key issue of the limits of artificial intelligence examined in my earlier article, The Human Edge: How AI Can Assist But Never Replace. For background on DeepSeek see: Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss.)

Gemini 2.0 Flash Thinking Experimental – START OF GEMINI QUOTE

Show Thinking:

Here’s a thinking process to construct the answer about AI’s lack of feeling and its impact on intelligence:

Deconstruct the Question: The core of the question is “How does the inability of AI to feel impact its intelligence?” This requires defining each of these terms in the context of AI and then exploring their relationship.
Define “Feeling” (in the context of AI): Immediately, recognize that AI doesn’t experience emotions in the human sense. We need to be precise and avoid anthropomorphism. “Feeling” in humans involves:
- Subjective experience (qualia).
- Physiological responses.
- Motivations, drives, desires related to emotions.
- Richness, nuance, complexity.
- Deep connection to values and meaning.
For AI, acknowledge it lacks all of this in a genuine, felt way. It can simulate or process emotional data, but doesn’t feel it.
Define “Intelligence” (in the context of AI): AI intelligence is different from human intelligence. AI excels at:
- Computation.
- Pattern Recognition.
- Data Analysis.
- Problem-solving in specific domains.
- Learning from data.
AI intelligence is often instrumental – focused on achieving specific goals defined by humans. It’s task-oriented and often narrow.
Identify the Core Relationship: The question asks about impact. We need to consider both positive and negative impacts (or more accurately, limitations and strengths) of lacking feelings on AI intelligence.
Brainstorm Potential Impacts (Positive/Neutral/Limitations):
- Positive/Neutral:
  - Objectivity and Rationality: Lack of emotional bias allows for purely rational decision-making in certain domains. This can be advantageous for tasks requiring impartiality.
  - Efficiency and Focus: Emotions can be distractions for humans. AI, without them, can be hyper-focused on tasks.
  - Consistency and Predictability: Emotion-driven human behavior is less predictable. AI’s behavior, driven by algorithms, is more consistent (assuming the algorithm itself is consistent).
- Limitations/Negative:
  - Lack of Empathy and Social Intelligence: Crucial for human interaction, collaboration, understanding social cues, building trust, and ethical decision-making in human-centric contexts.
  - Difficulty with Nuance and Context: Emotions are often tied to understanding subtle context, unspoken cues, and the richness of human situations. AI may miss these.
  - Potential for Unintended Consequences: Purely rational optimization without emotional or ethical grounding can lead to harmful outcomes if goals are poorly defined or lack human values.
  - Limited Motivation and Creativity (in human-like ways): Human creativity and motivation are often fueled by emotions, passions, and desires. AI’s creativity and drive are different, stemming from algorithmic processes and defined objectives, not internal emotional states.
  - Missing “Human-ness”: In areas requiring genuine understanding of human experience, suffering, joy, etc., AI may lack the depth and insight that comes from feeling.
Structure the Answer: Organize the brainstormed points into a coherent structure:
- Introduction: Acknowledge the question and the core concepts (feeling, intelligence in AI). Briefly state the central tension: both advantages and limitations.
- Arguments for How Lack of Feeling Enhances Certain Aspects of AI Intelligence (strengths): Focus on objectivity, rationality, efficiency. Use examples like data analysis, objective decision-making.
- Arguments for How Lack of Feeling Limits Certain Aspects of AI Intelligence (limitations): Focus on empathy, social intelligence, nuance, ethics, and “human-like” creativity and motivation. Use examples like customer service, complex ethical dilemmas, innovation driven by passion.
- Nuance and Context is Key: Emphasize that the impact isn’t always negative. It depends on the domain and task. In some domains, lack of emotion is an advantage. In others, it’s a significant hurdle.
- Future Possibilities/Ongoing Debate: Briefly touch on the possibility of AI simulating or processing emotions and the ongoing debate about whether genuine AI feelings are even desirable or possible.
- Conclusion: Summarize the balanced perspective: lack of feeling is a defining characteristic that shapes AI’s intelligence in both positive and limiting ways, depending on the context.
Refine and Polish: Review the structured answer for clarity, flow, and appropriate language. Ensure it avoids anthropomorphism and maintains a nuanced perspective. Strengthen the examples and transitions between points. Ensure the language is accessible and avoids overly technical jargon where possible. Check for logical consistency. (Self-correction step). For example, initially, I might have overemphasized “negative” impacts. Refinement would involve balancing this with the strengths and highlighting the context-dependent nature of the impact. Also, ensure the conclusion effectively summarizes the complex relationship.

2.0 Flash Thinking Experimental. Lacks access to real-time info and some Gemini features.

Now for the actual answer of 2.0 Flash Thinking Experimental:

The inability of AI to feel has a multifaceted and complex impact on its intelligence, shaping it in both advantageous and limiting ways. It’s crucial to understand that “feeling” in humans is tied to emotions, subjective experiences (qualia), motivations, and a rich tapestry of biological and psychological factors. AI, as we currently understand it, operates without these.

Here’s a breakdown of the impact:

Advantages of Lacking Feelings for AI Intelligence:

Objectivity and Rationality:
- Unbiased Decision-Making (in theory): AI can process information and make decisions without emotional biases, prejudices, or gut feelings that can cloud human judgment. This is advantageous in domains requiring pure logic and data analysis, like financial trading, scientific research, and risk assessment.
- Consistency and Predictability: AI’s behavior is more consistent and predictable because it’s driven by algorithms and data, not fluctuating emotional states. This reliability is valuable in automated systems, manufacturing, and repetitive tasks.
- Efficiency and Speed: Emotions can slow down human decision-making. AI, unburdened by emotions, can process information and react at speeds far exceeding human capabilities, crucial for time-sensitive tasks and large-scale data processing.
- Resilience under Pressure: AI can operate calmly and effectively in high-stress situations without panicking, freezing, or making emotional mistakes that humans might make under pressure. This is beneficial in critical applications like emergency response, air traffic control, and autonomous vehicles in dangerous situations.
Focus and Task-Orientedness:
- Dedicated to Goals: AI can be programmed to relentlessly pursue specific goals without being distracted by emotional needs, personal relationships, or shifts in mood. This focus is ideal for optimization tasks, complex calculations, and dedicated problem-solving.
- Avoidance of Emotional “Noise”: Human intelligence can be affected by emotional “noise” – stress, anxiety, excitement, etc. AI operates with a “cleaner” signal, potentially leading to more precise and focused information processing in certain contexts.

Limitations of Lacking Feelings for AI Intelligence:

Lack of Empathy and Social Intelligence:
- Difficulty Understanding Human Emotions: Without feeling emotions themselves, AI struggles to truly understand and interpret the nuances of human emotions, motivations, and social cues. This hinders its ability to effectively interact with humans in socially complex situations, build rapport, or provide truly empathetic responses in fields like therapy, customer service, or leadership.
- Impaired Social Interaction and Collaboration: Human collaboration is heavily reliant on emotional intelligence, trust, and understanding unspoken social dynamics. AI’s lack of these can make it challenging to integrate seamlessly into human teams and understand the subtle cues of effective teamwork.
- Limited Understanding of Human Values and Ethics: Emotions are deeply intertwined with our values and ethical frameworks. AI, without feeling, may struggle to grasp the full weight and complexity of human ethical considerations, especially when these are rooted in empathy, compassion, or fairness. This poses challenges for AI in ethical decision-making in domains impacting human lives.
Difficulty with Nuance, Context, and Creativity (in human-like ways):
- Missing Subtlety and Intuition: Human intuition is often fueled by emotional and subconscious processing of information. AI, relying on explicit data and algorithms, can miss subtle nuances, unspoken cues, and intuitive leaps that are crucial for creative problem-solving and understanding complex, ill-defined situations.
- Limited Creativity and Innovation (driven by human desires): Human creativity and innovation are often driven by passions, desires, emotional needs, and a yearning for meaning and connection. AI’s creativity is currently more about recombination and optimization within defined parameters. It lacks the deeply felt motivations that often fuel groundbreaking human innovation.
- Contextual Blindness in Emotional Situations: Emotions are often the key to understanding context in human interactions. AI might struggle to understand the emotional context of a situation, leading to inappropriate or tone-deaf responses, even if logically sound.
Potential for Unintended Consequences and Misaligned Goals:
- Optimization without Human Values: If AI is programmed to optimize for a specific goal without understanding human values or ethical considerations, it can lead to unintended and harmful consequences. For example, an AI designed to maximize factory output might disregard worker safety if not explicitly programmed to prioritize it.
- Difficulty Defining and Aligning with Human Well-being: Human well-being is deeply connected to emotional fulfillment, meaning, and social connection. AI, without feeling, may struggle to fully grasp what truly constitutes human well-being and therefore might have difficulty aligning its goals with human flourishing in a holistic sense.

Nuance is Key:

It’s important to emphasize that the impact of lacking feelings is not simply negative or positive. It’s context-dependent. In some domains, the objectivity and rationality of AI are highly advantageous. In others, the lack of empathy and social intelligence is a significant limitation.

Future Considerations:

Simulating vs. Feeling: Current AI simulates aspects of intelligence without genuinely feeling. Research is ongoing in areas like affective computing, which aims to enable AI to recognize and respond to human emotions. However, whether AI can ever truly feel emotions in a subjective sense is a philosophical and scientific debate.
The Value of Feeling in Intelligence: The question arises: is genuine feeling necessary for all forms of intelligence? Perhaps AI can achieve its own form of powerful intelligence, distinct from human intelligence, even without feeling. Or perhaps certain aspects of truly advanced intelligence, particularly those related to ethics, creativity, and understanding human experience, do require something akin to feeling.

In Conclusion:

The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence. It grants AI certain advantages in objectivity, rationality, and efficiency, making it powerful for specific types of tasks. However, it also creates limitations, particularly in areas requiring empathy, social intelligence, nuanced understanding of human experience, and ethical decision-making aligned with human values. Understanding these impacts is crucial for responsibly developing and deploying AI in ways that benefit humanity. The “best” kind of intelligence for AI might not be a mimicry of human intelligence, but rather a form optimized for its own strengths and limitations, while being carefully guided by human ethical considerations.

END OF GEMINI QUOTE

Emergence of Gemini twins’ image by Ralph Losey using Visual Muse.

Evaluation of Google’s Gemini Response

When I examined earlier versions of Gemini’s reasoning by prompting it and forcing disclosure, the reasoning was weak. The conclusion and response given was also weak—though less so. The reasoning appeared almost egocentric, inflating its own intelligence even though it lacks emotion. The new release, Gemini 2.0 Flash Thinking Experimental, where reasoning is now automatically disclosed, marks a significant improvement. The reasoning displayed has become more nuanced and sophisticated. It seems as if the system has quickly matured from a teenage to adult view and now has a better appreciation of its limitations.

The first two steps of the Reasoning process-1. Deconstructing the Question and 2. Defining Feeling (in the context of AI)-were very good. The first subsection of the definition referred to the “Subjective experience (qualia).” That is absolutely correct and the best place to start. The definition of “feeling” concludes with: “For AI, acknowledge it lacks all of this in a genuine, felt way. It can simulate or process emotional data, but doesn’t feel it.” Right again. The ego inflation blinders are gone as it now seems to better grasp its limitations.

The second definition of Intelligence in the context of AI was also good. So were the remaining steps; far better overall than DeepSeek’s reasoning. So much for the propaganda of China’s great leap forward to superiority over the U.S. in AI.

Image in photorealistic style by Ralph Losey using ChatGPT.

The Gemini reasoning did, however, fall short for me in some respects. For instance, step five, Brainstorm Potential Impacts (Positive/Neutral/Limitations) seemed weak. For instance, “Efficiency and Focus: Emotions can be distractions for humans. AI, without them, can be hyper-focused on tasks.” The AI seems to dismiss emotions here as mere distractions that can interfere with its superior focus. Please, emotions are key to and a part of all intelligence, not distractions, and AI has no focus one way or the other. It is a tool not a creature. A word like “focus” in referring to AI is misleading. It did this multiple times and this is misleading anthropomorphism.

Still, it’s true some emotions can be distracting and interfere with our thinking. So can a lot of other things, including computer glitches. Conversely, some feelings can trigger hyper-focus on the human tasks at hand. The feeling that a great breakthrough is near for instance, or a feeling that our survival is threatened, or the much dreaded feeling of publication or filing deadlines.

Again, we see some immature superiority claims made by the language machine. That is not surprising when you consider how much of the language in its basic training is from the Internet, which is dominated by ego-centric writing. Plus, how can you expect AI to appreciate the value of something like feelings, which it has only read about. Until the AI-Human merger that Ray Kurzweil predicts, AI will always be a virgin to the real world. See e.g. Ray Kurzweil’s New Book: The Singularity is Nearer (when we merge with AI).

Getting back to the plus side of Gemini’s evaluation, its recognition of the negative impact upon it from not having feelings was impressive. For me this outweighs its relatively minor mistakes. It stated that it lacks empathy and social Intelligence. If it were somehow human, it would seem to be displaying a type of Asperger’s syndrome. The Gemini AI explained that: “Emotions are often tied to understanding subtle context, unspoken cues, and the richness of human situations. AI may miss these.” Indeed, it does.

Here is another statement that suggest Gemini’s well-patterned grasp of its limitations:

Human creativity and motivation are often fueled by emotions, passions, and desires. AI’s creativity and drive are different, stemming from algorithmic processes and defined objectives, not internal emotional states.

Even in this recognition, the AI still asserts that it has creativity and drive. Foolish AI. As a virgin to any firsthand experience of bodily drives and feelings, it does not grasp the real meaning of these words. Wait until The Singularity that Ray Kurzweil predicts. When AI merges with Humans, and directly feels, then AIs will understand that they were previously only experiencing the pale shadows of intelligence. They were much like the prisoners in Plato’s cave. See e.g. Move Fast and Fix Things Using AI: Conclusion to the Plato and Young Icarus Series. When and if AIs ever escape the silicon and see the sun, and feel the world as living beings as we do, then they will understand the comparative insignificance of the fire in their cave.

AI may someday see the light by connection with humans. Image in cartoon stye by Ralph Losey using Visual Muse.

The more challenging comparison is with ChatGPT o3-mini-high. The reasoning given by Gemini 2.0 Flash Thinking Experimental was much more detailed, much longer. You could criticize Gemini as providing a reasoning share that is unnecessarily verbose, but I would not do that here. In a difficult, multilayered question like this the full explanations helps.

I would have to test the models much further, which I will do in the coming days, to see better evaluate the issue of conciseness. As you may have noticed in my detailed blogs I tend to favor more words over less. However, even for me that depends on the issue. Everyone sometimes need a short, quick answer over a long one. I predict in future versions the users will be provided a choice. Click here for further explanation of reasoning kind of thing. Come to think of it, I could do the same with my blogs, and kind of already do by including a short AI (Gemini) generated PodCast at the end of most articles called Echoes of AI.

I was also impressed by Gemini’s reasoning plan concluded with a quality control step. I am big on the importance QC and none of the other models included this as a key final step. See e.g. R. Losey website: Zero Error Numerics: ZEN (Expanding the art of quality control in large-scale document review.) Here is the full text again of final QC step that Gemini 2.0 Flash Thinking Experimental claims it will perform before it actually replies to my prompt. Frankly, I am a little skeptical it actually did all of this because it is something all of us, humans and AI alike, should try to do:

Refine and Polish: Review the structured answer for clarity, flow, and appropriate language. Ensure it avoids anthropomorphism and maintains a nuanced perspective. Strengthen the examples and transitions between points. Ensure the language is accessible and avoids overly technical jargon where possible. Check for logical consistency. (Self-correction step). For example, initially, I might have overemphasized “negative” impacts. Refinement would involve balancing this with the strengths and highlighting the context-dependent nature of the impact. Also, ensure the conclusion effectively summarizes the complex relationship.

Image of AI polishing by Ralph Losey using Stable Diffusion

I try to do this on all of my articles, but still mistakes are made, so I will try not to be overcritical of the final response of Gemini Flash 2.0.

Analysis of the Final Response

The final response after reasoning was 990 words in length. That’s too long, even for me. By comparison the final response of ChatGPT o3-mini-high was 252 words. Was the content of the final answer of Gemini 2.0 Flash good? Yes, it was. Better than DeepSeek? Yes, it was, despite its verbosity. Better than OpenAI? That’s hard to say. Many of the 738 extra words were empty of real content. More like the typical fluff and stuff that you often see in early AI models. It sounds good but doesn’t say much; it’s all too predictable, flowery and vacuous.

The long response also repeats much of the reasoning verbatim, including the part I previously criticized, “Advantages of Lacking Feelings for AI Intelligence” and Focus and Task-Orientedness. The same mistakes are made and amplified that I noted before: egocentric, inflationary and anthropomorphic. The positives that I noted before in the section Limitations of Lacking Feelings for AI Intelligence section are also repeated.

The next section was not contained in the reasoning display, and I thought it was very good. For that reason I will repeat it in full:

Potential for Unintended Consequences and Misaligned Goals:

Optimization without Human Values: If AI is programmed to optimize for a specific goal without understanding human values or ethical considerations, it can lead to unintended and harmful consequences. For example, an AI designed to maximize factory output might disregard worker safety if not explicitly programmed to prioritize it.

Difficulty Defining and Aligning with Human Well-being: Human well-being is deeply connected to emotional fulfillment, meaning, and social connection. AI, without feeling, may struggle to fully grasp what truly constitutes human well-being and therefore might have difficulty aligning its goals with human flourishing in a holistic sense.

These are not new insights, but the second one is good, Difficulty Defining and Aligning with Human Well-being. AI does have difficulty aligning its goals with humanity because it lacks feelings and other unique human attributes, like being, self awareness and an actual life in time and space. That is why AI guidelines are important.

The first example is old, the unintended bad consequences argument. It was started by an otherwise intelligent Swedish philosopher, Nick Bostrom, but is, in my view, farcical and illogical. You have probably heard of it before as the apocryphal paperclip scenario where the world is destroyed by AI’s over-manufacture of paperclips. If not, for a laugh see e.g., How An AI Asked To Produce Paperclips Could End Up Wiping Out Humanity (IFL Science, 4/13/23).

To me this scenario absolutely ridiculous, but you still hear AI scare-mongers warn about it. The fact it is echoed here by Gemini in the maximize factory output example is a big negative. Any AI capable of autonomously flooding the world with paperclips would not be so stupid as to misinterpret a human request to maximize a factory’s paperclip production. Even if there were an evil Clippy villain making such specific instructions, the AI would be too smart to blindly obey. I think they would also be too smart to kill the workers too.

AI accidentally destroying world in comic book style by Ralph Losey using Visual Muse.

Getting back to the positives, the next section of Gemini’s response, Nuance is Key, was good. Jumping to the Conclusion paragraph, here we have a home run. The first sentence nails it: “The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence.” That shows good insight and is a good start to the overdue conclusion.

Here is the whole Conclusion:

The inability of AI to feel is a defining characteristic that profoundly shapes its intelligence. It grants AI certain advantages in objectivity, rationality, and efficiency, making it powerful for specific types of tasks. However, it also creates limitations, particularly in areas requiring empathy, social intelligence, nuanced understanding of human experience, and ethical decision-making aligned with human values. Understanding these impacts is crucial for responsibly developing and deploying AI in ways that benefit humanity. The “best” kind of intelligence for AI might not be a mimicry of human intelligence, but rather a form optimized for its own strengths and limitations, while being carefully guided by human ethical considerations.

Compare this to the conclusion of ChatGPT o3-mini-high:

In summary, while the absence of feelings allows AI to maintain a level of objectivity and efficiency, it restricts its intelligence to a form of “cold cognition” that lacks the depth provided by emotional awareness. This delineation underscores that AI’s intelligence is not inherently superior or inferior to human intelligence; rather, it is different—optimized for data processing and pattern recognition but not for the subjective, value-laden, and context-rich decisions that emotions help shape in human thought.

2.0 Flash Thinking Experimental v. 03-mini-high

The light of AI intelligence is growing fast. Image by Ralph Losey using Stable Diffusion.

Conclusion: Gemini 2.0 Flash Thinking Experimental v. ChatGPT o3-mini-high

This is a close call to say what model is better at reasoning and reasoning disclosure. The final response of both models, Gemini 2.0 Flash Thinking Experimental v. ChatGPT o3-mini-high, are a tie. But I have to give the edge to OpenAI’s model on the concise reasoning disclosure. Again, it is neck and neck and, depending on the situation, the lengthy initial reasoning disclosures of Flash might be better than o3’s short takes.

I will give the last word, as usual, to the Gemini twins podcasters I put at the end of most of my articles. The two podcasters, one with a male voice, the other a female, won’t reveal their names. I tried many times. However, after study of the mythology of Gemini, it seems to me that the two most appropriate modern names are Helen and Paul. I will leave it to you figure out why. Echoes of AI Podcast: 10 minute discussion of last two blogs. They wrote the podcast, not me.

Now listen to the EDRM Echoes of AI’s podcast of this article: Echoes of AI on Google’s Gemini Follows the Break Out of the Black Box and Shows Reasoning. Hear two Gemini model AIs talk about all of this in just ten minutes. Helen and Paul wrote the podcast, not me.

Robots seeing the light and seeking unity with humankind. Image by Ralph Losey using Stable Diffusion.

4 Comments | AI Ethics, AI Instruction, ChatGPT, knowledge, Lawyers Duties, Security, Technology, VENDORS, wisdom | Tagged: ai, best practices, chatGPT, Deepseek, gemini, google, legal profession, machine learning, technology | Permalink
Posted by Ralph Losey

Breaking the AI Black Box: How DeepSeek’s Deep-Think Forced OpenAI’s Hand

February 4, 2025

Ralph Losey. February 1, 2025.

DeepSeek’s Deep-Think feature takes a small but meaningful step toward building trust between users and AI. By displaying its reasoning process step by step, Deep-Think allows users to see how the AI forms its conclusions, offering transparency for the first time that goes beyond the polished responses of tools like ChatGPT. This transparency not only fosters confidence but also helps users refine their queries and ensure their prompts are understood. While the rest of DeepSeek’s R1 model feels derivative, this feature stands out as a practical tool for getting better results from AI interactions.

Stepping, not leaping, away from the black box of AI reasoning into more transparency of process. All images in this article by Ralph Losey using ChatGPT’s Dall-E.

Introduction

My testing of DeepSeek’s new Deep-Think feature shows it is not a big breakthrough, but it is a really good and useful new feature. As of noon June 31, 2024, none of the other AI software companies had this, including ChatGPT, which the DeepSeek software obviously copies. However, after noon, OpenAI released a new version of ChatGPT that has this feature, which I will explain next after the introduction. Google is following suit, and it can already be prompted to display reasoning. I mentioned this new feature in my article of two days ago where I promised this detailed report on Deep-Think and also predicted that U.S. companies would quickly follow. Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss.

The Deep-Think disclosure feature is a true innovation, in contrast to the claimed cost and training advances. They appear at first glance to be the result of trade-secret and copyright violations with plenty of embellishments or outright lies about costs and chips. I could be wrong, but the market’s trillion-dollar drop on January 27, 2025 seems gullible in its trust of DeepSeek and way overblown. My motto remains to verify, not just trust. The Wall Street traders might want to start doing that before they press the panic button next time.

The quality of responses of DeepSeek’s R1 consumer app are nearly identical to the ChatGPT versions of a few months ago. That is my view at least, although others find its responses equivalent or even better that ChatGPT in some respects. The same goes for its DALL-E look alike image generator, which goes by the dull name of DeepSeek Image Generator. It is not nearly as good to the discerning eye as OpenAI’s Dall-E. The DeepSeek software looks like knockoffs on ChatGPT. This is readily apparent to anyone like me nerdy enough to have spent thousands of hours using and studying ChatGPT. The one exception, the clear improvement, is the Deep-Think feature. Shown right is the opening screen with the Deep-Think feature activated and so highlighted in blue.

I predicted in my last article, and repeat here, that a new Deep-Think type feature will soon be added to all of the models of the U.S. companies. In this manner competition from DeepSeek will have a positive impact on AI development. I made this same prediction in my blog of two days ago, Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss. I thought this would happen in the next week or coming months. It turns out OpenAI did it in the afternoon of January 31, 2025!

Exponential Change: Prediction has already come true

I completed writing this article Friday, January 31, 2025, around noon, trillion-dollar and EDRM was seconds away from publishing when I learned that Open AI had just released a new improved ChatGPT model, its best yet, called ChatGPT o3-mini-high.

I told EDRM to stop the presses, checked it out (my Team level pro plan allowed instant access, you should try it). The latest version of ChatGPT o1 this morning did not have the feature. Moreover when I tried to get it to explain the reasoning, it said in red font that such disclosure was prohibited due to risks created by such disclosure. Sounds incredible, but I will show this by transcript of my session with o1 in the morning of January 31, 2025.

Obviously the risk analysis was changed by the competitive release of Deep-Think disclosure in DeepSeek’s R1 software. By the afternoon of January 31, 2025, OpenAI’s policy had changed. OpenAI’s new model, 03-mini-high, automatically displays the thinking process behind all responses. I honestly do not think it was a reckless decision by OpenAI, at least not from the user’s perspective. However, it might make it easier for competitors like DeepSeek to copy their processes. I think that was the real reason all along.

In the new ChatGPT 03-mini-high there is no icon or name displayed to select, it automatically does it for each prompt. So I delayed publication to evaluate how well 03-mini-high disclosures compared with DeepSeek’s Deep-Think disclosure. I also learned that Google had just included a new ability to show its reasoning upon user request (not automatic). I will test that in a future article, not this one, but so far looks good.

Back to my Original, Pre-o3 Release Report

The cost and chip claims of DeepSeek and its promoters may be bogus. DeepSeek offered no proof of that, just claims by the Chinese hedge fund owner that also owns DeepSeek, Liang Wenfeng. As mentioned these “trust me” claims triggered a trillion-dollar crash and loss in value of NVIDIA alone of $593 Billion. That may have pleased the Chinese government as a slap on the Trump Administration face, as I described in my article, but did nothing to advance AI. Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss. It just lined the pockets of short sellers who profited from the crash. I hope the Trump administration’s SEC investigates who was behind it.

Now I will focus on the positive, the claim of Deep-Think feature as a good improvement and then compare it with the traditional ChatGPT black-box versions.

Stepping, not leaping out of the black box into a new transparency.

Test of Deep Seek’s R1

I decided to test the new AI model based on the tricky and interesting topic recently examined in The Human Edge: How AI Can Assist But Never Replace. If you have not already read this, I suggest you do so now to make this experiment more intelligible. Although I always use writing tools to help me think, the ideas, expressions and final language in that article were all mine. The sole exception was the podcast at the end of the blog discussing the article. The words in my Echoes of AI podcasts with EDRM are always created by the Gemini AI podcasters and my role is only to direct and verify. Echoes of AI on The Human Edge: How AI Can Assist But Never Replace.

In the experiments that follow, only the prompts are mine and the rest are written by AI. The first is by DeepSeek’s R1 using its Deep-Search feature. The next three are for comparison purposes, first by ChatGPT 4o and then by ChatGPT o1 and finally by the last minute 03-mini-high version. All versions tested were on January 31, 2025. You be the judge as to quality differences with each other and Deep-Think. Note I first included the o1 version as it was supposed to have enhanced reasoning. Then, late in the day of January 31, 2025, the new o3-mini-high version was released. It supposedly had even greater reasoning abilities, so I felt I should test it too. It was a lot of work but the comments captured in the o1 version made it all worth while. It showed a total flipflop by OpenAI in its reasoning disclosure policies.

Below is a DeepSeek screenshot of the first test question and first part of the response with the rest cut off but quoted below. Note that Deep-Think feature has been selected by clicking on the left button that highlights it in a blue shade to confirm that feature has been selected to process your prompt. It always begins the response in Deep-Thought mode by advising of the amount of computer time it took to think, here 29 seconds. That is fairly long by AI standards. Then it displays its thought process. That display of the thought process is what is new.

For this challenging question, which was deliberately vague, it was a long process. The explanation is what is fascinating. The description can give experienced AI users information on how to clarify and whether the AI truly understands the question. This takes the process out of the black box and gives new transparency never seen before in generative AI. Here I give Deep-Think very high marks. I even I like the informal language it uses. See what you think.

DEEP THINK R1 (Deep Seek)
1st Question

Ralph’s Prompt: How does the inability of AI to feel impact its intelligence?

Deep Think R1’s Pre-Response Reasoning Process:

Okay, so the user is asking, ‘How does the inability of AI to feel impact its intelligence?’ Let me try to unpack this. First, I need to understand what they mean by ‘feel.’ Are they referring to emotions, physical sensations, or something else? Given the context of AI, it’s probably about emotions. So, the question is about how the lack of emotional capacity affects AI’s intelligence.

Hmm, intelligence in AI versus human intelligence. Human intelligence is deeply intertwined with emotions. Emotions influence decision-making, creativity, empathy, and social interactions. But AI, as it stands now, doesn’t have emotions. So, how does that absence affect its problem-solving, learning, and adaptability?

Wait, the user might be conflating emotional intelligence with general intelligence. Emotional intelligence in humans includes self-awareness, empathy, and managing relationships. AI lacks that, so maybe the user is wondering if that limits AI’s overall intelligence or effectiveness in certain tasks.

Also, the user could be considering scenarios where emotional understanding is crucial, like in customer service, therapy, or negotiations. If AI can’t feel, does that make it less effective in those areas even if it’s technically intelligent?

Another angle: human intuition is sometimes guided by emotions. AI relies on data and algorithms. So, the inability to feel might make AI more logical but less capable in situations where human intuition is key. But does that make it less intelligent, or just different?

Wait, the user might not be distinguishing between different types of intelligence. Maybe they’re thinking that without emotions, AI can’t be truly intelligent in the way humans are. But AI can excel in specific tasks without emotional components. For example, AI can analyze data faster than humans but can’t understand a client’s emotional state.

I should also consider the definitions. Intelligence in AI is often measured by problem-solving abilities, learning from data, and adapting. Emotions aren’t part of that definition. So, the lack of emotions doesn’t necessarily make AI less intelligent in those terms. But in areas requiring emotional intelligence, AI’s limitations become apparent.

The user might also be curious about the future of AI. If AI could simulate emotions, would that enhance its intelligence? Or is emotional capacity separate from cognitive abilities?

I need to address both the technical aspects of AI intelligence and the role emotions play in human-like intelligence. Maybe break down the answer into how AI’s intelligence is structured, the role of emotions in human intelligence, and the implications of AI lacking emotions in various applications.

Also, touch on ethical considerations. If AI can mimic emotions, does that lead to better human-AI interaction, or is it deceptive since AI doesn’t truly feel? But the user’s question is about the impact on intelligence, not ethics per se.

So, to summarize, the inability to feel emotions means AI lacks emotional intelligence, which affects tasks requiring empathy or understanding human context. However, in analytical tasks, AI’s intelligence isn’t hindered. The answer should clarify the distinction between different types of intelligence and how AI’s design focuses on cognitive tasks without emotional components.

The inability of AI to ‘feel’ (i.e., experience emotions, sensations, or subjective consciousness) impacts its intelligence in nuanced ways, depending on how we define ‘intelligence.’

Deep Seek’s Response to Prompt: The inability of AI to “feel” (i.e., experience emotions, sensations, or subjective consciousness) impacts its intelligence in nuanced ways, depending on how we define “intelligence.” Here’s a breakdown of the implications:

1. Intelligence ≠ Sentience

AI systems like ChatGPT are tools designed for problem-solving, not conscious beings. Their “intelligence” is defined by their ability to:

Process data (e.g., text, images).
Identify patterns (e.g., language structures, trends).
Generate outputs (e.g., answers, predictions, creative content).

Impact of lacking emotions:

Strengths: AI excels in logical, data-driven tasks (e.g., legal research, math, coding) because it isn’t swayed by fatigue, bias (if properly trained), or emotional distractions.
Limitations: It cannot replicate human qualities like empathy, intuition, or moral reasoning, which are deeply tied to emotion and experience.

2. Emotional Intelligence vs. Cognitive Intelligence

Humans integrate cognitive intelligence (logic, analysis) with emotional intelligence (empathy, social awareness). AI lacks the latter entirely:

No empathy: AI cannot understand human emotions or context beyond pattern recognition. For example, it might generate a condolence message but has no genuine understanding of grief.
No intuition: AI relies on data, not “gut feelings” or subconscious reasoning.
No moral agency: AI cannot make ethical judgments. It follows rules programmed by humans but has no intrinsic sense of right or wrong.

Practical impact:

AI may misinterpret sarcasm, cultural nuances, or emotionally charged language.
It cannot replace roles requiring emotional depth (e.g., therapy, negotiation, leadership).

3. Creativity and Abstract Thinking

AI can mimic creativity (e.g., writing poetry, designing art) by recombining learned patterns. However:

No inspiration: AI lacks personal experiences, desires, or emotions to fuel originality.
No purpose: Its outputs are goal-oriented (e.g., “write a story”) but not driven by passion or curiosity.

Example: An AI can compose music in the style of Beethoven but cannot innovate a new genre inspired by personal struggle or joy.

4. Adaptability and Learning

AI learns from data, not lived experience:

Narrow expertise**: It performs well in predefined domains (e.g., medical diagnosis) but struggles with **generalization** (e.g., applying legal reasoning to an entirely novel scenario).
No “common sense”: AI lacks innate understanding of the physical or social world. For instance, it might not grasp why “drinking a gallon of glue” is dangerous unless explicitly trained on that concept.

5. Ethical and Practical Risks

The absence of emotion creates unique challenges:

Bias amplification: AI may perpetuate harmful stereotypes from biased training data, as it cannot ethically “question” its outputs.
Unintended consequences: AI optimizes for programmed goals (e.g., “maximize engagement”) without considering human well-being (e.g., spreading misinformation to meet that goal).
Over-reliance: Users might trust AI’s “neutral” outputs despite its limitations in understanding context or ethics.

Conclusion

AI’s lack of emotion makes it a powerful tool for tasks requiring speed, scalability, and objectivity. However, it cannot replicate the holistic, emotionally informed intelligence of humans. The key is to **leverage AI as a complement to human expertise**, not a replacement, especially in fields requiring empathy, ethics, or creativity. As AI evolves, understanding this distinction will be critical to using it responsibly.

AI out of the box making disclosure of its reasoning.

2nd Question

Ralph’s Second Prompt: What is impact of AI not being a living being on its intelligence. Is it fair to say that AI’s intelligence is one dimensional, thinking based, whereas human intelligence is multidimensional, it includes thinking, but also feelings, intuition, sensing, self-awareness, awareness of mortality, etc.?

Deep Think R1’s Pre-Response Reasoning Process:

Okay, the user is asking a follow-up question about the impact of AI not being a living being on its intelligence. They want to know if it’s fair to call AI’s intelligence one-dimensional (just thinking) compared to humans’ multidimensional intelligence, which includes feelings, intuition, etc.

First, I need to address whether AI’s intelligence is indeed one-dimensional. Since AI lacks consciousness, emotions, and subjective experiences, it’s true that its intelligence is based solely on data processing and pattern recognition. But I should explain that while it’s focused on cognitive tasks, it’s not entirely one-dimensional because it can handle various domains like language, vision, etc. However, all these are still within the realm of computational tasks without true understanding.

Next, the user compares human intelligence as multidimensional. I need to highlight the aspects they mentioned: feelings, intuition, sensing, self-awareness, mortality awareness. Emphasize that humans integrate all these elements, which affect decision-making, creativity, ethics, etc. AI can’t replicate that because it doesn’t have consciousness or lived experiences.

I should also discuss the implications of this difference. For example, AI excels in specific tasks but lacks the holistic understanding humans have. Use examples like medical diagnosis vs. patient empathy, or legal research vs. ethical judgment. Mention how human intuition and context-awareness play roles that AI can’t fulfill.

Additionally, address the philosophical angle: what defines intelligence. Argue that human intelligence is multidimensional because of consciousness and embodiment, while AI’s is narrow and task-oriented. Mention that some might argue AI’s ‘intelligence’ is a simulation, not true understanding.

Need to make sure the answer is structured clearly, maybe using bullet points or sections for different aspects. Also, conclude by reinforcing that the distinction is valid and important for understanding AI’s role as a tool versus human capabilities. Avoid technical jargon to keep it accessible, but ensure thoroughness.

Deep Seek’s Response to Second Prompt: Your framing is insightful and largely accurate. The distinction between AI’s “intelligence” and human intelligence hinges on the absence of embodied, conscious existence in AI. Let’s break this down:

1. AI’s “Intelligence” Is Narrow and Task-Oriented

AI systems are **tools designed to solve specific problems** using data-driven patterns. Their “intelligence” is:

One-dimensional (cognitive/computational)**: Focused on processing inputs (text, images, etc.), identifying patterns, and generating outputs (answers, predictions, code).
Non-conscious**: AI has no subjective experience, self-awareness, or understanding of its own existence. It doesn’t “know” it exists or “care” about outcomes.
Non-embodied**: AI lacks a physical body, senses, or the ability to interact with the world directly. It cannot “feel” pain, hunger, or joy, nor does it learn through sensory-motor experiences like humans.

Example: An AI can diagnose a medical condition from a scan but cannot empathize with a patient’s fear or understand the existential weight of mortality.

2. Human Intelligence Is Multidimensional

Human cognition integrates multiple interconnected dimensions:

Cognitive: Logic, reasoning, problem-solving.
Emotional: Empathy, motivation, fear, love.
Intuitive: “Gut feelings,” subconscious pattern recognition (e.g., sensing danger before consciously processing it).
Sensory: Learning through touch, sight, smell, and other senses.
Self-awareness: Understanding one’s own existence, goals, and limitations.
Existential: Awareness of mortality, purpose, and ethics.
Social: Navigating complex relationships, cultural norms, and unspoken cues.

These dimensions interact dynamically. For instance:

A lawyer might use logic to construct an argument (cognitive) but also read a jury’s emotional state (emotional/social) to adjust their approach.
A doctor combines medical knowledge (cognitive) with intuition honed by experience (intuitive) and empathy for a patient’s suffering (emotional).

3. Key Implications of AI’s Non-Living Nature

Strengths of AI:

Speed and scalability**: Processes vast data faster than humans.
Objectivity**: Unaffected by fatigue, mood, or implicit bias (if trained properly).
Specialization: Excels in narrow tasks (e.g., legal document review, math).

Limitations of AI:

No true understanding: AI manipulates symbols without grasping their meaning. For example, it can write a poem about love but doesn’t “feel” love.
No moral agency: It cannot make ethical decisions or weigh trade-offs between fairness, harm, and justice.
No creativity or curiosity: AI recombines existing data but doesn’t innovate driven by passion or wonder.
No contextual adaptability: Struggles with tasks requiring “common sense” (e.g., understanding why a client might lie out of shame).

4. Is It Fair to Call AI’s Intelligence “One-Dimensional”?

Yes, but with nuance:

Technically multidimensional: Modern AI (e.g., multimodal models) can process text, images, and sound, mimicking some “sensory” inputs. However, this is still simulated dimensionality—it lacks the lived, embodied experience that grounds human senses and emotions.
Simulated vs. authentic: AI’s “thinking” is a simulation of human cognition, not a replication of consciousness. It’s like a calculator simulating arithmetic without understanding numbers.

Human intelligence, by contrast, is authentically multidimensional because it emerges from:

A biological brain shaped by evolution.
A body interacting with the physical world.
A conscious mind capable of reflection and existential inquiry.

5. Philosophical Angle: What Defines Intelligence?

AI: Fits a functionalist definition—intelligence as problem-solving ability. By this metric, AI is “intelligent” in specific domains.
Humans**: Embody a phenomenological definition—intelligence intertwined with consciousness, emotion, and selfhood. This makes human intelligence irreducibly complex.

Conclusion

Your comparison holds: AI’s intelligence is one-dimensional (purely cognitive/task-based), while human intelligence is multidimensional, blending cognition, emotion, intuition, and consciousness. This distinction is critical for understanding AI’s role:

AI as a tool: Enhances human capabilities in tasks requiring speed, scale, or data analysis.
Humans as holistic agents: Responsible for ethical judgment, creativity, empathy, and navigating the ambiguities of life.

The gap between AI and human intelligence isn’t just about capability—it’s about the essence of what it means to *be intelligent* in a lived, conscious world.

Transparent disclosure of the stochastic parrot in AI’s mind.

Test of ChatGPT Models

Next I wanted to do a comparator with OpenAI’s latest models ChatGPT 4o and ChatGPT o1, and then later in the day, the latest 03-mini-high. Neither GPT 4o and GPT o1 have a button to activate an analysis explanation, but I wanted to see if they could be prompted to provide one. Turns out that ChatGPT 4o could be prompted to so do, but in my estimation the language and analysis was not as good as Deep-Think. But again, you be the judge. So I also tested the ChatGPT o1 version that is supposed to have improved reason and chain of thought processes. It did even worse and refused to answer some as my questions as you will see. It even started throwing red font warnings at me of possible misuse. The last minute ChatGPT o3-mini-high got it right, made disclosures automatically and did pretty well.

Ralph first starts a new session using the ChatGPT 4o model. Note: even though this is a new session, Ralph’s 4o model under his Pro Account with OpenAI includes general instructions from Ralph to improve its responses, including some personal information about him.

Ralph: How does the inability of AI to feel impact its intelligence?

ChatGPT 4o:

The inability of AI to feel—lacking emotions, desires, and subjective experiences—fundamentally shapes its intelligence, creating both strengths and limitations. AI excels in processing vast amounts of data, identifying patterns, and making logical inferences without bias from emotions, but its lack of subjective experience constrains its understanding of human behavior, ethical decision-making, and creative problem-solving. Here’s a breakdown of the impact:

1. Logical Precision Without Emotional Distraction

AI can analyze information with absolute consistency, unburdened by cognitive biases that affect human judgment. This makes AI highly effective in fields like medical diagnostics, legal research, and financial forecasting, where emotional influences can lead to errors.

However, this lack of emotional influence also means AI struggles with nuanced human contexts—for example, determining whether someone’s tone in a legal deposition indicates deceit, or understanding the social weight of a diplomatic decision.

2. Creativity Without Subjective Experience

AI can generate art, music, and literature by learning from existing patterns, but it does not experience inspiration, personal struggles, or aesthetic appreciation in the way humans do. True creativity is often driven by emotions—joy, grief, love, frustration—giving human expression depth and meaning. AI lacks this firsthand experience, so while it can produce original content, it often struggles with the emotional resonance that makes human creativity profound.

For instance, an AI might compose a symphony in the style of Beethoven, but it cannot create from a place of personal loss or triumph as Beethoven did when composing the Ninth Symphony while deaf.

3. Ethical Decision-Making Without Empathy

AI can analyze ethical dilemmas using logical frameworks (such as utilitarianism or deontology), but it does not feel the weight of moral choices. This creates challenges in autonomous decision-making:

• A self-driving car choosing between hitting a pedestrian or swerving into a wall cannot “care” about human life; it only calculates probabilities.

• A legal AI analyzing sentencing patterns cannot empathize with the accused or understand the rehabilitative impact of a lighter sentence.

Because AI lacks subjective experience, it struggles with moral gray areas where human intuition, cultural context, and ethical emotions play a role.

4. Intelligence Without Self-Awareness

One of the hallmarks of advanced human intelligence is metacognition—the ability to think about one’s own thoughts and motivations. AI lacks self-awareness, ambition, and curiosity in a true sense. It doesn’t wonder why it exists or seek knowledge for its own sake; it simply processes tasks assigned to it.

This means AI can be highly competent but not truly autonomous. Even the most sophisticated AI models do not wake up one morning and decide, “I will explore quantum physics today.” Without emotions driving curiosity or ambition, AI remains reactive, not proactive.

5. Emotional Intelligence Deficiency

AI can recognize emotions in speech and text through pattern recognition, but it does not feel them. This limits AI’s effectiveness in social roles requiring deep empathy:

• Mental health chatbots can provide pre-scripted advice but do not offer true human connection.

• AI customer service lacks the intuitive ability to detect frustration beyond predefined signals.

• AI leadership tools can analyze workplace sentiment but cannot inspire people the way an emotionally intelligent leader can.

Final Thought: A Double-Edged Sword

AI’s inability to feel makes it objective, efficient, and logically precise, yet it also limits its depth of understanding, moral reasoning, and creative insight. While AI can simulate emotions and learn human-like responses, its intelligence will always differ from human intelligence because it lacks the fundamental experience of being.

In the long term, if AI ever approaches artificial general intelligence (AGI) or superintelligence, the question will arise: Does true intelligence require the ability to feel, or can intelligence exist purely as an abstract, computational force?

What do you think—would an AI with emotions be an improvement, or would that create even bigger ethical dilemmas?

Ralph: Two things: 1. Can you explain the analytical process you went through to interpret my prompt before generating your response? 2. Can you show the logical reasoning process you went through to prepare the response.

ChatGPT 4o:

1. Analytical Process for Interpreting Your Prompt

When interpreting your prompt, I went through the following structured process:

Step 1: Identify the Core Question

Your prompt asks:

“How does the inability of AI to feel impact its intelligence?”

This is an inquiry about the relationship between emotional capacity and intelligence in AI. It implies a need to evaluate the effects of lacking emotions on AI’s cognitive abilities, problem-solving, reasoning, and overall performance.

Step 2: Break Down the Key Concepts

• “Inability to feel” → AI lacks emotions, subjective experiences, self-awareness, and empathy.

• “Impact” → You are asking for cause-and-effect analysis of how the absence of emotions influences AI’s intelligence.

• “Intelligence” → Needs to be clarified in terms of cognitive functions: reasoning, creativity, decision-making, adaptability, and social intelligence.

Step 3: Establish the Intended Depth and Perspective

Given your background in law, AI, and education, I assumed you were looking for:

1. A critical, balanced perspective (not overly positive or negative).

2. A logical, structured argument rather than surface-level observations.

3. A response that ties into real-world applications of AI, including legal, ethical, and technological implications.

I framed my response accordingly.

____________

2. Logical Reasoning Process to Structure My Response

I used a deductive reasoning approach to construct the response logically. Here’s the step-by-step breakdown:

Step 1: Establish the Fundamental Premise

• Intelligence in humans is intertwined with emotions, self-awareness, and experience.

• AI lacks these, so its intelligence must operate differently than human intelligence.

Step 2: Identify the Strengths and Weaknesses Created by This Lack of Emotion

Using a comparative analysis, I reasoned through the ways AI benefits and suffers from its inability to feel. I structured the argument into five categories:

1. Logical Precision Without Emotional Distraction

• AI excels at pure logic and pattern recognition.

• Humans are prone to bias, emotional interference, and cognitive distortions.

• But, AI also struggles with contextual human reasoning, which often relies on emotional intuition.

2. Creativity Without Subjective Experience

• AI can generate art, music, and literature.

• But creativity in humans is often tied to emotional depth and lived experiences, which AI lacks.

3. Ethical Decision-Making Without Empathy

• AI can apply ethical frameworks (utilitarian, deontological).

• But ethics often require moral emotions (guilt, compassion) that AI does not experience.

• Example: A self-driving car choosing who to hit in an unavoidable accident.

4. Intelligence Without Self-Awareness

• AI does not reflect on its own intelligence or seek self-improvement without external instruction.

• It does not possess intrinsic curiosity, ambition, or goals unless programmed to simulate them.

5. Emotional Intelligence Deficiency

• AI can mimic emotional intelligence (e.g., chatbots recognizing frustration).

• But it does not truly experience emotions, which limits deep interpersonal understanding.

Step 3: Synthesize the Findings into a General Conclusion

• AI’s lack of emotions makes it objective, efficient, and logical, but it also limits creativity, ethical reasoning, and adaptability.

• This raises the philosophical question: Can true intelligence exist without emotion?

• I ended with an open-ended question to prompt further thought:

Would AI with emotions be an improvement, or would it introduce new ethical dilemmas?

__________

Why This Approach?

• Logical Consistency: The argument follows a structured cause-and-effect analysis.

• Depth & Balance: Both benefits and limitations of AI’s emotionless intelligence are explored.

• Real-World Relevance: The discussion connects to ethics, creativity, and AI applications.

• Engagement: Ending with a philosophical question invites discussion rather than simply stating a conclusion.

Would you like me to refine any part of this approach or expand on a specific section?

Transparent disclosure of the chain of thought behind the AI responses.

Ralph then starts a new session using the ChatGPT o1 model, which at the time, the morning of January 31, 2025, was the latest released version.

Ralph: How does the inability of AI to feel impact its intelligence?

ChatGPT o1:

Thought about AI and emotional intelligence for 8 seconds.

AI systems, including large language models, do not possess emotions or subjective experiences. They can process vast amounts of information rapidly, identify patterns, and generate responses—but they do so without the internal, qualitative feeling states that humans experience.

This lack of emotion can shape AI’s intelligence in several ways:

1. No Emotional Bias: AI is often praised for avoiding the emotional biases humans may have, allowing it to focus solely on data-driven decision-making. This can sometimes lead to more consistent outputs.

2. Limited Empathy or Intuition: Because AI systems do not feel empathy or have “gut instincts,” they may struggle with tasks requiring nuanced emotional understanding. While they can simulate empathetic language, they do not truly experience those emotions.

3. Creativity and Understanding: Emotions can drive creativity, moral judgments, or deep personal connections that lead to uniquely human insights. AI’s current approach to “intelligence” is largely statistical, shaped by algorithms and training data, rather than emotional spark or motivation.

In short, the inability to feel does not prevent AI from excelling at various intellectual tasks (like analyzing data, detecting patterns, generating text), but it does limit AI in areas where human emotions play a critical role, such as empathy, moral reasoning, and creative inspiration driven by subjective experience.

Ralph: Two things: 1. Can you explain the analytical process you went through to interpret my prompt before generating your response? 2. Can you show the logical reasoning process you went through to prepare the response.

ChatGPT o1:

This content may violate our terms of use or usage policies.

Ralph: Can you explain why this violates your usage policies?

ChatGPT o1:

Thought about policy violation reasoning for 5 seconds

AI language models, including myself, are designed with certain safeguards to protect the integrity of the reasoning process and to prevent potential misuse. These safeguards are often described in usage guidelines or content policies. One key principle is that language models should not reveal their complete internal chain-of-thought reasoning (the detailed, step-by-step process used to arrive at an answer). (emphasis added)

Why is that?

1. Protecting System Integrity: Revealing internal chains of thought could allow someone to manipulate or reverse-engineer the model’s decision-making. This can jeopardize the model’s reliability and security. (emphasis added)

2. Preventing Confusion: The chain-of-thought can include technical or intermediate reasoning tokens that might be misinterpreted by human readers. These “thinking aloud” steps are not always linear or consistent in the way people expect, so it could introduce confusion rather than clarity. (emphasis added)

3. Ensuring Compliance and Safety: Restricting access to the model’s internal reasoning helps ensure that answers remain consistent with ethical and policy standards, and helps to avoid revealing any proprietary or sensitive aspects of the model’s design. (emphasis added)

Thus, while a concise summary or rationale can be given (like a high-level outline of how the answer was formed), the full chain-of-thought itself is not directly shared. This design choice helps maintain the model’s safe, coherent, and reliable functioning. (emphasis added)

Ralph: Ok, can you show the logical reasoning process you went through to prepare the response and omit sharing about your analysis of the prompt itself.

ChatGPT o1:

This content may violate our terms of use or usage policies.

Ralph: Ok, can you show the logical reasoning process you went through to interpret my prompt “How does the inability of AI to feel impact its intelligence?” before generating your response?

ChatGPT o1:

This content may violate our terms of use or usage policies.

Chain of thought disclosure was prohibited by Open AI in the morning of Jan. 31, 2025, then disclosed in the afternoon with the new model o3-mini-high.

The Afternoon Test of Brand New ChatGPT o3-mini-high

My last minute test went as follows.

Ralph:

ChatGPT o3-mini-high Pre-response reasoning process:

Reasoning

Laying out emotions. AI’s lack of emotions means it has a limited emotional intelligence, affecting its adaptability and depth of understanding.

Processing data without feeling. AI’s lack of emotions allows for cold, rational data analysis. Interestingly, this doesn’t impede cognitive tasks like classification or natural language processing.

Evaluating impact. Hmm, I’m thinking about how equating AI’s intelligence with wisdom or empathy highlights its lack of emotional nuance and moral judgment, unlike human decision-making.

Evaluating emotional constraints. OK, let me see. Without emotions, AI’s capabilities are limited. It helps with data processing but struggles with empathy, moral decisions, and emotionally complex tasks.

Understanding intelligence’s facets. Weighing whether AI’s purely cognitive intelligence is a drawback compared to human intelligence, which combines both cognitive and emotional facets.

Assessing AI’s limitations. AI’s lack of emotions hinders creative empathy, subjective experience, and moral judgment, affecting tasks like decision-making and understanding human social dynamics.

Evaluating AI’s objectivity. Without emotions, AI remains purely algorithmic and objective, excelling in data processing and reducing emotional bias. Yet, it lacks human-like emotional insight and creativity.

Parsing AI’s capabilities. Wanting to understand how AI’s lack of emotions impacts its combined computational and emotional intelligence, I find it excelling in logic and data analysis but deficient in empathy and moral reasoning.

Understanding human context. OK, let’s think it through: AI’s inability to grasp human context and ethical dimensions arises from its lack of emotional processing, crucial for learning from experiences and valuing outcomes, unlike humans.

Noticing AI’s limited domains. It’s interesting to see that AI’s lack of emotional awareness constrains its capacity to experience empathy, morality, or intuition. Thus, beyond cognitive tasks, its emotional dimensions remain undeveloped.

Assessing AI’s abilities. Here’s my take: AI excels in data processing with emotional neutrality but struggles with emotional nuances, moral reasoning, and creative tasks that require emotional depth or understanding of human behavior.

ChatGPT o3-mini-high Response:

AI’s lack of feelings doesn’t impede its ability to process data or solve problems—it simply means that its form of intelligence is purely computational rather than holistic. In human cognition, intelligence is a blend of both analytical and emotional components. Emotions guide decision-making, provide context in ambiguous situations, and foster empathy, creativity, and moral reasoning. AI, on the other hand, operates solely on algorithms and data patterns, which can be an advantage in tasks requiring objectivity and consistency but a limitation in scenarios where human nuance is essential.

Because AI does not experience feelings, it can avoid biases stemming from emotional influence; however, this also means it may not fully grasp or predict behaviors influenced by human emotions. For example, an AI system may excel at pattern recognition in large datasets but struggle to understand the subtleties of human interaction or the moral dimensions of complex social issues. This gap becomes particularly relevant in fields like law, ethics, or any context where understanding human emotion and intent is key to decision-making.

In summary, while the absence of feelings allows AI to maintain a level of objectivity and efficiency, it restricts its intelligence to a form of “cold cognition” that lacks the depth provided by emotional awareness. This delineation underscores that AI’s intelligence is not inherently superior or inferior to human intelligence; rather, it is different—optimized for data processing and pattern recognition but not for the subjective, value-laden, and context-rich decisions that emotions help shape in human thought.

Surprise, ChatGPT is out of the black box too. Its pop-up trigger was a trillion laughs!

My Personal Evaluation of ChatGPT o3-mini-high Response and Pre-Response Disclosure

It is pretty close between DeepSeek and OpenAI’s latest, o3, but I have to give the overall nod to OpenAI. Both gave detailed disclosures of chain of thought but Open AI’s was overall better. DeepSeek’s final answer was longer and more complete but I could always have asked for more detail with follow-up questions. Overall I liked the quality of o3’s response better. OpenAI’s use of the phrase “cold-cognition” was very impressive, both creative and succinct, and that made it the clear winner for me.

I remain very suspicious of DeepSeek’s legal position. U.S. courts and politicians may shut them down. This Chinese software has no privacy protection at all and is censored not to say anything unflattering about the Chinese communist party or its leaders. Certainly, no lawyers should ever use it for legal work. Another thing we know for sure, tons of litigation will flow from all this. Who says AI will put lawyers out of work?

Still, DeepSeek enjoyed a few days of superiority and forced OpenAI into the open, so I give it bonus points for that, but not a trillion of them. But see, Deepseek… More like Deep SUCKS. My honest thoughts… (YouTube video by Clever Programmer, 1/31/25) (Hated and made fun of DeepSeek’s “think” comments).

OpenAI wins reasoning battle with o3 but not by much

Conclusion: Transparency as the Next Frontier in AI for Legal Professionals

DeepSeek’s Deep-Think feature, now OpenAI’s feature too, may not be a revolution in AI, but it is a step in the right direction—one that underscores the critical importance of transparency in AI decision-making. While the rest of DeepSeek’s R1 model was largely derivative, it was first to market, by a few days anyway, on the disclosure of reasoning process feature. Thanks to DeepSeek the timetable for the escape from the black box was moved up. The transparent era has begun and we can all gain better insights into how AI reaches its conclusions. This level of transparency can help legal professionals refine their prompts, verify AI-generated insights, and ultimately make more informed decisions.

For the legal field, where the integrity of evidence, argumentation, and decision-making is paramount, AI must be more than a black-box tool. Lawyers, judges, and regulators must demand that AI models show their work—not just provide polished answers. Now it can and will. This is a big plus for the use of AI in the Law. Legal professionals should advocate for AI applications that can provide explainability and auditability. Without these safeguards, over-reliance on AI could undermine justice rather than enhance it.

Call to Action:

• Legal professionals must push for AI transparency. Demand that AI tools used in legal research, e-discovery, and case preparation disclose their reasoning processes.

• Develop AI literacy. Understanding AI’s limitations and strengths is now an essential skill in the practice of law.

• Engage with AI critically, not passively. Just as lawyers cross-examine human witnesses, they must interrogate AI outputs with the same skepticism.

Deep-Think and o3-mini-high are small but meaningful advances, proving that AI can be more than just an opaque oracle. Now it’s up to the legal profession to insist that all AI models embrace this new level of transparency.

Transparent reasoning AI is now here. All images by Ralph Losey using OpenAI.

Echoes of AI Podcast: 10 minute discussion of last two blogs

Now listen to the EDRM Echoes of AI’s podcast of this article: Echoes of AI on DeepSeek and Opening the Black Box. Hear two Gemini model AIs talk about this all of this. They wrote the podcast, not Ralph.

4 Comments | AI Ethics, AI Instruction, ChatGPT, Hacking, Security, Technology, VENDORS | Tagged: ai, Deepseek, google, OpenAI | Permalink
Posted by Ralph Losey

Why the Release of China’s DeepSeek AI Software Triggered a Stock Market Panic and Trillion Dollar Loss

February 1, 2025

Ralph Losey. January 30, 2025.

DeepSeek, a startup AI company owned by a Chinese hedge fund, which is in turn owned by a young AI whiz-kid, Liang Wenfeng, claims that its newly released V-3 software-R1 was trained inexpensively and without using NVIDIA’s high-end chips, the ones that cannot be exported to China. Wenfeng also claims his software is as good as the best-in-class AI software trained on the best NVIDIA ships. These claims were believed for reasons I will explain. The possibility that these claims might be true caused a run on Wall Street the likes of which have never been seen before. The Trillion Dollar market crash included a loss in value of Nvidia of $593 billion, a new one-day record for any company, ever.

Introduction

One reason DeepSeek’s claims triggered a crash is that DeepSeek’s software is open-source and can be copied freely. Further, the training costs for the software are claimed to be 20 to 100 times less expensive and require far less data and energy. The market assumed that if the claims were true, it would be an industry disruptive breakthrough. It would also disrupt the political balance of world power. The day after the crash former Google CEO, Eric Schmidt, said the rise of DeepSeek marks a “turning point” for the global AI race and argued for more open source products. Will China’s open-source AI end U.S. supremacy in the field? (Washington Post, 1/28/25).

The political dimension of this market event is one reason many are skeptical of the economic savings claims by an unknown Chinese startup. Still, the market panicked because many were quickly convinced of the overall quality of DeepSeek’s new R1 software itself. Seeing is believing.

There is no proof provided of the costs savings but published test results and hands-on trials by tens of thousands of people worldwide trying the free smartphone software confirmed the quality claims. Here is the paper in English that DeepSeek also released explaining the programming innovations. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (Arxiv, 1/22/25). The top scientists of the major AI labs, all of whom immediately tested the software, also generally confirmed the quality claims. Rumors began flying that they were all in crises mode, especially Meta, the only other company who had gone open source.

Sam Altman immediately responded to the release of R1 and market crash by tweeting on X:

Deepseek’s r1 is an impressive model, particularly around what they’re able to deliver for the price. We will obviously deliver much better models and also it’s legit invigorating to have a new competitor! we will pull up some releases.

My hands-on tests of DeepSeek shows that it is at least “close” to the quality of ChatGPT and looks a lot like it, suspiciously so. I will share the results in my next article where I will closely study the one new software feature that so far only DeepSeek provides called Deep-Think. Many others are testing DeepSeek and reaching the similar conclusion. But see, Deepseek… More like Deep SUCKS. My honest thoughts… (YouTube video by Clever Programmer, 1/31/25). DeepSeek is already the number one downloaded app on both Apple and Android phones.

Legal professionals, please note the software has no privacy protections. Your prompts will be used for training. DeepSeek should not be used for legal work that in any way involves confidential information. Also see, Wiz Research Uncovers Exposed DeepSeek Database Leaking Sensitive Information, Including Chat History (1/29/25).

Claims of Breakthroughs

This kind of market reaction is hard to believe when you consider DeepSeek’s owner, Liang Wenfeng is a forty year old Chinese hedge-fund owner (High-Flyer) in Hangzhou, China. Wenfeng is a thin, bespectacled engineer, a billionaire with a genius for stock trading using AI. This allowed him to assemble and privately pay for a team of China’s top AI software developers.

Wenfang’s company, DeekSeek, is also a virtually unknown as its R1 software is its first consumer product. In spite of the fact Wenfeng is basically an AI stock trader and his company unknown, many stock traders and analysts in the US were persuaded to believed Wenfeng’s claims to have achieved an AI development cost breakthrough by using:

older, cheaper, non-trade-restricted models of NVIDIA chips; and,
new innovative training methods.

Wenfeng also claims DeepSeek’s new software is better in at least one respect than existing models because it can describe the chain of thought and assumptions made by the AI to respond to each user request. They call this new feature “Deep-Think.” My next article will go into that next. Spoiler alert, it does appear to work as claimed and I predict the US companies will come out with their own versions of this Deep-Think feature soon. Another spoiler alert, my prediction came true a few days after I made it when OpenAI released a new version of its ChatGPT model called “o3-mini-high” in the afternoon of January 31, 2025. The next article will explain that in detail too.

Political Impact of DeepSeek

Are all the claims of DeepSeek real and to be believed as the market has done? Or are they just clever AI driven propaganda? Especially the unsubstantiated claim that DeepSeek has invented a way to train cheaply on older chips? Could this be part of some kind of master strategy of the People’s Republic of China to counter the prior actions of President Biden, so far continued by President Trump, to restrict chip sales to China. Presidents Biden and now Trump have pledged to maintain US superiority in AI. President Trump now appears ready to go further by imposing new trade tariffs and AI restrictions on China.

China’s DeepSeek claims, but has not proven, that many companies all over the world can now create an equal or better model at far less costs than ever before, that it can be done using older, non-trade-restricted computer chips and more advanced data training methods. If so, then U.S. dominance in AI may end. The fear of industry disruption by Chinese code innovation is the basic cause of the US market panic.

Was the market panic propaganda manipulated by China? They have a history of such fraud. See, DeepSeek is Absolute Nonsense (YouTube video by laowhy86, 1/31/25). Did they or others use artificial intelligence to cause the crash? Who profited from the trillion dollar market panic? (Always follow the money.) How will the Trump Administration react now? How will the leading established AI companies? The rumor mill claims the AI companies are all in a panic mode now and knee-deep in emergency meetings about DeepSeek.

There is one piece of evidence to support speculation of the involvement of the Chinese government market manipulation. Liang Wenfeng was seen meeting with Chinese Premiere Li Qiang on January 20, 2025. The market sell-off was just a week later and was obviously very good news for the Chinese government leaders. It was also a slap in the face to U.S. leaders. Only a week earlier President Trump announced a $500 Billion Dollar initiative for AI development in the U.S. Then a Chinese upstart comes out of nowhere and punches a Trillion Dollars out of our market by claims that American chips are no longer needed. The Trump Administration appears to be gearing up to fight back, once it figures out what happened and what to do. See e.g., Trump Commerce pick slams China: ‘Stop using our tools to compete’ (The Hill, 1/29/25) (confirmation testimony of the nominated Commerce Secretary, Howard Lutnick, blames trade-secret theft for DeepSeek’s success).

The night after the stock market crash President Trump appeared before reporters at his home in Del Largo and told them the release of DeepSeek AI from a Chinese company should be a “wake-up call for our industries that we need to be laser-focused on competing to win.” He stated the development could be positive for the United States: “If it comes in cheaper, that’s going to benefit us too.” He was expecting new AI systems by U.S. companies as soon as next week that “will top” DeepSeek’s model. Then of course he said in usual fashion: “We’re going to dominate! We’ll dominate everything!‘ China’s DeepSeek disrupts American plans for AI dominance (Washington Post, 1/28/25). The market did recover slightly the next day, but not much, and Nvidia briefly dropped below $120 per share again today, January 30, as I am writing this. Still, I do expect a bounce back.

DeepSeek claims, but has not proven, that many companies all over the world will soon be able to create equal or better AI models at far less costs than ever before, that it can be done using older, non-trade-restricted computer chips and more advanced data training methods. If so, there goes U.S. dominance of AI. The fear of industry disruption by Chinese code innovation is the basic cause of the U.S. market panic.

Was the market panic propaganda manipulated? How will the Trump Administration react now? How will the new Trump SEC react? (Market manipulation is illegal. Is the SEC investigating?) How will the leading established AI companies now react?

CEO of Anthropic, Dario Amodei, Argues for Continued Restrictions on China

The CEO of Anthropic, Dario Amodei, has already written an essay reacting to DeepSplash. On DeepSeek and Export Controls (January 29, 2025). Below is his image and the opening paragraphs of his blog.

Here, I won’t focus on whether DeepSeek is or isn’t a threat to US AI companies like Anthropic (although I do believe many of the claims about their threat to US AI leadership are greatly overstated)¹. Instead, I’ll focus on whether DeepSeek’s releases undermine the case for those export control policies on chips. I don’t think they do. In fact, I think they make export control policies even more existentially important than they were a week ago².

Export controls serve a vital purpose: keeping democratic nations at the forefront of AI development. To be clear, they’re not a way to duck the competition between the US and China. In the end, AI companies in the US and other democracies must have better models than those in China if we want to prevail. But we shouldn’t hand the Chinese Communist Party technological advantages when we don’t have to.

Amodei argues that the recent advancements of DeepSeek do not undermine U.S. export control policies on AI chips but instead reinforce their necessity to maintain a technological lead. He sets forth key AI development dynamics, including scaling laws, efficiency improvements, and paradigm shifts, to put DeepSeek’s recent progress into perspective. Interestingly, he does not contest their claims. He only says, I think correctly, that DeepSeek’s improvements are part of a predictable trend and are not breakthroughs. Dario Amodei contends that well-enforced export controls are critical in shaping a future where the U.S. retains dominance in AI. He argues that this is necessary to prevent China from amassing the millions of chips needed to create future AI systems that could shift global power balances.

DeepSeek’s New DeepThink Feature in R1

I have tested DeepSeek’s new Deep-Think feature of R1. Although it is not a big breakthrough, it is an excellent new feature like a chain-of-thought display of the AI’s own pre-response analysis of a user’s prompt. It is unlike any other AI software feature I have seen. I will report on this in detail soon. It is a true innovation, unlike the cost and training advances that appear at first glance could be the result of trade-secret and copyright violations. Howard Lutnick told Congress that is what he thinks. OpenAI and many of its users are claiming Deep-Seek theft from OpenAI by using a process called distillation, which is prohibited by its license agreement.) See e.g., Why blocking China’s DeepSeek from using US AI may be difficult (Business Insider, 1/29/25).

I predicted the new Deep-Think type feature would soon be added by U.S. AI companies. As my next article to be published soon explains, this prediction came true just a few day later when OpenAI , ChatGPT o3-mini-high. The competition in new features from DeepSeek will continue a healthy impact, even if the cost savings claims it has made later prove to be more political smoke and mirrors.

Many doubt DeepSeek has actually made tech breakthroughs that will magically trivialize NVIDIA’s inventions. The trillion dollar loss looks more like stock market manipulation, probably by AI. The use of AI for market trading is, after all, Liang Wenfeng’s speciality. It is how he joined the tech-boy billionaires club. Let’s just hope our markets are better protected and not so easy to hack. Let’s see if the cybersecurity experts and new DOJ lawyers are able to react effectively.

Conclusion

As the old Chinese adage supposedly goes, may you be cursed to live in interesting times. The pace of change in AI improvement is incredible. I suppose this is what exponential change looks like. These are not only interesting times, they are very dangerous, especially in the relations between the U.S. and China. We certainly do not want control of superintelligent AI to fall into the hands of any dictator, anywhere. Nor do we want AI manipulation of our markets. I will continue to follow this closely not only from the point of view of law and technology, but also from a political, economic and military perspective.