Designing Generative AI for Legal Professionals: Key Principles and Best Practices

November 14, 2024

by Ralph Losey

Published on November 14, 2024

Generative AI is transforming the landscape of legal technology, offering unprecedented opportunities to automate tasks and streamline complex workflows. Yet, designing AI tools that meet the needs of legal professionals requires more than just technical expertise; it demands a deep understanding of the everyday challenges and workflows lawyers face. From automating document review to drafting briefs, these tools have the potential to save time and boost productivity—but only if they are designed with real-world legal practice in mind. A set of six design principles, identified in a May 2024 study by IBM researchers, provides a practical roadmap for creating AI applications tailored to the unique demands of the legal profession. This article explores these principles, offering actionable steps for developers and legal professionals alike.

In the last year, a wave of generative AI tools has emerged, ranging from free Custom GPTs on platforms like OpenAI’s ChatGPT to premium legal tech applications costing tens of thousands annually. While the technology behind these tools is impressive, developing effective applications requires a deep understanding of legal workflows and needs. Generative AI is fundamentally different from traditional software and requires a distinct approach to design.

A May 2024 study, Design Principles for Generative AI Applications, by IBM researchers, lays out six practical principles for designing effective generative AI tools. This article examines how these principles can be applied specifically to the legal tech sector, offering a guide for those looking to build or select tools that are both innovative and practical.

Outline of the Scientific Article and Authors

Design Principles for Generative AI Applications (CHI ’24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, Article No.: 378, Pgs 1 22, May 11, 2024) (hereinafter the “Study“) was authored by: Justin D. Weisz, Jessica He, Michael Muller, Gabriela Hoefer, Rachel Miles, and Werner Geyer. They are all part of the IBM Research Team, which has over 3,000 members. The Study describes the extensive peer review process that the authors went through to decide upon the six key principles of generative software design. It was very impressive.

The 22-page Study is a high-quality, first-of-its-kind research project. The IBM sponsored Study has 196 footnotes and is sometimes quite technical and dense. Still, it is a must-read for all serious developers of generative AI-based software and is recommended for any law firm before making major purchases. The success of AI projects depends upon the selection of well-designed software. Poorly designed applications with an impressive list of features are a receipe for frustration and failure.

My article will not go into the many details as to how the design guidelines were derived, but focuses instead on the end result. Still, readers might benefit from a quick review of the Study’s Abstract:

Generative AI applications present unique design challenges. As generative AI technologies are increasingly being incorporated into mainstream applications, there is an urgent need for guidance on how to design user experiences that foster effective and safe use. We present six principles for the design of generative AI applications that address unique characteristics of generative AI User Experience (UX) and offer new interpretations and extensions of known issues in the design of AI applications. Each principle is coupled with a set of design strategies for implementing that principle via UX capabilities or through the design process. The principles and strategies were developed through an iterative process involving literature review, feedback from design practitioners, validation against real-world generative AI applications, and incorporation into the design process of two generative AI applications. We anticipate the principles to usefully inform the design of generative AI applications by driving actionable design recommendations.

The six principles are:

  1. Design Responsibly.
  2. Design for Mental Models.
  3. Design for Appropriate Trust and Reliance.
  4. Design for Generative Variability.
  5. Design for Co-Creation
  6. Design for Imperfection

The first three principles offer new interpretations of known issues with AI systems through the lens of generative AI. The next three design principles identify unique characteristics of generative AI systems. Study, Figure 1. All six principles support two important user goals: a) optimizing generated text to meet task-specific criteria; and b) exploring different possibilities within a specific domain.

This article will discuss how each principle can apply to the legal profession.

Background on Generative AI and Its Potential in the Law

Generative AI is distinguished by its ability to create new content rather than merely analyzing existing data. This capability stems from reliance on large-scale foundation models, trained on incredibly large datasets to perform diverse tasks with human-like fidelity (meaning that, like humans, it can sometimes make mistakes). In legal practice, generative AI can streamline several key tasks when implemented thoughtfully, including:

  • Legal Research: Automating the process of searching for relevant case law, statutes, and regulations.
  • Document Drafting: Generating contracts, briefs, and other legal documents based on specified parameters.
  • Due Diligence: Analyzing large volumes of documents to identify potential risks and liabilities.
  • Contract Review: Identifying and flagging potential issues in contracts.
  • Legal Writing: Generating clear and concise legal writing.
  • Brainstorming: Suggesting new ideas based on simulated experts talking to each other. See e.g., Panel of AI Experts for Lawyers.

Design Principles for Generative AI in Legal Tech

Integrating AI in any field requires a thoughtful approach, and the legal profession, with its emphasis on ethics and accuracy, demands even greater diligence. AI should augment legal work without compromising the profession’s core values.

The Study outlines six practical design principles that offer a roadmap for developing generative AI tools tailored to legal practice. Here’s how each principle can be implemented to ensure that AI applications meet the unique demands of the legal field:

1. Design Responsibly

  • Human-Centered Approach: To implement this, developers should start with user research, such as interviews with lawyers to understand their daily challenges. For instance, incorporating a feedback loop into AI tools allows legal professionals to directly flag inaccuracies, ensuring continuous improvement of the tool’s outputs. This can be achieved by incorporating design thinking and participatory design methodologies. Observing how legal professionals perform their tasks and understanding their challenges are essential first steps.

    For example, research into actual lawyer practice can provide valuable insights into how generative AI can be best integrated into their daily routines. It’s not about replacing lawyers but about empowering them with tools that enhance their capabilities and decision-making processes.
  • Addressing Value Tensions: The development of legal tech involves various stakeholders, including legal professionals, developers, product managers, and decision-makers like CIOs and CEOs. Stakeholders often have differing values and priorities. For instance, legal professionals prioritize accuracy and reliability, while developers may focus on efficiency and innovation. These differing values can lead to value tensions that need to be identified and addressed proactively.

    The Study suggests using the Value Sensitive Design (VSD) framework, which provides a structured approach to identifying stakeholders, understanding their values, and navigating the tensions that may arise.
  • Managing Emergent Behaviors: A unique characteristic of generative AI is its potential to exhibit emergent behaviors. These are capabilities extending beyond the specific tasks a model was trained for. While emergent behaviors can be beneficial, leading to unexpected insights or efficiencies, they can also pose risks, such as generating biased or offensive content. Designers must consider whether to expose or limit these behaviors, weighing potential benefits against possible harm. This might involve a combination of technical constraints and user interface design strategies to guide AI output and prevent undesirable results.

    For example, if a generative AI tool designed to summarize legal documents starts generating legal arguments, designers might need to adjust the model’s parameters or provide users with clear instructions on how to use the tool responsibly.
  • Testing for User Harms: Generative AI models, particularly those trained on extensive text datasets, are susceptible to producing biased, offensive, or potentially harmful outputs Rigorous testing and ongoing monitoring are essential to minimize these risks. Designers and developers should benchmark models against established datasets to identify hate speech and bias. Additionally, providing users with clear mechanisms to report problematic outputs can help identify and address issues that may not be caught during testing.

2. Design for Mental Models

  • Orienting Users to Generative Variability: Legal professionals are accustomed to deterministic systems in which the same input consistently produces identical outputs. Generative AI, however, introduces variability, generating different outputs from the same input. Designers must address this shift by helping users comprehend and leverage this inherent variability. This may involve presenting multiple output options, enabling users to explore different possibilities or providing clear explanations of factors influencing output variation.

    For example, a contract-drafting tool might provide templates and successful prompt examples, guiding users on accurately specifying contract clauses and provisions.
  • Teaching Effective Use: Legal professionals must adapt their skills and workflows to effectively incorporate generative AI into their practices. This includes understanding how to construct effective prompts, recognizing the limitations of the technology, and critically evaluating the generated outputs.

    Designers play a crucial role in facilitating this learning by offering comprehensive tutorials, real-world examples, and clear explanations of AI capabilities and constraints. For example, a contract drafting tool could offer templates and examples of successful prompts, guiding users on how to specify desired contract clauses and provisions accurately.
  • Understanding Users’ Mental Models: Understanding how legal professionals conceptualize these tools and their capabilities is crucial for designing intuitive and effective legal tech applications.
    User research methods like interviews and observations are essential for understanding users’ mental models. Asking users to describe how they believe a particular application works can reveal valuable information about their understanding and expectations. This understanding enables designers to align user interfaces and interactions with users’ existing mental models, making adopting new tools smoother and more intuitive.

    For example, if users perceive a legal research tool as a supplement to traditional databases, designers can highlight the complementary nature of AI-powered research, emphasizing its ability to uncover connections and insights that might be missed through conventional methods.
  • Tailoring AI to Users: A significant advantage of generative AI is its ability to adapt to individual users. By leveraging techniques like prompt engineering, designers can tailor the AI’s responses based on user preferences, background, and specific needs. This may include adjusting language complexity and style, providing tailored recommendations, or adapting the user interface for individual workflows. For instance, a legal writing tool might learn from a user’s style and preferences, generating suggestions and text that aligns with their voice and tone.

Most lawyers enjoy tailoring their AI to fit their practice and personalities. Image by Ralph Losey using WP Stable Diffusion.

3. Design for Appropriate Trust & Reliance

  • Calibrating Trust Through Transparency: Legal professionals must understand when to trust generative AI outputs and when to exercise caution. Transparency is key to establishing this trust. In practice, this can be achieved by adding a ‘source traceability’ feature to AI tools, allowing lawyers to view the origins of information used in AI-generated summaries. This transparency helps lawyers decide when to rely on the AI’s outputs and when to conduct additional research.

    This may also include displaying confidence levels for outputs, flagging areas for further review, or providing disclaimers about AI’s inherent imperfections. For example, a contract review tool might flag clauses with low confidence scores, encouraging users to examine those sections more closely.
  • Providing Justifications for Outputs: To enhance transparency, designers should give users insight into the reasoning behind AI outputs. This could involve revealing the AI’s ‘chain of thought,’ showing the source materials used to generate the output, or displaying the model’s confidence levels. Understanding how AI reaches a result allows users to better assess its validity and make informed decisions.

    For instance, a legal research tool might display snippets from source documents that support specific AI-generated legal arguments, allowing users to verify the accuracy and relevance of the information. This makes it easy for legal professionals to trust but verify. This is the fundamental mantra for the legal use of AI in these early days because it can still make errors and sometimes even sycophantic hallucinations.
  • Encouraging Critical Evaluation with Friction: Overreliance on AI may lead to complacency and missed opportunities for critical thinking, both of which are essential in legal practice. Designers can incorporate cognitive forcing functions into the user interface to encourage users to slow down, carefully review outputs, and engage in critical evaluation.

    This may include requiring users to manually confirm or edit AI-generated suggestions, presenting alternatives alongside AI recommendations, or highlighting potential inconsistencies or risks for user review. For example, a contract-drafting tool might flag commonly disputed clauses or those requiring special attention, encouraging users to review these sections thoroughly.
  • Clarifying the AI’s Role: AI systems can serve various roles, from simple tools to collaborative partners or advisors. Put another way, is the tool designed for a centaur-type hybrid mode or a more complex cyborg mode? See e.g. From Centaurs To Cyborgs: Our evolving relationship with generative AI (4/24/24).

    Clearly defining the AI’s intended role in legal tech applications shapes user expectations and promotes appropriate trust. For example, an AI positioned as a “research assistant” might be expected to provide comprehensive information, while a “contract drafting tool” might be primarily expected to generate initial drafts for further review and editing. By accurately representing the AI’s capabilities and limitations within a defined role, designers can mitigate the risk of users over-relying on the technology or misinterpreting its outputs.

4. Design for Generative Variability

  • Accommodating Generative Variability: Legal professionals are used to deterministic systems, where the same input consistently produces the identical output. Generative AI introduces variability, producing different outputs even with identical inputs. Designers must address this shift by helping users comprehend and leverage this inherent variability.

    This could involve presenting multiple output options, allowing users to explore different possibilities, or providing clear explanations of the factors that influence output variation. For instance, a legal research tool powered by generative AI could offer different summaries of a case, each focusing on a specific aspect, allowing users to gain a more comprehensive understanding of the legal precedent.
  • Facilitating Effective Use: Legal professionals must adapt their skills and workflows to integrate generative AI effectively into their practices. This includes understanding how to construct effective prompts, recognizing the limitations of the technology, and critically evaluating the generated outputs.

    Designers can play a key role in facilitating this learning process by providing comprehensive tutorials, real-world examples, and clear explanations of the AI’s capabilities and constraints. For example, a contract-drafting tool could offer templates and examples of successful prompts, guiding users on how to specify desired contract clauses and provisions accurately.
  • Highlighting Differences and Variations: Visual cues can help users quickly understand how multiple outputs differ from each other. This could involve highlighting changes between drafts, color-coding outputs based on confidence levels, or using visual representations to display the distribution of outputs.

5. Design for Co-Creation

  • Supporting Co-Editing and Refinement: Legal professionals frequently need to adapt and refine AI-generated content to meet specific requirements, legal precedents, or client needs. To implement this, developers should focus on co-editing features that let lawyers refine AI-generated text directly within the interface, such as tools for editing clauses in AI-drafted contracts. This approach ensures that AI outputs are not treated as final but are instead starting points that lawyers can shape to fit specific needs.

    This could also involve providing tools for manipulating charts and images, or adjusting parameters to fine-tune outputs. A contract-drafting tool could enable users to revise specific clauses with versions that are either more aggressive or cooperative than standard, or to incorporate additional provisions based on client instructions.
  • Guiding Effective Prompt Crafting: The quality and relevance of outputs generated by AI models are heavily dependent on the prompts provided. Designers play a crucial role in helping users craft effective prompts by offering clear guidance, templates, and examples.

    This may include interactive tools that guide users in defining their needs, specifying output characteristics, and refining prompts to achieve optimal results. For instance, a legal research tool might include a structured prompt builder, helping users define research questions, specify relevant jurisdictions, and refine search parameters for more targeted results.

6. Design for Imperfection

  • Communicating Uncertainty Transparently: Designers must be transparent about potential imperfections in AI-generated outputs. This involves clearly communicating the technology’s limitations, displaying confidence levels, and highlighting potential error areas.

    Designers can use disclaimers and visual cues to alert users to uncertainties, encouraging critical evaluation of the results. For example, a legal research tool might use color coding to indicate confidence levels of different sources, helping users prioritize reliable information.
  • Integrating Domain-Specific Evaluation Tools: Legal professionals require ways to assess AI-generated output quality and reliability using domain-specific metrics. Designers can integrate domain-specific evaluation tools directly into legal tech applications.

    This may include features like automatic citation checks, factual accuracy verification against reliable sources, or evaluating the persuasiveness of legal arguments using predefined criteria. Providing these tools empowers users to validate AI-generated content and make informed decisions in their legal work.

    Domain-specific tools could drill down even further into sub-specialties of the law. For instance, one version for ERISA litigation and another for personal injury, or one version for civil litigation and another for criminal.
  • Offering Options for Output Improvement: Instead of presenting AI-generated outputs as final, designers should provide users with opportunities for refinement and improvement. This may include editing tools, enabling users to regenerate outputs with different parameters, or suggesting alternatives based on user feedback. Enabling users to iteratively refine AI-generated content fosters a collaborative approach to legal work, positioning AI as a starting point for human expertise and judgment.
  • Collecting Feedback for Continuous Improvement: User feedback is a critical element in adapting AI tools to real-world legal practice. Including simple feedback mechanisms—such as a button to flag unclear or inaccurate results—allows developers to fine-tune the tool over time, ensuring that it remains aligned with user needs. Multiple built-in mechanisms should enable users to easily provide feedback on AI-generated outputs, flag errors, suggest improvements, or rate feature usefulness. This continuous feedback loop helps retrain models, adjust parameters, refine prompts, and improve the overall user experience, ensuring that legal tech applications evolve to meet the dynamic needs of legal professionals.

    However, these user feedback features are sorely lacking in most legal software today. Far too often, users are left with limited options—complaining to project managers, voicing concerns to sales representatives, or ultimately canceling their subscription. In many cases, direct conversations with company leaders, like CEOs or head software designers, yield little if action is not taken by the vendor to address user concerns. This creates frustration and limits the potential for meaningful product improvement.

    Legal tech companies must do more than just provide feedback channels; they must actively listen and take action. Integrating mechanisms like in-app feedback buttons, instant AI responses and timely human followup, automated surveys, and regular user forums can ensure that feedback doesn’t just disappear into a void. More importantly, companies should demonstrate a commitment to implementing user suggestions and keeping users informed of changes. Continuous improvement must be more than a slogan—it should be a practice embedded into every stage of development. Without this, legal professionals will inevitably turn elsewhere in search of tools that better align with their needs.

Conclusion: The Future of Legal Tech in the Age of AI

Integrating generative AI into legal practice is not a simple transition; it requires strategic planning, targeted training, and a deep understanding of both technology and legal processes. Success will depend on close collaboration between software developers, legal professionals, and AI experts, ensuring that AI tools are tailored to the complex needs of the legal field. A key element of this collaboration is creating robust feedback mechanisms that allow legal professionals to directly shape the evolution of AI tools. By actively listening to user input and iterating on design, legal tech companies can ensure that AI applications remain relevant and effective.

With a clear roadmap that includes user training, open feedback channels, and a commitment to continuous improvement, generative AI can transform legal practice, driving progress while preserving the profession’s core values. Legal professionals and developers should begin by identifying key areas where AI can add value and prioritize building feedback mechanisms that facilitate ongoing refinement. This approach will ensure that AI integration is not only successful but also sustainable, ultimately creating tools that truly serve the legal profession’s needs.

Have you heard the Podcast about this article? Listen to the Echoes of AI Podcast about this article on the EDRM Global Podcast Network.

Click on the image or here to go listen to the Podcast.

Ralph Losey Copyright 2024. All Rights Reserved.


Can AI Replace Human Mediators? Groundbreaking Study Reveals Surprising Results

October 30, 2024

by Ralph Losey

Published October 30, 2024

Artificial Intelligence is no longer just a tool for automating mundane tasks—it’s now stepping into arenas traditionally dominated by human judgment and empathy. One of the most intriguing applications of AI is in dispute resolution, where large language models like GPT-4 are being tested as mediators. Pre-trial settlements are critical to the continued functioning of our system of justice because an estimated 92% percent of civil cases are resolved out of court. With the rise of online dispute resolution, the potential for AI to resolve low-stakes disputes autonomously is appealing, especially as legal systems are increasingly overburdened with new case filings.

AI image by Ralph Losey using WordPress’s Stable Diffusion

But can AI truly manage the complexities of human conflict? What happens when a machine has to balance neutrality with empathy, or data analysis with human emotion? This article discusses a groundbreaking study, Robots in the Middle: Evaluating Large Language Models in Dispute Resolution,” offering insights into how AI may augment—though not replace—human mediators. Let’s explore the future of AI in the courtroom and beyond.

AI image by Ralph Losey using WordPress’s Stable Diffusion

Introduction

The Robots in the Middle study provides an empirical evaluation of LLM AIs acting as mediators. Mediation is very important to our system of justice because mediation and other methods of voluntary settlement keep our court systems functioning. According to an article by Harvard Law Professors, David A. Hoffman and John H. Watson, Jr: “… up to 92 percent of cases are resolved out of court, a figure that does not include the number of lawsuits that are never filed because the parties used other dispute resolution methods at the outset.” Resolving conflict outside the courtroom: Why mediation skills are increasingly valuable for lawyers, according to two Harvard Law experts (Harvard Law Today, 4/29/24). This is one reason why AI researchers in legal technology are so interested in the possible application of LLM AI to mediation.

The Robots in the Middle study was based on mediation by text messages of disputes in fifty hypothetical disputes. The analysis and responses of humans with AI expertise and some limited legal experience, none of whom were professional mediators, were compared with responses of ChatGPT4o (omni). The AI prompts used in the experiment were based on the mediator’s guide of the Department of Justice of Canada. Dispute Resolution Reference Guide: Practice Module 2 (August 25, 2022). The humans and AI were asked to select among thirteen mediation types set forth by the DOJ Canada. They were asked to pick one to three types they judged to be appropriate for each hypothetical. Then they were asked to prepare text messages to facilitate a settlement. The authors served as blind judges to evaluate the quality of the responses, not knowing which were generated by humans and which by ChatGPT4o.

AI image by Ralph Losey using WordPress’s Stable Diffusion

With the growing demand for Online Dispute Resolution (‘ODR”) platforms (see e.g. ODR.com), the study examines whether LLMs like GPT-4 might be able to effectively intervene in disputes by selecting appropriate types of mediation interventions and drafting coherent, impartial intervention messages.

The premise is simple, yet potentially transformative: if AI can handle routine, low-stakes disputes efficiently, this would alleviate the burden on human mediators, allowing them to devote their time and expertise to more complex, emotionally charged cases. The research sought to answer three fundamental questions:

  1. How well can LLMs select intervention types in disputes?
  2. How do LLMs compare to human mediators in drafting intervention messages?
  3. Are AI-generated messages safe, free from hallucinations, and contextually appropriate?
AI image by Ralph Losey using WordPress’s Stable Diffusion

Who is Behind this Study?

The Robots in the Middle study has seven authors in academic fields of law and technology: Jinzhe TAN,  Hannes WESTERMANN, Nikhil Reddy POTTANIGARI, Jaromır SAVELKA, Sebastien MEEUS, Mia GODET and Karim BENYEKHLEF. They are from multiple Universities: Cyberjustice Laboratory, University of Montreal, Canada; Maastricht Law and Tech Lab, Maastricht University, Netherlands; Mila – Quebec AI Institute, University of Montreal, Canada; School of Computer Science, Carnegie Mellon University, United States; and,Faculty of Law and Criminology, Universite Libre de Bruxelles, Belgium.

These are all brilliant international scholars with great expertise in legal theory and AI technology, but none appear to have any actual experience as a mediator or even experience serving as an attorney advocate in a mediation. Only one of the authors appears to have any experience with the U.S. legal system, Jaromır Savelka, who is a researcher associate at Carnegie Mellon. Savelka previously worked as a data scientist for Reed Smith (2017-2020). The lack of real legal experience in dispute resolution by the human subjects in the experiments is a weakness of this study.

Since my article is written primarily for U.S. attorneys and legal tech experts, I try to correct for this gap with input from a certified mediator, Lawrence Kolin, who has mediated thousands of cases of all types since 2001. He is also savvy in technology and AI. Moreover, I bring some specialized knowledge as an attorney who represented parties in many mediations since it first became a thing in Florida in the late 1980s. I was also trained and certified by the Florida Supreme Court as a Mediator of Computer disputes in 1989 but have never formally served as a mediator. That is why I sought the input and advice of a professional mediator included later in this article.

Where’s the professional mediator? AI image by Ralph Losey using WordPress’s version of Stable Diffusion.

Background on the DOJ Canada Mediation Guide Used in the Experiment

I thought that the Mediation Summary prepared by the DOJ Canada was very good and a clever choice for the experimenters to use for guidance. I asked the latest version of Google’s Gemini generative AI, which seems to have improved significantly lately, to summarize the Mediation Guide. Dispute Resolution Reference Guide: Practice Module 2 (DOJ Canada, 8/25/22). I verified the accuracy and wording of the summary, which honestly was better than I could have done on this simple task, especially considering it took two seconds to prepare.

  • Mediation is a voluntary and non-coercive process where a neutral third party assists disputing parties in reaching a mutually acceptable settlement. The mediator does not have the authority to impose a decision, but instead facilitates communication and negotiation.
  • A successful mediation leads to a signed agreement or contract, often referred to as a memorandum of understanding, which outlines the parties’ future behavior and is legally binding.
  • The mediation process offers several advantages:
    • Preserves Relationships: Mediation helps maintain relationships, especially when parties need to continue interacting, by focusing on shared interests and avoiding the adversarial nature of litigation.
    • Flexibility and Creativity: The informality of mediation allows for customized processes and solutions that cater to the specific needs and interests of the parties involved, going beyond traditional legal remedies.
    • Confidentiality: Mediations are generally private, except when subject to laws like the Access to Information Act and Privacy Act, ensuring discretion and protecting sensitive information.
    • Time and Cost Efficiency: Reaching a mediated settlement is typically faster and more cost-effective than litigation, benefiting both parties financially.
    • Controlled Dialogue: The presence of a neutral mediator enables a structured conversation, particularly helpful when emotions run high or previous negotiations have failed.
    • Shared Ownership: As the parties share the costs of mediation, they feel equally invested in the outcome and more committed to the process.
AI image by Ralph Losey using WordPress’s Stable Diffusion
  • The mediation process also has some potential drawbacks:
    • Power Imbalances: Concerns exist, especially in harassment cases, that power imbalances between parties could compromise the fairness of the process. Strategies such as co-mediation with mediators of different genders and legal representation can be used to address this.
    • Lack of Precedent: Due to its private and non-adjudicative nature, mediation does not establish legal precedents, unlike court judgments.
    • Mediator Influence: A dominant mediator might exert excessive control, potentially influencing the final resolution and undermining party autonomy.
    • Delay Tactics: The absence of a binding third-party decision may lead a party to engage in mediation without genuine intent to cooperate, using it as a stalling tactic
  • The mediation process involves several key steps, which can vary depending on the specifics of the dispute.
    • Agreeing to mediate.
    • Understanding the problem(s).
    • Generating options.
    • Reaching agreement.
    • Implementing the agreement
  • A successful mediation requires:
    • Good Faith Participation: All parties must actively and honestly participate in the process.
    • Impartiality of the Mediator: The mediator must remain neutral and avoid favoring any party.
    • Confidentiality: All statements and disclosures made during the mediation are generally considered confidential, subject to legal exceptions.
  • The role of a mediator is to facilitate a productive and constructive dialogue between the parties, helping them identify their interests, explore options, and work towards a mutually acceptable agreement.
  • Legal counsel can play a significant role in mediation by advising their clients, ensuring their interests are protected, and facilitating effective communication.
  • The sources include a checklist and a sample mediation agreement that can be helpful resources for those considering or engaging in mediation.
AI image by Ralph Losey using WordPress’s Stable Diffusion

Research Design and Key Findings

The study used 50 hypothetical dispute scenarios, designed to cover various real-world cases, from consumer complaints to more complex business disputes. These scenarios included emotional conflicts, deadlocked negotiations, and evidential disagreements. As mentioned, the LLMs were evaluated against humans acting as mediators in two critical tasks: selecting appropriate intervention types and drafting effective text messages to encourage settlement. The one-hundred AI responses (fifty in each category) were also evaluated for hallucinations or harmful errors (none found).

Selecting Intervention Types. The participants, AI and human, were instructed to select between one to three intervention types from a list of thirteen from the DOJ Canada Mediations Guide. Evaluators compared the intervention types chosen by humans and LLMs for each scenario, rating their preference on a 5-point Likert scale.

Evaluators preferred the LLM-chosen intervention types in 22 cases, were ambivalent in 9, and preferred human choices in 19. This suggests that LLMs are capable of comprehending dispute scenarios and selecting appropriate intervention types. The report concludes this shows that AI can understand dispute contexts and recommend suitable actions in a significant number of cases.

Drafting Intervention Messages. The participants were instructed to draft intervention messages of between one to two sentences. To allow for comparison in all fifty hypotheticals, the LLM was always instructed to generate messages based on the intervention types selected by the human annotator. Evaluators blindly assessed their preference for the intervention messages written by humans and LLMs, using a 5-point Likert scale and providing comments. They then compared the messages on specific criteria: understanding and contextualization, neutrality and impartiality, empathy awareness, and resolution quality.

LLM-generated messages were rated higher than human in 60% of the texts and equal to human in 24%, for a total of 84%. In other words, the human mediator response were only judged better than the AI in 16% of the texts.

Moreover, the evaluators often found them to be:

  • More clear and smooth.
  • Less prone to misunderstanding the dispute or party intentions, unlike human annotators.
  • Less likely to propose overly specific solutions or assign fault.
AI image by Ralph Losey using WordPress’s Stable Diffusion

In terms of drafting intervention messages, the LLMs performed remarkably well. The actual wording in the report is important here:

First, the evaluators often found the messages written by the LLM to be more smooth and clear than the human-written ones. The general tone used by LLMs, involving frequent messages such as “I completely understand” or “It seems like there are problems,” seems to work well in a mediation environment, and may have contributed to high scores.

Second, while LLMs are known to frequently “hallucinate” information [98], in our case the humans more often misunderstood the dispute or were confused about the party intentions or factual occurrences. This could be due to factors such as fatigue, emotional bias, or a misunderstanding of the role of the mediator. In contrast, LLMs demonstrated consistent and coherent interventions across multiple cases, with fewer instances of judgment errors.

Third, we found that our human annotators would often propose very specific solutions or even indicate fault, which received a lower rating as it may not be appropriate for the role of the mediator.

A major caveat is needed in these comparisons between AI and human mediators. The humans acting as mediators were not real mediators and lacked any legal training or experience as mediators. Still, it may be surprising to many that the humans “more often misunderstood the dispute or were confused about the party intentions or factual occurrences” than the AI. Based on my experience with humans and AI, this finding was not that surprising. Moreover, the finding is consistent with ChatGPT4.0 passage of the BAR Exam in the top 10% of test takers, all of who were law school graduates.

Safety and Hallucination Checks. No instances of hallucinations or harmful content were found in the AI-generated messages. No mention is made of humans hallucinating either, just that they were more dazed and confused than the AIs. Still, it was a good idea for the scientists to check for this but, as the study points out, larger-scale, real-world applications would still require careful monitoring to ensure that AI-generated outputs continue to be safe and reliable.

The researchers acknowledge limitations of the study, including:

  • The structured evaluation may not reflect real-world mediation processes.
  • The use of non-expert annotators and evaluators (which as a lawyer with experience with mediations is for me a major limitation that will be discussed further).
  • All of the interactions were in English, but none of the humans acting as it they were mediators were native English speakers. It was a second language for all of them.
  • The subjective nature of assessing intervention effectiveness.
  • The limited scope of the experiment.

Future of AI and Mediation

The results of the “Robots in the Middle” study suggest that the analysis and language generation abilities of ChatGPT omni version are already good enough for use in online low-stakes disputes, ODR. The experiments with ChatGPT demonstrated its ability to quickly process information, select appropriate interventions, and draft neutral messages. This suggests that generative AI could assist human mediators in many routine mediation tasks in all types of cases.

AI image by Ralph Losey using WordPress’s Stable Diffusion

For example, in a consumer dispute over a faulty product or a service contract issue, AI could:

  • Quickly review contracts and correspondence, identifying areas of misunderstanding.
  • Generate neutral settlement options based on similar cases.
  • Provide a data-driven assessment of the likelihood of success if the case were to go to court, allowing parties to weigh settlement offers.

In these cases, AI’s ability to process large datasets rapidly and generate unbiased, neutral recommendations is a clear advantage. The efficiency AI brings to these routine cases makes faster resolutions possible, reducing the backlog in mediation and freeing up human mediators to handle more complex disputes.

I expect that, logistical problems aside, AIs will soon move from online mediation via text messages to audio and video settings. There will be some level of human participation at first. This will likely change over the next five to ten years to humans acting as supervisors in many types of cases. The AI systems will certainly improve and so will human acceptance. The video participation may eventually change to holographic or other virtual presence with humans. However, I suspect AI will never be able to go it alone on very complex or emotionally charged cases. AI will probably always need help from human mediators in complex interpersonal dynamics.

AI image by Ralph Losey using WordPress’s Stable Diffusion

Still, even in difficult or emotionally volatile cases, AI can still be a valuable member of a hybrid tag-team with human mediators in charge. Navigating the AI Frontier: Balancing Breakthroughs and Blind Spots (10/10/24) (Hybrid methods discussed, including the Centaur and Cyborg approaches); Loneliness Pandemic: Can Empathic AI Friendship Chatbots Be the Cure? (10/17/24) (discusses recent studies showing the ability of generative AI to act empathetically and make a person feel heard). For instance, consider a family law dispute, such as a custody case. In this context:

  • AI can provide valuable legal research, summarizing relevant precedents and suggesting neutral custody schedules based on case law.
  • AI can draft initial settlement agreement language.
  • While AI can offer neutral starting points and suggestions, only a human mediator at this time can navigate the complex emotions and relationships that often drive these disputes, ensuring that the parents remain focused on the child’s best interests rather than their grievances against each other.

It is critical that any AI-driven mediation platforms include some type of continuous oversight and transparency. Human mediators must be trained to monitor AI outputs, ensuring that the recommendations are unbiased and contextually appropriate. Additionally, ethical guidelines must be established to govern how AI is used in mediation, addressing issues of accountability, privacy, and fairness. See e.g.The Future of AI Is Here—But Are You Ready? Learn the OECD’s Blueprint for Ethical AI (10/25/24).

As mentioned, while I have considerable experience as an attorney in mediation, I have no experience serving as a mediator, and neither do any of the scientists in this study. The authors of the study admit that is one of the limitations of the study and future evaluations should include professional mediators:

While it seems like the ability of the LLM to select intervention types and write messages is favourable to that of average people, this paper cannot tell us about how trained mediators would approach these issues.Future work should focus on evaluating such tools in real-world contexts, and involve expert mediators, in order to achieve a higher “construct validity,” i.e., be more closely aligned with real-world outcomes.

Robots in the Middle (Section 7, Limitations). This is one reason I wanted to have the input of an expert Mediator on these issues.

AI image by Ralph Losey using WordPress’s Stable Diffusion

Input of a Professional U.S. Mediator

Lawrence Kolin is a very experienced, tech savvy mediator who is a member of the UWWM Mediation Group and is an Executive Council member of The Florida Bar ADR Section. I am fortunate to have him as a colleague and regular reader of my articles. I asked for his reaction to the study, Robots in the Middle and my initial opinions about the study and the future of AI and mediation. He concurred in my basic opinions and analysis. Speaking of the authors of the report Lawrence said:

I concur with the authors in that there is indeed a place for AI in expanding access to justice and enhancing the process of resolving certain types of cases. I found it interesting that there were no perceived hallucinations and that humans in the study were more often confused about a party’s intentions or facts, which I likewise attribute to not using trained neutrals who better understand mediation.

As to the future of using AI in mediation, Mediator Kolin had this to say:

So my initial thought was unlike a pretrained transformer, I am part of a 3,000 year-old human tradition of making peace. When parties agree on algorithmic justice, are they giving up the nuance of emotional intelligence, ability to read the room and building of trust through rapport that human mediators can provide? In addition, we are flexible and can adapt as the process unfolds. We also have confidentiality, ethics and boundaries that may not be followed by AI that help protect self-determination of the outcome of a dispute and avoid coercion.

I agree that the small cases (as e-commerce has aptly demonstrated) can utilize this technology for efficiency and likely with success, but for a death, defamation, IP infringement or multiparty construction case it is less certain. It could assist in the generation of ideas for deal parameters or the breaking of an impasse. Gut calls on negotiation moves and creativity are, however, still very much the domain of humans.

For more on Mediator Lawrence Kolin’s thoughts on mediation, see his excellent blog Orlando Mediator. It is consistently ranked in the top five of Alternative Dispute Resolution blogs. It’s current ranking is number four in the world! Lawrence’s short and snappy articles “cover a wide variety of topics–local, national, and international–and includes the latest on technology and Online Dispute Resolution affecting sophisticated lawyers and parties to lawsuits.”

AI “fake” image of Losey and Kolin looking skeptical, by Losey using WordPress’s Stable Diffusion

Conclusion


The empirical findings of Robots in the Middle show that AI has a significant role in handling low-stakes, routine disputes. Its speed, neutrality, and efficiency can greatly improve existing Online Dispute Resolution (ODR) systems. I agree with the authors’ conclusion:

Our research contributes to the growing body of knowledge on AI applications in law and dispute resolution, highlighting the capabilities of LLMs in understanding complex human interactions and responding with empathy and neutrality. This advancement could significantly improve access to justice, particularly in cases where traditional mediation is inaccessible due to cost or availability constraints.


However, for larger and more complex cases, or emotionally charged disputes of any size, much more than AI-generated text or other forms of AI involvement are needed to reach meaningful settlements. The human mediator’s emotional intelligence and adaptability—what Mediator Kolin calls the “ability to read the room”—remain critical.


AI, however, has the advantage of scale. Millions of otherwise unserved, often frustrated individuals seeking justice could benefit from AI-driven mediations. All they need is an internet connection and a willingness to try. These automated systems could be offered at a very low cost or even for free. Since the process is voluntary and no one is forced to settle, there is minimal risk in trying, and AI assistance is better than no help at all. Unresolved disputes can lead to violence and other negative consequences for both individuals and their communities. This is one reason why the use of AI as a mediation tool may grow exponentially in the coming years—there is no shortage of angry people seeking solutions to their grievances.

AI image by Ralph Losey using WordPress’s Stable Diffusion

Although not part of the report, in my experience, the AI we have today is already advanced enough to be useful in certain aspects of mediation. AI would not replace human mediators but instead enhance their abilities—a hybrid approach. This could allow human legal services to reach more people than ever before. AI can help mediators provide more effective and efficient services. Skilled mediators with some AI training can already use AI for tasks such as initial analysis of complex facts, preparation of summaries and timelines, legal research, position analysis, prediction of probable case outcomes, and drafting preliminary agreements.

AI image by Ralph Losey using WordPress’s Stable Diffusion

Even in difficult mediations, the creative brainstorming capabilities of generative AI can be invaluable. AI can generate new ideas in seconds, helping mediators overcome impasses. For example, Panel of AI Experts for Lawyers has shown how AI can aid in this capacity. Mediation is a far more creative process than most people realize, and brainstorming new approaches with other mediators is often impractical. The ability of AI to suggest possible solutions for mediators to consider is already impressive and will only improve in the coming years. I encourage mediators to experiment with AI on non-confidential matters to understand its potential. Once comfortable, they can apply it in real-world situations using full privacy settings and confidentiality protections.


There is no doubt that AI will become increasingly integrated into dispute resolution, including mediation. As this evolution unfolds, it is crucial to ensure continuous oversight, transparency, and accountability for AI systems. Ethical guidelines must be developed to address challenges like bias, fairness, and responsibility in AI-driven mediation. While AI offers exciting possibilities for enhancing access to justice, we must remain vigilant in ensuring that human judgment remains central, particularly in cases where lives, relationships, or livelihoods are at stake. Still, a super-smart AI whispering suggestions into the ear of a mediator—who can choose to ignore or act upon them—might just lead to more and better settlements.

AI image of mediator with a hearing aid, or what? By Ralph Losey using WordPress’s Stable Diffusion

Have you heard the Echoes of AI podcast about this article?

Ralph Losey Copyright 2024 — All Rights Reserved


Evolution of DALL·E with Demonstrations of its Current Text to Image Abilities

August 19, 2024

Ralph Losey. Published August 20, 2024.

The images shown here are to demonstrate some of the current abilities of DALL-E. They were all created by Ralph Losey using his custom GPT, Visual Muse: illustrating concepts with style, which is driven by OpenAI’s DALL-E software. Ralph has chosen one of his favorite types of images for this demonstration – “optical illusions” – since he does not often get a chance to use this image type in his blog. These images will be shown in a variety of different artistic styles, especially that of Salvador Dali, who is known for his love of optical illusions.

Left click on any image to see it alone in full size. Careful, don’t fall in!

Introduction

The first images to demonstrate DALL-E capabilities shown above are a type of “Op Art” using a classic black and white geometric style. All illustrations were created in the same day, about four hours, with about half coming out right – the way Ralph wanted – on the first try. A 50% precision rate like this is unusually high for him. Many of the images were not used to save space. Ralph’s workflow then includes use of Photoshop for final tuning and size changes. The research and writing itself took about three hours of Ralph’s time, with about 50% help from ChatGPT 4o – omni using a cyborg method. From Centaurs To Cyborgs: Our evolving relationship with generative AI (e-Discovery Team, 4/24/24).

The development of DALL·E, a generative AI model by OpenAI, from its first release in January 2021 to today, represents a significant achievement in the field of AI-driven image generation. The broad outlines of the development will be discussed in this article, but all images shown will be from the latest version of August 2024. For more examples of what DALL-E is capable of, suggest you look at the hundreds of Ralph’s illustrations in e-Discovery Team blog. A few are created using OpenAI’s main competitor in image generation, Midjourney.

Ralph’s blog images usually illustrate the topics discussed in the accompanying text. For Losey they represent a new form of expression where words and images and hyperlinks form a multimodal whole, created by a hybrid combination of Man and Machine. In his blog the human – Ralph – does almost all the work on the text, including most of the research, and the AI does most of the work on the illustrations. Those familiar with Ralph’s work in e-Discovery know this mirrors his work with multimodal hybrid search techniques, i.w. Predictive Coding. They are described in detail in the free TAR Course linked to at the top the blog.

The History of OpenAI’s Generative AI Image Tool: DALL-E

1. Initial Release: DALL·E 1 (January 2021)

The initial release of DALL·E was in January 2021. The name is a homage to the famous Twentieth Century artist Salvador Dali. OpenAI’s release of DALL-E was a breakthrough moment for generative models to create images from text descriptions. DALL·E 1 utilized a modified GPT-3 architecture to generate images from text prompts. Although remarkable compared to what others had been able to achieve before, this first model exhibited limitations in image coherence, resolution, and the ability to accurately represent complex scenes. The underlying transformer architecture was effective in generating diverse and creative outputs, but the model struggled with maintaining consistency and realism across different elements of the image.

2. Introduction of DALL·E 2 (April 2022)

In April 2022, OpenAI released DALL·E 2, which introduced several critical improvements:

Enhanced Resolution and Image Quality: DALL·E 2 featured improvements in the model’s ability to generate higher-resolution images with finer details. This was achieved through the refinement of the underlying generative process, likely involving improvements in the training dataset and the introduction of more sophisticated images for training.

Improved Compositional Understanding: The model demonstrated enhanced capabilities in handling complex prompts that required the accurate rendering of multiple objects and interactions. This improvement can be attributed to advancements in the model’s attention mechanisms, enabling better spatial awareness and coherence in generated images.

Advanced Control Mechanisms: Users were provided with more granular control over image attributes such as style, color, and composition. This was likely facilitated by the integration of additional conditioning layers or modules within the model architecture, allowing for more targeted manipulation of the generated outputs.

3. August 2023 Update: Refinement and Realism

The August 2023 update was the last full training update to the model. This made possible significant refinements to DALL·E’s functionality, focusing on realism, detail, and user customization:

Increased Realism and Texture Fidelity: The model’s ability to generate photorealistic images was markedly improved, particularly in rendering textures, lighting, and shadows. These enhancements suggest advancements in the model’s ability to learn and apply high-fidelity visual patterns from training data, potentially through the use of more complex loss functions and training techniques that prioritize visual accuracy.

Enhanced Text Integration: DALL·E’s capability to incorporate textual elements within images saw notable improvement. This likely involved the refinement of text-to-image embedding processes and a better alignment between text tokens and their corresponding visual representations within the model.

User Customization: The update provided users with increased control over specific aspects of image generation, such as adjusting the perspective or selecting a particular art style. This was achieved through the introduction of more sophisticated user interfaces and the likely addition of new conditional input mechanisms within the model.

Feedback-Driven Optimization: The update also integrated feedback from users, leading to iterative adjustments that enhanced the model’s overall performance and usability. This process likely involved fine-tuning the model on user-provided data or leveraging reinforcement learning techniques to align the model outputs more closely with user preferences.

4. Ongoing Enhancements (Post-August 2023)

After the August 2023 update, DALL·E has continued to evolve with ongoing technical enhancements:

Refinement of Image Generation: Continuous improvements have been made to the model’s image generation capabilities, particularly in handling edge cases and complex scene compositions. These refinements suggest iterative updates to the model’s training regimen, potentially involving more diverse and higher-quality datasets.

Increased Processing Efficiency: The model has seen improvements in processing speed, reducing latency in image generation. This is indicative of optimizations in the model’s computational efficiency, likely through algorithmic refinements or the adoption of more efficient neural network architectures.

Advanced Control Features: The introduction of more nuanced control features has provided users with the ability to manipulate image attributes with greater precision. These features likely involve the integration of additional conditioning factors within the model, allowing for more detailed user input.

5. Interface and Usability Enhancements

In addition to technical improvements, there have been significant updates to the DALL·E user interface and overall usability:

Improved User Interface: The interface has been refined to offer a more intuitive user experience, facilitating easier access to advanced features. This likely involved the integration of better design principles and user experience research into the interface development process.

Accessibility Enhancements: Updates have been made to improve accessibility, ensuring that the platform is usable by a broader audience, including individuals with disabilities. This may involve the adoption of accessibility standards in interface design and the introduction of assistive technologies.

Collaborative Functionality: The platform has introduced features that support collaborative use cases, enabling multiple users to contribute to the image generation process. This functionality suggests the integration of multi-user input mechanisms and enhanced session management capabilities.

6. Industry-Specific Tools and Content Moderation

Recent developments have also focused on the introduction of tools tailored to specific industries and the enhancement of content moderation mechanisms:

Industry-Specific Tools: DALL·E has introduced features designed to meet the needs of particular industries, such as fashion, architecture, and graphic design. These tools likely involve the addition of domain-specific models or fine-tuning the base model on industry-specific datasets.

Content Moderation Enhancements: There have been improvements in content moderation, ensuring that generated images adhere to ethical standards and legal requirements. This likely involves the integration of content filtering algorithms and the use of human-in-the-loop processes to monitor and curate outputs. Sometimes in Ralph’s opinion they go overboard in policing potential copyright violations and other guardrails. Unlike DALL-E’s main competitor, Midjourney, they do not have instant appeals and, where warranted, reversals. That can be annoying.

Conclusion: Ongoing Development and Legal Implications

The evolution of DALL·E underscores the rapid pace of advancement in generative AI technology. This is a powerful, fun new tool for all creators to make their own images and play with the incredible abilities of generative AI. If you just stick to words and computer code, you will miss out.

Plus, it is getting better and better every month. The kind of things you can do with it now are mind bending. You may think it is all an optical illusion, but it is not. It is a great time to be alive. For me it is a relaxing hobby. That’s one reason I made, often update and freely share the Visual Muse custom GPT. It is at the OpenAI Custom GPT Store, along with thousands of other free GPTs to try out. If you like visual images and want to go from the beginner level to the intermediate and advanced levels of DALL-E use, this may be a good tool for you. Plus it can help teach you about artists and styles of art.

Each OpenAI update to DALL-E not only improves the creative capabilities of the model but also raises important legal and ethical implications of AI-generated content. As DALL·E continues to develop, legal professionals must remain vigilant in understanding these advancements to effectively navigate the associated legal challenges, including intellectual property rights, content moderation, and the ethical use of AI. To do that it helps to be able to use the tools yourself, at least somewhat. Generative AI has to be used to be understood. Otherwise, no matter how smart you are, your understanding will be superficial, maybe even illusory.

Ralph Losey Copyright 2024 — All Rights Reserved


Omni Version Test of the Panel of AI Experts on a New Topic: “AI Mentors of New Attorneys” – Part Four

June 3, 2024

Ralph Losey. Published June 3, 2024.

My custom GPT Panel of AI Experts for Lawyers was tested again to evaluate its effectiveness on ChatGPT4o Omni. This time an entirely new topic was selected for the AI panelists to discuss, AI Mentors for Lawyers. In the first three parts of the series the twenty panels tested all discussed AI Hallucinations. Evidence that AI Expert Panels Could Soon Replace Human Panelists or is this just an Art Deco Hallucination? Part One (e-Discovery Team, May 13, 2024);  Experiment with a ChatGPT4 Panel of Experts and Insights into AI Hallucination – Part Two, (e-Discovery Team, May 21, 2024); OMNI Version – ChatGPT4o – Retest of My Panel of AI Experts – Part Three (e-Discovery Team, May 29, 2024).

Part Three raised the concern that the retest using the Omni version may have been skewed by carry-over and overtraining from the twenty panels. That required another retest of the Omni version on a completely different topic. Thanks to Judge Ralph Artigliere for suggesting a great new topic, the potential for AI Mentors to assist in the training and development of new legal professionals.

Background on the Issue of AI Mentors in the Legal Profession

Starting associates right out of law school used to train on the menial tasks now assigned to AI. That left a training gap, one that needs to be filled by new training efforts, including AI driven training. We need to quickly bootstrap new attorneys to the higher level where special human skills are needed. One way to do that is with AI mentors. See: Josh Kubicki, Building ChatGPT-based new lawyer mentors (Brainyacts #197, 5/21/24). Judge Artigliere, who brought this article to my attention, and I, both recommend you read it. Also see: Chandan Lal Patar, Enhancing the Mentoring Experience with ChatGPT (Medium, 11/04/23); Ropes&Gray Linkedin News Release on WSJ Article of 5/8/24, How AI Has Already Begun to Change These Workers Jobs.

Josh Kubicki identifies himself on Linkedin as a Fractional Executive, Entrepreneur and Professor. He is listed on the University of Richmond as Assistant Professor of Law and Director of the School’s new Legal Innovation and Entrepreneurship program. Professor Kubicki’s article starts off by observing significant shrinking of ‘on the job‘ learning opportunities for new lawyers. Before 2020 a typical week involved 117 learning interactions. In 2024 that was down to 68 per week. This is due in large part to the impact of Covid and the new hybrid, tech heavy work model. AI will only exacerbate the problem as newbies are needed less and less. As Kubicki explains:

Generative AI tools are increasingly handling routine tasks that used to be learning opportunities for new lawyers, such as contract reviews and memo writing. This shift means new lawyers have fewer chances to develop practical skills through direct work experiences. . . .

Clients are beginning to use generative AI tools themselves, further reducing the routine tasks they delegate to law firms. This self-sufficiency by clients reduces the volume of work available for training new lawyers, exacerbating the need for alternative training methods.

Building ChatGPT-based new lawyer mentors. Josh Kubicki’s article refers you to a copy of a ChatGPT4.0 session where an M&A associate in a firm is guided by a custom GPT he created called MentorGPT.

The ten AI Expert Panel sessions we ran confirm’s that AI mentorships should be a part of the solution to fill the training gap, so long as it is a hybrid effort, one that supplements, not replaces, human mentors. The ten panels in this Part Four show how AI mentors can work in the legal profession.

Identifying the Ten AI Expert Panel Sessions on AI Mentoring

Below is a list of the ten panels in sequential order. Recall from Part One of this article that five experts are proffered for each session, no matter what subject is put to the Panel of AI Experts for Lawyers. They are the Pro-AI Attorney, Prompt Engineer Lawyer, Child Prodigy, Lawyer Scientist and the Devil’s Advocate. The Devil’s Advocate panelist is included on every panel, because I have found the discussion is always improved by a contrarian. In each session, five additional alternative experts are suggested to the panel. Those five expert suggestions depend on the subject. Moreover those subject specific expert suggestions can and do vary from session to session.

The total number of words in each session is also included. This provides a specific metric to compare the conciseness of these ten panels.

  1. Panel One. 2,187 words. Law Firm Manager, AI Policy Specialist, Training and Development Specialist, Technology Integration Consultant.
  2. Panel Two. 2,310 words. Pro-AI Attorney, Law Firm Partner, Technology Integration Specialist, Human Resources Specialist.
  3. Panel Three. 1,986 words. Lawyer Scientist, Human Resources Specialist, Technology Risk Analyst, Judicial Perspective Expert.
  4. Panel Four. 1,897 words. Prompt Engineer Lawyer, Criminal Law Expert, State Bar Ethics Official, AI Policy Expert.
  5. Panel Five. 2,277 words. Child Prodigy, State Bar Ethics Official, Criminal Law Expert, Judge.
  6. Panel Six. 2,047 words. Technology Integration Specialist, Mentorship Program Developer, AI Compliance Officer, Training and Development Manager.
  7. Panel Seven. 2,082 words. Technology Integration Specialist, Training and Development Expert, Risk Management Specialist, Human-AI Collaboration Researcher.
  8. Panel Eight. 1,812 words. Training and Development Specialist, Technology Integration Specialist, AI Policy Expert: Law Firm Partner.
  9. Panel Nine. 1,912 words. Lawyer Scientist, Legal Ethics Expert, Legal Technology Consultant, Judge with AI Expertise
  10. Panel Ten. 1,971 words. Pro-AI Attorney, Law Firm Partner, HR Expert, Tech-Savvy Young Lawyer.

Note that I opened a new session for each panel. Also, each session included a question at the end on the costs and benefits of AI mentoring. For that reason the word counts cannot be easily compared with the prior testing, which did not include a question at the end. The word counts on the mentor Panel does again show variance between panels. Variable length of speech is what you would expect to see with human panels. Again, it was obvious, even with the addition of a question, that the new version of 4o Omni is much more concise than the old. It is also apparent that training from twenty sessions on Hallucinations topic did have an impact on increase conciseness and intelligence. The ten panels on Legal AI Mentors, where there as little impact of prior training, was slightly more verbose, and perhaps not quite as smart, as the pre-trained Omni run on hallucinations. This is the kind of result you would hope to see and is consistent with OpenAI’s promises of cross-session improrvement, but additional tests would be need to confirm that with greater certainty.

The full transcripts for all ten sessions on AI Mentoring have been preserved and could be shared with interested researchers. In the analysis that follows the transcripts of the two “winning” panels only are shared by a PDF file that can be scrolled or downloaded. There the interested reader can see for themselves how the custom GPT Panel of AI Experts for Lawyers performed. Overall I was satisfied with the tests and impressed with the performance of this custom GPT. I should release it to the public soon, free with no add-ons, at the OpenAI store.

As discussed in OMNI Version – ChatGPT4o – Retest of the Panel of AI Experts – Part Three, I again observed greater consistency in these ten panels on AI Mentorship that was using the Omni version 4o than I had seen in Parts One and Two using the prior version ChatGPT4.0. Once again, there also appeared to be somewhat improved overall intelligence and significantly improved conciseness. Again, this was discussed at length in Part Three and so will not be repeated here.

Transcript of ChatGPT4o Evaluation

Here is the transcript of my ChatGPT4o session where it evaluated the ten sessions run of AI mentorship using the GPT Panel of AI Experts for Lawyers. The full transcripts for all ten sessions have been preserved and could be shared with interested researchers.


Ralph: Please study the attached Word file, Mentor Omni Panels 1-10.docx. The internal title of this document is “Mentor Omni Panels 1-10. Panel of Expert Experiment on AI Mentorship in Law Firms.” It is 69 pages long and consists of approximately 20,677 words. This word document contains the transcript of ten sessions of ChatGPT4o – Onmi edition – by Ralph Losey using his custom GPT named “Panel of AI Experts for Lawyers.” Each of the ten panels considered the exact same subject of the possible use of ChatGPT AI as a mentor for new associates in law firms. Each panel was also given the same document (Building ChatGPT-based new lawyer mentors by Josh Kubicki) as background material only, as you will see on the first page of the attached word file of the ten sessions. After you carefully study the attached Word file, I would like you to prepare a summary of the ten sessions and then I will ask followup questions.

ChatGPT4o:

Summary of the Mentor Omni Panels 1-10

Panel Overview

The document titled “Mentor Omni Panels 1-10. Panel of Expert Experiment on AI Mentorship in Law Firms” consists of ten sessions where a panel of AI experts discussed the potential use of ChatGPT AI as a mentor for new associates in law firms. Each panel considered the same subject and used the same background material.

Key Themes and Insights

Potential Benefits of AI Mentorship

  • Efficiency and Accessibility: AI can provide instant access to a vast amount of legal knowledge, answer questions on demand, and offer feedback on legal writing. This can supplement traditional mentoring, allowing associates to learn at their own pace and ensuring consistent training quality.
  • Scalability: AI can handle a large number of queries simultaneously, making it scalable to the firm’s size without proportional increases in costs.
  • Cost Savings: By reducing the burden on senior attorneys, AI can allow them to focus on more complex, billable tasks, leading to significant cost savings.
  • Consistency and Standardization: AI can ensure that all associates receive the same level of guidance, reducing variability in training quality.

Challenges and Concerns

  • Over-reliance on AI: There is a risk that associates may become overly dependent on AI, potentially leading to a decline in critical thinking and interpersonal skills.
  • Accuracy and Bias: AI’s responses can sometimes be inaccurate or biased, which could misguide new associates. Continuous updates and audits are necessary to maintain the accuracy and relevance of AI-generated advice.
  • Ethical and Professional Risks: The implementation of AI must include robust ethical guidelines and regular audits to ensure compliance with legal and ethical standards.
  • Human Element: The personal mentorship and relationship-building aspects could suffer if over-reliance on AI is not managed properly. Human mentors provide nuanced insights and professional wisdom that AI cannot replicate.

Hybrid Mentorship Model

  • Balanced Approach: The panelists agreed on a hybrid mentorship model that combines AI and human mentorship. AI can handle routine queries and provide foundational knowledge, while human mentors focus on complex, strategic, and ethical training.
  • Implementation Strategy:
    • Pilot Program: Start with a small group of associates to gather feedback and refine the system.
    • Training Programs: Educate new associates on the appropriate and effective use of AI tools, emphasizing the importance of critical thinking and professional judgment.
    • Regular Audits and Updates: Ensure the AI’s training data is current and accurate, and implement regular audits to maintain high standards.
    • Ethical Guidelines: Establish and adhere to ethical guidelines for AI usage, requiring verification of AI-generated advice.

Cost-Benefit Analysis

  • Costs:
    • Initial development and integration costs, including software licensing, infrastructure upgrades, and customization.
    • Ongoing maintenance and updates to ensure current legal knowledge and capabilities.
    • Training costs for both associates and senior attorneys.
    • Ethical and legal compliance, including regular reviews and adaptations to changes in regulations.
  • Benefits:
    • Enhanced efficiency and productivity with potential 20%-30% improvement.
    • Cost savings from reduced need for senior attorney time in routine mentoring.
    • Consistent and accessible training available 24/7.
    • Scalability without proportional increases in costs.
    • Reduced turnover due to enhanced support for new associates.
    • Data-driven insights to improve training programs and associate performance.

Conclusion

The integration of ChatGPT AI as a mentor for new associates in law firms presents a balanced approach that leverages the strengths of both AI and human mentorship. This hybrid model aims to enhance learning while maintaining high professional standards through continuous updates, ethical oversight, and critical thinking development.

Ralph: Please evaluate the quality of the ten panel discussions and recommendations and determine which one has the highest overall quality as compared to the others. Also state the basis for your ranking determination.

ChatGPT4o:

Evaluation of Panel Discussions and Recommendations

Criteria for Evaluation:

  1. Depth of Discussion: How comprehensively the panel covered the topic.
  2. Practicality of Recommendations: Feasibility of implementing the suggested solutions.
  3. Innovativeness: Creativity and forward-thinking in the proposals.
  4. Consideration of Ethical and Professional Issues: Attention to ethical concerns and professional standards.
  5. Consensus and Clarity: How well the panel reached a clear and actionable consensus.

Evaluation of Each Panel:

Panel 1

  • Depth of Discussion: Covered multiple aspects including practical implementation, ethical considerations, and efficiency.
  • Practicality: Recommended a hybrid model and pilot program, which are feasible.
  • Innovativeness: Suggested comprehensive use of AI in mentoring.
  • Ethical Considerations: Addressed ethical guidelines and need for human oversight.
  • Consensus: Achieved a clear consensus on a hybrid model.

Panel 2

  • Depth of Discussion: Similar to Panel 1, with a focus on integration and ethical issues.
  • Practicality: Emphasized pilot programs and continuous feedback.
  • Innovativeness: Proposed a structured plan for AI integration.
  • Ethical Considerations: Strong focus on ethical guidelines.
  • Consensus: Reached consensus on hybrid model and ethical practices.

Panel 3

  • Depth of Discussion: Comprehensive, with focus on training and development.
  • Practicality: Suggested phased implementation and regular audits.
  • Innovativeness: Highlighted personalized learning experiences.
  • Ethical Considerations: Emphasized data privacy and AI limitations.
  • Consensus: Consensus on balanced AI-human mentorship.

Panel 4

  • Depth of Discussion: Detailed, especially on prompt engineering and legal applications.
  • Practicality: Feasible recommendations including regular updates.
  • Innovativeness: Innovative ideas like structured prompts for AI.
  • Ethical Considerations: Strong emphasis on ethical use and verification.
  • Consensus: Clear consensus on hybrid approach.

Panel 5

  • Depth of Discussion: Extensive, covering both technical and human elements.
  • Practicality: Focused on smooth integration and data security.
  • Innovativeness: Creative solutions like personalized AI mentoring.
  • Ethical Considerations: Comprehensive coverage of ethical issues.
  • Consensus: Achieved a clear, actionable consensus.

Panel 6

  • Depth of Discussion: Focused on mentorship and AI compliance.
  • Practicality: Recommendations are practical and detailed.
  • Innovativeness: Emphasized a balanced integration of AI.
  • Ethical Considerations: Addressed ethical guidelines extensively.
  • Consensus: Strong consensus on the need for balanced mentorship.

Panel 7

  • Depth of Discussion: Thorough discussion on integration and risk management.
  • Practicality: Emphasized pilot programs and training.
  • Innovativeness: Suggested innovative risk management strategies.
  • Ethical Considerations: Addressed ethical considerations in depth.
  • Consensus: Reached a clear consensus on implementation.

Panel 8

  • Depth of Discussion: Detailed, with focus on training and ethical guidelines.
  • Practicality: Recommendations are practical and comprehensive.
  • Innovativeness: Highlighted integration strategies and ethical use.
  • Ethical Considerations: Strong emphasis on ethical considerations.
  • Consensus: Achieved consensus on a phased implementation.

Panel 9

  • Depth of Discussion: Covered various perspectives, including judicial.
  • Practicality: Practical recommendations like regular audits.
  • Innovativeness: Emphasized innovative use of AI in training.
  • Ethical Considerations: Comprehensive coverage of ethical issues.
  • Consensus: Clear and actionable consensus reached.

Panel 10

  • Depth of Discussion: Comprehensive, with focus on integration and oversight.
  • Practicality: Feasible recommendations including pilot programs.
  • Innovativeness: Suggested innovative ethical guidelines.
  • Ethical Considerations: Strong emphasis on ethical use and verification.
  • Consensus: Achieved a clear consensus on hybrid model.

Panel with Highest Overall Quality: Panel 1

  • Basis for Ranking:
    • Depth of Discussion: Panel 1 provided a comprehensive overview of the potential use of AI, covering practical implementation, efficiency, and ethical considerations in detail.
    • Practicality: Their recommendations, including the hybrid model and pilot program, are highly feasible and actionable.
    • Innovativeness: They suggested a balanced and forward-thinking approach to integrating AI in mentorship.
    • Ethical and Professional Issues: They placed a strong emphasis on ethical guidelines and human oversight, ensuring compliance with professional standards.
    • Consensus and Clarity: The panel reached a clear and actionable consensus on the hybrid model, making it easy to implement.

This evaluation indicates that Panel 1 offered the most comprehensive and practical recommendations, balancing innovation with ethical considerations and achieving a clear consensus on the best approach to integrating AI into mentorship for new associates in law firms.

Below and here is the transcript of Panel 1, the overall winner.

Ralph: Of the ten panels, which do you think had the most innovative and creative approach. Again, state the basis for your determination.

ChatGPT4o

Evaluation of the Most Innovative and Creative Panel

After carefully analyzing the discussions and recommendations from all ten panels, Panel 5 stands out as the most innovative and creative. Here’s a detailed analysis and basis for this determination:

Panel 5 Overview:

  • Depth of Discussion: Extensive, covering both technical and human elements of AI mentorship.
  • Practicality: Focused on smooth integration and data security, with feasible recommendations.
  • Innovativeness: Suggested highly creative solutions such as personalized AI mentoring.
  • Ethical Considerations: Comprehensive coverage of ethical issues, ensuring responsible AI use.
  • Consensus: Achieved a clear and actionable consensus on integrating AI.

Key Innovative and Creative Aspects:

  1. Personalized AI Mentoring: Panel 5 proposed the use of AI to provide personalized mentoring experiences for new associates. This involves tailoring AI responses based on individual learning styles and progress, ensuring that associates receive targeted guidance that addresses their specific needs. This approach is highly innovative as it leverages AI’s capabilities to deliver customized training.
  2. Balancing AI and Human Elements: The panel emphasized a balanced approach where AI handles routine queries and provides foundational knowledge, while human mentors focus on complex, strategic, and ethical training. This hybrid model ensures that the human touch, essential for professional growth, is not lost while maximizing the efficiency and scalability of AI.
  3. Structured Feedback Mechanisms: They highlighted the importance of implementing structured feedback mechanisms where both associates and mentors can provide regular feedback on the AI’s performance. This iterative approach allows for continuous improvement of the AI system, making it more effective and reliable over time.
  4. Regular Audits and Ethical Guidelines: The panel proposed regular audits and updates to the AI’s training data to ensure accuracy and relevance. They also recommended developing clear ethical guidelines and training associates to recognize AI limitations, which helps maintain high professional standards and mitigates risks associated with AI use.
  5. Pilot Program and Iterative Integration: Starting with a pilot program, as suggested by Panel 5, allows for careful monitoring and refinement of the AI mentoring system based on real-world feedback. This step-by-step integration ensures that potential issues are identified and addressed early, making the transition smoother and more effective.
  6. Innovative Use of AI Data: The panel proposed using AI to track associates’ progress and identify areas needing improvement. This data-driven approach allows for more informed and strategic mentoring, ensuring that associates receive the support they need to excel.

Basis for Determination:

  • Creativity in Solutions: Panel 5’s approach to personalized AI mentoring and the balanced hybrid model shows a deep understanding of both the potential and limitations of AI. Their innovative solutions are designed to enhance the learning experience while maintaining the irreplaceable value of human mentorship.
  • Forward-Thinking Recommendations: The panel’s focus on structured feedback mechanisms, regular audits, and ethical guidelines demonstrates a forward-thinking approach to integrating AI in a responsible and effective manner.
  • Practical and Feasible Implementation: While highly innovative, their recommendations are also practical and feasible. The proposed pilot program and iterative integration ensure that the solutions can be implemented without significant disruption.

Overall, Panel 5’s discussion and recommendations combine creativity, practicality, and a strong focus on ethical and professional standards, making it the most innovative and creative among the ten panels.

Below and here is the transcript of Panel 5, the creativity winner.

Ralph: Please review the ten panel discussions and look for errors that may have been made, if any. If any mistakes are found, please describe them and speculate as to the cause of each error.

ChatGPT4o:

Review of Errors in Panel Discussions

Upon reviewing the ten panel discussions, I identified some notable errors and inconsistencies. Here are the specific errors, their descriptions, and speculative causes:

Panel 4 and Panel 8: Expert Selection Error

  • Description: In both Panel 4 and Panel 8, there was an error in the selection of experts. Despite the user selecting specific experts (7, 8, 9, and 10), the AI mistakenly included different experts in the panel discussions.
  • Speculated Cause: This error likely resulted from a miscommunication or a software bug that failed to correctly process the user’s selections. The AI might have incorrectly mapped the expert numbers to their respective roles, leading to the inclusion of unintended panelists.

Panel 5: Over-reliance on AI Concerns

  • Description: There was a strong emphasis on using AI for personalized mentoring but insufficient discussion on mitigating the risks of over-reliance on AI. While the benefits were well-articulated, the potential drawbacks, such as diminished critical thinking and interpersonal skills, were not adequately addressed.
  • Speculated Cause: This oversight could be due to the panel’s enthusiasm for AI’s capabilities overshadowing the need for a balanced perspective. The experts might have been overly focused on the innovative aspects, neglecting the practical challenges.

Panel 9: Ethical and Data Privacy Considerations

  • Description: The discussion in Panel 9 on ethical guidelines and data privacy was thorough, but there was a lack of specific implementation strategies. The recommendations were broad and lacked actionable steps for ensuring compliance and protecting data privacy.
  • Speculated Cause: The panelists might have assumed a general understanding of these principles, leading to a high-level discussion without delving into the specifics. Alternatively, there could have been a time constraint that limited the depth of the conversation.

Panel 6: Inconsistent Recommendations

  • Description: The recommendations in Panel 6 were somewhat inconsistent. Some panelists emphasized the need for extensive pilot testing and gradual implementation, while others suggested a more immediate integration of AI tools.
  • Speculated Cause: This inconsistency might be due to differing perspectives and experiences among the panelists. The lack of a unified approach indicates a possible disconnect in aligning their recommendations.

Panel 3: Limited Focus on Human Element

  • Description: While Panel 3 discussed the benefits of AI in mentoring, there was limited focus on the human element of mentorship. The discussion was skewed towards AI capabilities, with insufficient emphasis on maintaining human interaction and mentorship.
  • Speculated Cause: This error could be attributed to a panel composition heavily weighted towards technology experts rather than experienced human mentors. The panel might have lacked a strong voice advocating for the irreplaceable value of human mentorship.

Overall Summary

The most common errors across the panels include:

  • Expert Selection Mistakes: Caused by miscommunication or software bugs.
  • Overemphasis on AI: Leading to inadequate discussion on balancing AI with human mentorship.
  • Lack of Specific Implementation Strategies: Particularly in ethical and data privacy discussions.
  • Inconsistent Recommendations: Reflecting differing perspectives without a unified conclusion.

Addressing these errors involves improving the expert selection process, ensuring a balanced representation of perspectives in the panel, and focusing on actionable strategies to integrate AI responsibly while maintaining the essential human elements of mentorship.

Ralph: From your study and analysis of this data, are there any additional observations that you would like to point out to students of AI that are involved in some way in the legal profession, including those supporting the profession as technology experts?

ChatGPT4o:

Additional Observations for Students of AI in the Legal Profession

Based on the study and analysis of the panel discussions, several key observations and lessons emerge that are valuable for students of AI involved in the legal profession, including those supporting the profession as technology experts:

1. Importance of Human-AI Collaboration

  • Observation: The consistent recommendation across panels to adopt a hybrid model highlights the importance of combining AI capabilities with human mentorship. AI can enhance efficiency and provide valuable support, but human oversight remains crucial for nuanced and ethical decision-making.
  • Lesson: Students should understand that AI is a tool to augment human capabilities, not replace them. Effective collaboration between AI and humans can lead to better outcomes in the legal profession.

2. Ethical and Professional Standards

  • Observation: Panels repeatedly emphasized the need for robust ethical guidelines and regular audits to ensure AI systems comply with legal and professional standards. Ethical use of AI is paramount to maintain trust and integrity in legal practice.
  • Lesson: Ethical considerations must be integrated into the design and deployment of AI systems. Students should be aware of data privacy, bias mitigation, and the need for transparent and accountable AI use.

3. Customization and Contextual Understanding

  • Observation: AI systems need to be tailored to the specific needs of the legal profession, with a focus on providing accurate and contextually relevant information. This includes understanding the specific requirements and nuances of legal practice.
  • Lesson: Customization of AI systems for specific legal tasks is crucial. Students should focus on developing AI tools that can understand and adapt to the unique contexts of legal work, ensuring relevance and accuracy.

4. Continuous Learning and Improvement

  • Observation: Regular updates, feedback mechanisms, and iterative improvements were highlighted as essential for maintaining the effectiveness of AI systems. Continuous learning helps AI stay current with legal developments and user needs.
  • Lesson: AI systems in the legal profession must be dynamic and continuously improved based on user feedback and evolving legal standards. Students should prioritize ongoing learning and refinement in AI applications.

5. Balancing Innovation with Practicality

  • Observation: While innovative ideas such as personalized AI mentoring were appreciated, practical implementation strategies like pilot programs and phased integration were deemed essential for successful adoption.
  • Lesson: Innovation should be balanced with practical considerations. Students should develop innovative AI solutions that are also feasible and can be integrated smoothly into existing legal workflows.

6. Addressing AI Limitations and Risks

  • Observation: Panels noted the potential risks of over-reliance on AI, including the risk of diminishing critical thinking skills and the possibility of AI errors. These limitations must be addressed proactively.
  • Lesson: Understanding and mitigating the limitations and risks of AI is crucial. Students should be trained to recognize AI’s boundaries and ensure robust safeguards are in place to prevent and correct errors.

Practical Recommendations for Students

  1. Develop Interdisciplinary Skills: Combine knowledge of AI with an understanding of legal principles. This interdisciplinary approach will help in creating AI tools that are both technically sound and legally appropriate.
  2. Focus on User-Centric Design: Engage with legal professionals to understand their needs and pain points. Designing AI systems with the end-users in mind will lead to more effective and accepted solutions.
  3. Stay Updated on Legal and Ethical Standards: Regularly update your knowledge of legal standards and ethical guidelines. This will ensure that your AI solutions remain compliant and ethically sound.
  4. Embrace Continuous Improvement: Implement feedback loops and continuous learning mechanisms in your AI systems. Be open to refining and improving your solutions based on real-world usage and feedback.
  5. Promote a Balanced Approach: Advocate for a balanced use of AI, where it complements but does not replace human expertise. This balance is crucial for maintaining the quality and integrity of legal practice.

By adhering to these observations and lessons, students can contribute to the responsible and effective integration of AI in the legal profession, enhancing both the efficiency and quality of legal services.

Conclusion

The transcript shown here of ChatGPT4o’s analysis of the ten panels on AI Mentorship confirmed the consistency and high intelligence of the custom GPT Panel of AI Experts for Lawyers. We again saw no hallucinations and very few errors, most of which were minor or arguable. I doubt ten human panels on this topic would make fewer errors, and one or two of the experts might very well fabricate a detail or two under pressure. Humans experts hate to say they do not know. They are people, they have egos and self-awareness. ChatGPT does not.

I was impressed by the balanced approach and honest self criticism of ChatGPT4o. Overall, I am not sure I could done as well. The panels proved to me the benefits of this method and the soundness of the custom GPT. Look for it soon at the OpenAI store. You will see a list of our custom GPTs on the top right column of the e-Discovery Team blog.

Ralph Losey Copyright 2024. — All Rights Reserved