With any new technology, questions of discovery and privilege inevitably arise. As a recent New Mexico case demonstrates, that’s certainly true of Gen AI.

The Tremblay Case

The case, Tremblay v. OpenAI, Inc., is pending in California District Court. The case involves claims that OpenAI was trained by using plaintiffs’ copyright materials. OpenAI sought to compel the plaintiffs to produce and obtain the prompts and responses the OpenAI tool used in pre-suit testing, including those responses the plaintiffs did not use to support their claims.

The Magistrate Judge granted the motion to compel OpenAI, rejecting the plaintiff’s arguments that the prompts and responses not used were protected attorney work products. The Magistrate reasoned that the negative results were “more in the nature of bare facts.” The Magistrate also ruled the plaintiffs had waived the privilege by placing a large subset of the facts they obtained in the complaints (aka the positive results).

The federal district Judge took a different view. He believed the negative prompts and results were indeed attorney work product and labeling them in the nature of bare facts was a misapplication of law. The prompts said the Judge were queries crafted by counsel and contained the counsel’s mental impressions and opinions about how to query ChatGPT.

As far as the waiver argument, the Judge noted that a waiver is only applied where the mental impressions are at issue in a case and the need is compelling. There was no such showing.

Interesting Questions Abound

The case raises many interesting questions that go to the heart of what ChatGPT (and other large language models) do.

First, something doesn’t seem quite right about letting the plaintiffs use the positive results but hiding the negative results. I think that falls within the legal concept of what’s good for the goose is good for the gander.

The work product privilege is set out in Rule 26 of the Federal Rules of Civil Procedure. The Rule states that “ordinarily, a party may not discover documents and tangible things prepared in anticipation of litigation or for trial by another party” unless the party seeking the discovery can show it has a substantial need for the materials to prepare its case and cannot obtain an equivalent elsewhere. This has been held to generally mean a party cannot discover an attorney’s mental impressions, conclusions, opinions, and legal theories. 

What Does This Mean for Large Language Models?

The case raises many interesting questions that go to the heart of what ChatGPT (and other large language models) do.

First, something doesn’t seem right about letting the plaintiffs use the positive results they obtained but being able to hide the negative ones. I think that falls within the legal concept of what’s good for the goose is good for the gander.

But beyond that, it’s legitimate to ask why OpenAI couldn’t have obtained the same info by creating its own prompts on its system and thereby get the same or similar information. The answer, of course, is that a different or even similar prompt does not necessarily produce the same response as another one. So, in this regard, OpenAI may have had an argument of substantial need depending on how relevant the information was to the case.

Waiver?

What about the waiver? The Judge seemed to confuse what needs to be shown to meet the compelling need standard with waiver, which is the intentional relinquishment of a known right. Thus, if I provide ChatGPT with attorney client confidential materials, the materials are no longer confidential, and I have waived any claim that they are. That has always been one of the main concerns with legal use of the LLMs. Plus, the ability of LLMs to use the prompts and responses for other inquiries has led most to believe that putting materials placed into a public-facing LLM like ChatGPT will cause the materials to lose their privileged status. It’s not different with work product materials.

By this point, it would be hard for any legal professional to say they didn’t know these facts.

Yes, you can instruct ChatGPT not to use your data to train the model. But the robustness of this tool and the protections in place have not been tested. And yes, there are private systems that provide assurances your data will be kept confidential. But those models often use OpenAI in part, so we are back to assessing the validity of the OpenAI assurances. 

Don’t Forget About Reasonableness

Layer on top of this the notion that no system can promise that the data you provide to it will never be disclosed. Instead, the notion of confidentiality from a legal perspective depends on reasonableness. Were the protections and promises reasonable under the circumstances?What showing must be made to demonstrate there are sufficient confidentially protections in place? What this means in the context of AI and Gen AI has yet to be fleshed out. All in all, the case raises some interesting questions of privilege, waiver, and confidentiality that will be fodder for discovery disputes in 2025 and beyond