Beyond Cite-Checking: Why the Duty of AI Competence May Require Subject-Matter Expertise, Not Just Citation Verification

Profile summary

Primary use cases: legal research, document drafting
Pricing tier: enterprise/custom
Target audience: law firm, in-house legal
Underlying model: RAG-augmented
Accuracy / benchmark data: Stanford RegLab 2024: Lexis+ AI hallucination 17%, Westlaw AI-Assisted Research hallucination 34% (See comparison guides →)
Last reviewed: 2026-07-04

“I checked the citations” is a useful sentence. It is not, by itself, a competence analysis. A lawyer who uses generative AI to draft a brief, research memo, demand letter, or contract annotation can confirm that every cited case exists and still miss the more dangerous defect: the cited case may not stand for the proposition asserted, a controlling exception may be absent, or the answer may be plausible only to someone who does not practice in that field.

That is where the AI competence duty in professional responsibility becomes harder than the familiar anti-hallucination advice. Holland & Knight has framed the dispute as a choice between a “Duty to Verify” and a “Duty of Expertise”: under the first, the lawyer’s central obligation is to independently verify AI output; under the second, the lawyer must have enough substantive expertise to judge whether the work product is complete, accurate, and tailored to the client’s matter. That framing is a law firm’s interpretive analysis, not a binding ethics opinion, but it names the question more precisely than most compliance checklists do.[1]

Split view of surface-level AI citation verification and attorney subject-matter review

The better reading of the current materials points toward the Duty of Expertise, with an important caveat: no court or ethics committee has yet adopted that phrase as a settled standard. The argument is not that every AI-assisted task requires a specialist. It is that once a lawyer presents AI-generated legal analysis as professional work product, the lawyer cannot reduce Rule 1.1 competence to a citation-existence exercise.

Opinion 512 Does Not Let “Human Review” Do All the Work

ABA Formal Opinion 512, issued on July 29, 2024, is careful in the way ethics opinions often are careful: it gives lawyers room to use new tools, then quietly places the hard work back on professional judgment. The ABA summarized the opinion as addressing lawyers’ ethical duties when using generative AI, including competence, confidentiality, communication, candor, supervisory responsibilities, and fees.[2]

The opinion’s most important sentence for this issue is not simply that lawyers must review AI output. It is the warning that lawyers may not “leave it to GAI tools alone to perform functions that require a lawyer’s personal judgment.”[2] That language does more than require a final proofread. It identifies a category of professional functions that cannot be outsourced to the tool, even if the tool’s citations survive a database search.

There is a real tension here. Opinion 512 also reflects a sliding-scale approach: the amount and kind of review depends on the use case. A lawyer asking a tool to reformat a chronology does not need the same review protocol as a lawyer asking it to draft a dispositive motion. That sliding scale matters. It keeps the analysis from collapsing into a rule that only subject-matter experts may touch AI in any legal setting.

But the sliding scale does not save the Duty to Verify as a complete model. It tells us how much review may be needed; it does not redefine what counts as review when the task requires legal judgment. If the AI system proposes an argument about preemption, limitations periods, class certification, ERISA exhaustion, immigration removability, or an evidentiary exception, the reviewing lawyer must be able to evaluate the legal substance. That includes what the answer leaves out.

This is also where internal AI policies can mislead. A policy that says “all AI output must be reviewed by a lawyer” sounds disciplined, but it leaves the crucial question unanswered: reviewed by which lawyer, with what competence, against what sources, and for what kind of legal risk? For broader sanctions and ethics context, see the firm-policy discussion in AI Ethics in Legal Practice 2026.

The Problem Is Not Just Fake Cases

The first generation of lawyer-AI cautionary tales trained the profession to look for fabricated citations. That was necessary. It was also too easy. The more serious competence problem appears when the case is real, the citation is clean, and the proposition attached to it is wrong.

Stanford RegLab’s 2024 evaluation of legal retrieval-augmented generation systems found hallucination rates of 17% for Lexis+ AI and 34% for Westlaw AI-Assisted Research in the tested benchmark, and it identified “misgrounding” as a key failure mode: the system cites real authority that does not support the answer given.[3][4] The exact rates should not be treated as timeless product scores; tools change, benchmarks vary, and the study reflects the tested systems and period. The concept of misgrounding, however, is the part professional responsibility lawyers should linger over.

Magnified legal citation marked as real while the adjacent legal proposition is flagged as unsupported

Misgrounding defeats a simple citation audit. A junior lawyer, paralegal, or verification checklist can confirm that a case exists, that the quotation appears on the cited page, and that the parenthetical is formatted correctly. None of that proves the authority controls the issue, survives a later exception, applies under the relevant procedural posture, or belongs in the jurisdictional hierarchy the draft assumes.

A hypothetical example shows the point without needing a dramatic sanction order. Suppose an AI-generated memo says a plaintiff must satisfy a heightened pleading standard and cites a real appellate case. The case exists. The cited language appears. But the case involved a statutory claim with a specific scienter element, while the client’s claim does not. A citation checker may pass the sentence. A lawyer competent in the subject should hesitate.

This is why a hallucination audit remains useful but incomplete. A workflow like a six-phase AI hallucination audit checklist can reduce obvious failures, especially fake sources and unsupported quotations. It cannot by itself supply the doctrinal judgment needed to know whether the answer fits the matter.

Candor Problems Often Hide in What the Draft Omits

Rule 3.3 adds another reason cite-checking is too narrow. The duty of candor toward the tribunal includes the obligation to disclose directly adverse legal authority in the controlling jurisdiction when opposing counsel has not disclosed it. A workflow that checks only the citations already in the AI draft cannot identify an authority the draft failed to mention.

That omission problem is not exotic. Legal analysis is often competent because it knows the bad case, the exception, the exhaustion requirement, the local rule, or the standard of review that changes the answer. An AI draft may be polished precisely where it is incomplete. The danger is not only that a false citation will embarrass the lawyer. It is that the lawyer will sign a document that sounds researched while failing to account for authority a competent lawyer in the field would have recognized.

This is the point at which “independent verification” needs content. Independent of what? If it means checking the AI’s listed authorities against the database, it addresses only a subset of the risk. If it means independently researching the issue, testing the proposed rule against governing law, and asking whether contrary authority or client-specific facts change the analysis, it begins to look much closer to the Duty of Expertise.

Mata Was About Responsibility, Not Merely Bad Citations

Mata v. Avianca is still the case everyone reaches for, and usually for the obvious reason: lawyers filed materials containing fictitious cases generated by ChatGPT. In 2023, the Southern District of New York imposed sanctions after counsel submitted and then defended fabricated authorities in federal litigation.[5]

The useful lesson is narrower and sterner than “do not cite fake cases.” The court rejected the idea that ignorance about ChatGPT’s ability to fabricate cases excused the conduct.[5] The breakdown was a failure of lawyer responsibility over the filing. The lawyers did not merely miss a typographical defect. They presented legal materials to a court without doing the work necessary to know whether those materials were real, relevant, and supportable.

That reasoning matters even when every case in a later AI-assisted filing is real. The court’s concern was not Bluebook purity. It was that lawyers had allowed an unreliable process to substitute for the professional judgment expected of counsel. For a fuller rule-by-rule treatment of hallucination sanctions, the AI hallucinations and attorney ethics risk digest covers the recurring disciplinary theories.

The Rule 5.3 Analogy Makes Delegation Familiar, and Limited

The New York State Bar Association Task Force Report, issued in April 2024, is useful because it treats generative AI less like a mystical new actor and more like a supervised nonlawyer for Rule 5.3 purposes. That analogy has limits; software is not a paralegal, and a model cannot be trained, disciplined, or instructed in the ordinary professional sense. But the analogy disciplines the conversation.

Lawyers have always been able to delegate tasks. They have not been able to delegate the legal judgment that makes the work competent. A litigation partner may ask an associate to draft a research memo; the partner still needs a review process fitted to the associate’s experience, the stakes, and the partner’s own responsibility for the advice or filing. Treating generative AI as a nonlawyer makes the same point in a less glamorous vocabulary: assistance is permitted, supervision is required, abdication is not.

Vendor diligence belongs in this picture, but it does not replace it. A firm should ask what sources a tool uses, how it retrieves materials, what logging is available, how confidentiality is handled, and what contractual claims the vendor will actually stand behind. A Model Rules–mapped AI vendor due diligence checklist can help structure those questions. Still, a clean vendor file does not make a lawyer competent in tax, immigration, securities, family law, criminal procedure, or any other subject the lawyer does not understand well enough to review.

AI Competence Has Both Technical and Ethical Content

Harvard Law School’s Center on the Legal Profession usefully separates two kinds of competence questions raised by Opinion 512: what lawyers need to understand about how generative AI works, and what lawyers need to understand about the ethical risks of using it.[6] That distinction prevents two common mistakes.

The first mistake is treating AI competence as purely technical. A lawyer does not need to become a machine-learning engineer to understand that generative systems can produce fluent, unsupported, incomplete, or misgrounded legal text. The second mistake is treating AI competence as purely ethical. Knowing that confidentiality, supervision, candor, and fees are implicated does not answer whether the draft’s legal analysis is right.

The professional responsibility problem sits between those two domains. A lawyer must understand enough about the tool to know why output requires scrutiny, and enough about the legal subject to conduct the scrutiny that matters. Without the second part, “human review” becomes a label placed over work the reviewer cannot actually evaluate.

What the Duty of Expertise Does, and Does Not, Claim

The Duty of Expertise is the more defensible reading of the present ethics materials, but it should not be overstated. It does not mean every AI-assisted first draft must be reviewed by a nationally recognized expert. It does not mean a generalist can never use AI to explore an unfamiliar issue. It does not mean citation verification, source inspection, and audit trails are unimportant.

It does mean that the lawyer responsible for the work product must be competent to evaluate the legal answer before relying on it or presenting it to others. Sometimes that competence can be obtained through ordinary study, consultation, co-counsel, supervision, or limiting the scope of the task. Sometimes the correct answer is that the lawyer is not the right reviewer. Rule 1.1 has never promised that a good process can fully compensate for an absence of substantive competence.

Review Question	What It Can Catch	What It May Miss
Do the cited authorities exist?	Fabricated cases, broken citations, inaccurate quotations	Real cases used for the wrong proposition
Do the cited authorities support the proposition?	Misgrounding, overbroad parentheticals, weak analogies	Uncited adverse authority or missing exceptions
Is the answer complete under governing law?	Doctrinal gaps, jurisdictional problems, procedural mismatches	Client-specific factual consequences if the reviewer lacks matter knowledge
Is this tailored to the client’s circumstances?	Generic advice, wrong assumptions, omitted constraints	Strategic or counseling risks outside the document’s frame

The practical question, then, is not whether lawyers may use AI-generated legal work product. They may, subject to the ordinary and developing constraints of competence, confidentiality, communication, candor, supervision, and billing. The harder question is who is competent to sign off on the result.

A lawyer using AI-generated legal analysis should ask more than “Are the citations real?” The better question is: “Am I competent enough in this subject to know whether this answer is complete, accurate, and tailored to this client’s circumstances?” If the honest answer is no, the defect is not in the formatting of the citation. It is in the review.

References

Generative AI and the Duty of Competence Conundrum, Holland & Knight, February 2025.
ABA issues first ethics guidance on a lawyer’s use of AI tools, American Bar Association, July 29, 2024.
AI on Trial: Legal Models Hallucinate in 1 out of 6 (or More) Benchmarking Queries, Stanford HAI.
Legal RAG Hallucinations, Stanford RegLab, 2024.
Mata v. Avianca, Inc., 678 F.Supp.3d 443 (S.D.N.Y. 2023).
Being a competent lawyer in the age of generative artificial intelligence, Harvard Law School Center on the Legal Profession, November 2024.

← All legal AI tools

Corrections & feedback

Submit corrections to factual information, flag stale data, or share deployment experience. Comments are moderated. Nothing in comments constitutes legal advice.

Comments

Join the discussion with an anonymous comment.

Loading comments...