Designing a Trustworthy Partnership Between Humans and AI Agents

Why AI Agents? Beyond the Buzzword.

Our Solution: From Black Box to Glass Box

We set out to create an interface where the AI Agent doesn’t just give answers, but shows its work. Our goal was to keep the auditor in a single flow, providing all the necessary context to make a quick, confident decision without ever needing to jump to another document or screen.

This philosophy is similar to how AI-driven grammar and style checkers like Grammarly built trust. Initially, writers were hesitant to trust a AI with their work. Would it understand context and nuance? Grammarly addressed this through a transparent UI: it underlines issues and, when clicked, explains the reasoning (“Conciseness: consider removing this word…”). The user remains in control — each suggestion can be accepted or dismissed with a click. By educating users on why it suggested a change and making the interaction opt-in, Grammarly turned skeptics into regular users. We wanted to bring that same clarity and control to our auditing process.

How Jones Designed the AI Auditing Agent:

1. Show the evidence, instantly

Instead of just flagging a discrepancy, we made the AI Agent’s reasoning accessible on hover. When an auditor hovers over an entity the AI has reviewed, a simple yet powerful tooltip appears. Notably, this tooltip displays the exact page number of the finding, along with a side-by-side comparison of what the system required versus what it found in the document.

For a perfect match: The tooltip shows the found text in green, with a confirmation like “Entity name matched.”
For a total miss: The tooltip shows the required text in red, stating “Entity is completely absent across documents.”
For a partial match (the trickiest scenario): The tooltip highlights the differences, instantly drawing the auditor’s eye to the specific discrepancy (e.g., a missing “LLC”).

We also included a pop-out icon that, on click, opens the source document directly to the page where the evidence was found. This simple interaction is our “show your work” moment. It answers the auditor’s first question — “How do you know that?” — before they even have to ask.

2. Bridge the gap between requirement and resolution

When the AI Agent identifies a gap, it generates a proposed gap comment. This isn’t just an internal note; it’s the exact text that will appear on client-facing reports and be sent to vendors explaining the compliance issue. The stakes couldn’t be higher, as this is the final, crucial output of the entire verification process. To maintain context, we created a visual link. Hovering over a flagged entity in the requirements list now highlights the corresponding gap comment below. A single click scrolls the user directly to it, minimizing friction and the cognitive load of connecting a problem to its official resolution.

3. Empower the auditor, don’t dictate

Trust is a two-way street. While we wanted the AI Auditing Agent to do the heavy lifting, we needed to ensure the domain expert always felt in control, especially since the gap comments become the official communication to clients and their vendors. The AI-generated text is therefore fully editable. The auditor can accept the suggestion, tweak the wording for clarity or tone, or override it completely, ensuring they are the final guardian of our company’s voice.

To make this even more powerful, we added two critical features:

“Revert changes”: A simple one-click action to undo any manual edits and restore the original GenAI suggestion.
“Press & Hold to compare with AI”: This allows the auditor to quickly toggle between their edited version and the GenAI’s original, making it easy to see what’s changed.

Moreover, we acknowledged that building trust is a gradual process.
For moments when an expert still feels the need to manually verify a finding, our UX provides ‘trust-building’ accelerators. Even if they are skeptical of a suggestion, they are still armed with the page number from the evidence tooltip and quick-copy icons for both the required and found text. This allows them to jump into the document and perform their own Ctrl+F search with unprecedented speed, turning a once-lengthy manual check into a swift confirmation. By making even the act of double-checking faster, we respect their expertise and build confidence in the system’s overall utility.

The Expected Impact: Speed, Accuracy, and Confidence

By moving from a “black box” to a transparent “glass box” approach, we anticipate a transformative impact on the auditing workflow. We expect to see a more significant reduction in the time it takes to verify Additional Insureds. More importantly, we expect to see a rapid increase in auditor confidence and adoption of the AI feature.

The design directly addresses the core feedback from our users: it provides clear visual distinctions for gaps, offers an unmistakable indicator of the AI’s findings, and clarifies whether a resolution was automated or manually edited. This ensures the final communication sent to clients is both accurate and professionally crafted.

Tip for Product Managers: To truly measure the impact of your AI features, make sure to collect quantitative data. Before launching, benchmark how long it takes your users (in our case, auditors) to complete the task manually and measure their accuracy. After launch, track the same metrics. The goal isn’t just to be faster, but to be faster and maintain or improve accuracy. This data is invaluable for proving ROI and justifying future investment.

Takeaways: Designing Trustworthy AI Agent

Building a successful human-in-the-loop AI system is a masterclass in UX design. The AI process is only half the equation; the interface is the other half that makes it trustworthy.

Transparency is job one: Don’t just provide an answer. Show the evidence, the source, the “why” behind the AI’s conclusion. Let the user peek inside the box.
Minimize context switching: Design to keep the user in their flow. Bring the verification evidence to them, rather than making them hunt for it across different screens or documents.
Give the human the final say: Always provide an “out.” Ensure the user can easily override, edit, and revert AI suggestions. This maintains their sense of agency and control.
Build on familiar patterns: Leverage well-understood UI patterns (tooltips, highlighting, track changes) to make the newnAI-driven features feel intuitive and less intimidating.

Ultimately, our goal is to create a partnership where the AI handles the tedious 90%, allowing the domain expert to focus their valuable time on the critical 10% that requires their judgment. By designing for trust and transparency, we can build AI tools that truly augment human expertise, helping our users navigate their work faster and more confidently than ever before.

Thank you for taking the time to read. As a product manager and UX expert, I believe the principles of human-AI interaction are the most exciting challenges in our field today.

Designing a Trustworthy Partnership Between Humans and AI Agents

Bridging the gap between GenAI automation and human expertise by designing an interface that makes verification fast and intuitive.

Why AI Agents? Beyond the Buzzword.

The Insurance Auditor’s Dilemma: A Tale of Two Screens

Our Solution: From Black Box to Glass Box

How Jones Designed the AI Auditing Agent:

The Expected Impact: Speed, Accuracy, and Confidence

Takeaways: Designing Trustworthy AI Agent

More on Jones’ Blog

AI Agents + Prompt Engineering for Risk: What Groundbreak 2025 Left Out

Your Guide to Procore Groundbreak 2025: Key Themes and How to Maximize Your Experience

Compliance video course

Designing a Trustworthy Partnership Between Humans and AI Agents

Bridging the gap between GenAI automation and human expertise by designing an interface that makes verification fast and intuitive.

Why AI Agents? Beyond the Buzzword.

The Insurance Auditor’s Dilemma: A Tale of Two Screens

Our Solution: From Black Box to Glass Box

How Jones Designed the AI Auditing Agent:

The Expected Impact: Speed, Accuracy, and Confidence

Takeaways: Designing Trustworthy AI Agent

More on Jones’ Blog

AI Agents + Prompt Engineering for Risk: What Groundbreak 2025 Left Out

Your Guide to Procore Groundbreak 2025: Key Themes and How to Maximize Your Experience