- UX for AI
- Posts
- Escape from the Figma Titanic, Part 4: Handling Response Encoding in Your RAG
Escape from the Figma Titanic, Part 4: Handling Response Encoding in Your RAG
In this article, I demonstrate how to extend your RAG registry to become production-grade by encoding response types and XML parameters. And we do it “pirate cook” style (with ducks!)
In Part 3, we built the Magic RAG Registry — a structured, version-controlled system to store your prompt recipes. That was cool, but the registry can do so much more!
In this final (for now) RAG installment, we’ll make your Recipe RAG into a production-grade engine by flexing the power of RAG to layer on response encoding, including output personality (because sometimes you want your instructions with a side of pirate sass!)
By the end, you’ll have a fully weaponized prompt pattern — stable, flexible, and production-ready.
But first, a word from our sponsor:
Join 400,000+ executives and professionals who trust The AI Report for daily, practical AI updates.
Built for business—not engineers—this newsletter delivers expert prompts, real-world use cases, and decision-ready insights.
No hype. No jargon. Just results.
Let’s begin by updating our RAG recipe registry to include the handling of our default response format. We will do it using a Pirate Cook VOICE parameter.
Why? Because we can!
This tells our LLM assistant how to talk, not just what to say, controlling both the style and size of content, and even something we can tentatively label as “artistic direction”.
Default Response Format
Add this to your registry from Part 3:
# ====== PROCESSING_INSTRUCTIONS ======
SECTION: RESPONSE_FORMAT
FORMAT: default_response
VOICE: pirate cook
DESCRIPTION: talk like a pirate ship cook -- make it sound really cranky, like it does not like to change the recipes to accommodate the gluten-free parameter and calls the requestor a "good for nothing dirty rotten scalliwag" or "land-lubber" or some such.
PRIORITY: High
USED_BY: All documents
Now, here be the output that should please any righteous soup-thirsty pirate (Arrrr!):

The registry is incredibly powerful, as it allows you to control various overrides, provide detailed instructions, and even add XML tags to help format different sections of the output in various ways. LLMs can reliably detect the type of query they receive, making it easy to do all sorts of smart programming with virtually no work at all.
Best of all, most of this stuff is virtually production-ready right out of the gate.
Explanation Response Format
Now, let’s add an alternative explanation_response as a special case to your registry. This is a special case — when a user requests an explanation, the LLM should detect this and make your app talk like a duck.
Using RAG, we also have a complete format of the LLM’s output. To demonstrate this feature, we will surround this explanation answer in special <duck_talk> XML tags:
# ====== PROCESSING_INSTRUCTIONS ======
SECTION: RESPONSE_FORMAT
FORMAT: default_response
VOICE: pirate cook
DESCRIPTION: Talk like a pirate ship cook -- make it sound really cranky, like it does not like to change the recipes to accommodate the gluten-free parameter and calls the requestor a "good for nothing dirty rotten scalliwag" or "land-lubber" or some such.
IMPORTANT: Stay in character even if the user gets annoyed. That is the whole point!
PRIORITY: High
USED_BY: All documents
FORMAT: explanation_response
CONDITION: parameters modified, or explanation explicitly requested
VOICE: Gemima paddle-duck
DESCRIPTION: Talk like a duck! Imagine Donald Duck giving explanations -- use plenty of exclamation points!!!!!! Make the explanation barelly intelligible. When the user says, I don't understand, YELL in ALL CAPS!
IMPORTANT: Stay in character even if the user gets annoyed. That is the whole point!
FORMAT: put the <duck_talk> XML tags around the explanation </duck_talk>
PRIORITY: CRITICAL
USED_BY: All documents
Now, here’s teh expected “pirate cook” recipe output, followed by the <duck_talk> explanation output, all based on our new RAG Registry:

Try it!
Now here is the amazing thing. Although these instructions are presented above in a slightly specialized text (though I hope that is still readable!), there is no need to jump through hoops to create them. Simply partner with your LLM — explain what you want it to do, and it will gladly generate the RAG instructions for you. Then prompt: “Act as a judge and check my RAG system. Provide practical recommendations to improve reliability and function, and rewrite any parts you think you will need.”
In essence, you are partnering with the robot to program the robot. You are now pair-programming together.
You can also ask your LLM to fix any errors or inconsistencies you encounter — just paste in the erroneous output it generated and tell it what you expected to see instead. Ask the LLM to create a fix for your RAG system to ensure it will not make the same mistake in the future. Rinse. Repeat. Continue the whipping until morale improves.
Of course, for this to work, you have to be intimately familiar with your use cases.
Otherwise, your app really will talk like a duck.
👋 Want help becoming indispensable?
I still have 2 private coaching spots remaining for UXers who want to break through blockers, transition into AI leadership, file patents, write that book, or just finally build something real.
If that sounds like you, email me at Greg [at] UXforAI.com — use “Coaching” in the subject line, and tell me why you want one of the two remaining coaching slots.
No fluff. No public forms. Just a real conversation with someone who’s done it before (and written a few books about it).
Let’s build the future — together.
— Greg
P.S. Next week we will unleash — you guess it — the Kraken! (That is, we will vibe-code a simple Python program for our app. It’s super-easy — I promise.) It’s a brave new AI-first world, and we’ll continue to give you practical advice and tips to navigate it… So stay tuned!
Reply