UX for AI
Posts
12 LLM Product Development Practices that give you an UNFAIR advantage

12 LLM Product Development Practices that give you an UNFAIR advantage

12 AI Product Best Practices that elevate the value of UX as a glue for the product development process

Greg Nudelman
February 09, 2024

While at first glance this list goes beyond just UX design, hopefully, you can see how it elevates the value of UX as a glue for the product development process. Here are the 12 LLM Products Development Practices that give you an unfair advantage:

Thin-slice your MVP use case. Like prosciutto, the tastiest slices are the thinnest-sliced fatty ones. Is your slice the thinnest and fattest you can get? What do customers complain about most? Can this be solved with AI? If so, you found the thin, fatty slice to train your model on. More Info: How to Pick an AI Use Case https://www.uxforai.com/p/how-to-pick-an-ai-use-case
Don’t waste time on over-designing the tactical UX. Simple is best. Chances are the UX will change drastically when the product hits the market. More Info: The Importance of Staying Lean https://www.uxforai.com/p/the-importance-of-staying-lean
Don’t forget the vision. Start your project with a thin-slice tactical UX design, but keep an eye on the vision. Running two projects (tactical and vision) in parallel on two different tracks is ideal. More Info: Essential UX for AI Techniques: Vision Prototype https://www.uxforai.com/p/essential-ux-for-ai-techniques-vision-prototype
Use the “Master LLM” approach. Your vision project should include plans for a “Master LLM” that will direct to a specific “thin slice” LLM area, e.g., “Explain something” vs. “visual dashboard” vs. “answer question” – each of these use cases are likely best handled by a specialized instance of the AI and can be trained separately. Then you will need a “Master LLM” to direct the user query to a specific instance of the specialized model. More Info: UX Best Practices for Copilot Design https://www.uxforai.com/p/ux-best-practices-copilot-design
AI is like nothing you’ve ever shipped. Expect design to change rapidly even (and especially after) the AI feature ships. Old processes of deliberate thinking and lengthy up-front validation do not fully apply to AI projects and will slow you down considerably. Try to get “provisional approvals” rather than full-scale ones, as the design will likely evolve rapidly. Technology is likewise changing rapidly and driving the rapid UI and product definition changes. Rapid roll-out and iteration, in combination with direct market testing and model training, will likely become the industry norm for most AI-driven projects going forward. More Info: On holding back the strange AI tide, by Ethan Mollick https://www.oneusefulthing.org/p/on-holding-back-the-strange-ai-tide
Let the customers train your model. This is likely the single most important point: set up your project processes so the customers can train the model en-masse, and then the model will provide the benefits for the customers. Your AI feature is essentially the intermediary between the two: you simply provide a convenient space (along with some specialized display, refinement, and feedback tools) for a “meeting of the minds” between your customers and a specially trained AI.
Your first project iteration is mainly there to collect the data. Provide prominent tools for feedback (e.g., Amazon Q insight example “mark as verified” and thumbs up/down, and the space to provide the answer you expected from the LLM.) All these nudges to engagement do really work, and they need to be a prominent part of the experience, along with legal disclaimers and such content warning the user that the AI-generated information needs to be carefully verified before being used. More info: Amazon Q Preview: https://aws.amazon.com/q/
Technology stack matters! Used to be you can get the equivalent benefit from going with AWS, Azure, GPT, etc. Not anymore! The quality of the underlying AI model matters much more than anything else. ChatGPT on Azure is the clear winner at the moment for most applications; however, depending on your use case, a different GPT model might do better. Be sure to try out multiple models (ChatGPT, Anthropic, Llama, Bard, etc.) and compare them. Also, do a quick spike and test for latency: how long does the model take to come back? That is the first thing they solved with the r1 rabbit. The bar to hit at the moment? ChatGPT. More info: The Rise of AI-First Products https://www.uxforai.com/p/the-rise-of-ai-first-products
Use Fine-tuning instead of prompt engineering. Regarding prompt engineering vs. fine-tuning, so far, in my experience, fine-tuning seems to do better. So if you can, pick an LLM platform that you can fine-tune. The new Amazon Q Ai is an excellent playground for trying this out. While we are on the subject, pick a platform that takes feedback well – some models I have tried make few or no adjustments and resist training (see point 12 below on LLM model temperature).
Use Positive N-shot Prompting. If you do decide to prompt instead of fine-tuning, giving the system ample (a few thousand) examples of what to do rather than what NOT to do seems to work better. Just like you tell children to “Remember to bring your gloves back home after you play” rather than “Don’t forget your gloves” (because if you phrase the request as a negative, many kids will unconsciously do what you told them not to do, e.g., they will “forget the gloves.”) Many LLMs seem to be kind of like that. More info: Prompt Engineering, Explained, Product Mindset https://productmindset.substack.com/p/prompt-engineering-explained
Incentivize the AI Product Development in your Organization. Start with a dead simple UI and model underneath and release it internally to your own company: Account Execs, customer support, gurus, etc., so that you can both see the holes in your training data and use cases and crowd-source the training data. Tell everyone how critical this AI function will be to your company’s long-term sustainable success, and share how helping train this AI will ultimately aid everyone’s careers and long-term employment. If you can, collect the data as to who does the most training work and make it a game to incentivize contributors. Give away valuable prizes.
Know when to chill your model. Temperature parameter is what makes your model more or less creative: higher temperature means more creative answers. For most SaaS products, lock down the model, e.g., set the model temperature parameter as low as possible to ensure consistent answers. If the initial answer is not correct, provide a button so the user can try to regenerate the answer. Experiment with using a slightly different model for the “regenerate” action, where the model temperature is turned way up, so the model gets more creative and generates some variability in the alternative answers. Experiment with how much “hotter” to make the model for this “regenerative” action. More Info: How to tune LLM Parameters for optimal performance https://datasciencedojo.com/blog/llm-parameters/

That’s all we have time for today! I hope you continue to explore the tremendous value UX can bring to your AI-driven product development lifecycle.

Greg Nudelman & Daria Kempka (Contributing Editor)

Reply

or to participate.