Most AI Text Generators Are Bad
I was, and in many ways still am, skeptical about ChatGPT and similar LLMs' ability to actually improve the experience of the average user of an average SaaS application in any meaningful way.
I watched carefully as the tools I personally use — Notion, Docs, Linear — rushed to add generative text to their products. And I never got any value out of it. The text generated by Notion's AI blocks is bland and robotic, detectable from a mile away as written by a robot.
Seeing this, I largely disregarded the hype cycle around LLMs, despite all the startups springing up left and right raising boatloads of money. After watching the crypto hype cycle and how it ended, I was pretty convinced that this was headed in a similar direction.
Most importantly, I couldn't answer the question, how does this make life easier for the average, non-technical person?
Then one day, after largely ignoring LLMs for over a year, I was dealing with a specific design problem that one of my businesses was having. Users were occasionally running into an issue where they wanted to rename their entire list of documents to follow a new pattern, which required a series of manual tasks in the UI (right-click, rename, type, save, repeat for all documents). This was a total pain, and I wanted to find a way to streamline it.
First I proposed coming up with a dedicated UI for this, where users could pick from a selection of naming schemes to choose from. This would be the standard approach to a UX problem — add a tool that streamlines the process.
But this solution sucked. First of all, this problem was not common, exactly — but when it occurred it left users frustrated and increased the probability that they'd churn. So I wanted to fix it, yes, but not at the cost of adding unneeded bloat to an already complex piece of software. Plus, a fixed set of naming schemes wouldn't be flexible enough for all use cases.
It turns out that LLMs are extremely well suited to this kind of problem, though it took a bit of setup. Here's the solution I came up with:
- Prepare a list of supported actions that the user is capable of executing, for example
RENAME_DOCUMENT
. - Prepare the relevant data that the user has access to, for example the list of documents on their account.
- Ask the user to describe in plain English what they'd like to do.
- Feed the supported actions, context, and user prompt to an LLM, and ask it to generate a list of actions that the user is trying to execute.
- Execute the actions.
It worked like magic. LLMs are uniquely good at this type of problem — taking plain English and transcribing it to another format. Now, in our SaaS application, users can type "Rename all my documents to follow the naming scheme 'Chapter One' instead of 'Chapter 1'" and we can execute that operation in seconds, even over hundreds of documents.
Abstraction and Implementation
This approach is capable of much more than renaming documents. Given a carefully defined set of actions and context, it gives users the power to avoid one of the worst feelings of using a web application — having to do something that's trivial to conceptualize but time-consuming to accomplish. If users can describe it, it can be automated without bloating your UI with highly specific tools.
Implementing the above is non-trivial, as you'll need to set up your integration with an LLM, figure out how to serialize and reverse serialize your application's context and operations into text, set up a chat-like UI for the user and tie it all together. It's certainly doable, but...
The Fast Way
At Rehance we've made it easy for you to cover your bases and retain your users. Define the available actions and context in our UI, and add our drop-in JS script to your site to get the chat interface up and running. Theme of the UI widget is customizable, and integration of the drop-in script should take a front-end engineer an hour or so depending on the number of operations you're looking to support. Check it out at rehance.ai.
The Slow Way
Here's the roadmap:
- Define a set of actions. Each action needs to have a set of parameters. Only include the bare necessities, as too much complexity will confuse the LLM you use.
- Define the shape of the context that may be needed to perform the provided actions. The details here depend on your use case, so you'll need to figure out what shape works best for you. As with the actions, minimize the number of fields and keep the data types as simple as possible.
- Create a chat-like UI interface for your users. They should be able to enter text describing what they're looking to accomplish.
- Construct a prompt for an LLM, providing the context and actions, along with the user request. Process the response, and convert it into actual operations performed in your software application. Handle errors and edge cases appropriately.
- Run analytics on your users' prompts to figure out what actions they use the chat tool for most - these are likely the most frustrating parts of using your application.
Building this type of interface into every piece of software is a no-brainer. Text inputs won't replace traditional UIs by any means, but they can certainly cover all of its blind spots.