Chat Interfaces Aren't the Future of Software — But They're Invaluable Additions

Text-based interfaces can make every user a power user, without bloating your site's UI.

·7 minute read
Cover for Chat Interfaces Aren't the Future of Software — But They're Invaluable Additions

Are We Going Back in Time?

In the early days of computing, text-based interfaces were the only option. Commands were executed by hand, and there was no mouse to click, drag, or resize things. Simply typing the right magic words would get you the desired outcome.

Since then, of course, we progressed into the realm of visual user interfaces. A mouse to point and click things with. Concepts like "windows", "tabs", "icons", and "toolbars" were introduced. By all accounts, the visual approach to interfacing with computers came to dominate the space, with text-based interfaces reserved for tech nerds like you and I.

Yet, since the release of ChatGPT, there's been a crescendo of voices questioning whether text-based interfaces may become the norm in some software again. Podcasters and tech 'influencers' make grandiose predictions that the future of software is simply typing what you want to do, and that the introduction of LLMs will lead to a total reimagining of how software is designed.

I don't see it playing out that way — but that doesn't mean that text-based interfaces don't have a place in the average SaaS application.

Pros and Cons of Text-Based and Visual UIs

In retrospect, the transition from text-based to visual UIs seems obvious. You don't have to first memorize a bunch of commands for every software or tool that you use — instead you just "click" the "button" that contains the action you're trying to accomplish! Now, one could argue that this point is moot when taking LLMs into account, as you don't need to memorize any commands at all; you can just type in plain English. So let's consider the pros and cons of text-based and visual UIs in the era of advanced language learning models.

Visual UIs

Pros:

  • Intuitive, thanks to iconography and the fact that we've been using visual UIs for many years.
  • Conveys structure. Consider Notion — without a visual sidebar showing the tree structure of all your documents, how useful would it be?

Cons:

  • Space-inefficient. Every feature or tool has some corresponding UI elements, so complex software takes up a lot of surface area and needs to be divided into dozens or hundreds of pages, tabs, menus, and sidebars.

Text-Based UIs

Pros:

  • Space-efficient. All you need is a text box and your users can do everything they need.
  • Flexible. Rather than having functionality tied to discrete UI elements, the user can enter anything they want to accomplish, and the software can do its best to appropriately respond to the request.

Cons:

  • Does not convey structure. An empty text box gives no information about what can be done in the tool, what data is stored in it and how it's organized, etc.
  • Unintuitive. Goes against the norm of the modern era of computing, lacks iconography and users don't know where to start.

Combining Visual and Text-Based UIs for the Optimal User Experience

As it turns out, a combination of the two approaches is the most compelling. Visual UIs are without a doubt the better choice for all the core functionality of most applications. We're used to them, they're far more intuitive, and they convey the structure of the application. But their fatal flaw is, well, fatal — space inefficiency.

A screen only has so much real estate, and burying functionality under piles of menus and tabs causes cognitive load and unhappy users. Designers have recognized this and responded by focusing on simplicity and ease of use in their work. But every choice in favor of simplicity results in the exclusion of potential functionality. This often makes it impossible to easily execute the kinds of complex tasks that power users face on a regular basis. This is a significant tradeoff that isn't talked about enough. We try to cover it up with band-aid solutions like keystrokes, but it doesn't cut it.

This is entirely a UI/UX tradeoff. After all, any given software is technically capable of performing far more operations, and combinations of operations, than are available in the UI. But because of the perceived negative impact on simplicity and ease of use for the average user, those operations are excluded. So when a user faces a complex set of tasks, they have to go step by step, following a robotic process that takes forever. That feeling is what makes a user feel like a normie instead of a power user. And if they find another tool that lets them automate or optimize that, they'll churn for it.

I also want to make it clear that this UI/UX tradeoff is real – the solution isn't to add all the buttons and controls and modals and operations that your users might benefit from. Luckily, it turns out that text-based UIs outperform at covering these bases thanks to the flexibility they provide.

LLMs Excel Where Your UI Fails

LLMs are good at processing text and yielding more text. Applying these skills into a chat interface can kill all of the above birds with one stone. Users can interact with the software as usual (visually) for most things, but when they face a set of related tasks, or a particularly monotonous one, they can pull up the text-based interface to describe their intended output. With a bit of care, we can use LLMs to correctly handle these requests. To to this, we need the following:

  1. Be able to provide the list of available operations to the LLM, in a concise, plaintext format;
  2. Be able to provide the context around the user's request, such as the currently viewed list of documents in Notion's case;
  3. Allow the user to type in plaintext what they want to accomplish;
  4. Process the LLM's output into instructions for your SaaS using the provided list of available operations.

Putting this all together, you can give your user the power to type descriptions of their desired outcome in plain English, and convert that into a proposed list of actions to execute. If the user approves the list, you execute those instantly, saving the user hours of time and giving them that power-user-thrill.

Implementation

Implementing the above is non-trivial, as you'll need to set up your integration with an LLM, figure out how to serialize and reverse serialize your application's context and operations into text, set up a chat-like UI for the user and tie it all together. It's certainly doable, but...

The Fast Way

At Rehance we've made it easy for you to cover your bases and retain your users. Define the available actions and context in our UI, and add our drop-in JS script to your site to get the chat interface up and running. Theme of the UI widget is customizable, and integration of the drop-in script should take a front-end engineer an hour or so depending on the number of operations you're looking to support. Check it out at rehance.ai.

The Slow Way

Here's the roadmap:

  1. Define a set of actions. Each action needs to have a set of parameters. Only include the bare necessities, as too much complexity will confuse the LLM you use.
  2. Define the shape of the context that may be needed to perform the provided actions. The details here depend on your use case, so you'll need to figure out what shape works best for you. As with the actions, minimize the number of fields and keep the data types as simple as possible.
  3. Create a chat-like UI interface for your users. They should be able to enter text describing what they're looking to accomplish.
  4. Construct a prompt for an LLM, providing the context and actions, along with the user request. Process the response, and convert it into actual operations performed in your software application. Handle errors and edge cases appropriately.
  5. Run analytics on your users' prompts to figure out what actions they use the chat tool for most - these are likely the most frustrating parts of using your application.

Building this type of interface into every piece of software is a no-brainer. Text inputs won't replace traditional UIs by any means, but they can certainly cover all of its blind spots.