Are User Interfaces Shrinking?

Ever since technology has gotten better, interfaces required less human input to get people to their desired outcomes, and as a result, user interfaces have shrunk. AI will accelerate this trend.

Dec 23, 2024

Today, with AI getting better at predicting and anticipating human tasks and input — I hypothesize user interfaces designed for predictable tasks will shrink quickly, if not disappear entirely, while complex (and often creative) tools are made more accessible with an AI collaborating within the same interfaces to reduce human effort. I argue that interfaces most resistant to shrinkage are those designed for a wide range of unpredictable tasks.

I. The Historical Shrinking of Interfaces

Let’s look at how interfaces have shrunk already through a relatively obvious consumer use case: watching videos. When YouTube launched in 2005, it launched with a very simple core flow to get people to watch videos they find entertaining:

A search box (text field) requiring human input to yield personalized results: type in a video title or topic.
A results page requiring human input: e.g. the act of exploring the search results and deciding which video to watch based on thumbnails and titles.
A video card requiring human opt-in: click on a video from the search results to play the respective video, resulting in getting the user to their desired outcome (e.g. a video to watch and they most likely find entertaining).

To be clear: at the time, all of these steps were necessary steps in YouTube’s UI to get a user to a video they enjoy watching (the desired outcome).

Approximately 15 years later, with TikTok, we:

Just open the app, and we get to our desired outcome instantly (e.g. a video we watch and most likely find entertaining).

Clearly, there was more friction to get to an entertaining video to watch through YouTube’s original interface, but it required that interface because YouTube required human input to get users to that desired outcome given the state of technology in 2005. Roughly 15 years later, step 1, 2, and 3 are completely removed as required components of the user interface and from the respective user journey, because they’re no longer necessary.

By showing and auto-playing a video right away, TikTok forces users to decide to watch or skip. By doing so, TikTok has users naturally contributing to a self-fulfilling loop that trains their algorithm and gets users to watching videos they find entertaining even faster. A great example of technology and good product design shortening the user’s path to their desired outcome, and thus requiring less user interface to get there.

You might say, “but the search bar is still there!”, sure, but it’s no longer a required component of the interface because search input isn’t a necessary piece of information to get people to watching an entertaining video anymore. The required human input to get people to that video has gone down, and as a result, the required user interface has shrunk. With TikTok, it’s no longer necessary to type in a search query, browse results, click into a result and finally watch the video — and since those types of human inputs aren’t required to get to the desired outcome anymore, their respective user interface components become redundant.

II. How AI Might Accelerate the Shrinking of Interfaces

As AI is getting better, it’s safe to assume that the ability of an AI “mimicking” a human is only getting better over time, and that ability can very likely converge to acting as a (particular) human — the concept of an AI agent. At a minimum, you can imagine a world in which an AI can predict the actions and input that are required by humans in applications today. AI can already reason about user interfaces and can quite literally tap buttons and navigate an interface for you.

*Mobile-Agent is an autonomous agent for operating the mobile device. Based on a user instruction, Mobile-Agent can plan a series of operations to complete the requirements.*

To illustrate this using the YouTube example: assume there’s no TikTok and YouTube’s interface looks the way it did in 2005. Also assume AI is even better at predicting and anticipating human input than it is today. If an AI could then predict which search query a particular user is going to type into the YouTube search bar at 4:20pm on a Thursday, and predict which video they’re most likely interested in watching, and in turn play that video for them — would YouTube still need the search bar, search results, and active opt-in on a result to get you to watching a video? Probably not, it would introduce unnecessary interface components and friction.

On top of that, with access to the underlying functions and architecture of applications, AI doesn’t need interfaces designed for people to perform its tasks, which eliminates the need for human-centric interfaces for many tasks. Instead of navigating a graphical user interface, an AI could directly execute the necessary functions to achieve the desired outcome.

Let’s say an AI can order groceries by calling APIs directly, the interface as it stands today becomes redundant. The experience shifts to a simple voice query: ‘reorder my usual groceries,’ with AI handling the rest invisibly. It may anticipate when you need groceries in the first place and just order them for you, without any query or interface at all. Now, of course, not everything we do is as repetitive and predictive as selecting and ordering groceries. But again, assuming things will only get better, this still surfaces an interesting question for user interfaces as we know them today: if AI can predict human actions or input before it’s even needed, and human input wouldn’t be required to achieve a certain outcome, do we still need traditional interfaces, which are inherently designed for human input?

III. Why Service Apps’ Interfaces Are Most Prone to Shrinkage

Interfaces that are most prone to shrinkage are service-driven applications — their interfaces rely heavily on repetitive and retentive user input. Because service apps’ human input is so repetitive, that type of input can very likely be anticipated or accurately predicted by AI. It’s not too crazy to imagine telling Siri to order you an Uber and simply asking you to confirm the trip (if at all visually) because all of today’s required human input (ride type, pickup, destination) is anticipated by Siri (or some other AI) and presented to you. After doing this for a year, you might be trusting enough where you don’t need to be confirming every instance anymore. Siri might know when you’re leaving an apartment (e.g. you’re far away from Bluetooth devices that you were near to for a specific period of time), and automatically order a car for you.

Apple’s Siri App Intents already lays the groundwork for such interactions. The missing piece has been Siri’s AI capabilities. Moreover, in the most non-intrusive way, this is already happening on your iPhone’s home-screen with suggested apps. In fact, if someone were to study the number of home screen pages iPhone users have, I wouldn’t be surprised if that number went down since the introduction of suggested apps — again, since apps you’re likely to open are front-and-center suggestions, there’s no need to have pages of apps to browse through: those pages have now become unnecessary interface real estate. Taking the leap from suggested apps to, hypothetically speaking, your iPhone “opening” an app for you, isn’t too hot of a take in my opinion. In that world, instead of your iPhone telling you “Yo, I’m about to open Uber, watch me do it”, the Uber app’s underlying functions might simply be “called”, and you’re just asked to confirm the ride through a push notification or voice command.

In that world, Uber as a service may still exist, but it might not manifest itself through a visual user interface anymore. The visual user interface may still be there when you need it for confirmations and error-correction, but this may manifest itself much more in mechanisms for AI training purposes (such as TikTok’s swipe up and down paradigm). These types of service-driven user interfaces are perfect examples of where an AI could use the underlying functions and endpoints, combined with the anticipated human input, to achieve the desired outcome. And by doing so, eliminate the majority of the user interface that’s required today to achieve that same outcome (ordering a ride). In these interfaces, the AI serves as an autonomous agent with minimal oversight, and the interface designed for human input becomes redundant because it doesn’t serve its purpose anymore: it’s no longer the fastest path to the desired outcome.

IV. How Creative Tools’ Interfaces Resist Shrinkage

While service-driven applications are likely to see interfaces shrink or disappear, creative tools will probably require a different approach. Creative tools rely heavily on unpredictable granular user input, making them less prone to shrinkage. They are tools, quite literally, designed for humans — and human input in creative tools and processes aren’t as predictable as in service driven applications. In fact, creative tools are designed for the unpredictability and flexibility that allow for creative exploration and ultimately, human expression. Where human input and interfaces in service-driven apps are often purposefully constrained and intended to converge towards an outcome, in creative tools the human input is often very granular and should actually support a divergence of outcomes: there often isn’t a clear and desired outcome or “job to be done”.

In creative tools, AI could serve as a collaborator, aka co-pilot, supporting the user in their (creative) process by offering variations and guiding them in using the tools. Since the desired outcome isn’t as clearly defined in creative processes, the role of the AI changes from an autonomous agent to a collaborator supporting the human’s creative paths. It’s almost like the introduction of an AI in that same interface is the equivalent of two music producers working on the same computer collaboratively. Even though the “total” human input required from one end-user does go down in this multiplayer world (e.g. two producers, or human + AI), the granularity of the interface is still relevant for a sense of control.

Take the example of ChatGPT. When it launched, part of what made it so cool was seeing ChatGPT “type”, token-for-token for you, mimicking human interaction. Sure, it’s aesthetic, but on a deeper level, it offered a sense of transparency, oversight and control. It’s often too fast to read along, but it’s slow enough to skim and understand the direction of the output. Combining that subtle sense of oversight with the ability to stop it from generating mid-process gives us control.

This design choice points towards a potential evolution for interfaces in creative tools— a shift from direct user input to guidance, using the same primitives a human may use in these tools, which I wrote about earlier. AI can handle the heavy lifting of complex tasks while keeping the user informed and involved at each (critical) step along the way. In this way, interfaces in creative tools evolve from being mechanisms of granular control, to becoming oversight levers: tools that allow users to steer and refine AI-driven processes with the same familiar interface a human would use.

Examples of where this is happening today are in video editing tools such as Captions (s/o to my coworkers) and programming tools like Cursor and (somewhat) Replit. By showing the user how the AI is performing a requested set of actions in the same interface humans use and are used to, the AI guides the user in their process. And because that’s happening within that same interface that the human uses; the user stays in control, and may learn a thing or two about the tool, in the same way the AI did from us (ha!).

In creative tools, the human input required to navigate the interface is reduced, yet the interface itself persists as a collaborative canvas. Since less human input is required to reach desirable results (e.g. exploring a variety of creative directions in a video editing tool, or creating different ad-sets), the inherent efficiency of the tool increases. Instead of one person providing input, there’s now two. The interface is still serving its purpose, but in a multiplayer and multiplied manner: the user interface remains as is today, but has an AI collaborating with the user in that interface — not only reducing human input to achieve the same results, but multiplying results with less effort, making these types of interfaces less susceptible to shrinkage.

The question that remains for creative tools is: for how long will these “traditional” tool interfaces still serve their purpose? The real question to ask there is: to what extent is a human’s creative process and respective input predictable? We said earlier that, as the required human input to get people to a desired outcome goes down, the required user interface has historically shrunk… Then, the more our creative processes can be mapped and pattern-matched, the better AIs will be able to anticipate our creative input in creative tools, resulting in faster shrinking of creative tools’ user interfaces.

V. Continued Interface Shrinking

If I had to define a spectrum of where and how fast interfaces are shrinking, I would position the shrinking of interfaces as a continuum based on the predictability (and complexity) of tasks, and whether the user interface as it stands today still serves its purpose assuming AI can predict the human input required to navigate the interface.

Service-driven apps are most bound to shrink to invisibility (e.g. Uber), since service-driven apps rely on predictable patterns and their interfaces’ primary purpose is designed to capture those predictable patterns. Whereas creative tools’ interfaces won’t shrink (initially), but rather require less human input and evolve towards oversight mechanisms where an AI operates within the same UI (e.g. co-pilot programming experiences). For creative tools, the interface still serves its purpose for humans but they’re assisted within the UI and may evolve more from a “single player” experience towards a “multiplayer” experience.

The trajectory seems pretty clear: interfaces designed for predictable tasks will shrink quickly, if not disappear entirely. While creative tools are made more accessible through AI collaborating within the same tools to reduce human input, yet multiplying creative output, ultimately supporting their creative journey to a desired outcome. In such a world, screen time might shift from just tracking your screen time, to tracking the duration and context of AI activity on your devices. Product designers might become orchestrators of user trust and AI-human interaction rather than crafting each interaction for a human end-user.

The shrinking of interfaces reflects a broader evolution: as AI assumes responsibility for predicting and executing tasks that we currently use graphical user interfaces for, then the visual and functional complexity of interfaces diminishes, leaving us with faster paths to outcomes. Taking it a big step further, if this trend continues, the graphical user interface itself may no longer define our relationship with technology. For the time being, the interfaces most resistant to shrinkage are those designed for a wide range of unpredictable tasks. Though, if AI continues to get better at predicting our behavior and is eliminating the need for the majority of our graphical user interfaces, did we enter a world in which AI is competing for our screen time? What will we do with the time we used to spend navigating apps?

Thank you to Gaurav, Dwight, Dan Moreno, Dan Noskin, Drew, Tyler, Cam, Alex, Daniel, Talal, Russell and Jordan for reviewing, chatting and inspiring me about these topics :)

Midas’ Substack

Discussion about this post