UI for AI

Beyond the Text Box: Imagining the Next Stages of AI-Driven User Experience

May 20, 2025

For decades we’ve navigated screens built around files, folders, buttons, and menus - predictable (but now boring) routes. Gen AI, arguably the best thing that has happened to computing since the web itself, lets us rethink that entire layer from scratch. Its ability to understand natural language and generate novel content means our interactions no longer need to be so rigidly defined.

This UX/UI transformation is something I think about often, so I decided to lay out my thoughts on how it’s progressing and point out a few examples I find particularly cool at each stage.

Phase 1 · Basic Chatbots

The Vibe: That first real experience of direct conversation with AI; a bit clunky and prone to making things up, but it definitely felt like a new door had opened.

What It Looked & Felt Like

Plain text windows with prompt / response
No internet or external tools
Minimal formatting (code blocks, lists)
Sessions isolated and disposable

Notable Tech1

ChatGPT (Nov 2022)
Claude 1 (Mar 2023)
Alongside chatbots, there also were early visual AI like Midjourney & DALL-E

Phase 2 · Fragmented AI Utilities

The Vibe: AI gained the ability to access live information, use various tools, and work with multiple data types (text, images, voice), though users often interact with these capabilities through a range of separate, specialized systems.

What It Looks & Feels Like

Several specialised systems with distinct strengths
Function-calling, plugins, search APIs
Text, image, and voice in/out (multimodal)
Work & knowledge spread across browser tabs and apps

Notable Tech

Major LLMs (ChatGPT, Claude, Gemini) all support external tool use. Gemini and ChatGPT are fully multimodal (text + image + audio)
AI-first browsers like Arc, Dia, Perplexity Comet, Brave (Leo)
AI browser Dia
Cursor AI - chats over your whole repo, rewrites multiple files at once
Elicit, Consensus, Deep Research - tools for advanced research synthesis and paper analysis
Agent Platforms (Gumloop, n8n, Manus etc)
Advanced Multimodal Tools (Veo 2, Kling)

Phase 3 · On-Demand Interfaces

The Vibe: Your screen starts feeling less like a fixed window and more like a smart canvas, actively rearranging itself to show exactly what you need for whatever you're doing.

What It Looks & Feels Like

UI elements generated on-demand
Canvas-style, multi-threaded workspaces
Controls appear only when relevant
Rich visualisations for fast comprehension
Personalised layouts adapting to accessibility needs
Fluid interactions; less clicking through menus, more direct manipulation & intent-driven adjustments

Notable Tech (early)

Galileo AI - Canvas-style, on-demand UI elements
Thesys C1 GenUI -API turning LLM JSON into live React interfaces, try their chat!
Thesys demo
tldraw"Make Real" - transforms sketches and diagram into HTML
Mermaid AI - prompt-to-diagram generation
PamPam - map-based interaction and curation:
Meter Command - conversational UI changes completely based on user's command:

Phase 4 · Ambient Intelligence

The Vibe: The world starts to become your computer, with AI blending into your surroundings and a central system managing context across your devices.

What It Looks & Feels Like

A central agent orchestrating experiences everywhere
AR / VR layers blending digital with physical
Voice, gaze, gesture, haptics, and even subtle environmental sound or visual cues as part of a richer sensory input palette
Information staged before you even ask; not just explicit data but also subtle, contextually relevant cues & suggestions integrated into the environment
Parallel exploratory threads visible at once

Notable Tech (still early)

Apple Vision Pro spatial computing, Meta glasses & quest
Endel - AI generates a constantly evolving soundscape based on your biometric data & environment
Games with AI NPCs and dynamic worlds
Cool example: Quest mixed reality game where you build automated contraptions in your real-world space

Phase 5 · Cognitive Extension

The Vibe: AI starts to feel less like an external tool and more like a natural extension of your own thinking process, making complex tasks or accessing knowledge feel more direct and intuitive.

What It Looks & Feels Like

Interface fades; experience dominates
Biosensors infer mental state and intention
Just-in-time knowledge surfaces without prompts
Thought-level shortcuts replace commands

Notable Tech (super early)

Cognixion - AR applications controlled by mind, eyes, head pose

P.S. The 'Notable Tech' examples are a purely subjective curation based on what I find cool and representative

Echoes of the Machine Mind

Discussion about this post