For decades we’ve navigated screens built around files, folders, buttons, and menus - predictable (but now boring) routes. Gen AI, arguably the best thing that has happened to computing since the web itself, lets us rethink that entire layer from scratch. Its ability to understand natural language and generate novel content means our interactions no longer need to be so rigidly defined.
This UX/UI transformation is something I think about often, so I decided to lay out my thoughts on how it’s progressing and point out a few examples I find particularly cool at each stage.
Phase 1 · Basic Chatbots
The Vibe: That first real experience of direct conversation with AI; a bit clunky and prone to making things up, but it definitely felt like a new door had opened.
What It Looked & Felt Like
Plain text windows with prompt / response
No internet or external tools
Minimal formatting (code blocks, lists)
Sessions isolated and disposable
Notable Tech1
Alongside chatbots, there also were early visual AI like Midjourney & DALL-E
Phase 2 · Fragmented AI Utilities
The Vibe: AI gained the ability to access live information, use various tools, and work with multiple data types (text, images, voice), though users often interact with these capabilities through a range of separate, specialized systems.
What It Looks & Feels Like
Several specialised systems with distinct strengths
Function-calling, plugins, search APIs
Text, image, and voice in/out (multimodal)
Work & knowledge spread across browser tabs and apps
Notable Tech
Major LLMs (ChatGPT, Claude, Gemini) all support external tool use. Gemini and ChatGPT are fully multimodal (text + image + audio)
AI-first browsers like Arc, Dia, Perplexity Comet, Brave (Leo)
Cursor AI - chats over your whole repo, rewrites multiple files at once
Elicit, Consensus, Deep Research - tools for advanced research synthesis and paper analysis
Phase 3 · On-Demand Interfaces
The Vibe: Your screen starts feeling less like a fixed window and more like a smart canvas, actively rearranging itself to show exactly what you need for whatever you're doing.
What It Looks & Feels Like
UI elements generated on-demand
Canvas-style, multi-threaded workspaces
Controls appear only when relevant
Rich visualisations for fast comprehension
Personalised layouts adapting to accessibility needs
Fluid interactions; less clicking through menus, more direct manipulation & intent-driven adjustments
Notable Tech (early)
Galileo AI - Canvas-style, on-demand UI elements
Thesys C1 GenUI -API turning LLM JSON into live React interfaces, try their chat!
tldraw"Make Real" - transforms sketches and diagram into HTML
Mermaid AI - prompt-to-diagram generation
PamPam - map-based interaction and curation:
Meter Command - conversational UI changes completely based on user's command:
Phase 4 · Ambient Intelligence
The Vibe: The world starts to become your computer, with AI blending into your surroundings and a central system managing context across your devices.
What It Looks & Feels Like
A central agent orchestrating experiences everywhere
AR / VR layers blending digital with physical
Voice, gaze, gesture, haptics, and even subtle environmental sound or visual cues as part of a richer sensory input palette
Information staged before you even ask; not just explicit data but also subtle, contextually relevant cues & suggestions integrated into the environment
Parallel exploratory threads visible at once
Notable Tech (still early)
Apple Vision Pro spatial computing, Meta glasses & quest
Endel - AI generates a constantly evolving soundscape based on your biometric data & environment
Games with AI NPCs and dynamic worlds
Cool example: Quest mixed reality game where you build automated contraptions in your real-world space
Phase 5 · Cognitive Extension
The Vibe: AI starts to feel less like an external tool and more like a natural extension of your own thinking process, making complex tasks or accessing knowledge feel more direct and intuitive.
What It Looks & Feels Like
Interface fades; experience dominates
Biosensors infer mental state and intention
Just-in-time knowledge surfaces without prompts
Thought-level shortcuts replace commands
Notable Tech (super early)
Cognixion - AR applications controlled by mind, eyes, head pose
P.S. The 'Notable Tech' examples are a purely subjective curation based on what I find cool and representative