We build agents and software in the open. The repo is here: aws-ai-agent-bus. https://github.com/baur-software/aws-ai-agent-bus/, and our blog post here.
That work shaped this view. Conversational UI is not a gimmick. It is classic interaction design in a new medium. It is voice user interface plus screen awareness. It is video-first AI that can see what you point at and hear what you mean.
The rules have not changed. Alan Cooper still matters. You design for a user, a goal, and a moment. That lens keeps AI agents from turning into noisy roommates. Cooper’s “goal-directed design” is still a daily tool, not theory. It survives because it works in real rooms and real workflows.
Don Norman still matters. We need signifiers people can read. We need affordances that invite the right action. We need mappings that match mental models. We need loud, honest feedback. These are not nostalgia terms. They are the bones of usable systems. They keep conversational interfaces from becoming guesswork.
When menus disappear, fundamentals grow in importance. Nielsen Norman Group has said this for years. In voice, users cannot scan a screen for clues. They need audio signifiers. Tones. Words. Lights. Clear state. That lowers cognitive load and improves task success. Nielsen Norman Group+1
We have already run the social experiment. A small cylinder sat on kitchen counters. Alexa taught the world about ambient computing. The wake word worked. It was a perfect affordance. The light ring and tones worked. They were honest signifiers and feedback. Everyday tasks stuck because they were obvious and fast. Timers. Weather. Music. That is everyday UX, not sci-fi.
Where it wobbled is predictable. The conceptual model drifted. Many users imagined a “Star Trek computer.” They met a very good “voice remote” with islands of magic. That is a mapping gap. Mismatched expectations erode trust. Alexa privacy also entered the chat. In July 2025 a federal judge let a nationwide class action proceed. The case targets alleged recording and storage practices. The message is simple. Visible limits matter. Mute is not a feature. Mute is a promise. Reuters+1
Why push toward video-first AI at all? Because it mirrors how people already work. Real collaboration is messy. Someone circles a number. Someone points at a chart. Someone adds a quick qualifier like, “Ignore Q3 edge cases.” Typed prompts compress that context into brittle text. A video-aware agent expands it again. The agent sees the diagram. It hears the aside. It notices the column you hovered. That shrinks the “What did you mean?” loop. That cuts rework. That is value.
We do not need stereotypes to justify the shift. The media diet is documented. Pew reports that roughly seven in ten U.S. teens go to YouTube daily. Fifteen percent say “almost constantly.” AP and other summaries have reported similar numbers over the last two years. The point is not taste. The point is distribution. Video is a default surface for explanations. Designers should meet people where they are. Pew Research Center+2Pew Research Center+2
Deloitte’s Digital Media Trends 2024 adds detail. Forty-seven percent of Gen Z and about a third of millennials pick social video and live streams as their favorite video. That does not kill long-form content. It does shift where people expect to learn and discover. It also explains why creator-led “how-to” content reaches teams before PDFs do. Deloitte
Nielsen shows the scale. In 2023 U.S. audiences streamed 21 million years of video. In 2024 they streamed about 23 million years. Video is not a side channel. It is the water supply. That is why video-first AI is sensible. It is also why visibility and limits must be first-class. Nielsen+1
Here is the real risk. Ambient everything. Helpful becomes creepy when control is unclear. The fix is not a manifesto. It is boring and effective. Show state. All the time. When an AI agent is observing, everyone in the room should know. When it is recording or summarizing, that should be persistent and obvious. When it is off, it should be unmistakably off. Not a tiny icon. Not a buried toggle. A clear signifier. A hardware light is even better.
This is “visibility of system status.” This is “user control and freedom.” These are Nielsen Norman Group heuristics with new skins. They are the difference between a tool and a wiretap. They also reduce legal risk. The Alexa privacy case is a reminder that law follows interface choices. Clear limits are not just good UX. They are compliance help. Nielsen Norman Group+1
Now, let’s connect UI fundamentals to prompt engineering. A good prompt is just goal-directed design in one paragraph. You state the role. You state the goal. You state the moment. You signal allowed actions. You point to shared objects. You ask for checks. You state constraints. Spoken or typed, the pattern is the same. It is design, not magic words.
Use a simple prompt pattern like this. It works in voice and text:
Role and moment:
I’m <role> working on <moment>. On screen: <artifact(s)>. The goal is <outcome>.
Actions:
Please <verbs you want>. Avoid <verbs you don’t>.
Where to look and save:
Use <systems/docs on screen>. Save results to <destination>.
Checks:
Before you proceed, restate your assumptions. After you act, show what changed and where.
Boundaries:
Don’t notify anyone. Don’t record or persist beyond <scope>. Stop if I say stop.
This is Don Norman in compact form. The first line sets the conceptual model. The second line is a signifier for available verbs. The third line fixes mapping so results land in the right place. The fourth line demands feedback. The fifth line sets constraints, including the most important one. Off means off.
This also scales to video-first AI workflows. Consider a process mapping session. The agent is on while it watches the whiteboard and the screen. It drafts a clean BPMN. It reads back assumptions. It saves the flow to the right folder. It turns off while the team debates trade-offs. Clear state. Clear boundaries. Better results.
Consider a sales pipeline review. The agent is on while it summarizes call clips and CRM notes. It identifies blockers. It proposes three small experiments. It turns off while leaders make the call. No ambient eavesdropping. No creep.
Consider data quality work. The agent is on while it reconciles CRM, billing, and product telemetry. It flags duplicate entities. It highlights missing funnel steps. It turns off while the team writes hypotheses. This is Data 360 as a practice, not a slogan. It is conversational UI plugged into the truth layer.
We also need a ten-second test for any voice user interface or conversational UI flow. A newcomer should infer what is possible in under ten seconds. If they follow the prompt pattern, the result should land in the right place without a follow-up. There must be a single, obvious way to turn the system off. A bystander should know it is off. If any answer is “no,” do not ship yet.
A word on writing style for agents. Use short sentences. Use concrete nouns. Use plain verbs. Avoid hedging. Avoid multi-clause instructions. This is not poetry. It is instructional design for machines and people. Short lines reduce ambiguity. Short lines improve speech recognition. Short lines are easy to check.
Privacy belongs in the definition of done. Not a policy doc on a shelf. Build logs. Build rollbacks. Make actions auditable. Say what you did. Show what changed. Show where you saved it. These are usability features. These are also compliance features. The more powerful the AI agent, the more important these basics become.
If you work on prompt engineering, keep the UI fundamentals close. They will outlast platform shifts. Menus will change. Input modes will change. The basics stay put. Intent. Actions. Mapping. Feedback. Constraints. Repeat that loop and your conversational UI will feel respectful. It will also feel fast.
Back to the repo. We open-sourced the AWS AI Agent Bus so teams can study a real implementation. Not a demo reel. A system with roles, scopes, visible state, and auditable actions. It is a small proof that everyday design carries into AI agents without drama. It is also a good place to borrow patterns and improve them.
The path forward is not loud. It is careful and obvious. Make intent legible. Make actions discoverable. Keep state visible. Show your work. Leave the room when asked. That is how conversational UI becomes normal. That is how video-first AI becomes useful. That is how AI agents earn trust without a campaign.
Sources
- Pew Research Center. Teens and Social Media Fact Sheet (Jul 10, 2025). Daily YouTube use among U.S. teens; 15% “almost constantly.” Pew Research Center
- Pew Research Center. 5 facts about Americans and YouTube (Feb 28, 2025). Most teens use YouTube daily. Pew Research Center
- AP News summaries of Pew findings (2024–2025). Nearly half of U.S. teens online “almost constantly.” AP News
- Deloitte. Digital Media Trends 2024: Online creators and the impact of social media on entertainment (Mar 20, 2024). Gen Z and millennial video preferences. Deloitte
- Nielsen. Streaming viewership goes to the library in 2023 (Jan 2024). 21M years streamed. Nielsen
- Nielsen. Top Streaming TV Trends 2024 (ARTEY Awards) (Jan 2025). ~23M years streamed. Nielsen
- Reuters. Amazon must face U.S. class action over Alexa users’ privacy (Jul 7, 2025). Case allowed to proceed.