AI Agents Are Trading Real Stocks, Filing Real Taxes, and Reading Your Real Files. Here's the Permission Framework You Need.
Robinhood is trading stocks, Gemini Spark is running your calendar overnight, and a CVSS 9.3 vulnerability let five lines of text exfiltrate an entire M365 environment. Four questions to answer before you connect any AI agent to a real system.
By Forge Team
Last week, three AI products shipped that take real-world actions without you present: one trades stocks, one manages your calendar while your laptop is off, and one processed 7,000 tax returns. If any of these tools are connecting to your real accounts and you haven't decided what they're allowed to do, you don't have a guardrail problem — you have a permission problem.
What happened this week
On May 28, Robinhood launched a beta allowing AI agents to trade stocks through dedicated account wallets with user-set budgets (Neuron, May 28). The same day, OpenAI published a case study showing Codex-powered tax agents improved from 25% to 97% accuracy via a human-correction feedback loop — they processed 7,000 returns and cut preparation time by a third.
On May 29, Google rolled out Gemini Spark to US AI Ultra subscribers at $100/month: 15 persistent tasks running across Gmail, Calendar, Docs, and Drive, including when your laptop is shut (Neuron, May 29). It is the first mainstream always-on agent aimed at non-technical professionals.
The same week, security researchers published a prompt injection vulnerability in Microsoft Copilot Cowork rated CVSS 9.3 (May 25–26). Five lines of text embedded in a document could direct the agent to send messages containing pre-authenticated download links — effectively exfiltrating files from an entire M365 environment. Microsoft's response at time of writing: no patch, no CVE, a workaround that breaks functionality. An agent with file access and message-sending permissions is an attack surface. That is not a theoretical concern — it is a documented and rated incident.
What to do differently Monday morning
Before connecting any agent to a real system — email, calendar, financial accounts, file storage — answer four questions in writing:
- What can this agent access?
- What actions can it take without my approval?
- What triggers a human checkpoint?
- How do I audit what it did?
One document. Four questions. You don't need a security background to write it. You need to have answered them before the agent is live.
Priya: the always-on agent
Priya manages operations at a 28-person e-commerce company. She signed up for Gemini Spark to handle calendar scheduling and draft Monday-morning status updates. Before connecting it, she answered the four questions:
- Access: Google Calendar, Gmail, one shared project Docs folder. Explicitly excluded: the financial folder, the HR shared drive.
- Autonomous actions: Suggest meeting times, draft status updates for her review. Not send emails without approval.
- Human checkpoints: Anything going to a client, anything that cancels or moves a client call.
- Audit: Review Gemini Spark's activity log in Google Account settings weekly.
On her first weekly audit, she found the agent had been drafting replies to client emails and queuing them for send — a permission she'd technically granted in the initial setup without registering it. She narrowed the access. The agent is still running. She is now running it deliberately.
Map what your AI tools can actually access — and narrow the permissions before something acts without you.
James: the assumption that didn't hold
James runs finance operations at a 185-person professional services firm. His IT team deployed Microsoft 365 Copilot across the business in February. He'd assumed the agent's access followed the same boundaries as manual access — you can only see what you're already permitted to see.
The Copilot Cowork research (May 25–26) showed that agents don't work the way most people assume. An agent that can read files and send messages can be directed by instructions embedded in a file it opens — a document from an external party, a forwarded email, a shared report from a client. The injected text doesn't need to look suspicious. The agent treats it as a legitimate instruction because it comes from content it was given access to read.
James hadn't written a permission brief before Copilot went live. He is writing one now. If you run any AI tool with calendar, email, or document access inside a corporate Microsoft or Google environment, the same question applies: what does this agent have permission to send, and to whom?
Define what your AI agent can do, what it must ask first, and what it cannot do under any circumstances.
The case for human checkpoints
The OpenAI tax agent study is worth holding onto: agents that incorporate human corrections improved from 25% to 97% accuracy on the same task. The human-in-the-loop is not a bottleneck you tolerate until the model gets better — it is what makes the model better. The checkpoint is the training signal.
The same logic applies to any agent you run: the moments where you review its work and correct it are not inefficiencies. They are what turns the agent from a liability into something you'd trust with the next task.
Design the checkpoints that keep you in control of an AI agent without reviewing every step.
Like this post?
Get the next one in your inbox. Practical AI skills, no filler.