David Monnerat

Dad. Husband. Product + AI. Generalist. Endlessly Curious.

Tag: LLM

  • It’s Not AI That Will Destroy Us

    It’s Not AI That Will Destroy Us

    The Singularity Is A Mirror

    There’s a growing obsession with Artificial General Intelligence (AGI), the idea of machines that can think, reason, and act like humans. Some believe it will be the most significant breakthrough in human history. Others warn it will be our last — the catalyst that brings the Singularity into existence.

    The Singularity refers to a hypothetical future moment when artificial intelligence surpasses human intelligence and begins to improve itself at an exponential rate, beyond our understanding and human control. It’s often portrayed as the point when machines become so smart and so capable that they can design their successors, and humans become obsolete, irrelevant, or even endangered.

    AGI won’t spring fully formed from nowhere. It will be built by people. It will reflect our incentives, ambitions, blind spots — and our flaws. It will be trained on data created by us, governed by rules we design (or fail to design), and used for purposes we either endorse or conveniently ignore.

    If AGI destroys humanity, it won’t be because the machine chose to. It will be because humans built it in a world where profit trumped ethics, power went unchecked, and accountability was optional.

    AGI won’t decide what kind of world it steps into.

    We do.

    Tools, Not Threats

    We talk about AI as if it’s an external threat — an alien intelligence that might turn on us. But AI isn’t alien. It’s Made with ♥ by Humans. It’s a tool. And like any powerful tool, it can build or destroy, depending on whose hands it’s in and what they choose to do with it.

    AI is fire in a new form — and we’re the children playing with it.

    If the house burns down, you don’t blame the fire. You blame the child who lit the match or the parents who never taught them that fire was dangerous. You ask who left gasoline sitting around. You question why no one thought to install a smoke alarm.

    AI doesn’t operate with intent. It doesn’t choose good or evil. It carries out the tasks we assign, shaped by the values and choices we embed in it.

    When AI generates misinformation, invades privacy, replaces workers without a safety net, or amplifies bias, it’s not the algorithm acting alone. It’s people designing systems with specific incentives, deploying them without oversight, and looking the other way when the consequences show up.

    The danger isn’t that we won’t understand AI.

    It’s that we won’t take responsibility for how we shape it and how we use it.

    The Real Threat: Human Decisions

    The real threat isn’t artificial intelligence.

    It’s human intelligence.

    We’ve already seen how powerful AI becomes when paired with human intention. Not superhuman intention — just ordinary political, ideological, or economic motives. And that’s the danger: AI doesn’t need to be sentient to cause harm. It just needs people ready to use it irresponsibly.

    Just in the past few weeks, headlines have shown how AI misuse is rooted in human decisions.

    A political report, touted by the MAHA (Make America Healthy Again) movement, questioned vaccine safety and included dozens of scientific references. However, fact‑checkers discovered that at least seven cited studies didn’t exist, and many links were broken.1 Experts traced this back to generative AI platforms like ChatGPT, which can produce plausible but completely fabricated citations.2 The White House quietly corrected the report but described the issue as “formatting errors.”3 AI didn’t decide to deceive anyone—it simply enabled it.

    When xAI’s chatbot Grok flagged that right‑wing political violence has outpaced left‑wing violence since 2016, Elon Musk publicly labeled this a “major fail,” accusing the system of parroting “legacy media.”4 Instead of questioning the data or method, Musk implied that any answer he doesn’t like must be ideologically infiltrated. He’s saying, “If the tool makes me look bad, the tool is broken.” This isn’t AI gone haywire — it’s a machine bent by human vanity, then reshaped to serve its creators’ agendas.

    This isn’t a partisan issue. Misuse of AI spans the political and corporate spectrum. In 2024, a consultant used AI to generate robocalls impersonating President Biden, urging voters in New Hampshire to stay home for the primary — a blatant voter suppression tactic that led to a $1 million FCC fine.5 The Republican National Committee released an AI-generated ad depicting a dystopian future if Biden were reelected, complete with fake imagery designed to provoke fear.6 And major oil companies like Shell and Exxon have used AI-generated messaging to greenwash their climate record — downplaying environmental harm while projecting a misleading image of sustainability.7

    These aren’t tech failures. This isn’t about ideology.

    They’re ethical and political failures. It’s about power, and our willingness to let it go unchecked.

    AI reflects its users’ values, or lack thereof. When we let political actors exploit AI to mislead, distort, or conceal, we aren’t witnessing a feature of AI. We’re exposing a feature of ourselves.

    The danger isn’t in the machine.

    It’s in our refusal to confront how we wield it.

    Power and Responsibility

    AI doesn’t live in the abstract. It lives in systems, and those systems are run by people with power.

    The question isn’t just what can AI do? It’s who decides what it does, who it serves, and who it harms.

    Right now, power is concentrated in a few hands — governments, tech giants, billionaires, and unregulated platforms. These are the people and institutions shaping how AI is built, trained, deployed, and monetized. And too often, their incentives are misaligned with the public good.

    When political actors use AI to fabricate legitimacy or manufacture doubt, that’s not the future acting on us. That’s us weaponizing the future.

    When Elon Musk can personally shape what information an AI does or doesn’t show, that’s not innovation. That’s the consolidation of narrative control.

    When we gut public education, weaken institutions of science and journalism, and leave people unable to critically assess the information they’re being fed, AI becomes a distortion engine with no brakes—not because it’s evil, but because we’ve stripped away the tools to resist its misuse.

    We have to ask who benefits, who decides, and who gets to hold them accountable. Because as long as the answer is “no one,” the story doesn’t end with superintelligence. It ends with unchecked power, amplified by machines, and a public too distracted, divided, or disempowered to intervene.

    Our Future Is Still Ours

    We’re not doomed.

    That’s the part people forget when they talk about AI like it’s fate. As if the rise of AGI is a cosmic event we can’t shape. As if the Singularity is already written, and we’re just watching it unfold.

    But we’re not spectators.

    We’re the authors.

    Every day, in every boardroom, government office, university lab, and startup pitch deck, people are making decisions about what AI becomes. What it protects. What it threatens. Who it includes. Who it erases.

    That means the future is still open. Still contested. Still ours to shape.

    We can demand accountability. We can invest in public institutions that inform and protect. We can teach our children how to think critically, how to recognize misinformation, how to ask better questions. We can regulate the use of AI without killing innovation. We can fund alternatives that aren’t controlled by billionaires. We can insist that progress isn’t just what’s possible, it’s what’s ethical.

    This isn’t just about AI. It’s about us. It always has been.

    We’ve been handed a powerful tool. It’s up to us whether we use it to illuminate or incinerate.

    It’s not AI that will destroy us.

    It’s us.


    Note: There are a few good reads on the topic, including The End of Reality by Jonathan Taplin and More Everything Forever by Adam Becker.

    1. https://www.politifact.com/article/2025/may/30/MAHA-report-AI-fake-citations/ ↩︎
    2. https://www.agdaily.com/news/phony-citations-discovered-kennedys-maha-report/ ↩︎
    3. https://theweek.com/politics/maha-report-rfk-jr-fake-citations ↩︎
    4. https://www.independent.co.uk/news/world/americas/us-politics/elon-musk-grok-right-wing-violence-b2772242.html ↩︎
    5. https://www.fcc.gov/document/fcc-issues-6m-fine-nh-robocalls ↩︎
    6. https://www.washingtonpost.com/politics/2023/04/25/rnc-biden-ad-ai/ ↩︎
    7. https://globalwitness.org/en/campaigns/digital-threats/greenwashing-and-bothsidesism-in-ai-chatbot-answers-about-fossil-fuels-role-in-climate-change/ ↩︎
  • AI First, Second Thoughts

    AI First, Second Thoughts

    Over the past few weeks, several companies have made headlines by declaring an “AI First” strategy.

    Shopify CEO Tobi Lütke told employees that before asking for additional headcount or resources, they must prove the work can’t be done by AI.

    Duolingo’s CEO, Luis von Ahn, laid out a similar vision, phasing out contractors for tasks AI can handle and using AI to rapidly accelerate content creation.

    Both companies also stated that AI proficiency will now play a role in hiring decisions and performance reviews.

    On the surface, this all sounds reasonable. If generative AI can truly replicate—or even amplify—human effort, then why wouldn’t companies want to lean in? Compared to the cost of hiring, onboarding, and supporting a new employee, AI looks like a faster, cheaper alternative that’s available now.

    But is it really that simple?

    First, there was AI Last

    Before we talk about “AI First,” it’s worth rewinding to what came before.

    I’ve long been an advocate of what I’d call an “AI Last” approach, so the “AI First” mindset is a shift for me.

    Historically, I’ve found that teams often jump too quickly to AI as the sole solution, due to significant pressure from the top to “do more AI.” It showed a lack of understanding of what AI is, how it works, its limitations, and its cost. The mindset of sprinkling magical AI pixie dust over a problem and having it solved is naive and dangerous, often distracting teams from a much more practical solution.

    Here’s why I always pushed for exhausting the basics before reaching for AI:

    Cost

    • High development and maintenance costs: AI solutions aren’t cheap. They require time, talent, and significant financial investment.
    • Data preparation overhead: Training useful models requires large volumes of clean, labeled data—something most teams don’t have readily available.
    • Infrastructure needs: Maintaining reliable AI systems often means investing in robust MLOps infrastructure and tooling.

    Complexity

    • Simple solutions often work: Business logic, heuristics, or even minor process changes can solve the problem faster and more predictably.
    • Harder to maintain and debug: AI models are opaque by nature—unlike rule-based systems, it’s hard to explain why they behave the way they do.
    • Performance is uncertain: AI models can fail in edge cases, degrade over time, or simply underperform outside of their training environment.
    • Latency and scalability issues: Large models—especially when accessed through APIs—can introduce unacceptable delays or infrastructure costs.

    Risk

    • Low explainability: In regulated or mission-critical settings, black-box AI systems are a liability.
    • Ethical and legal exposure: AI can introduce or amplify bias, violate user privacy, or produce harmful or offensive outputs.
    • Chasing hype over value: Too often, teams build AI solutions to satisfy leadership or investor expectations, not because it’s the best tool for the job.

    What Changed?

    So why the shift from AI Last to AI First?

    The shift happened not just because of what generative AI made possible, but how effortless it made everything look.

    Generative AI feels easy.

    Unlike traditional AI, which required data pipelines, modeling, and MLOps, generative AI tools like ChatGPT or GitHub Copilot give you answers in seconds with nothing more than a prompt. The barrier to entry feels low, and the results look surprisingly good (at first).

    This surface-level ease masks the hidden costs, risks, and technical debt that still lurk underneath. But the illusion of simplicity is powerful.

    Generalization expands possibilities.

    LLMs can generalize across many domains, which lowers the barrier to trying AI in new areas. That’s a significant shift from traditional AI, which typically had narrow, custom-built models.

    AI for everyone.

    Anyone—from marketers to developers—can now interact directly with AI. This democratization of AI access represents a significant shift, accelerating adoption, even in cases where the use case is unclear.

    Speed became the new selling point.

    Prototyping with LLMs is fast. Really fast. You can build a working demo in hours, not weeks. For many teams, that 80% solution is “good enough” to ship, validate, or at least justify further investment.

    That speed creates pressure to bypass traditional diligence, especially in high-urgency or low-margin environments.

    The ROI pressure is real.

    Companies have made massive investments in AI, whether in cloud compute, partnerships, talent, or infrastructure. Boards and executives want to see returns. “AI First” becomes less of a strategy and more of a mandate to justify spend.

    It’s worth mentioning that this pressure sometimes focuses on using AI, not using it well.

    People are expensive. AI is not (on the surface).

    Hiring is slow, expensive, and full of risk. In contrast, AI appears to offer infinite scale, zero ramp-up time, and no HR overhead. For budget-conscious leaders, the math seems obvious.

    The hype machine keeps humming.

    Executives don’t want to be left behind. Generative AI is being sold as the answer to nearly every business challenge, often without nuance or grounding in reality. Just like with traditional AI, teams are once again being told to “add AI” without understanding if it’s needed, feasible, or valuable.

    It feels like a shortcut.

    There’s another reason “AI First” is so appealing: it feels like a shortcut.

    It promises to bypass the friction, delay, and uncertainty of hiring. Teams can ship faster, cut costs, and show progress—at least on the surface. In high-pressure environments, that shortcut is incredibly tempting.

    But like most shortcuts, this one comes with consequences.

    Over-reliance on AI can erode institutional knowledge, create brittle systems, and introduce long-term costs that aren’t immediately obvious. Models drift. Prompts break. Outputs change. Context disappears. Without careful oversight, today’s efficiency gains can become tomorrow’s tech debt.

    Moving fast is easy. Moving well is harder. “AI First” can be a strategy—but only when it’s paired with rigor, intent, and a willingness to say no.

    What’s a Better Way?

    “AI First” isn’t inherently wrong, but without guardrails, it becomes a race to the bottom. A better approach doesn’t reject AI. It reframes the question.

    Yes, start with AI. But don’t stop there. Ask:

    • Is AI the right tool for the problem?
    • Is this solution resilient, or just fast?
    • Are we building something sustainable—or something that looks good in a demo?

    A better way is one that’s AI-aware, not AI-blind. That means being clear-eyed about what AI is good at, where it breaks down, and what it costs over time.

    Here are five principles I’ve seen work in practice:

    Start With the Problem, Not the Technology

    Don’t start by asking “how can we use AI?” Start by asking, “What’s the problem we’re trying to solve?”

    • What does success look like?
    • What are the constraints?
    • What’s already working—or broken?

    AI might still be the right answer. But if you haven’t clearly defined the problem, everything else is just expensive guesswork.

    Weigh the Tradeoffs, Not Just the Speed

    Yes, AI gets you something fast. But is it the right thing?

    • What happens when the model changes?
    • What’s the fallback if the prompt fails?
    • Who’s accountable when it goes off the rails?

    “AI First” works when speed is balanced by responsibility. If you’re not measuring long-term cost, you’re not doing ROI—you’re doing wishful thinking.

    Build for Resilience, Not Just Velocity

    Shortcuts save time today and create chaos tomorrow.

    • Document assumptions.
    • Build fallback paths.
    • Monitor for drift.
    • Don’t “set it and forget it.”

    Treat every AI-powered system like it’s going to break, because eventually, it will. The teams that succeed are the ones who planned for it.

    Design Human-AI Collaboration, Not Substitution

    Over-automating can backfire. When people feel like they’re just babysitting machines—or worse, being replaced by them—you lose the very thing AI was supposed to support: human creativity, intuition, and care.

    The best systems aren’t human-only or AI-only. They’re collaborative.

    • AI drafts, people refine.
    • AI scales, humans supervise.
    • AI suggests, humans decide.

    This isn’t about replacing judgment, it’s about amplifying it. “AI First” should make your people better, not make them optional.

    Measure What Actually Matters

    A lot of AI initiatives look productive because we’re measuring the wrong things.

    More output ≠ better outcomes.

    And if everyone is using the same AI tools in the same way, we risk a monoculture of solutions—outputs that look the same, sound the same, and think the same.

    Real creativity and insight don’t come from the center. They come from the edges, from the teams that challenge assumptions and break patterns. Over-reliance on AI can mute those voices, replacing originality with uniformity.

    Human memory is inefficient and unreliable in comparison to machine memory. But it’s this very unpredictability that’s the source of our creativity. It makes connections we’d never consciously think of making, smashing together atoms that our conscious minds keep separate. Digital databases cannot yet replicate the kind of serendipity that enables the unconscious human mind to make novel patterns and see powerful new analogies of the kind that lead to our most creative breakthroughs. The more we outsource our memories to Google, the less we are nourishing the wonderfully accidental creativity of our consciousness.

    Ian Leslie, Curious: The Desire to Know and Why Your Future Depends on It

    If we let AI dictate the shape of our work, we may all end up building the same thing—just faster.

    More speed ≠ more value.

    Instead of counting tasks, measure trust. Instead of tracking volume, track quality. Focus on the things your customers and teams actually feel.

    The Real “AI First” Advantage

    The companies that win with AI won’t be the ones who move the fastest.

    They’ll be the ones who move the smartest. They’ll be the ones who know when to use AI, when to skip it, and when to slow down.

    Because in the long run, discipline beats urgency. Clarity beats novelty. And thoughtfulness scales better than any model.

    The real power of AI isn’t in what it can do.

    It’s in what we choose to do with it.

  • Are You Not Entertained?

    Are You Not Entertained?

    “Give them bread and circuses, and they will never revolt.”
    — Juvenal, Roman satirist

    Over the past two weeks, my LinkedIn feed has looked like an AI fever dream. Every meme from the past 10 years was turned into a Studio Ghibli production. Former colleagues changed their profile pictures into a Muppet version of themselves. And somewhere, a perfectly respectable CTO shared an image of themselves as an ’80s action figure.

    Meanwhile, in boardrooms everywhere, a familiar silence falls: ‘But… where’s the ROI?

    The Modern Colosseum

    The Roman Empire understood something timeless about human nature: if people are distracted, they’re less likely to notice what’s happening around them. Bread and circuses. Keep them fed and entertained, and you can buy yourself time (or at least avoid a riot).

    Fast-forward a couple of thousand years, swap out the emperors and politicians for CEOs in hoodies, VCs in Patagonia vests, and gladiators for generative AI, and the strategy hasn’t changed much.

    Today’s Colosseum is our social feed. And instead of lions and swords, it’s Ghibli filters, Muppet profile pictures, and action figure avatars. Every few weeks, a new AI-powered spectacle sweeps through like a new headline act. The crowd goes wild. The algorithm delivers the dopamine. And for a moment, it feels like this is what AI was always only meant for fun, viral, harmless play.

    But here’s the thing: that spectacle serves a purpose. The companies building these tools want you in the arena.

    Every playful experiment trains their models, every viral trend props up their metrics, and every wave of AI-generated content helps justify the next round of fundraising at an even higher valuation. These modern-day emperors are profiting from the distraction.

    You get a JPEG. They get data, engagement, and another step toward platform dominance.

    Meanwhile, the harder, messier questions that actually matter get conveniently lost in the noise:

    • Where does this data come from?
    • Where does the data go?
    • Who owns it?
    • Who profits from it?
    • What happens when a handful of companies control both the models and the means of production?
    • And are these tools creating real business value — or just highly shareable distractions?

    Because while everyone’s busy turning their profile picture into a dreamy Miyazaki protagonist, the real, boring, messy, complicated work of AI is quietly stalling out as companies continue to struggle to find sustainable, repeatable ways to extract value from these tools. The promise is enormous, but the reality? It’s a little less cinematic.

    And so the cycle continues: hype on the outside, hard problems on the inside. Keep the crowd entertained long enough, and maybe nobody will ask the hardest question in the arena:

    Is any of this actually working?”

    Spectacle Scales Faster Than Strategy

    It’s easy to look at all of this and roll your eyes. The AI selfies. The endless gimmicks. The flood of LinkedIn posts that feel more like digital dress-up than technology strategy.

    But this dynamic exists for a reason. In fact, it keeps happening because the forces behind it are perfectly aligned.

    It’s Easy

    The barrier to entry for generative AI spectacle is incredibly low.
    Write a prompt. Upload a photo. Get a result in seconds. No infrastructure. No integration. No approvals. Just instant content, ready for likes.

    Compare that to operationalizing AI inside a company where projects can stall for months over data access, privacy concerns, or alignment between teams. It’s no wonder which version of AI most people gravitate towards.

    It’s Visible

    Executives like to see signs of innovation. Shareholders like to hear about “AI initiatives.” Employees want to feel like their company isn’t falling behind.

    Generative AI content delivers that visibility without the friction of actual transformation. Everyone gets to point to something and say, “Look! We’re doing AI.

    It’s Fun

    Novelty wins attention. Play wins engagement. Spectacle spreads faster than strategy ever will.

    People want to engage with these trends — not because they believe it will transform their business, but because it’s delightful, unexpected, and fundamentally human to want to see yourself as a cartoon.

    It’s Safe

    The real work of AI is messy. It challenges workflows. It exposes gaps in data. It forces questions about roles, skills, and even headcount.

    That’s difficult, political, and sometimes threatening. Creating a Muppet version of your team is much easier than asking, “How do we automate this process without breaking everything?”

    And that’s exactly what the model and tool providers are taking advantage of. The easier it is to generate content, the faster you train the models. The more fun it is to share, the more data you give away. The safer it feels, the less you question who controls the tools you’re using.

    The Danger of Distraction

    The Colosseum didn’t just keep the Roman crowds entertained — it kept them occupied. And that’s the real risk with today’s AI spectacle.

    It’s not that the Ghibli portraits or action figure avatars are bad. It’s that they’re incredibly effective at giving the illusion of progress while the hard work of transformation stalls out behind the scenes.

    Distraction doesn’t just waste time. It creates risk. It creates vulnerability.

    Because while everyone is busy playing with the latest AI toy, the companies building these tools are playing a very different game — and they are deadly serious about it.

    They’re not just entertaining users. They’re capturing data. Shaping behavior. Building platforms. Creating dependencies. And accelerating their lead.

    Every viral trend lowers the bar for what people expect AI to do — clever content instead of meaningful change, spectacle instead of service, noise instead of impact. Meanwhile, the companies behind the curtain aren’t lowering their ambitions at all. They’re racing ahead.

    And the longer you sit in the stands clapping, the harder it gets to catch up.

    Leaders lose urgency. Teams lose focus. Customers lower their standards. And quietly, beneath all the fun and novelty, a very real gap is opening up — between the companies who are playing around with AI and the companies who are building their future on it.

    This is the real risk: not that generative AI fails but that it succeeds at the completely wrong thing. That we emerge from this wave with smarter toys, funnier memes, faster content… but no real shift in how work gets done, how customers are served, or how value is created.

    And by the time the novelty wears off and people finally look around and ask, “Wait, what did we actually build?” it might be too late to catch up to the companies who never stopped asking that question in the first place.

    Distraction delays that reckoning. But it doesn’t prevent it.

    The crowd will eventually leave the Colosseum. The show always ends. What’s left is whatever you bothered to build while the noise was loudest.

    Leaving The Arena

    If the past year has felt like sitting in the front row of the AI Colosseum, the obvious question is: do you want to stay in your seat forever?

    Because leaving the arena doesn’t mean abandoning generative AI. It means stepping away from the noise long enough to remember why you showed up in the first place. It means holding both yourself and the technology providers to a higher standard.

    It means asking harder questions about how you’re using AI and who you’re trusting to shape your future.

    • What real problems could this technology help us solve?
    • Where are we spending time or money inefficiently?
    • Who owns the value we create with these tools?
    • Where are we giving away data, control, or customer relationships without realizing it?
    • What assumptions are these LLM providers baking into our products, our workflows, our culture?
    • What happens to our business if these providers change the rules, the pricing, or the access tomorrow?
    • Are we designing for leverage or locking ourselves into dependency?
    • What happens if these companies own both the means of production and the means of distribution?

    It means shifting the focus from what AI can do to what people need. From delight to durability. From spectacle to service. From passive adoption to active accountability.

    Because the real work isn’t viral. It doesn’t trend on social media. No one’s sharing screenshots of cleaner data pipelines or more intelligent internal tools. But that’s exactly where the lasting value gets created.

    The companies (and people) who figure that out will not only survive the hype cycle but also be the ones standing long after the crowd moves on to whatever comes next.

    The arena will always be there. The show will always go on. The next shiny demo will always drop.

    But at some point, you must decide whether you’re in this to watch or are here to build something that lasts and ask the uncomfortable questions that building requires.

  • Automation’s Hidden Effort

    Automation’s Hidden Effort

    In the early 2000s, as the dot-com bubble burst, I found myself without an assignment as a software development consultant. My firm, scrambling to keep people employed, placed me in an unexpected role: a hardware testing lab at a telecommunications company.

    dm automation hidden effort test cable box telecommunications

    The lab tested cable boxes and was the last line of defense before new devices and software were released to customers. These tests consisted of following steps in a script tracked in Microsoft Excel to validate different features and functionality and then marking the row with an “x” in the “Pass” or “Fail” column.

    A few days into the job, I noticed that, after they had completed a test script, some of my colleagues would painstakingly count the “x” in each column and then populate the summary at the end of the spreadsheet.

    “You know, Excel can do that for you, right?” I offered, only to be met with blank stares.

    “Watch.”

    I showed them how to use simple formulas to tally results and then added conditional formatting to highlight failed steps automatically. These small tweaks eliminated tedious manual work, freeing testers to focus on more valuable tasks.

    That small win led to a bigger challenge. My manager handed me an unopened box of equipment—an automated testing system that no one had set up.

    “You know how to write code,” he said. “See if you can do something with that.”

    Inside were a computer, a video capture card, an IR transmitter, and an automation suite for running scripts written in C. My first script followed the “happy path,” assuming everything worked perfectly. It ran smoothly—until it didn’t. When an IR signal was missed, the entire test derailed, failing step after step.

    To fix it, I added verification steps after every command. If the expected screen didn’t appear, the script would retry or report a failure. Over weeks of experimentation, I built a system that ran core regression tests automatically, flagged exceptions, and generated reports.

    When I showed my manager the result, he was amazed as he watched the screen. As if by magic, the cable box navigated to different screens and tested various actions. At the end of the demo, he was impressed and directed me to automate more tests.

    What he didn’t see in the demo was the effort behind the scenes—the constant tweaking, exception handling, and fine-tuning to account for the messy realities of real-world systems.

    The polished demo sent a simple message:

    Automation is here. No manual effort is needed.

    But that wasn’t the whole story. Automation, while transformative, is rarely as effortless as it appears.

    Operator: Automation’s New Chapter

    The lessons I learned in that testing lab feel eerily relevant today.

    In January 2025, OpenAI released Operator. According to OpenAI1:

    Operator is a research preview of an agent that can go to the web to perform tasks for you. It can automate various tasks—like filling out forms, booking travel, or even creating memes—by remotely interacting with a web browser much as a person would, via mouse clicks, scrolling, and typing.

    When I saw OpenAI’s announcement, I had déjà vu. Over 20 years ago, I built automation scripts to mimic how customers interacted with cable boxes—sending commands, verifying responses, and handling exceptions. It seemed simple in theory but was anything but in practice.

    Now, AI tools like Operator promise to navigate the web “just like a person,” and history is repeating itself. The demo makes automation look seamless, much like mine did years ago. The implicit message is the same:

    Automation is here. No manual effort is needed.

    But if my experience in test automation taught me anything, it’s that a smooth demo hides a much messier reality.

    The Hidden Complexity of Automation

    automations hidden effort ai machine learning operator

    At a high level, Operator achieves something conceptually similar to what I built for the test lab—but with modern machine learning. Instead of writing scripts in C, it combines large language models with vision-based recognition to interpret web pages and perform actions. It’s a powerful advancement.

    However, the fundamental challenge remains: the real world is unpredictable.

    In my cable box testing days, the obstacles were largely technological. The environment was controlled, the navigation structure was fixed, and yet automation still required extensive validation steps, exception handling, and endless adjustments to account for inconsistencies.

    With Operator, the automation stack is more advanced, but the execution environment—the web—is far less predictable. Websites are inconsistent. Navigation is not standardized. Pages change layouts frequently, breaking automated workflows. Worse, many sites actively fight automation with CAPTCHAs2, anti-bot measures, and dynamic content loading. While automation tools like Operator try to solve these anti-bot techniques, their effectiveness and ethics are still debatable.3,4

    The result is another flashy demo in a controlled environment with a much more “brittle and occasionally erratic”5 behavior in the wild.

    The problem isn’t the technology itself—it’s the assumption that automation is effortless.

    A Demo Is Not Reality

    Like my manager, who saw a smooth test automation demo and assumed we could apply it to every test, many will see the Operator demo and believe AI agents are ready to replace manual effort for every use case.

    dm automation test hidden effort operator

    The question isn’t whether Operator can automate tasks—it clearly can. But the real challenge isn’t innovation—it’s the misalignment between expectations and the realities of implementation.

    Real-world implementation is messy. Moving beyond controlled conditions, you run into exceptions, edge cases, and failure modes requiring human intervention. It isn’t clear if companies understand the investment required to make automation work in the real world. Without that effort, automation promises will remain just that—promises.

    Many companies don’t fail at automation because the tools don’t work—they fail because they get distracted by the illusion of effortless automation. Without investment in infrastructure, data, and disciplined execution, agents like Operator won’t just fail to deliver results—they’ll pull focus away from the work that matters.

    1. https://help.openai.com/en/articles/10421097-operator
      ↩︎
    2. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a security feature used on websites to differentiate between human users and bots. It typically involves challenges like identifying distorted text, selecting specific objects in images, solving simple math problems, or checking a box (“I’m not a robot”). ↩︎
    3. https://www.verdict.co.uk/captcha-recaptcha-bot-detection-ethics/?cf-view ↩︎
    4. https://hackernoon.com/openais-operator-vs-captchas-whos-winning ↩︎
    5. https://www.nytimes.com/2025/02/01/technology/openai-operator-agent.html ↩︎