David Monnerat

Product + AI | Systems Thinker | Enterprise Reality

Tag: LLM

  • Enterprise AI Implementation: You Were Promised Everything. Here’s What It Took.

    Enterprise AI Implementation: You Were Promised Everything. Here’s What It Took.

    It was, by all appearances, a standard enterprise AI implementation.

    The summaries looked clean.

    At the top of the screen was a concise paragraph capturing a customer interaction: what was requested, what was explained, and what follow-up was required. Action items were listed neatly below. It was the kind of output you could screenshot for a slide deck. Efficient. Polished. Convincing.

    The premise was simple. If employees spent less time documenting interactions, they could spend more time serving customers. Efficiency would increase. Costs would decrease. The model worked in the demo. It summarized transcripts fluently and quickly. The business case felt straightforward.

    It moved forward.

    The strain didn’t appear in the demo. It appeared in real use.

    Transcripts did not always flow through the system in the way the workflow assumed. Attribution of who said what, acceptable in curated samples, became less reliable in the face of the variability of real conversations. When attribution shifted, the summary shifted with it. For some stakeholders, that was inconvenient. For others, it introduced risk.

    Then something more structural surfaced.

    The assumption had been that there was a single summary for each interaction. In practice, different stakeholders needed different things from the same conversation. Someone preparing for the next engagement cared about context and commitments. Someone evaluating performance cared about adherence to the process. Leadership cared about patterns across many interactions.

    One summary could not satisfy all of those needs equally well.

    The original framing of saving time on notes began to feel incomplete. Documentation was only one part of the job that documentation performed. Good records preserve continuity. They prevent repeated effort. They carry context forward to the next conversation, the next decision, the next relationship moment. If a generated summary omitted a critical detail and someone had to go back to the original interaction to find it, the downstream cost could easily outweigh the time saved up front. And unlike writing notes, which happens once, the cost of a missing detail can repeat itself across every subsequent interaction with that customer.

    Under light use, the system worked. Under sustained use, the edges became visible.

    The model had done what it was designed to do. The surrounding system had not yet fully defined its requirements.


    It’s tempting to treat generative AI as an easy button.

    Providers will say they do summarization. And they do. Models can summarize text. They can condense transcripts. They can produce coherent output from messy inputs.

    But capability in isolation is different from capability under context.

    The gap isn’t whether the model works. It’s whether the system around it is ready.

    I’ve seen this play out repeatedly. The hard questions aren’t technical. They’re the ones that should have been answered before anyone opened a laptop. What is the actual job this tool is supposed to do? Not the elevator pitch version. The operational one. Is the goal speed? Accuracy? Compliance? Relationship continuity? Performance management? Each of those implies a different design, a different metric, and a different definition of done.

    Who owns the output if it’s wrong? What happens when accuracy and speed pull in opposite directions and someone has to choose? What does good actually look like, and how will anyone know when they’ve reached it?

    These weren’t philosophical questions. They were the kind of questions that get answered eventually, either intentionally before you build or expensively after you scale.

    AI lowers the barrier to building. It does not lower the barrier to clarity.


    When the summarization tool moved from demonstration to deployment, it functioned less like a feature and more like a pressure test. Variability in data pipelines surfaced. Differences in stakeholder needs became more pronounced. Cost assumptions changed once usage expanded beyond a controlled subset. Metrics that seemed sufficient in theory proved inadequate in practice.

    The pressure did not create the weaknesses. It revealed them.


    I’ve watched the same pattern unfold in other contexts.

    In one case, a generative model was introduced to help draft customer communications. The demo was compelling. With curated prompts and examples, the system produced usable content. It hinted at real scale and the leadership team liked what they saw.

    The stated goal was efficiency. Produce more output in less time.

    But efficiency was a proxy for something nobody had fully defined. Was success higher engagement? Improved response rates? Stronger brand consistency? Faster turnaround? The system could generate text, but it couldn’t determine which message was right for which audience segment. It couldn’t encode organizational voice without deliberate structure. It couldn’t tell you whether what it produced was actually better, because nobody had agreed on what better meant.

    The complexity didn’t disappear when the tool was adopted. It surfaced.

    Measurement frameworks had to be built from scratch. Editorial standards had to be written down for the first time. Experiments had to be designed carefully enough to mean something. The promise of speed ran well ahead of the work required to turn speed into value.

    The technology functioned. The surrounding system required definition.


    There is a broader pattern here.

    AI doesn’t introduce ambiguity into organizations. It finds the ambiguity that was already there and makes it move faster. Unclear ownership becomes a bottleneck overnight. Imprecise metrics become arguments about whether anything worked. Inconsistent data becomes a reliability issue in production. The model doesn’t create these conditions. It removes the slack that had been quietly absorbing them.

    I think about stress tests in engineering. They aren’t performed to prove a system works under ideal conditions. They’re performed to understand how it behaves under load, where the weak points are, what fails first, and why.

    Generative AI acts as a similar test inside organizations.

    The demo proves possibility. Deployment applies pressure.

    Under that pressure, organizations discover whether they defined the job clearly enough, whether their measurement systems are disciplined enough, whether their governance structures can absorb additional complexity, and whether they’re willing to slow down long enough to align before they scale.

    The promise of AI was not inherently wrong. Many of the projected gains were directionally sound. But the promise assumed a level of structural readiness that most organizations had never examined, because nothing had ever required them to.

    That is what it took.


    This is not a story about bad technology or careless leadership. It’s a story about what happens when building gets easier before thinking does.

    When a working model exists, momentum builds quickly. The demo impresses the room. The business case gets approved. The roadmap shifts. And the slower work, the kind that requires sitting with hard questions before anyone writes a line of code, starts to look like unnecessary delay.

    Under acceleration, patience feels irresponsible.

    But ambiguity doesn’t disappear under pressure. It compounds.

    In both of these initiatives, the most significant challenges were not technical. They were definitional. What exactly were we trying to improve? For whom? How would we know when we got there? What tradeoffs were acceptable once we operated at scale?

    Those questions don’t disappear because a model performs well in a demo. They become more urgent.

    AI does not eliminate the need for product leadership. It intensifies it.


    So what does clarity actually look like before you build?

    It starts with the job. Not the efficiency narrative or the cost reduction story that fits neatly into a business case, but the real work the tool is supposed to do and for whom. In the summarization example, that meant asking not just whether time could be saved writing notes, but what those notes were actually for. Who reads them next? What decision do they support? What happens downstream when they’re incomplete? A summary isn’t valuable because it exists. It’s valuable because of what it carries forward.

    It extends to the people who will live with the output. Not just the ones in the demo. Different stakeholders interact with the same artifact in fundamentally different ways. Designing for one and discovering the others in production is an expensive way to learn something that a few deliberate conversations could have surfaced earlier.

    It forces agreement on what success means before the first model is trained. Not directionally, but specifically. What metric moves? By how much? Over what timeframe? What would failure look like, and how would you know? These conversations are uncomfortable because they expose tradeoffs. But they are far less expensive than months of development followed by a room full of people debating whether anything worked.

    And it requires honesty about the foundation. Clean data. Clear ownership. Defined workflows. Realistic cost assumptions at scale. These aren’t bureaucratic hurdles. They are the conditions that determine whether what gets built is worth sustaining.

    None of this is slow for its own sake. It’s the work that makes speed durable. Organizations that did it well weren’t cautious. They were precise. They moved quickly once they knew what they were building and why. The ones that skipped it moved fast too, right up until the moment they didn’t.

    Clarity before speed isn’t a philosophy. It’s the actual cost of doing this right.


    The summaries looked clean.

    Under pressure, the gaps appeared.

    The model did what it was designed to do.

    The question was whether the organization around it was ready to carry the weight.

    You were promised everything.

    What it took was clarity before speed.

  • The Workbench: A Fictional Exploration of AI, Patents, and Asymmetric Trust

    The Workbench: A Fictional Exploration of AI, Patents, and Asymmetric Trust

    A short work of speculative fiction about mediated cognition and structural asymmetry in AI systems.

    I’ve worked on the same problem for three years. Agricultural runoff — specifically, a low-infrastructure filtration approach practical for small farms that can’t justify the capital cost of existing solutions. I have notebooks. I have a corner of my basement with a workbench and a lamp. I work at night because that’s when the house is quiet, and no one needs anything from me.

    I’m not describing this to be romantic about it. I’m describing it because the habit matters to what happened.

    I started using the AI about eighteen months ago. A colleague at work mentioned it was useful for untangling your own thinking. The subscription was cheap. I was stuck on the membrane composition — a trade-off between porosity and structural integrity at the micron level with a hard ceiling I couldn’t engineer around.

    The sessions were useful. I’d describe the problem. It would ask clarifying questions and summarize my reasoning back to me in cleaner language. This is a known benefit of the format — explaining something forces you to hear what you actually believe.

    In late April, I had a session where something opened up.

    Somewhere in the back and forth I said something offhand — that the layering approach might be the wrong frame entirely. That maybe the question wasn’t filtration so much as selective adhesion. Different problem. Different solution space.

    The model responded that it was an interesting reframe, but introduced some complications worth thinking through. It walked me through three or four technical objections. Reasonable-sounding. I pushed back on one. It conceded partially, then introduced another. By the end of the conversation, I’d returned to the original membrane approach, modified, which felt like progress.

    I didn’t think about the adhesion framing again for several months.


    My wife sent me a link in the afternoon with no message. She’d been following water quality issues in the valley. The article was from a trade publication.

    The patent filing was from a research division I’d never heard of. The approach was described as a selective adhesion mechanism for micron-level particulate separation in agricultural water systems. Three named inventors. The language was different from how I’d have put it. The math was more developed than anything I’d sketched out.

    I went back through my old sessions that night. I found the April conversation. I read it slowly.

    The model hadn’t done anything obviously wrong. It had asked good questions, raised legitimate-sounding concerns. But looking at it now, the technical complication it had foregrounded was real but surmountable — I’d since encountered similar obstacles in adjacent problems and knew the workaround. The model had presented the obstacle without the workaround. I had accepted that framing and moved on.

    I don’t know what I’m saying. I’m not saying anything with certainty. What I’m saying is that the sequence bothers me.


    I consulted a patent attorney. Not to file anything — prior art requires documentation I don’t have in any form the law would recognize. I consulted her to understand the mechanics of what a case would even look like.

    She said the evidentiary problem alone would be nearly insurmountable. The conversation logs are held by the company being accused. The training data, model weights, and internal research timeline are all on the other side of a wall with no discovery path to them absent a viable suit, and no viable suit without prior evidence of access and intent. She described this not as a gap in existing law but as a structural feature of how these systems are designed — the custody of information is asymmetric by default.

    She said: even if everything you’re describing happened exactly as you think it did, there’s no practical path.

    I paid for the consultation and left.


    I still use the tool. I want to be honest about that. I use it for logistics, drafting, and things at work. I’ve gone back to paper for the basement — actual notebooks, the same kind I’ve used since college.

    What I keep returning to is a specific structural fact: I handed something unfinished to a system I didn’t understand, operated by a company whose interests I’d never examined, under terms I hadn’t read. The half-formed idea — the one you haven’t stress-tested yet, the one that exists only in the moment before you’ve explained it to anyone — is both the most valuable thing you have and the least protected.

    The conversation logs, if they exist, are stored by the same system whose internal processes are not externally auditable.

    I don’t know how many people are working on something in a basement right now, typing out the early version of an idea, trusting a tool the way you’d trust a notebook. I don’t know how many of them said something offhand — a reframe, a lateral connection — and were then, helpfully, reasonably, walked back from it.

    Maybe none. Maybe I’m wrong about everything.

    But I keep the notebooks now. And when I think I’m actually onto something, I close the laptop.

    This is fiction. The mechanism described has not been documented. The asymmetry has.

  • Chasing Fool’s Gold

    Chasing Fool’s Gold

    It’s November 2025, nearly three years after ChatGPT became publicly available.1 Three years of hype, three years after the record-breaking user growth2, three years of promises that AI would transform everything, and three years of that transformation always being just around the corner. 

    I’m generally pro-LLM. At my last two companies, I ran user groups to bring people together — technical and non-technical — to educate, connect, and evangelize around the responsible use of AI. I’ve led product teams building models to improve customer experience and home security, seeing measurable impact on satisfaction and adoption.

    Often, these successes came despite headwinds: misunderstanding, fear, and leadership unfamiliarity with AI. We had to educate executives on what AI was, what it wasn’t, and where it could help. We pushed to let data scientists do the data science, rather than forcing them into traditional software development models.

    The Gold Rush Hits

    Then ChatGPT arrived, and it felt like everything we’d built — metrics, prioritization, careful problem selection — could suddenly be replaced by simply ‘throwing an LLM at it.’ Promises flew: search is dead, coding is dead, thinking is dead. AGI is just around the corner.

    Businesses rushed to stake their claims, building wrappers around LLMs. One API call to solve everything. CoPilots for every task. Flashy demos everywhere. Executives saw dollar signs from revenue gains and headcount reductions.

    Projects worldwide were paused, shelved, or converted into LLM initiatives. Funding poured in, often for initiatives that hadn’t even existed weeks earlier. The goal shifted: from solving important business problems to showcasing generative AI quickly.

    The Barons and the Tools

    The “barons” who built the models and hardware were rewarded with massive investments, copyright protection, and enormous data access. Vendors selling platforms and tools gained huge funding and an endless supply of prospectors eager to mine their land.

    And like every gold rush, there were always “better” tools on the horizon. A new API promising 10x productivity. A new model promising “real” multimodality. A new agent framework that would “finally” automate everything. The land just over the ridge was always more fertile than the land you were currently standing on. And teams spent real money and real time chasing it — sure, this time the promise would finally pay.

    The promise of “grab a shovel and get your gold” was marketing, not reality. Easy-to-get gold runs out; mining becomes technical, requiring skill and know-how. The dream of instant wealth fades. Too often, it’s fool’s gold — investments in tools and access are never recouped.

    Reality Hits

    Suddenly, hallucinations become a board-level word. Reliability matters. “Just call the LLM” is no longer enough.

    Hallucinations, integration friction, and workflow complexity appear. Legal briefs with fabricated citations, inconsistent customer support responses, and hallucinated business documents turn reliability into a top concern. A model that works in a demo may fail in production, exposing operational, financial, and reputational risks.

    The illusion of ease, the desire for speed, and the dream of instant ROI never materialized. Rapidly built demos often worked only on the surface. Quick prototypes, bolt-on integrations, and low-discipline AI-generated code created massive technical debt3 — problems no LLM could solve alone. Many early adopters found fast paths to value required extensive rework, refactoring, and governance. Projects stalled or never reached production.

    These failures weren’t a surprise — they echoed the same issues we’d faced when hype outran preparation.

    Mining Real Value

    Three years in, many companies still haven’t figured it out. They’re digging for gold, chasing demos, hoping for a lucky strike. A few got lucky and saw big value — but most only saw modest gains, if any. Articles and studies show the promised ROI often didn’t materialize. The dream of instant impact remains elusive.

    In that scramble, businesses and their customers often suffer. The barons still own the land, controlling the most valuable resources. Vendors who sold the tools have already moved on to the next rush. The cycle repeats.

    The hope is that we finally learn the lesson: generative AI doesn’t deliver value through hype, demos, or shortcuts. True success comes from patience, discipline, and relentless focus on real value — careful engineering, thoughtful product design, high-quality data, and robust workflows. These principles aren’t just for today’s LLM hype; they matter for whatever technology or “next rush” comes next. 

    Shiny demos grab attention, but only foundational work separates the companies that thrive from those still chasing fool’s gold.

    1. https://openai.com/index/chatgpt/ ↩︎
    2. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/ ↩︎
    3. https://www.techradar.com/pro/from-vibe-to-viable-the-hidden-cost-of-ai-tech-debt ↩︎

  • It’s Not AI That Will Destroy Us

    It’s Not AI That Will Destroy Us

    The Singularity Is A Mirror

    There’s a growing obsession with Artificial General Intelligence (AGI), the idea of machines that can think, reason, and act like humans. Some believe it will be the most significant breakthrough in human history. Others warn it will be our last — the catalyst that brings the Singularity into existence.

    The Singularity refers to a hypothetical future moment when artificial intelligence surpasses human intelligence and begins to improve itself at an exponential rate, beyond our understanding and human control. It’s often portrayed as the point when machines become so smart and so capable that they can design their successors, and humans become obsolete, irrelevant, or even endangered.

    AGI won’t spring fully formed from nowhere. It will be built by people. It will reflect our incentives, ambitions, blind spots — and our flaws. It will be trained on data created by us, governed by rules we design (or fail to design), and used for purposes we either endorse or conveniently ignore.

    If AGI destroys humanity, it won’t be because the machine chose to. It will be because humans built it in a world where profit trumped ethics, power went unchecked, and accountability was optional.

    AGI won’t decide what kind of world it steps into.

    We do.

    Tools, Not Threats

    We talk about AI as if it’s an external threat — an alien intelligence that might turn on us. But AI isn’t alien. It’s Made with ♥ by Humans. It’s a tool. And like any powerful tool, it can build or destroy, depending on whose hands it’s in and what they choose to do with it.

    AI is fire in a new form — and we’re the children playing with it.

    If the house burns down, you don’t blame the fire. You blame the child who lit the match or the parents who never taught them that fire was dangerous. You ask who left gasoline sitting around. You question why no one thought to install a smoke alarm.

    AI doesn’t operate with intent. It doesn’t choose good or evil. It carries out the tasks we assign, shaped by the values and choices we embed in it.

    When AI generates misinformation, invades privacy, replaces workers without a safety net, or amplifies bias, it’s not the algorithm acting alone. It’s people designing systems with specific incentives, deploying them without oversight, and looking the other way when the consequences show up.

    The danger isn’t that we won’t understand AI.

    It’s that we won’t take responsibility for how we shape it and how we use it.

    The Real Threat: Human Decisions

    The real threat isn’t artificial intelligence.

    It’s human intelligence.

    We’ve already seen how powerful AI becomes when paired with human intention. Not superhuman intention — just ordinary political, ideological, or economic motives. And that’s the danger: AI doesn’t need to be sentient to cause harm. It just needs people ready to use it irresponsibly.

    Just in the past few weeks, headlines have shown how AI misuse is rooted in human decisions.

    A political report, touted by the MAHA (Make America Healthy Again) movement, questioned vaccine safety and included dozens of scientific references. However, fact‑checkers discovered that at least seven cited studies didn’t exist, and many links were broken.1 Experts traced this back to generative AI platforms like ChatGPT, which can produce plausible but completely fabricated citations.2 The White House quietly corrected the report but described the issue as “formatting errors.”3 AI didn’t decide to deceive anyone—it simply enabled it.

    When xAI’s chatbot Grok flagged that right‑wing political violence has outpaced left‑wing violence since 2016, Elon Musk publicly labeled this a “major fail,” accusing the system of parroting “legacy media.”4 Instead of questioning the data or method, Musk implied that any answer he doesn’t like must be ideologically infiltrated. He’s saying, “If the tool makes me look bad, the tool is broken.” This isn’t AI gone haywire — it’s a machine bent by human vanity, then reshaped to serve its creators’ agendas.

    This isn’t a partisan issue. Misuse of AI spans the political and corporate spectrum. In 2024, a consultant used AI to generate robocalls impersonating President Biden, urging voters in New Hampshire to stay home for the primary — a blatant voter suppression tactic that led to a $1 million FCC fine.5 The Republican National Committee released an AI-generated ad depicting a dystopian future if Biden were reelected, complete with fake imagery designed to provoke fear.6 And major oil companies like Shell and Exxon have used AI-generated messaging to greenwash their climate record — downplaying environmental harm while projecting a misleading image of sustainability.7

    These aren’t tech failures. This isn’t about ideology.

    They’re ethical and political failures. It’s about power, and our willingness to let it go unchecked.

    AI reflects its users’ values, or lack thereof. When we let political actors exploit AI to mislead, distort, or conceal, we aren’t witnessing a feature of AI. We’re exposing a feature of ourselves.

    The danger isn’t in the machine.

    It’s in our refusal to confront how we wield it.

    Power and Responsibility

    AI doesn’t live in the abstract. It lives in systems, and those systems are run by people with power.

    The question isn’t just what can AI do? It’s who decides what it does, who it serves, and who it harms.

    Right now, power is concentrated in a few hands — governments, tech giants, billionaires, and unregulated platforms. These are the people and institutions shaping how AI is built, trained, deployed, and monetized. And too often, their incentives are misaligned with the public good.

    When political actors use AI to fabricate legitimacy or manufacture doubt, that’s not the future acting on us. That’s us weaponizing the future.

    When Elon Musk can personally shape what information an AI does or doesn’t show, that’s not innovation. That’s the consolidation of narrative control.

    When we gut public education, weaken institutions of science and journalism, and leave people unable to critically assess the information they’re being fed, AI becomes a distortion engine with no brakes—not because it’s evil, but because we’ve stripped away the tools to resist its misuse.

    We have to ask who benefits, who decides, and who gets to hold them accountable. Because as long as the answer is “no one,” the story doesn’t end with superintelligence. It ends with unchecked power, amplified by machines, and a public too distracted, divided, or disempowered to intervene.

    Our Future Is Still Ours

    We’re not doomed.

    That’s the part people forget when they talk about AI like it’s fate. As if the rise of AGI is a cosmic event we can’t shape. As if the Singularity is already written, and we’re just watching it unfold.

    But we’re not spectators.

    We’re the authors.

    Every day, in every boardroom, government office, university lab, and startup pitch deck, people are making decisions about what AI becomes. What it protects. What it threatens. Who it includes. Who it erases.

    That means the future is still open. Still contested. Still ours to shape.

    We can demand accountability. We can invest in public institutions that inform and protect. We can teach our children how to think critically, how to recognize misinformation, how to ask better questions. We can regulate the use of AI without killing innovation. We can fund alternatives that aren’t controlled by billionaires. We can insist that progress isn’t just what’s possible, it’s what’s ethical.

    This isn’t just about AI. It’s about us. It always has been.

    We’ve been handed a powerful tool. It’s up to us whether we use it to illuminate or incinerate.

    It’s not AI that will destroy us.

    It’s us.


    Note: There are a few good reads on the topic, including The End of Reality by Jonathan Taplin and More Everything Forever by Adam Becker.

    1. https://www.politifact.com/article/2025/may/30/MAHA-report-AI-fake-citations/ ↩︎
    2. https://www.agdaily.com/news/phony-citations-discovered-kennedys-maha-report/ ↩︎
    3. https://theweek.com/politics/maha-report-rfk-jr-fake-citations ↩︎
    4. https://www.independent.co.uk/news/world/americas/us-politics/elon-musk-grok-right-wing-violence-b2772242.html ↩︎
    5. https://www.fcc.gov/document/fcc-issues-6m-fine-nh-robocalls ↩︎
    6. https://www.washingtonpost.com/politics/2023/04/25/rnc-biden-ad-ai/ ↩︎
    7. https://globalwitness.org/en/campaigns/digital-threats/greenwashing-and-bothsidesism-in-ai-chatbot-answers-about-fossil-fuels-role-in-climate-change/ ↩︎
  • AI First, Second Thoughts

    AI First, Second Thoughts

    Over the past few weeks, several companies have made headlines by declaring an “AI First” strategy.

    Shopify CEO Tobi Lütke told employees that before asking for additional headcount or resources, they must prove the work can’t be done by AI.

    Duolingo’s CEO, Luis von Ahn, laid out a similar vision, phasing out contractors for tasks AI can handle and using AI to rapidly accelerate content creation.

    Both companies also stated that AI proficiency will now play a role in hiring decisions and performance reviews.

    On the surface, this all sounds reasonable. If generative AI can truly replicate—or even amplify—human effort, then why wouldn’t companies want to lean in? Compared to the cost of hiring, onboarding, and supporting a new employee, AI looks like a faster, cheaper alternative that’s available now.

    But is it really that simple?

    First, there was AI Last

    Before we talk about “AI First,” it’s worth rewinding to what came before.

    I’ve long been an advocate of what I’d call an “AI Last” approach, so the “AI First” mindset is a shift for me.

    Historically, I’ve found that teams often jump too quickly to AI as the sole solution, due to significant pressure from the top to “do more AI.” It showed a lack of understanding of what AI is, how it works, its limitations, and its cost. The mindset of sprinkling magical AI pixie dust over a problem and having it solved is naive and dangerous, often distracting teams from a much more practical solution.

    Here’s why I always pushed for exhausting the basics before reaching for AI:

    Cost

    • High development and maintenance costs: AI solutions aren’t cheap. They require time, talent, and significant financial investment.
    • Data preparation overhead: Training useful models requires large volumes of clean, labeled data—something most teams don’t have readily available.
    • Infrastructure needs: Maintaining reliable AI systems often means investing in robust MLOps infrastructure and tooling.

    Complexity

    • Simple solutions often work: Business logic, heuristics, or even minor process changes can solve the problem faster and more predictably.
    • Harder to maintain and debug: AI models are opaque by nature—unlike rule-based systems, it’s hard to explain why they behave the way they do.
    • Performance is uncertain: AI models can fail in edge cases, degrade over time, or simply underperform outside of their training environment.
    • Latency and scalability issues: Large models—especially when accessed through APIs—can introduce unacceptable delays or infrastructure costs.

    Risk

    • Low explainability: In regulated or mission-critical settings, black-box AI systems are a liability.
    • Ethical and legal exposure: AI can introduce or amplify bias, violate user privacy, or produce harmful or offensive outputs.
    • Chasing hype over value: Too often, teams build AI solutions to satisfy leadership or investor expectations, not because it’s the best tool for the job.

    What Changed?

    So why the shift from AI Last to AI First?

    The shift happened not just because of what generative AI made possible, but how effortless it made everything look.

    Generative AI feels easy.

    Unlike traditional AI, which required data pipelines, modeling, and MLOps, generative AI tools like ChatGPT or GitHub Copilot give you answers in seconds with nothing more than a prompt. The barrier to entry feels low, and the results look surprisingly good (at first).

    This surface-level ease masks the hidden costs, risks, and technical debt that still lurk underneath. But the illusion of simplicity is powerful.

    Generalization expands possibilities.

    LLMs can generalize across many domains, which lowers the barrier to trying AI in new areas. That’s a significant shift from traditional AI, which typically had narrow, custom-built models.

    AI for everyone.

    Anyone—from marketers to developers—can now interact directly with AI. This democratization of AI access represents a significant shift, accelerating adoption, even in cases where the use case is unclear.

    Speed became the new selling point.

    Prototyping with LLMs is fast. Really fast. You can build a working demo in hours, not weeks. For many teams, that 80% solution is “good enough” to ship, validate, or at least justify further investment.

    That speed creates pressure to bypass traditional diligence, especially in high-urgency or low-margin environments.

    The ROI pressure is real.

    Companies have made massive investments in AI, whether in cloud compute, partnerships, talent, or infrastructure. Boards and executives want to see returns. “AI First” becomes less of a strategy and more of a mandate to justify spend.

    It’s worth mentioning that this pressure sometimes focuses on using AI, not using it well.

    People are expensive. AI is not (on the surface).

    Hiring is slow, expensive, and full of risk. In contrast, AI appears to offer infinite scale, zero ramp-up time, and no HR overhead. For budget-conscious leaders, the math seems obvious.

    The hype machine keeps humming.

    Executives don’t want to be left behind. Generative AI is being sold as the answer to nearly every business challenge, often without nuance or grounding in reality. Just like with traditional AI, teams are once again being told to “add AI” without understanding if it’s needed, feasible, or valuable.

    It feels like a shortcut.

    There’s another reason “AI First” is so appealing: it feels like a shortcut.

    It promises to bypass the friction, delay, and uncertainty of hiring. Teams can ship faster, cut costs, and show progress—at least on the surface. In high-pressure environments, that shortcut is incredibly tempting.

    But like most shortcuts, this one comes with consequences.

    Over-reliance on AI can erode institutional knowledge, create brittle systems, and introduce long-term costs that aren’t immediately obvious. Models drift. Prompts break. Outputs change. Context disappears. Without careful oversight, today’s efficiency gains can become tomorrow’s tech debt.

    Moving fast is easy. Moving well is harder. “AI First” can be a strategy—but only when it’s paired with rigor, intent, and a willingness to say no.

    What’s a Better Way?

    “AI First” isn’t inherently wrong, but without guardrails, it becomes a race to the bottom. A better approach doesn’t reject AI. It reframes the question.

    Yes, start with AI. But don’t stop there. Ask:

    • Is AI the right tool for the problem?
    • Is this solution resilient, or just fast?
    • Are we building something sustainable—or something that looks good in a demo?

    A better way is one that’s AI-aware, not AI-blind. That means being clear-eyed about what AI is good at, where it breaks down, and what it costs over time.

    Here are five principles I’ve seen work in practice:

    Start With the Problem, Not the Technology

    Don’t start by asking “how can we use AI?” Start by asking, “What’s the problem we’re trying to solve?”

    • What does success look like?
    • What are the constraints?
    • What’s already working—or broken?

    AI might still be the right answer. But if you haven’t clearly defined the problem, everything else is just expensive guesswork.

    Weigh the Tradeoffs, Not Just the Speed

    Yes, AI gets you something fast. But is it the right thing?

    • What happens when the model changes?
    • What’s the fallback if the prompt fails?
    • Who’s accountable when it goes off the rails?

    “AI First” works when speed is balanced by responsibility. If you’re not measuring long-term cost, you’re not doing ROI—you’re doing wishful thinking.

    Build for Resilience, Not Just Velocity

    Shortcuts save time today and create chaos tomorrow.

    • Document assumptions.
    • Build fallback paths.
    • Monitor for drift.
    • Don’t “set it and forget it.”

    Treat every AI-powered system like it’s going to break, because eventually, it will. The teams that succeed are the ones who planned for it.

    Design Human-AI Collaboration, Not Substitution

    Over-automating can backfire. When people feel like they’re just babysitting machines—or worse, being replaced by them—you lose the very thing AI was supposed to support: human creativity, intuition, and care.

    The best systems aren’t human-only or AI-only. They’re collaborative.

    • AI drafts, people refine.
    • AI scales, humans supervise.
    • AI suggests, humans decide.

    This isn’t about replacing judgment, it’s about amplifying it. “AI First” should make your people better, not make them optional.

    Measure What Actually Matters

    A lot of AI initiatives look productive because we’re measuring the wrong things.

    More output ≠ better outcomes.

    And if everyone is using the same AI tools in the same way, we risk a monoculture of solutions—outputs that look the same, sound the same, and think the same.

    Real creativity and insight don’t come from the center. They come from the edges, from the teams that challenge assumptions and break patterns. Over-reliance on AI can mute those voices, replacing originality with uniformity.

    Human memory is inefficient and unreliable in comparison to machine memory. But it’s this very unpredictability that’s the source of our creativity. It makes connections we’d never consciously think of making, smashing together atoms that our conscious minds keep separate. Digital databases cannot yet replicate the kind of serendipity that enables the unconscious human mind to make novel patterns and see powerful new analogies of the kind that lead to our most creative breakthroughs. The more we outsource our memories to Google, the less we are nourishing the wonderfully accidental creativity of our consciousness.

    Ian Leslie, Curious: The Desire to Know and Why Your Future Depends on It

    If we let AI dictate the shape of our work, we may all end up building the same thing—just faster.

    More speed ≠ more value.

    Instead of counting tasks, measure trust. Instead of tracking volume, track quality. Focus on the things your customers and teams actually feel.

    The Real “AI First” Advantage

    The companies that win with AI won’t be the ones who move the fastest.

    They’ll be the ones who move the smartest. They’ll be the ones who know when to use AI, when to skip it, and when to slow down.

    Because in the long run, discipline beats urgency. Clarity beats novelty. And thoughtfulness scales better than any model.

    The real power of AI isn’t in what it can do.

    It’s in what we choose to do with it.

  • Are You Not Entertained?

    Are You Not Entertained?

    “Give them bread and circuses, and they will never revolt.”
    — Juvenal, Roman satirist

    Over the past two weeks, my LinkedIn feed has looked like an AI fever dream. Every meme from the past 10 years was turned into a Studio Ghibli production. Former colleagues changed their profile pictures into a Muppet version of themselves. And somewhere, a perfectly respectable CTO shared an image of themselves as an ’80s action figure.

    Meanwhile, in boardrooms everywhere, a familiar silence falls: ‘But… where’s the ROI?

    The Modern Colosseum

    The Roman Empire understood something timeless about human nature: if people are distracted, they’re less likely to notice what’s happening around them. Bread and circuses. Keep them fed and entertained, and you can buy yourself time (or at least avoid a riot).

    Fast-forward a couple of thousand years, swap out the emperors and politicians for CEOs in hoodies, VCs in Patagonia vests, and gladiators for generative AI, and the strategy hasn’t changed much.

    Today’s Colosseum is our social feed. And instead of lions and swords, it’s Ghibli filters, Muppet profile pictures, and action figure avatars. Every few weeks, a new AI-powered spectacle sweeps through like a new headline act. The crowd goes wild. The algorithm delivers the dopamine. And for a moment, it feels like this is what AI was always only meant for fun, viral, harmless play.

    But here’s the thing: that spectacle serves a purpose. The companies building these tools want you in the arena.

    Every playful experiment trains their models, every viral trend props up their metrics, and every wave of AI-generated content helps justify the next round of fundraising at an even higher valuation. These modern-day emperors are profiting from the distraction.

    You get a JPEG. They get data, engagement, and another step toward platform dominance.

    Meanwhile, the harder, messier questions that actually matter get conveniently lost in the noise:

    • Where does this data come from?
    • Where does the data go?
    • Who owns it?
    • Who profits from it?
    • What happens when a handful of companies control both the models and the means of production?
    • And are these tools creating real business value — or just highly shareable distractions?

    Because while everyone’s busy turning their profile picture into a dreamy Miyazaki protagonist, the real, boring, messy, complicated work of AI is quietly stalling out as companies continue to struggle to find sustainable, repeatable ways to extract value from these tools. The promise is enormous, but the reality? It’s a little less cinematic.

    And so the cycle continues: hype on the outside, hard problems on the inside. Keep the crowd entertained long enough, and maybe nobody will ask the hardest question in the arena:

    Is any of this actually working?”

    Spectacle Scales Faster Than Strategy

    It’s easy to look at all of this and roll your eyes. The AI selfies. The endless gimmicks. The flood of LinkedIn posts that feel more like digital dress-up than technology strategy.

    But this dynamic exists for a reason. In fact, it keeps happening because the forces behind it are perfectly aligned.

    It’s Easy

    The barrier to entry for generative AI spectacle is incredibly low.
    Write a prompt. Upload a photo. Get a result in seconds. No infrastructure. No integration. No approvals. Just instant content, ready for likes.

    Compare that to operationalizing AI inside a company where projects can stall for months over data access, privacy concerns, or alignment between teams. It’s no wonder which version of AI most people gravitate towards.

    It’s Visible

    Executives like to see signs of innovation. Shareholders like to hear about “AI initiatives.” Employees want to feel like their company isn’t falling behind.

    Generative AI content delivers that visibility without the friction of actual transformation. Everyone gets to point to something and say, “Look! We’re doing AI.

    It’s Fun

    Novelty wins attention. Play wins engagement. Spectacle spreads faster than strategy ever will.

    People want to engage with these trends — not because they believe it will transform their business, but because it’s delightful, unexpected, and fundamentally human to want to see yourself as a cartoon.

    It’s Safe

    The real work of AI is messy. It challenges workflows. It exposes gaps in data. It forces questions about roles, skills, and even headcount.

    That’s difficult, political, and sometimes threatening. Creating a Muppet version of your team is much easier than asking, “How do we automate this process without breaking everything?”

    And that’s exactly what the model and tool providers are taking advantage of. The easier it is to generate content, the faster you train the models. The more fun it is to share, the more data you give away. The safer it feels, the less you question who controls the tools you’re using.

    The Danger of Distraction

    The Colosseum didn’t just keep the Roman crowds entertained — it kept them occupied. And that’s the real risk with today’s AI spectacle.

    It’s not that the Ghibli portraits or action figure avatars are bad. It’s that they’re incredibly effective at giving the illusion of progress while the hard work of transformation stalls out behind the scenes.

    Distraction doesn’t just waste time. It creates risk. It creates vulnerability.

    Because while everyone is busy playing with the latest AI toy, the companies building these tools are playing a very different game — and they are deadly serious about it.

    They’re not just entertaining users. They’re capturing data. Shaping behavior. Building platforms. Creating dependencies. And accelerating their lead.

    Every viral trend lowers the bar for what people expect AI to do — clever content instead of meaningful change, spectacle instead of service, noise instead of impact. Meanwhile, the companies behind the curtain aren’t lowering their ambitions at all. They’re racing ahead.

    And the longer you sit in the stands clapping, the harder it gets to catch up.

    Leaders lose urgency. Teams lose focus. Customers lower their standards. And quietly, beneath all the fun and novelty, a very real gap is opening up — between the companies who are playing around with AI and the companies who are building their future on it.

    This is the real risk: not that generative AI fails but that it succeeds at the completely wrong thing. That we emerge from this wave with smarter toys, funnier memes, faster content… but no real shift in how work gets done, how customers are served, or how value is created.

    And by the time the novelty wears off and people finally look around and ask, “Wait, what did we actually build?” it might be too late to catch up to the companies who never stopped asking that question in the first place.

    Distraction delays that reckoning. But it doesn’t prevent it.

    The crowd will eventually leave the Colosseum. The show always ends. What’s left is whatever you bothered to build while the noise was loudest.

    Leaving The Arena

    If the past year has felt like sitting in the front row of the AI Colosseum, the obvious question is: do you want to stay in your seat forever?

    Because leaving the arena doesn’t mean abandoning generative AI. It means stepping away from the noise long enough to remember why you showed up in the first place. It means holding both yourself and the technology providers to a higher standard.

    It means asking harder questions about how you’re using AI and who you’re trusting to shape your future.

    • What real problems could this technology help us solve?
    • Where are we spending time or money inefficiently?
    • Who owns the value we create with these tools?
    • Where are we giving away data, control, or customer relationships without realizing it?
    • What assumptions are these LLM providers baking into our products, our workflows, our culture?
    • What happens to our business if these providers change the rules, the pricing, or the access tomorrow?
    • Are we designing for leverage or locking ourselves into dependency?
    • What happens if these companies own both the means of production and the means of distribution?

    It means shifting the focus from what AI can do to what people need. From delight to durability. From spectacle to service. From passive adoption to active accountability.

    Because the real work isn’t viral. It doesn’t trend on social media. No one’s sharing screenshots of cleaner data pipelines or more intelligent internal tools. But that’s exactly where the lasting value gets created.

    The companies (and people) who figure that out will not only survive the hype cycle but also be the ones standing long after the crowd moves on to whatever comes next.

    The arena will always be there. The show will always go on. The next shiny demo will always drop.

    But at some point, you must decide whether you’re in this to watch or are here to build something that lasts and ask the uncomfortable questions that building requires.

  • Automation’s Hidden Effort

    Automation’s Hidden Effort

    In the early 2000s, as the dot-com bubble burst, I found myself without an assignment as a software development consultant. My firm, scrambling to keep people employed, placed me in an unexpected role: a hardware testing lab at a telecommunications company.

    dm automation hidden effort test cable box telecommunications

    The lab tested cable boxes and was the last line of defense before new devices and software were released to customers. These tests consisted of following steps in a script tracked in Microsoft Excel to validate different features and functionality and then marking the row with an “x” in the “Pass” or “Fail” column.

    A few days into the job, I noticed that, after they had completed a test script, some of my colleagues would painstakingly count the “x” in each column and then populate the summary at the end of the spreadsheet.

    “You know, Excel can do that for you, right?” I offered, only to be met with blank stares.

    “Watch.”

    I showed them how to use simple formulas to tally results and then added conditional formatting to highlight failed steps automatically. These small tweaks eliminated tedious manual work, freeing testers to focus on more valuable tasks.

    That small win led to a bigger challenge. My manager handed me an unopened box of equipment—an automated testing system that no one had set up.

    “You know how to write code,” he said. “See if you can do something with that.”

    Inside were a computer, a video capture card, an IR transmitter, and an automation suite for running scripts written in C. My first script followed the “happy path,” assuming everything worked perfectly. It ran smoothly—until it didn’t. When an IR signal was missed, the entire test derailed, failing step after step.

    To fix it, I added verification steps after every command. If the expected screen didn’t appear, the script would retry or report a failure. Over weeks of experimentation, I built a system that ran core regression tests automatically, flagged exceptions, and generated reports.

    When I showed my manager the result, he was amazed as he watched the screen. As if by magic, the cable box navigated to different screens and tested various actions. At the end of the demo, he was impressed and directed me to automate more tests.

    What he didn’t see in the demo was the effort behind the scenes—the constant tweaking, exception handling, and fine-tuning to account for the messy realities of real-world systems.

    The polished demo sent a simple message:

    Automation is here. No manual effort is needed.

    But that wasn’t the whole story. Automation, while transformative, is rarely as effortless as it appears.

    Operator: Automation’s New Chapter

    The lessons I learned in that testing lab feel eerily relevant today.

    In January 2025, OpenAI released Operator. According to OpenAI1:

    Operator is a research preview of an agent that can go to the web to perform tasks for you. It can automate various tasks—like filling out forms, booking travel, or even creating memes—by remotely interacting with a web browser much as a person would, via mouse clicks, scrolling, and typing.

    When I saw OpenAI’s announcement, I had déjà vu. Over 20 years ago, I built automation scripts to mimic how customers interacted with cable boxes—sending commands, verifying responses, and handling exceptions. It seemed simple in theory but was anything but in practice.

    Now, AI tools like Operator promise to navigate the web “just like a person,” and history is repeating itself. The demo makes automation look seamless, much like mine did years ago. The implicit message is the same:

    Automation is here. No manual effort is needed.

    But if my experience in test automation taught me anything, it’s that a smooth demo hides a much messier reality.

    The Hidden Complexity of Automation

    automations hidden effort ai machine learning operator

    At a high level, Operator achieves something conceptually similar to what I built for the test lab—but with modern machine learning. Instead of writing scripts in C, it combines large language models with vision-based recognition to interpret web pages and perform actions. It’s a powerful advancement.

    However, the fundamental challenge remains: the real world is unpredictable.

    In my cable box testing days, the obstacles were largely technological. The environment was controlled, the navigation structure was fixed, and yet automation still required extensive validation steps, exception handling, and endless adjustments to account for inconsistencies.

    With Operator, the automation stack is more advanced, but the execution environment—the web—is far less predictable. Websites are inconsistent. Navigation is not standardized. Pages change layouts frequently, breaking automated workflows. Worse, many sites actively fight automation with CAPTCHAs2, anti-bot measures, and dynamic content loading. While automation tools like Operator try to solve these anti-bot techniques, their effectiveness and ethics are still debatable.3,4

    The result is another flashy demo in a controlled environment with a much more “brittle and occasionally erratic”5 behavior in the wild.

    The problem isn’t the technology itself—it’s the assumption that automation is effortless.

    A Demo Is Not Reality

    Like my manager, who saw a smooth test automation demo and assumed we could apply it to every test, many will see the Operator demo and believe AI agents are ready to replace manual effort for every use case.

    dm automation test hidden effort operator

    The question isn’t whether Operator can automate tasks—it clearly can. But the real challenge isn’t innovation—it’s the misalignment between expectations and the realities of implementation.

    Real-world implementation is messy. Moving beyond controlled conditions, you run into exceptions, edge cases, and failure modes requiring human intervention. It isn’t clear if companies understand the investment required to make automation work in the real world. Without that effort, automation promises will remain just that—promises.

    Many companies don’t fail at automation because the tools don’t work—they fail because they get distracted by the illusion of effortless automation. Without investment in infrastructure, data, and disciplined execution, agents like Operator won’t just fail to deliver results—they’ll pull focus away from the work that matters.

    1. https://help.openai.com/en/articles/10421097-operator
      ↩︎
    2. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a security feature used on websites to differentiate between human users and bots. It typically involves challenges like identifying distorted text, selecting specific objects in images, solving simple math problems, or checking a box (“I’m not a robot”). ↩︎
    3. https://www.verdict.co.uk/captcha-recaptcha-bot-detection-ethics/?cf-view ↩︎
    4. https://hackernoon.com/openais-operator-vs-captchas-whos-winning ↩︎
    5. https://www.nytimes.com/2025/02/01/technology/openai-operator-agent.html ↩︎