David Monnerat

Product + AI | Systems Thinker | Enterprise Reality

Category: future

  • The Leveler: The Use Cases Nobody Planned For

    The Leveler: The Use Cases Nobody Planned For

    I sat at a table in the gym at my son’s school. Around the room there was a dog groomer, a police detective, someone from the state park maintenance crew, and an archaeologist. We were there for career day. My topic was AI.

    When each group of kids came over, I played them a song. I told them my son made it. He had the idea, shaped the lyrics, and used AI tools to bring it to life. It’s on Spotify and Apple Music. The instruments, the vocals, the production were all generated by AI. But the idea, the story, the words were his.

    Some of the middle school kids had already used ChatGPT. A few had used it for homework. One wanted a training plan for a video game he was trying to get better at. One girl said she used it as someone to talk to.

    That last one stopped me. The adults in the room were thinking about AI in terms of what it might take from these kids. That girl was using it for something none of the adults had thought to offer her.

    I’ve spent more than a decade working in AI. In that time, the most important thing I’ve learned is that the most interesting question about any technology is never what it’s supposed to do. It’s what people actually do with it.

    The Song

    In June 2022, my son and I were in Colorado. My Tampa Bay Lightning were playing his Colorado Avalanche in the Stanley Cup Finals. We went to Game 4. The Avalanche won 7-0 on their way to winning the Cup.

    It was a rout. It was also one of the best nights we’ve had.

    My son has epilepsy. His memory doesn’t work the way most people’s does. Some things stick. Some things don’t. The reasons aren’t always clear, and he doesn’t always have control over which is which. But that game stuck. The score, the crowd, the improbable joy of watching your dad’s team get demolished on their way to losing the championship. That one he kept.

    We still talk about it. We laugh about it. It became a core memory in a brain that doesn’t always hold onto things.

    A while back he decided to make a song about it. He had the idea. He worked on the lyrics, using AI to help shape them. He used Suno to generate the music and the vocals. The result was a real song, his song, about that night in Colorado. It’s on Spotify and Apple Music.

    I’ve thought a lot about what that means.

    The Leveler

    There’s a common criticism of AI and technology more broadly: that it’s eroding our ability to remember things. That when we outsource memory to our phones or our search engines or our AI tools, we lose the capacity to hold things ourselves. The concern is real, and there’s research to support it.

    But that argument is made by people whose memory works.

    For people whose brains don’t store things the same way, these tools aren’t a crutch. They’re access. They’re a leveler.

    We take pictures the way most families do. But for my son, pictures do something different. He may not remember an event, but he’ll see a photo and his brain will construct a story from it, stitching together something coherent and believable from the visual evidence. The story feels real because it is real, even if the memory that produced it works differently than other people’s memories.

    The pictures are doing the work his memory can’t. They’re scaffolding for an internal process that needs support.

    The song is the same thing, one layer deeper.

    A photograph captures a moment. The song is something he made from a moment. He wasn’t just present for that game. He processed it, shaped it into something, and now it exists outside of him as a record of how that night landed. If the memory ever fades, the song will be there. Not as documentation, but as something he built. Something that carries his version of the story.

    That’s a different kind of artifact than a photograph. A photograph is evidence. A song is expression.

    The tools that made it possible weren’t designed specifically for this. But he found them anyway. And made something that wouldn’t have existed without them.

    The Use Case Nobody Planned For

    At career day, I watched a similar thing happen in real time. The kids who came to my table weren’t thinking about productivity or enterprise deployment. They had things they wanted to make and questions they wanted to answer, and the tools were finally accessible enough to help them.

    A training plan for a video game. An image of something specific. A song about a hockey game. A conversation when there was no one else to talk to.

    The enterprise AI conversation focuses on scale, efficiency, and organizational impact. Those things matter. But they tend to crowd out the more personal use cases. The ones that emerge when someone with a specific need finds a general-purpose tool and redirects it toward something that actually helps them.

    I’ve written before about the long tail of problems that were never going to get funded. The tasks too small and too niche to justify a project or a budget. This is the same phenomenon, but more intimate. It’s not about workflow efficiency. It’s about expression. About making something. About finding a way into something that felt out of reach.

    My son couldn’t have made that song five years ago. Not because the idea wasn’t in him. It was. But because the tools that could have helped him bring it out didn’t exist in a form he could reach.

    Now they do.

    What This Actually Tells Us

    The AI conversation tends to sort itself into two camps: those who focus on what the technology enables, and those who focus on what it threatens. Both camps tend to argue from the assumed mainstream. The average user, the average use case, the average impact.

    The more interesting signal is at the edges. The girl who uses ChatGPT as someone to talk to. The kid who makes a song about a sporting event he wants to remember. The person who builds something small for a problem nobody else thought was worth solving.

    These aren’t outliers to dismiss. They’re early indicators of what accessibility actually means when tools get capable enough to be redirected. Not the use case the designers intended. The use case the person needed.

    I work in this space professionally. I think about AI in terms of systems, incentives, and organizational readiness. That’s the right level of analysis for the work I do.

    But I also have a kid who made a song. And watching him play it for the other kids in that gym, watching their faces when I told them he made it, that told me something the systems-level analysis doesn’t.

    The tools are getting accessible enough to reach people who weren’t in the original design. And some of those people are going to do things with them that nobody planned for.

    That girl talking to ChatGPT. My son and his song.

    Those are the use cases I’m watching.


    I put together a handout for the younger kids at career day — simple AI prompts parents can try at home with their kids. No experience needed. Take it, use it, share it.

  • Defining Success Criteria: Do You Know Where You Are Going?

    Defining Success Criteria: Do You Know Where You Are Going?

    It was the tenth call.

    I had been on the first one. A customer data project, matching records across systems to connect outcomes to the right source. It seemed straightforward enough that I handed it off and moved on. What followed was eight more calls between my team and the customer, each one ending with a tweak to the logic, each tweak fixing something and revealing something else.

    The problem wasn’t the code. It wasn’t the team. It was that nobody had defined success criteria before the work started. Nobody had asked: what does done actually look like?

    By the time my team pulled me back in, it was already swirling. The customer’s manager had joined too. I suspect that because the issue still wasn’t resolved, he felt the need to get involved. We had a piece of logic built by people who were no longer on the team, results that were almost always right, and a team that had been grinding on this for weeks.

    I asked a simple question: What do you actually want?

    That was it. Every previous conversation had been about what the code wasn’t doing right, assuming the approach was sound and just needed adjustment. Nobody had stopped to ask whether the approach itself was the right one. The answer to that simpler question was: when this happens, here is what we expect to see. And the solution that followed was far simpler than what we had been building toward. A defined set of conditions, clearly mapped to outcomes. The complexity we had been wrestling with wasn’t a feature of the problem. It was a feature of never having properly defined the problem.

    We left that meeting with a clear destination. We should have had that conversation on call one.

    The Road That Never Shortens

    The frustrating thing about this kind of problem is that it doesn’t feel like you’re lost. You can see the destination. The direction feels right. Every adjustment produces a result that is better than the last one. You can look back and see real distance covered.

    But the destination never gets closer.

    It is not a treadmill. On a treadmill, nothing changes. This is different. The scenery changes. The code changes. The outputs change. Progress is real. It just isn’t progress toward the right place, because the right place was never precisely defined.

    I think of it as a receding horizon. You walk toward it. The ground behind you is real and covered. But the horizon moves as you move, and no amount of forward motion closes the gap. The destination stays visible, just always slightly further ahead.

    What makes it so hard to catch is that the evidence of progress is genuine. In our case, the logic was better after each call than it was before. The team wasn’t spinning their wheels. They were solving real problems. The issue was that the problems they were solving were symptoms of a question nobody had fully asked.

    The Assumption Cascade

    When I look back at those ten calls, the thing that strikes me isn’t that anyone did something wrong. It’s that everyone did something reasonable, and the reasonable things stacked up into an expensive mess.

    The customer assumed the existing logic was doing what they needed, just imperfectly. My team assumed the approach was sound and needed tweaking. When a new team member joined and inherited the code, they assumed the foundation was correct. Their job was to fix the edges, not question the core. I assumed the problem was simple enough that my continued involvement wasn’t necessary. Each assumption was defensible in isolation. Together, they meant that nobody stopped to validate whether the foundation was right in the first place.

    There was another assumption underneath all of them. The code had been built by someone with more context than anyone currently on the team. That history gave it authority. When the results looked wrong, the natural conclusion was that something in the current implementation needed fixing, not that the original approach was built on the wrong question. We trusted the expert’s work even after the expert was gone. And in doing so, we inherited not just the code but the assumptions baked into it.

    When we finally looked at what the logic was actually doing versus what the business needed, the gap was significant. There were layers of code built on assumptions about the data that were no longer accurate. That had always been true. Nobody had thought to check because everyone assumed the foundation had already been validated. Each tweak revealed new issues that had always been there. It felt like discovery. It was actually just uncovering more road.

    The Same Pattern in AI

    I have watched this exact dynamic play out in AI projects more times than I can count, and it is worse there for a specific reason: AI systems always produce output.

    A piece of analytics logic can return null or error in a way that is obviously wrong. A model returns a prediction. An LLM returns a response. The output looks like an answer even when the question was never properly defined. That makes it even easier to convince yourself you are iterating toward something real when you are actually just generating variations on an undefined target.

    I have seen teams spend months adjusting prompts, tuning thresholds, and debating evaluation criteria for systems where nobody had written down what success looked like in concrete, measurable terms. Defining success criteria sounds like a step teams can skip when they’re moving fast. It isn’t. The model kept producing output. The output kept improving on the metrics the team had defined. The business kept saying it wasn’t right yet. And the team kept tweaking, because tweaking was the only tool they had.

    The problem was never the model. It was the missing destination.

    Two Questions That Define Your Success Criteria

    There is a question I now try to ask at the beginning of any project, before any code is written or any model is selected or any dashboard is designed.

    Do you know where you are going?

    Most teams will say yes. They have a goal. They have a use case. They have a sense of what they are trying to accomplish. But the follow-up question is the one that matters.

    How will you know when you get there?

    That second question is where undefined destinations get exposed. A team that can answer it will describe arrival in specific, observable terms. A team that cannot will describe a feeling. Something like: the results will just look right, or the stakeholders will be satisfied, or we’ll know it when we see it.

    Justice Potter Stewart famously said that about obscenity. The bar for a product should be higher.

    If a team cannot describe what arrival looks like in concrete terms, they do not yet have a destination. They have a direction. And direction without destination is how you end up on your tenth call, fixing logic that was never going to get you where you needed to go.

    Stop Before You Tweak

    When a project starts accumulating meetings, when each session ends with a new adjustment rather than a resolved question, when the results are always almost right but never quite there: stop. Not to debug the logic. Not to adjust the model. Stop to ask the two questions.

    Do we know where we are going? And how will we know when we get there?

    Challenge every assumption that has been made about what the system is doing and what the business actually needs. In our case, the assumptions were layered three teams deep. Nobody was hiding anything. Everyone was trying to help. But the assumptions meant that the wrong question had been quietly powering the work for weeks.

    The destination, when we finally named it, was clear. Focused. Achievable.

    It didn’t require less ambition. It required less swirl.

  • The Code Nobody Understands: The Hidden Risk of AI-Generated Code

    The Code Nobody Understands: The Hidden Risk of AI-Generated Code

    The error message appeared at 2:17 a.m. Something about a null reference in a service that hadn’t been touched in weeks. The developer on call pulled up the relevant file, read through it, and felt a familiar but newly uncomfortable sensation.

    She had written this code. Her name was in the commit history. But she hadn’t really written it. She’d accepted it. Reviewed it in the way you review something when you’re moving fast, and the output looks right, and the tests pass. The AI had generated the logic. She had approved the shape of it. And now, at 2:17 a.m., she needed to understand not just what it did but why it did it that way: what assumption it was built on, what edge case it was avoiding, what the author had been thinking.

    There was no author. Not in the sense that mattered.


    Debugging is not reading code. That’s the thing people miss when they think about what AI-generated code changes.

    Reading code tells you what a program does. Debugging tells you why it broke, which requires understanding what the code assumed. Every piece of code is a record of decisions: why this approach and not another, what inputs were considered normal, where the edge of the design was drawn. Those decisions live in the mind of the person who made them. When you write code yourself, you carry that context invisibly. It’s not in the comments. It’s not in the variable names. It’s in your memory of the afternoon you spent figuring out why the simpler approach didn’t work.

    When something fails, you don’t just read the broken code. You reconstruct the intent. You ask: what was this trying to do, and where did reality diverge from the assumption? That reconstruction depends on having built the mental model in the first place.

    AI-generated code arrives without that history. It is the output of a process you did not participate in, produced by a system that has no memory of producing it and no stake in what happens when it runs. The code may be correct. It may even be elegant. But the reasoning that produced it is not available to you.

    When it breaks, you are not reconstructing intent. You are reverse-engineering a decision process that was never explained and no longer exists.

    This is happening one pull request at a time.

    A developer uses an AI tool to scaffold a new service. The output is good enough. She reviews it, adjusts a few things, and ships it. Six months later, either a different developer or the same one with a different memory has to modify that service. He reads through it. The code is coherent but opaque in the way that inherited code is always opaque: it works, and you can see that it works, but you cannot see why it was built this way and not another way. Usually, that’s because the original developer made tradeoffs they never documented. Now it’s because there was no original developer in the relevant sense.

    Neither of them is doing anything wrong. The tool is working as intended. The code is functional. The problem is quieter than that.

    The codebase is accumulating decisions that nobody made.

    There is a craft dimension to this that is easy to undervalue until it is missing.

    Experienced developers carry something that is hard to name but easy to recognize: a feel for systems. The ability to look at a piece of code and sense where it will be fragile. To read a stack trace and know immediately which layer to look at. To hear a description of a failure mode and think: I’ve seen that pattern before, it usually means this.

    That intuition is not innate. It’s built from years of writing code that broke, figuring out why, and carrying the lesson forward. It’s the accumulated residue of debugging. Every production incident is a deposit into that account. Every hour spent reconstructing intent from someone else’s code is a deposit.

    If you outsource the writing, you also reduce the debugging. If you reduce the debugging, you slow down the accumulation. The intuition that makes senior engineers valuable is built precisely from the friction that AI tools are designed to remove.

    The tools make you faster. They may also make you shallower. Not immediately, but incrementally.


    A single developer using AI to move faster is fine. The tradeoffs are local and manageable. But consider the codebase of a growing company where the majority of code was generated by AI tools over a period of several years. Where the team that shipped the original services has turned over. Where nobody alive in the organization has a mental model of why certain architectural decisions were made, because those decisions were not made by a person in the way that creates mental models.

    That system will fail eventually. All systems do. And when it fails in a way that matters, under load, in a corner case, in a security context, the organization will face a debugging problem that is categorically different from the ones it has faced before. Not harder in complexity, necessarily. Harder in a different way: the knowledge required to fix it was never created.

    You cannot interview the AI that wrote it. You cannot ask it what it was thinking. You cannot look at its previous projects and recognize a pattern. And then it’s gone.


    None of this is an argument against using AI coding tools. They are genuinely useful, and the productivity gains are real.

    This is an argument for being honest about the trade-off.

    We tend to talk about AI-generated code in terms of output: does it work, does it pass the tests, does it meet the spec. These are the right questions for shipping. They are incomplete questions for operating.

    Operating a system over time requires more than knowing that it works. It requires knowing how it works, and why, and where it will break, and what to do when it does. That knowledge has to live somewhere. In a person, in documentation detailed enough to reconstruct intent, or in a team with enough collective history to fill the gaps.

    Right now, we are generating code faster than we are generating that knowledge. The gap between the two is not visible in the metrics that matter day to day. It will become visible under pressure.


    The developer on call at 2:17 a.m. will figure it out. She always does. She will read the code carefully, form a hypothesis, test it, and find the assumption that broke. She is good at this.

    But she will spend longer than she should, because the code she is debugging does not remember being written.

    And the next time will take a little longer still.

  • Enterprise AI Implementation: You Were Promised Everything. Here’s What It Took.

    Enterprise AI Implementation: You Were Promised Everything. Here’s What It Took.

    It was, by all appearances, a standard enterprise AI implementation.

    The summaries looked clean.

    At the top of the screen was a concise paragraph capturing a customer interaction: what was requested, what was explained, and what follow-up was required. Action items were listed neatly below. It was the kind of output you could screenshot for a slide deck. Efficient. Polished. Convincing.

    The premise was simple. If employees spent less time documenting interactions, they could spend more time serving customers. Efficiency would increase. Costs would decrease. The model worked in the demo. It summarized transcripts fluently and quickly. The business case felt straightforward.

    It moved forward.

    The strain didn’t appear in the demo. It appeared in real use.

    Transcripts did not always flow through the system in the way the workflow assumed. Attribution of who said what, acceptable in curated samples, became less reliable in the face of the variability of real conversations. When attribution shifted, the summary shifted with it. For some stakeholders, that was inconvenient. For others, it introduced risk.

    Then something more structural surfaced.

    The assumption had been that there was a single summary for each interaction. In practice, different stakeholders needed different things from the same conversation. Someone preparing for the next engagement cared about context and commitments. Someone evaluating performance cared about adherence to the process. Leadership cared about patterns across many interactions.

    One summary could not satisfy all of those needs equally well.

    The original framing of saving time on notes began to feel incomplete. Documentation was only one part of the job that documentation performed. Good records preserve continuity. They prevent repeated effort. They carry context forward to the next conversation, the next decision, the next relationship moment. If a generated summary omitted a critical detail and someone had to go back to the original interaction to find it, the downstream cost could easily outweigh the time saved up front. And unlike writing notes, which happens once, the cost of a missing detail can repeat itself across every subsequent interaction with that customer.

    Under light use, the system worked. Under sustained use, the edges became visible.

    The model had done what it was designed to do. The surrounding system had not yet fully defined its requirements.


    It’s tempting to treat generative AI as an easy button.

    Providers will say they do summarization. And they do. Models can summarize text. They can condense transcripts. They can produce coherent output from messy inputs.

    But capability in isolation is different from capability under context.

    The gap isn’t whether the model works. It’s whether the system around it is ready.

    I’ve seen this play out repeatedly. The hard questions aren’t technical. They’re the ones that should have been answered before anyone opened a laptop. What is the actual job this tool is supposed to do? Not the elevator pitch version. The operational one. Is the goal speed? Accuracy? Compliance? Relationship continuity? Performance management? Each of those implies a different design, a different metric, and a different definition of done.

    Who owns the output if it’s wrong? What happens when accuracy and speed pull in opposite directions and someone has to choose? What does good actually look like, and how will anyone know when they’ve reached it?

    These weren’t philosophical questions. They were the kind of questions that get answered eventually, either intentionally before you build or expensively after you scale.

    AI lowers the barrier to building. It does not lower the barrier to clarity.


    When the summarization tool moved from demonstration to deployment, it functioned less like a feature and more like a pressure test. Variability in data pipelines surfaced. Differences in stakeholder needs became more pronounced. Cost assumptions changed once usage expanded beyond a controlled subset. Metrics that seemed sufficient in theory proved inadequate in practice.

    The pressure did not create the weaknesses. It revealed them.


    I’ve watched the same pattern unfold in other contexts.

    In one case, a generative model was introduced to help draft customer communications. The demo was compelling. With curated prompts and examples, the system produced usable content. It hinted at real scale and the leadership team liked what they saw.

    The stated goal was efficiency. Produce more output in less time.

    But efficiency was a proxy for something nobody had fully defined. Was success higher engagement? Improved response rates? Stronger brand consistency? Faster turnaround? The system could generate text, but it couldn’t determine which message was right for which audience segment. It couldn’t encode organizational voice without deliberate structure. It couldn’t tell you whether what it produced was actually better, because nobody had agreed on what better meant.

    The complexity didn’t disappear when the tool was adopted. It surfaced.

    Measurement frameworks had to be built from scratch. Editorial standards had to be written down for the first time. Experiments had to be designed carefully enough to mean something. The promise of speed ran well ahead of the work required to turn speed into value.

    The technology functioned. The surrounding system required definition.


    There is a broader pattern here.

    AI doesn’t introduce ambiguity into organizations. It finds the ambiguity that was already there and makes it move faster. Unclear ownership becomes a bottleneck overnight. Imprecise metrics become arguments about whether anything worked. Inconsistent data becomes a reliability issue in production. The model doesn’t create these conditions. It removes the slack that had been quietly absorbing them.

    I think about stress tests in engineering. They aren’t performed to prove a system works under ideal conditions. They’re performed to understand how it behaves under load, where the weak points are, what fails first, and why.

    Generative AI acts as a similar test inside organizations.

    The demo proves possibility. Deployment applies pressure.

    Under that pressure, organizations discover whether they defined the job clearly enough, whether their measurement systems are disciplined enough, whether their governance structures can absorb additional complexity, and whether they’re willing to slow down long enough to align before they scale.

    The promise of AI was not inherently wrong. Many of the projected gains were directionally sound. But the promise assumed a level of structural readiness that most organizations had never examined, because nothing had ever required them to.

    That is what it took.


    This is not a story about bad technology or careless leadership. It’s a story about what happens when building gets easier before thinking does.

    When a working model exists, momentum builds quickly. The demo impresses the room. The business case gets approved. The roadmap shifts. And the slower work, the kind that requires sitting with hard questions before anyone writes a line of code, starts to look like unnecessary delay.

    Under acceleration, patience feels irresponsible.

    But ambiguity doesn’t disappear under pressure. It compounds.

    In both of these initiatives, the most significant challenges were not technical. They were definitional. What exactly were we trying to improve? For whom? How would we know when we got there? What tradeoffs were acceptable once we operated at scale?

    Those questions don’t disappear because a model performs well in a demo. They become more urgent.

    AI does not eliminate the need for product leadership. It intensifies it.


    So what does clarity actually look like before you build?

    It starts with the job. Not the efficiency narrative or the cost reduction story that fits neatly into a business case, but the real work the tool is supposed to do and for whom. In the summarization example, that meant asking not just whether time could be saved writing notes, but what those notes were actually for. Who reads them next? What decision do they support? What happens downstream when they’re incomplete? A summary isn’t valuable because it exists. It’s valuable because of what it carries forward.

    It extends to the people who will live with the output. Not just the ones in the demo. Different stakeholders interact with the same artifact in fundamentally different ways. Designing for one and discovering the others in production is an expensive way to learn something that a few deliberate conversations could have surfaced earlier.

    It forces agreement on what success means before the first model is trained. Not directionally, but specifically. What metric moves? By how much? Over what timeframe? What would failure look like, and how would you know? These conversations are uncomfortable because they expose tradeoffs. But they are far less expensive than months of development followed by a room full of people debating whether anything worked.

    And it requires honesty about the foundation. Clean data. Clear ownership. Defined workflows. Realistic cost assumptions at scale. These aren’t bureaucratic hurdles. They are the conditions that determine whether what gets built is worth sustaining.

    None of this is slow for its own sake. It’s the work that makes speed durable. Organizations that did it well weren’t cautious. They were precise. They moved quickly once they knew what they were building and why. The ones that skipped it moved fast too, right up until the moment they didn’t.

    Clarity before speed isn’t a philosophy. It’s the actual cost of doing this right.


    The summaries looked clean.

    Under pressure, the gaps appeared.

    The model did what it was designed to do.

    The question was whether the organization around it was ready to carry the weight.

    You were promised everything.

    What it took was clarity before speed.

  • When AI Safety Commitments Become Ballast

    When AI Safety Commitments Become Ballast

    There’s a moment in every race when weight starts to matter.

    At the beginning, you carry everything. Redundancy. Margin. Contingency. The assumption is that you can afford to be careful, that prudence is a strength rather than a liability.

    Then someone pulls ahead.

    And what once felt responsible begins to feel heavy.

    Over the past year, we’ve started to see that dynamic surface in the AI industry.

    One major lab revised a flagship safety pledge that had previously been framed as firm. Around the same time, another secured a high-profile defense contract after a competitor hesitated over how its policies applied to military use. Each decision, taken on its own, was defensible. Policies evolve. Governments seek capability. Companies interpret commitments in context.

    But together, they reveal something structural.

    Safety commitments do not exist outside competitive pressure. And competitive pressure changes how commitments behave.

    Over the past several years, frontier labs have published increasingly detailed safety frameworks: responsible scaling policies, capability thresholds, deployment guardrails, public commitments to pause development under certain conditions. On paper, this looked like maturation. A recognition that frontier models are not just products but infrastructure. That capability increases are nonlinear. That misuse risk and geopolitical consequence are real.

    But safety inside a competitive market operates differently than safety in isolation.

    If a safeguard slows release timelines, it stops being only a question of principle. It becomes a question of position. If one company interprets a boundary strictly while another interprets it flexibly, the stricter company absorbs the delay. And delay compounds.

    Not because leadership suddenly stops caring about safety, but because the cost of being slower is immediate and measurable, while the benefit of being cautious is probabilistic and diffuse.

    That asymmetry matters.

    The risk is not that companies abandon safety entirely. It is that safety becomes relative — relative to rivals, to political pressure, to market cycles. And relative standards drift.

    Safety that exists primarily as policy language can be refined, reinterpreted, and adjusted under pressure. Safety that is embedded as structural constraint — reinforced through governance, incentives, and shared baselines — is harder to move.

    Most AI safety today lives somewhere in between.

    None of this requires conspiracy.

    It requires acceleration.

    The faster models improve, the more the industry behaves like a race. The more it behaves like a race, the more weight gets scrutinized. And when weight is scrutinized, it is measured against speed.

    Optional safeguards are not discarded outright. They are narrowed. Clarified. Updated. Positioned differently. Over time, the difference between optimization and erosion becomes harder to see.

    Once safety becomes a variable instead of a constraint, it will be optimized like any other variable.

    There is always a less careful actor somewhere in the field. If one company relaxes a guardrail, critics will point to others who are worse. If another holds a line, competitors may frame it as impractical. The reference point shifts quietly. The baseline moves.

    No single revision signals collapse. The bar lowers incrementally, through interpretation rather than abandonment.

    Safety does not disappear. It becomes thinner. More conditional. More dependent on context.

    The industry will continue to publish commitments. It will continue to speak the language of responsibility. It will continue to signal intent.

    The real signal is not in the language.

    It is in what remains non-negotiable when pressure increases.

    When safety is structural, it constrains speed.

    When it is strategic, it competes with it.

    And in competitive markets, strategy is optimized.

    Constraints are endured.

  • The Dulling of Innovation

    The Dulling of Innovation

    For a few years, I was on a patent team. Our job was to drive innovation and empower employees to come up with new ideas and shepherd them through the process to see if we could turn those ideas into patents.

    I loved that job for many reasons. It leveraged an innovation framework I had already started with a few colleagues—work that earned us a handful of patents. It fed my curiosity, love for technology, and joy of being surrounded by smart people. Most of all, I loved watching someone light up as they became an inventor.

    I worked with an engineer who had an idea based on his deep knowledge of a specific system. Together, we expanded on that idea and turned it into an innovative solution to a broader problem. The look on his face when his idea was approved for patent filing was one of the greatest moments of my career. For years after, he would stop me in the hallway just to say hello and introduce me as the person who helped him get a patent.

    Much of the success I saw on that team came from people who deeply understood a problem, were curious to ask why, and believed there had to be a better way. That success was amplified when more than one inventor was involved, when overlapping experiences and diverse perspectives combined into something truly original.

    When I moved into product management, the same patterns held true. The most successful ideas still came from a clear understanding of the problem, deep knowledge of the system, and the willingness to explore different perspectives.

    Innovation used to be a web. It was messy, organic, and interconnected. The spark came from deep context and unexpected collisions.

    But that process is starting to change.

    Same High, Lower Ceiling

    In this new age of large language models (LLMs), companies are looking for shortcuts for growth and innovation and see LLMs as the cheat code.

    Teams are tasked with mining customer comments to synthesize feedback and generate feature ideas and roadmaps. If the ideas seem reasonable, they are executed without further analysis. Speed is the goal. Output is the metric.

    Regardless of size or maturity, every company can access the tools and capabilities once reserved for tech giants. Generative AI lowers the barrier to entry. It also levels the playing field, democratizing innovation.

    But what if it also levels the results?

    When everyone uses the same models, is trained on the same data, and is prompted in similar ways, the ideas start to converge. It’s innovation by template. You might move faster, but so is everyone else, and in the same direction.

    Even when applied to your unique domain, the outputs often look the same. Which means the ideas are starting to look the same, too.

    AI lifts companies that lacked innovation muscle, but in doing so, it risks pulling down those that had built it. The average improves, but the outliers vanish. The floor rises, but the ceiling falls.

    We’re still getting the high. But it doesn’t feel like it used to.

    The Dopamine of Speed

    The danger is that we’re not going to see it happening. Worse, we’re blindly moving forward without considering the long-term implications. We’re so fixated on speed that it’s easy to convince ourselves that we’re moving fast and innovating.

    We confuse motion for momentum, and output for originality. The teams and companies that move the fastest will be rewarded. Natural selection will leave the slower ones behind. Speed will be the new sign of innovation. But just because something ships fast doesn’t mean it moves us forward.

    The dopamine hit that comes from release after release is addictive, and we’ll need more and more to feel the same level of speed and growth. We’ll rely increasingly on these tools to get our fix until it stops working altogether. Meanwhile, the incremental reliance on these tools dulls effectiveness and erodes impact, and our ability to be creative and innovate will atrophy.

    By the time we realize the quality of our ideas has flattened, we’ll be too dependent on the process to do anything differently.

    The Dealers Own the Supply

    And those algorithms? They’re owned by a handful of companies. These companies decide how the models behave, what data they’re trained on, and what comes out of them.

    They also own the data. And it’s only a matter of time before they start mining it for intellectual property—filing patents faster than anyone else can, or arguing that anything derived from their models is theirs by default.

    Beyond intellectual property and market control, this concentration of power raises more profound ethical and societal questions. When innovation is funneled through a few gatekeepers, it risks reinforcing existing inequalities and biases embedded in the training data and business models. The diversity of ideas and creators narrows, and communities without direct access to these technologies may be left behind, exacerbating the digital divide and limiting who benefits from AI-driven innovation.

    The more we rely on these models, the more we feed them. Every prompt, interaction, and insight becomes part of a flywheel that strengthens the model and the company behind it, making it more powerful. It’s a feedback loop: we give them our best thinking, and they return a usable version to everyone else.

    LLMs don’t think from first principles—they remix from secondhand insight. And when we stop thinking from scratch, we start building from scraps.

    Because the answers sound confident, they feel finished. That confidence masks conformity, and we mistake it for consensus.

    Innovation becomes a productized service. Creative edge gets compressed into a monthly subscription. What once gave your company a competitive advantage is now available to anyone who can write a halfway decent prompt.

    Make no mistake, these aren’t neutral platforms. They shape how we think, guide what we explore, and, as they become more embedded in our workflows, influence decisions, strategies, and even what we consider possible.

    We used to control the process. Now we’re just users. The same companies selling us the shortcut are quietly collecting the toll.

    When the supply is centralized, so is the power. And if we keep chasing the high, we’ll find ourselves dependent on a dealer who decides what we get and when we get it.

    Rewiring for Real Innovation

    This isn’t a call to reject the tools. Generative AI isn’t going away, and used well, it can make us faster, better, and more creative. But the key is how we use it—and what we choose to preserve along the way.

    Here’s where we start:

    1. Protect the Messy Middle

    Innovation doesn’t happen at the point of output. It happens in the friction. The spark lives in debate, dead ends, and rabbit holes. We must protect the messy, nonlinear process that makes true insight possible.

    Use AI to accelerate parts of the journey, not to skip it entirely.

    2. Think from First Principles

    Don’t just prompt. Reframe. Instead of asking, “What’s the answer?” ask, “What’s the real question?” LLMs are great at synthesis, but breakthroughs come from original framing.

    Start with what you know. Ask “why” more than “how.” And resist the urge to outsource the thinking.

    3. Don’t Confuse Confidence for Quality

    A confident response isn’t necessarily a correct one. Learn to interrogate the output. Ask where it came from, what it’s assuming, and what it might be missing.

    Treat every generated answer like a draft, not a destination.

    4. Diversify Your Inputs

    The model’s perspective is based on what it’s been trained on, which is mostly what’s already popular, published, and safe. If you want a fresh idea, don’t ask the same question everyone else is asking in the same way.

    Talk to people. Explore unlikely connections. Bring in perspectives that aren’t in the data.

    5. Make Thinking Visible

    The danger of speed is that it hides process. Write out your assumptions. Diagram your logic. Invite others into the middle of your thinking instead of just sharing polished outputs.

    We need to normalize visible, imperfect thought again. That’s where the new stuff lives.

    6. Incentivize Depth

    If we reward speed, we get speed. If we reward outputs, we get more of them. But if we want real innovation, we need to measure the stuff that doesn’t show up in dashboards: insight, originality, and depth of understanding.

    Push your teams to spend time with the problem, not just the solution.

    Staying Sharp

    We didn’t set out to flatten innovation. We set out to go faster, to do more, to meet the moment. But in chasing speed and scale, we risk trading depth for derivatives, and originality for automation.

    Large language models can be incredible tools. They can accelerate discovery, surface connections, and amplify creative potential. But only if we treat them as collaborators, not crutches.

    The danger isn’t in using these models. The danger is in forgetting how to think without them.

    We have to resist the pull toward sameness. We have to do the slower, messier work of understanding real problems, cultivating creative tension, and building teams that collide in productive ways. We have to reward originality over velocity, and insight over output.

    Otherwise, the future of innovation won’t be bold or brilliant.

    It’ll just be fast.

    And dull.