Every AI project estimate I've made or seen has been wrong the same way. The build takes longer, the integration is harder, the users are slower to adopt, and the costs you forgot to count are bigger than the ones you did.
I've estimated a lot of AI projects. For myself, for teams I worked with, for founders I've talked to. They all look roughly the same going in. A plausible plan. A reasonable timeline. Some version of: I'll build the AI feature, it'll save users X hours, I'll charge Y, and I'll be profitable in six months.
The math always works on the napkin. The math almost never works in production.
Not because the estimates are sloppy. Usually they're done by sharp people thinking carefully about what they know. The problem is what they don't know. And with AI, the things you don't know are bigger, more expensive, and more numerous than with any other kind of software you've built.
Every AI project estimate I've seen depends on at least four assumptions that feel like facts but are actually open questions. They look solid on paper. They dissolve on contact with reality.
You probably do know how long the software part takes. The frontend, the API, the database, the auth flow. You've built those before. You can estimate them.
You almost certainly don't know how long the AI part takes. The prompt engineering that works in the playground but falls apart on real user inputs. The data cleaning you didn't know you'd need until you tried to use real data. The evaluation loop where you discover the model handles 80% of cases well and the remaining 20% require a completely different approach. The hallucination edge case that only surfaces when a user types something you never tested.
I talked to a founder who estimated four weeks for an AI feature. The software scaffolding took one week, just as planned. The prompt engineering and evaluation took eleven more. Not because it was hard in a way he could have anticipated. Because the gap between "works in testing" and "works reliably on messy real-world input" was larger than he'd ever experienced with deterministic software.
With regular software, the gap between your estimate and reality is usually 50-100%. With AI, it's often 3-5x. Not because you're bad at estimating. Because the system isn't deterministic and your experience estimating deterministic systems doesn't transfer.
You budgeted a week for integration. It will take three to six. Here's why.
The AI part of your product doesn't live in isolation. It connects to data sources, user workflows, third-party APIs, and existing systems. Every one of those connections has assumptions baked in. The data source has fields that are sometimes empty. The API rate-limits you in ways the documentation didn't mention. The user workflow has steps that make sense for human decision-making but don't map to automated outputs.
I watched this play out in my own work. A data pipeline I expected to wire up in a day took two weeks because the upstream system's schema had undocumented edge cases that only surfaced with production data. That wasn't a failure of planning. It was the nature of connecting systems that were designed by different people at different times for different purposes.
For solo builders, integration often means connecting to APIs you don't control. OpenAI changes their response format. Stripe webhooks behave differently in test vs. production. Your vector database handles 100 documents fine but crawls at 10,000. Each of these is solvable. None of them were in your estimate.
This is the assumption that breaks last and costs the most. You ship the feature. It works. Users don't use it. Or they use it once, don't trust the output, and go back to doing things manually.
Adoption isn't deployment. Deployment is technical. Adoption is behavioral. People have to change how they work, and people don't change how they work just because a better tool exists. They change when the new tool fits naturally into what they're already doing, when they trust it, and when the switching cost feels worth it.
I talked to a founder who built an AI writing assistant for sales teams. Beautiful product. Genuinely useful outputs. The sales reps used it for the first week, then stopped. Why? Because editing AI-generated emails took about as long as writing their own, and they trusted their own instincts more. The tool was good. The behavior change was too expensive for the perceived benefit.
His estimate assumed 80% adoption within a month. Actual adoption at month three was 15%, and most of that was one enthusiastic early adopter.
If your business model depends on users changing their behavior, your timeline is wrong. Not might be wrong. Is wrong. Behavior change is measured in months, not the week after launch.
You estimated your API costs based on your test usage. Your test usage is nothing like production usage.
In testing, you send clean, predictable inputs and get efficient responses. In production, users send long, messy, ambiguous inputs that consume more tokens. They retry when they don't like the answer. They use features you expected them to use once in ways that generate five API calls instead of one. Your cost-per-user estimate was based on an average that doesn't exist in the real world.
Beyond API costs, there are costs you probably haven't budgeted for at all:
I'm not going to tell you to pad your estimates by 3x, even though that's usually closer to reality. Padding is just a fudge factor that makes you feel better without actually understanding the risk.
Instead, try this: for every AI project estimate, write down the four assumptions above and answer them honestly.
And add one more thing that most estimates skip: kill criteria. Under what conditions would you stop? What would the data have to show? If you can't answer that before you start, you're not making an estimate. You're making a commitment, and commitments are hard to reverse even when the evidence says you should.
The estimate that gets you excited is the one that tells you what you want to hear. The estimate that saves you is the one that tells you what you need to hear. They're almost never the same spreadsheet.
Because I've made every one of these mistakes. I've estimated four weeks and spent sixteen. I've assumed integration was a weekend and spent a month. I've built features that worked perfectly and watched nobody use them. I've been surprised by API bills and maintenance costs I didn't see coming.
Every builder I talk to makes these same mistakes, and it's not because they're careless. It's because AI feels like software, and our instinct is to estimate it like software. But the math from software doesn't transfer to AI, for the same reason the testing doesn't transfer and the shipping doesn't transfer: the system is probabilistic, the environment changes, and the users are unpredictable.
The napkin math is always clean. The production math never is. The builders who survive aren't the ones with the best estimates. They're the ones who build in the assumption that their estimates are wrong, and structure their projects so they can learn and adjust before the money runs out.
Builder's Path is a public lab from Sellhausen AI Systems focused on AI-native building, validation, and product judgment.