< / >

The End of the Shovel: Is Development as We Knew It Dead?

Published on January 4, 2026 • Vojta Baránek

AI Development Agentic Coding Claude Gemini Rock8.cloud

The end of the shovel — Is the shovel era over?

While the business world hit the brakes for the holidays, I shifted into high gear. I used the end-of-year silence to go all-in on Rock8.cloud, pushing it from a rough PoC into a real-world MVP (alpha soon). With a clean greenfield project in front of me and every AI buzzword ringing in my ears, I decided to run a high-stakes experiment: can you actually build a serious hosting platform without falling into the “prompt-and-pray” trap?

Vibe coding it is then?

Ha, nice try… but absolutely not. “Vibe coding” is fine for a quick-and-dirty PoC or a flashy landing page, but building a production-grade infrastructure on it? That’s a nightmare waiting to happen. Services like Lovable or Bolt are great for the masses, but I’m not interested in building a house of cards. Instead, I’ve been digging into a category that doesn’t get nearly enough hype—mostly because it actually requires you to know what you’re doing: Agentic Coding. Think of it as pair programming on steroids, operating at a level of abstraction most “vibers” aren’t ready for.

Tools and Models

Choosing tools is primarily about personal preference, but I was deciding between two paths: open-source (potentially self-hostable) models, or the absolute top-tier private models.

Claude, Gemini, or OpenAI Codex?

OpenAI is not my favorite company. While their innovation changed the current era, their market position, direction, and (disproportionate) valuation don’t feel like a solid foundation for investing in their tools.

So, I was choosing between Gemini (Google is only going to get better in my opinion, and they have a massive lead in their own infrastructure) and Claude (Anthropic is clearly profiled toward developer models for Enterprises). Specifically, it was Gemini 3 Pro vs. Claude Opus 4.5. I tried both, but the quality and the advantageous subscription for Claude Code tipped the scales. Most of the time, I was burning through Opus tokens.

Open-source

I focused on the largest models to compare similar context sizes. This logically rules out the possibility of on-device self-hosting for now, though that’s an area where I see great future potential. Since GLM 4.7 was released around that time, I primarily used that.

Interface: Terminal vs. IDE

The only thing that makes sense for active development with a reasonable flow is Agentic mode directly in the IDE. I tried Zed; I tried Idea Junie (which didn’t have BYOK at the time, so I passed). While Claude has its claude-code in the TUI, the best interface for various models (with the ability to use my Claude subscription) for me is OpenCode. Paradoxically, it’s a terminal app, so the flow isn’t 100%, but it’s easy to parallelize and has a high-quality communication interface and good tool usage.

In the end, I settled on the agent panel in OpenCode and diff checking in IntelliJ. (Muscle memory too strong… wcyd).

The amount of ‘You are absolutely right’ is too damn high!

I know I’m right, but I need the machine to be right, too. The problem is that LLMs are pathological people-pleasers. They’ll “solve” any problem with a confident smirk, but the moment you push back, they fold: “You are absolutely right!”

It’s a dangerous trap. This “yes-man” behavior convinces anyone not paying attention that the code is solid when it’s actually just agreeable slop. We all know the “clear instructions and guardrails” mantra—yada yada—but how do you actually break the cycle of hollow apologies in practice?

Driving the AI

Small features? (e.g., adding a page to the UI): Specify how it should structure query parameters, what should be in state vs. stateless, and which libraries to use. Feel free to attach a documentation page for a specific function. Only then do you have a chance that the implementation will fit your code standards. Plus, it forces you to think through the requirements before you start writing code.

Larger features? Write down the requirements and create a markdown implementation plan with the AI. Propose a solution, guide it, and let the AI ask follow-up questions before it generates anything. The AI will grep the files and suggest a solution. It will likely forget something, which is why you have to go through the “boring” discussion. The result should be a markdown file that says exactly what and how to implement. For big tasks, I’ve found it useful to split the plan into multiple files.

Why not do it in one thread? For large tasks, AI often slips into sub-optimal implementations. Smaller plans allow you to catch errors earlier. If you have to scrap it, you burn fewer tokens and still have a backup of the requirements that you can version-control.

Interfaces, interfaces, interfaces!

If you want to actually win at this, focus your entire cognitive load on the design of your interfaces. That is your highest leverage—the point where your brain work matters most. Whether you’re defining component props, URL parameters, backend services, or DB schemas, the real value isn’t the syntax; it’s the mental model you build. This is the stage that demands your most sophisticated thinking and your most intense, high-level brainstorming sessions with the AI.

Success depends on well-designed contracts between modules. Whether you solve the logic of these interfaces through your own architectural vision or a brutal, iterative debate with the machine, the goal is the same: a clear, robust system design. Without that foundational work, AI-generated code will never evolve into high-quality software, and you’ll lose the vital understanding of how the gears actually turn under the hood. Once you’ve established those contracts through high-level thinking, the actual implementation—whether finished manually or via an agent is simply a matter of choosing the right tool for the job.

The Silicon Battleground: Why Opus 4.5 Won My Winter

Throwing the most expensive models at a fresh project finally yielded results I’d call good. I put Gemini 3 through its paces, and while it performed great, it didn’t quite click with my workflow the way Claude did. I even experimented with GLM 4.7—it lagged a bit behind the heavy hitters, but it’s more than usable for smaller, isolated tasks.

In the end, I spent most of my time burning through Claude Opus 4.5. It feels more dialed into the latest stacks and modern UI practices, delivering high-caliber logic with impressive consistency. Once the interfaces were locked, it stopped guessing and started churning out production-grade code that actually met my standards. When you pair top-tier silicon with a solid architectural map, the machine becomes a formidable engine for execution. But even with the best tools money can buy, you can still drive the whole thing off a cliff if you aren’t careful.

Proceed with Caution: Pitfalls & Power Drains

AI is a bottomless pit of “helpfulness”—and the more it helps, the more you’re bleeding tokens. It’s a great business model for them, but a disaster for you if you’re flying blind. If you don’t actually know how to implement a feature, stop playing the token lottery. You aren’t going to hit the jackpot by throwing random prompts at the wall. Use the AI for research, digest the documentation yourself, and then come back with a surgical strike. Blindly guessing isn’t engineering; it’s expensive gambling.

The Parallelization Paradox: Git wasn’t built for a hive mind. When you’re deep in the trenches on a local branch, the last thing you want is a second agent wildly hallucinating changes over your existing work. Sure, there are “workflows” for this, but they ignore the biggest bottleneck: You.

AI can scale horizontally at terrifying speeds, but can your brain? The context-switching tax is real. For complex architectural shifts, I have to be in the room, actively brainstorming and verifying every single mutation the AI suggests. I can see a world where asynchronous remote agents squash minor bugs while I sleep, but for the big, structural moves? The efficiency falls apart the second you stop intervening. You can’t automate the soul of the architecture—at least, not yet.

The Final Verdict: Am I Cooked?

Well, not really—but the floor is definitely rising. In my opinion, AI isn’t coming to replace developers (despite what the marketing hype-train wants you to believe), but it is going to raise the bar brutally. This isn’t your cue to sit back and chill while the tokens spin. Instead, AI is here to act as a force multiplier for your skills—vaporizing the routine work and supercharging those who actually master their craft. It’s the arrival of the excavator for a workforce that’s been digging with hand shovels for decades.

Once this buzzword bubble inevitably pops, these tools will simply be another standard part of the workflow. Yet, as the world drowns in a sea of ‘AI slop’, your personal credibility, reliability, and responsibility become your main real currency. At the end of the day, someone has to be the one people can actually count on. The machine can dig the hole, but you still have to decide where the foundation goes.

How does AI help you during your development? Have you found a workflow that actually delivers for you?

Is "Debugging AI Code" Your New Full-Time (Unpaid) Job?

Reclaim your precious time and sanity from the clutches of buggy bot-code. ProRocketeers offers developers who write code that *solves* problems, rather than creating an ever-expanding matryoshka doll of inscrutable errors. Less head-desking, more high-fiving.

Break the Bug Loop!

See more stories