05 Aug 2025 5 min read AI Development

From Autocomplete to Agency: Why We Moved to Agentic Development at GrowthSpace

Turning AI from a smart autocomplete into a real engineering partner

TL;DR

Copilot and AI code reviews helped, but mostly as smart autocomplete and generic feedback. Moving to Cursor + Claude—and leaning into “thinking” (reasoning) and tool use—turned AI into a real teammate. Engineers now steer outcomes; agents plan, fetch context, verify, and propose clean diffs.

The Early Days: Helpful, But Not Transformational

When AI coding tools took off, we rolled out GitHub Copilot across R&D. It sped up repetitive tasks and boilerplate, which was great. But after the initial excitement, we saw limits:

It was a fantastic autocomplete, not a teammate.
It didn’t understand how we build software — our architecture, conventions, or trade-offs.
Even as Copilot added features, it remained strongest at generic tasks.

We also tried AI code reviews. They were solid for common smells and patterns, but again: no awareness of our internal standards, documentation, or architectural decisions. It felt like working with a reviewer who’d read every textbook but never shipped with our stack.

The Turning Point: Vibe Coding, Cursor, Claude

Then came “vibe coding” and agentic tools like Cursor with Claude behind it. The interactions felt different:

Instead of tossing prompts into a void, we could collaborate on reasoning, design, and implementation.
The agent could engage the whole repo, not just a single file in front of us.
It started to feel like pair programming, not smarter autocomplete.

We decided to move to Cursor as our primary interface, and — importantly — stay flexible across models so we could use the best of what the industry offered at any moment.

That decision paid off. Results improved. But the agent still behaved like a very fast junior — capable, yet dependent on context and guidance.

What Changed With “Thinking” Models and Tool Use

Early assistants predicted the next line. Newer agents plan. They break work into steps, fetch context, call tools (like repo search, linters), check results, and iterate. That shift — from “complete my code” to “plan → act → check → revise” — is what unlocked agentic development for us.

Practically, here’s what changed:

From prompts to procedures. We describe outcomes; the agent proposes a plan and executes it in steps.
Less context dumping. Instead of pasting half the repo, the agent uses tools to find what it needs (code search, docs, ADRs, schemas).
Smaller, safer changes. The agent can work in iterative patches we can review quickly.
Role shift for humans. Engineers steer architecture and product trade-offs; agents handle scaffolding, consistency, and repetition.

From Junior to Trusted Teammate

The Claude + Cursor combo lifted our baseline quality and consistency. We could:

Refactor large areas of the codebase while aligning to our conventions.
Generate tests and migration scripts that matched our frameworks.
Draft clean diffs with solid rationale that reviewers could scan and approve.

But we wanted more. We wanted the agent to behave like a senior contributor — to bring structure, know where to look, and validate itself. That pushed us to adopt process and guardrails around the agent so that reasoning + tool use became a reliable part of delivery, not a novelty.

Concrete Wins We Saw

Cross-cutting refactors (e.g., upgrading a logging approach across services) became a search → plan → propose PRs loop with far fewer misses.
Test scaffolding went from a tax to a default: agents generate focused tests that compile and pass linters before we even review.
API client and schema work (types, adapters, small migrations) became fast, consistent, and less error-prone.
Onboarding for new contributors improved because agents now nudge toward our patterns, not generic internet patterns.

No magic numbers here — we track cycle time, PR size, review turnaround, and defect patterns like everyone else — but the feel of delivery changed: fewer back-and-forths, more confident diffs, and less glue code stealing focus.

Lessons We Learned (So You Don’t Have To)

Tool agnostic > tool loyal. Cursor + Claude worked for us, but the point is flexibility. Keep optionality as the ecosystem evolves.
Give the agent your culture. Codify conventions, ADRs, and examples. Agents can’t align to standards they can’t see.
Guardrails create speed. Having a good idea on how to test for positive or negative results can speed up the agent self check-up and make sure you get what you want.
Think in procedures. “What outcome do we want?” → “What steps get us there?” → “What checks prove it’s done?”
Start with the right tasks. Cross-cutting, repetitive, well-bounded work shines. Greenfield domain design still benefits from human lead, agent execution.

Rolling It Out Across the Team

Before adopting agentic development, we explored tools like Lovable, Base44, and Cursor. Cursor was already powerful for POCs and smaller projects, but larger codebases exposed gaps in structure and context. That led us to experiment with memory bank solutions — making the agent aware of project history, decisions, and conventions. The breakthrough came with CursorRIPER, which paired memory banks with well-defined roles for the agent. It felt like moving from a junior to a senior pair-programming partner.

We rolled it out with:

Kickoff meeting for all of R&D (plus other key stakeholders) introducing Cursor, CursorRIPER, memory banks, and our working “rules.”
Good/bad prompt examples and a shared “chat history” app as a quick-reference library.
Internal Slack AI Hub for sharing successes, pitfalls, and industry updates.
Weekly AI-development syncs to share new practices, troubleshoot, and surface priorities for improvement.

This structure gave everyone a shared vocabulary, clear examples of effective use, and an ongoing feedback loop — making the transition faster and more consistent.

What This Means for Engineer Roles

We didn’t replace engineers — we reframed how they spend their time. Engineers now act more like feature development managers:

Define intent and constraints.
Approve a plan, not just a file.
Let the agent do repetitive labor and verification.
Review diffs that are already linted, tested, and justified.

The work got more interesting. The output got more consistent.

Closing Thought

Copilot showed us what AI could do. Agentic development showed us what AI should do: plan, fetch, verify, and iterate — so engineers can steer outcomes instead of shoveling glue code. We’re staying tool-agnostic on purpose, but the direction is clear: less toil, more judgment.

Next in This Series

We’ve already implemented much of what follows; upcoming posts will dive into how we did it — examples, and gotchas:

Structured agents with CursorRIPER: the plan → act → check loop on steroids. better templates for how to work with the agent, and guardrails that make every session reliable.
Memory banks for deep context: versioned files per service capturing history, Architecture Decision Records, invariants, and anti-patterns — so agents understand why, not just what.
Building an AI-first engineering culture: lightweight rituals, show-and-tell demos, shared prompt libraries, and “agent etiquette” so teams level up together.
Safety & privacy by default: practical patterns for PII redaction, least-privilege tool access, and auditable traces — keeping agents helpful and compliant.