6 Months of AI Coding Tools: The Mistakes I Made So You Don't Have To

I've been using AI coding tools almost every day since the start of this year. Claude Code, Cursor, Copilot, Gemini CLI — if it exists, I've probably tried it. Six months in, my productivity has genuinely improved. But I've also made enough mistakes to fill an article.

Here's the thing nobody tells you upfront: these tools are powerful, but they'll bite you if you don't know what you're doing. I've shipped bugs that AI confidently wrote, burned through token quotas in days, and wasted hours debugging code that "looked right."

This is the article I wish I'd read before I started.

The honest truth: AI coding tools aren't magic

When I first started using these tools, my mental model was basically: "This thing writes code for me. I can relax now." That lasted about a month.

The reality is more nuanced. AI coding tools are like having a very fast, very confident junior developer on your team. They'll churn out code at impressive speed, but they don't understand your architecture, they don't think about edge cases, and they definitely won't take the blame when something breaks in production.

Once you internalize that, everything else falls into place.

Mistake #1: Trusting AI-generated code without review

This is the most expensive mistake, and probably the most common one.

I had Claude Code build a file upload feature for me. It produced a solid-looking implementation: error handling, progress indicators, type validation, the works. I skimmed it, thought "looks good," and committed it.

Two days later, the server started running out of memory. Turns out the upload handler only worked for small files. There was no streaming, no chunking, no memory management. It worked perfectly in development because I was only testing with 2MB files. In production, someone tried to upload a 500MB video.

The specific things AI tends to get wrong:

Error handling: It'll wrap everything in try-catch, but the catch block just logs to console. That's not error handling, that's error ignoring.
Edge cases: Empty arrays, null values, extremely long strings, concurrent requests — AI rarely considers these proactively.
Performance: The code runs, but it might do N+1 queries, unnecessary re-renders, or leak memory.
Security: SQL injection, XSS, CSRF — AI won't protect against these unless you explicitly ask.

My rule now: after AI generates code, I ask it "what edge cases did you miss?" It usually finds a few. Not always accurate, but better than shipping blind.

Mistake #2: Poor context management

Here's something weird I noticed: sometimes AI tools feel incredibly smart, understanding my intent immediately. Other times, they seem completely clueless no matter how clearly I explain things.

The difference is context.

AI coding tools are only as good as the context they can see. If your project structure is messy, or if you're not specific enough in your prompts, the tool is basically guessing. When it guesses right, you think it's brilliant. When it guesses wrong, you think it's stupid. Neither assessment is fair.

Common context problems I've run into:

Deep directory structures confuse AI. I had a project with paths like src/lib/utils/helpers/string/format.ts. Claude Code kept looking in the wrong places because it didn't understand my naming conventions. Adding a directory structure explanation to CLAUDE.md fixed this.

Long conversations cause amnesia. This is especially bad in Claude Code. After 20 rounds of conversation, it might forget constraints you mentioned in round 3. Use /compact regularly to compress the context, or just start a fresh session.

No configuration files. If you don't have a CLAUDE.md or .cursorrules in your project root, the AI has to guess your preferences every single time. I now put one in every project with: tech stack and versions, code style preferences, project structure, files to never touch, and testing requirements. One-time effort, massive long-term payoff.

Real example: I have a Next.js project. Without CLAUDE.md, Claude Code kept using CommonJS require syntax. I added "Project uses ES Modules — use import/export" to the config file. Problem solved permanently.

Mistake #3: Token consumption is faster than you think

This one hits your wallet directly.

I'm on Claude Code Pro ($20/month). First month, I ran out in under two weeks. Why? I was using it for everything — "help me understand this file," "what does this function do," "refactor this section." Every conversation burns tokens, and large files burn them fast.

Here's a rough idea of consumption: the Pro plan gives you maybe 100-200 medium-complexity conversations. Reading a 500-line file in a single conversation can cost 10,000+ tokens. So yeah, $20 goes faster than you'd expect.

Tips I've picked up for saving tokens:

Be specific before you ask. Don't say "help me look at this project" — the AI will read the entire thing. Instead, tell it exactly which files matter for your question.

Use /compact religiously. In Claude Code, this compresses previous conversation and frees up context space. Use it every 10-15 exchanges in long sessions.

Match the tool to the task. Claude Code for complex work that needs whole-project understanding. Copilot for simple completions. Gemini CLI for quick questions (it's free, no token limits).

Watch your rate limits. Anthropic throttles heavy usage. I once ran Claude Code continuously for a few hours and got locked out. Now I take breaks every 30 minutes or so. Keeps me under the radar.

Mistake #4: Cursor's Agent mode isn't as reliable as it looks

Cursor's Agent mode (formerly Composer) is seductive: describe a task, and it automatically modifies multiple files, runs commands, even fixes its own bugs. Sounds amazing.

In practice, the failure rate is higher than you'd expect.

I asked it to help migrate an Express.js project to Next.js. It made changes to a bunch of files, and when it said "done," I believed it. The project didn't run. It had modified files it shouldn't have touched, scrambled some routing logic, and installed wrong dependency versions.

The worst part? It reported success confidently. I spent hours debugging before I realized the AI had caused the problems.

My rules for Agent mode:

Keep tasks small and specific. Don't say "refactor the whole project." Say "convert this file's API routes to Next.js Route Handlers." Smaller scope = fewer mistakes.

Always review the diff. Cursor tells you which files it changed. Look at every single one. Don't trust its summary.

Use the notepad feature. Cursor has a notepad for recording project conventions. Load it at the start of each session so the AI doesn't re-guess your preferences.

Disable auto-command execution. I strongly recommend this. Make Cursor ask permission before running npm install or git commit. I once caught it about to run something dangerous because I had auto-execute on.

Also worth knowing: Cursor Agent sometimes "hallucinates" its accomplishments. It might say "I've added error handling to all 12 routes" when it only actually modified 8. After it finishes, run git diff --stat to verify the file count matches what it claimed.

Mistake #5: Copilot's suggestions look right but aren't

GitHub Copilot is the tool I use most because it's built right into VS Code and costs almost nothing. But its suggestions have a specific failure mode: they look perfectly reasonable, the syntax is correct, but the logic is wrong.

Example: I was writing a React component that fetches data from an API and renders it. Copilot completed a chunk of code that looked fine. But it didn't handle loading or error states. In development, the API responds instantly, so everything looks great. In production, when the API is a bit slow, users see a blank page.

Another time, I was writing a database query. Copilot suggested a reasonable-looking SQL. I almost used it — then realized it had no index conditions. On a large dataset, it would do a full table scan. That could have taken down the production database.

Copilot works by pattern matching. It sees your half-written code and completes it based on similar code it's seen. But "looks similar" and "is logically correct" are different things.

Where Copilot actually shines: boilerplate code, test cases, type definitions — things where correctness is straightforward and creativity isn't needed. For business logic, security, and performance-sensitive code, write it yourself.

One more thing about Copilot: its suggestions can "learn bad habits" from your codebase. If your project has poor practices (like using var everywhere), Copilot will pick up on that and suggest var too. It mimics your style, including the bad parts. If you inherit a messy codebase, be extra cautious with Copilot's suggestions.

Mistake #6: Using the wrong tool for the job

After six months, my biggest takeaway is this: every AI coding tool has a sweet spot. Use it outside that sweet spot, and your efficiency actually drops.

Claude Code excels at: large-scale refactors that need whole-project context, code reviews, debugging complex bugs (give it error logs + relevant code), and writing documentation.

Cursor excels at: daily coding in an IDE (most natural workflow), multi-file edits (Agent mode, despite its flaws), frontend development (preview feature is great), and quick prototypes.

Copilot excels at: inline code completion (its strongest feature), test case generation, boilerplate code, and learning new frameworks by reading its suggestions.

Windsurf excels at: cross-session context retention (Cascade memory), code research and exploration, and budget-friendly IDE integration (more free credits than Cursor).

Gemini CLI excels at: free usage (no token limits), quick questions without launching an IDE, code explanation, and rapid prototyping.

Using Claude Code just for code completion is overkill. Using Copilot to understand an entire project will disappoint you. Match the tool to the task.

Mistake #7: The hidden cost of switching tools

Most developers I know (including me) use multiple AI coding tools. But switching between them has a cost that's easy to overlook.

Each tool has its own communication style. Claude Code works best with detailed context and explicit instructions. Cursor's Agent mode prefers high-level task descriptions. Copilot just needs well-written code comments.

If you use the same communication style with all of them, results suffer. I used to talk to Copilot Chat the same way I talk to Claude Code — tons of context, very detailed. Copilot Chat couldn't handle it because its context window is smaller.

Also keep configurations consistent. Your .cursorrules (Cursor) and CLAUDE.md (Claude Code) should express the same conventions. Otherwise you'll end up with code in different styles depending on which tool wrote it.

I now maintain a unified config template that I copy into every new project and tweak as needed. Consistent output regardless of which tool I'm using.

Mistake #8: Security is an afterthought (it shouldn't be)

This one doesn't get enough attention.

AI coding tools need to read your code to work. That means your code is being sent to the cloud (unless you're using a local model). For personal projects, maybe that's fine. For company code, it's a serious concern involving IP and compliance.

A friend of mine works at a finance company. Their policy is clear: no cloud-based AI coding tools. The code contains trading algorithms and client data that can't leave their infrastructure. He's stuck with local models, which are noticeably worse.

Even for personal projects, watch out for:

Don't let AI see your secrets. Some tools store conversation history in the cloud. If your code has API keys or passwords visible, those might get saved somewhere you don't control.

Check AI-generated code for vulnerabilities. AI might write code with SQL injection, XSS, or other security holes. It won't proactively audit for security.

Verify AI-suggested dependencies. Sometimes AI recommends installing packages that don't exist or are abandoned. I've seen Claude Code suggest a Python package that had barely any downloads and hadn't been updated in two years. The functionality was already in the standard library.

Watch for hardcoded values. AI sometimes embeds example values like API endpoints or database connection strings. If you commit without checking, you might leak infrastructure details.

Mistake #9: Your skills are quietly deteriorating

This is the most insidious mistake because you don't notice it happening.

After a few months of AI-assisted coding, I realized something uncomfortable: my raw coding ability had declined. I still understood programming concepts, but my "muscle memory" was gone. Writing a function from scratch? My first instinct was to ask AI. Solving a simple algorithm? I'd let Copilot suggest it.

This creates a dangerous dependency. You become a code modifier instead of a code creator. You stop thinking through problems from first principles and instead always start from AI's suggestions.

There was an interesting Reddit thread where someone asked "Did stopping Copilot affect your productivity?" Multiple people said they barely noticed a difference. That suggests a lot of the "efficiency gains" from AI tools might be illusory — you write code faster, but spend more time reviewing and debugging, so the net effect is roughly neutral.

What I'm doing about it:

Simple stuff, write it yourself. Not every line of code needs AI. Simple logic, common patterns — writing these yourself keeps your skills sharp.

Think first, then ask AI. Don't immediately ask AI when you hit a problem. Spend a few minutes thinking through it yourself, form a rough approach, then let AI help refine it.

Regularly code without AI. I now have one day a week where I don't use any AI coding tools. Pure manual coding. Slower, but it keeps my instincts alive.

Think of AI as a very capable intern, not a senior engineer. An intern can do work for you, but you need to check their output. You wouldn't let an intern own a critical system module, right?

Mistake #10: Skipping version control discipline

This one sounds obvious, but it really hurt when I learned it the hard way.

When AI tools modify your code, they change many files at once. If you're not using Git carefully — not committing frequently, not using branches — rolling back becomes a nightmare.

My current workflow:

Before any big AI-driven change, commit what you have.
After AI finishes, review the diff, confirm it's good, commit again.
If something's broken, git checkout back to the last good version.

This has saved me multiple times. Once, Cursor Agent mode scrambled my routing configuration. Without that pre-change commit, I'd have spent hours manually reconstructing it.

I also recommend using feature branches for AI-generated changes. Never let AI modify your main branch directly. Create a branch, test everything, then merge.

Mistake #11: Bad prompts

The last mistake is the most common and the easiest to fix.

Most people write prompts that are way too vague. "Help me optimize this code" — optimize for what? Performance? Readability? Security? The AI doesn't know what you're thinking.

My prompt-writing principles:

Be specific about what you want. Not "write a function" but "write a Python function that takes a list of URLs, downloads all files concurrently, saves them to /tmp/downloads, retries each download up to 3 times."

Provide context. Tell the AI what framework you're using, what version, where this feature fits in the project. It can't read your mind.

State constraints upfront. If you have specific requirements (no any types, use async/await not .then()), say so. Otherwise the AI will do what it thinks is "best," which might not match your needs.

Break it down. Complex tasks shouldn't be one giant prompt. Step by step: first understand the project, then do part one, review, then part two. Fewer mistakes, and you maintain control at each stage.

My daily workflow

After all these mistakes, here's the workflow that works for me:

Starting a new feature:

Think through the approach myself first, sketch the architecture
Use Claude Code to understand relevant code ("help me understand src/api/users.ts")
Confirm the approach is viable, then switch to Cursor for implementation

While coding:

Copilot handles inline completion (automatic, no special effort)
Cursor Chat for complex logic questions
Cursor Agent mode for multi-file changes (small tasks only)

After coding:

Claude Code for code review ("review this PR, focus on security and performance")
AI-generated test cases as a starting point
Run tests, confirm everything passes, then commit

When debugging:

Read the error logs myself first, identify the likely location
Give Claude Code the error message plus relevant code
If its first suggestion doesn't work, rephrase and try again

It's not perfect, but it's the most efficient workflow I've found. Your mileage will vary depending on your projects and habits.

I'm planning to experiment with AI-assisted code review and automated testing next. If anything interesting comes out of that, I'll write about it. Drop a comment if you have questions.