AI Coding Agents Are Spamming Open Source: 90%+ PR Rejection Rates on OpenClaw

I was scrolling through Hacker News yesterday when a title stopped me cold: "PR spam today looks like email spam in the early 2000s." It was a statistical analysis from Greptile, looking at data from OpenClaw — one of the fastest-growing repos on GitHub.

I sat with it for a while.

Not because the numbers were shocking (though they were), but because this affects everyone using AI coding tools. When you write code with Claude Code, autocomplete with Cursor, or generate functions with Copilot — that code might end up as a PR on an open source project. And when everyone's using the same tools, the same models, and the same prompts to contribute code, what happens to the open source ecosystem?

OpenClaw gave us a real answer.

From 2 PRs per week to 3,400

Some background. OpenClaw is an open source AI Agent project that grew insanely fast — it became one of the hottest repos on GitHub within months. Greptile does PR review for OpenClaw, so they have access to the full PR data.

Here's what the numbers look like:

Last December, OpenClaw received about 2 PRs per week
By February, that number exploded to 3,400 per week
Before the spike, roughly 48% of PRs got merged
After the spike, the merge rate dropped below 9.3%

From 2 to 3,400. Merge rate from 48% to 9%. Let that sink in.

The most absurd stat: one contributor submitted 106 PRs in a single day. 106. I did the math — even if each PR took just 5 minutes to write, that's nearly 9 hours of nonstop submitting. But the median time between submissions was 3 seconds.

3 seconds.

That's not a person writing code. That's an AI Agent on autopilot. Someone set up a script — probably Claude or Codex — to continuously generate code, create branches, and submit PRs in a loop. All day. The quality? You can guess.

Why PR spam happens

To understand this, you need to grasp one thing: the cost of submitting a PR with an AI coding agent is now essentially zero.

It used to be that contributing to open source required effort. You'd read the codebase, understand the architecture, find something to improve, write the code, test it, then submit a PR. The whole process took hours, minimum.

Now? You tell Claude Code "find things to improve in this repo," and it analyzes the code, identifies issues, generates fixes, writes the code, and creates a PR. The whole thing might take 5 minutes.

When cost drops to zero, volume explodes. This is exactly what happened with email spam in 2000.

Greptile's Rahul made a comparison that really landed: the ILOVEYOU worm infected 45 million computers in 24 hours because sending email cost nothing and people trusted the platform. PRs are in the same situation now — submission cost is near zero, and maintainers default to trusting every PR.

I'd extend that analogy further. Email spam eventually gave us Gmail's spam filter and sender reputation systems (SPF/DKIM/DMARC). PR spam will催生 similar infrastructure — we're just in the early stages.

But here's the thing — the motivation behind PR spam is more varied than you'd think. Some people genuinely want to contribute but go about it the wrong way: they have Agent scan the entire repo, find every "improvement," and batch-submit PRs. Some want to pad their GitHub contribution graph (those green squares). Some want to bulk up their resume with open source contributions without actually investing the time to understand the project.

Whatever the motivation, the result is the same: maintainers are drowning.

I saw someone in a Discord sharing a tutorial on "how to quickly contribute to open source with AI." The steps were basically: fork the repo -> use Claude Code to scan for lint warnings -> auto-fix everything -> batch submit PRs. Sounds efficient, right? But lint warnings are warnings, not errors, for a reason — some are intentional design choices. The Agent doesn't know that. It just mechanically fixes everything.

What makes it even worse is that this creates a new kind of "arms race" on GitHub. When one person discovers they can use AI to rapidly accumulate PR count, others follow suit — otherwise their contributions get buried in the noise. It's a vicious cycle: more PRs, lower quality, more exhausted maintainers, harder-to-maintain projects.

Everyone's using the same AI, so all PRs look the same

This finding hit me the hardest.

Linus Torvalds famously said: "Given enough eyeballs, all bugs are shallow." That's the core strength of open source — different people think differently, catch different bugs, approach problems from different angles.

But when everyone's using Claude, Codex, Cursor, and the like, that advantage erodes. Greptile found some jaw-dropping examples in the OpenClaw data:

4 different people submitted PRs with the exact same title: "feat(web-search): add SearXNG as a search provider." Over 10 people independently tried to add the same feature.
6 people independently fixed the same Brave Search locale bug. Two of them submitted PRs with identical titles 94 minutes apart.
5 people independently found the same timeout deadlock in the agent runner.

Picture this: 10+ people each open Claude Code, tell it "add SearXNG search support," and Claude independently produces nearly identical implementations.

That's not "enough eyeballs." That's "the same brain copied 10 times."

There's actually a term for this: "model monoculture." Agriculture has a concept called "monocrop risk" — if all farmers in a region grow the same rice variety, a single disease can wipe out the entire harvest. Software development faces the same risk: if everyone uses the same model's "thinking style," the model's blind spots become the entire codebase's blind spots.

Here's a concrete example. Claude models tend to favor mutex-based concurrency (lots of locks, careful synchronization). GPT-4 tends toward more minimalist patterns. Gemini tends to use the latest language features. If a project's contributors all use Claude, the concurrency model converges on a single approach. Some scenarios might be better served by channels or actors, but the Agent won't explore those alternatives — it'll use what's most "familiar" from its training.

I've experienced this myself. I once used Claude Code to refactor error handling in a module. It wrapped all errors in custom types. I thought it looked reasonable and made the changes. Later, I reviewed someone else's PR on the same module — they'd used Claude Code too, and their solution was almost identical. Same pattern, similar variable names.

One open source maintainer put it perfectly on Twitter: "I used to worry about not having enough contributors. Now I worry they're all the same."

What kind of PRs actually get merged

The most actionable data point: refactoring PRs merge at nearly 4x the rate of feature PRs.

Feature PRs: ~9% merge rate
Refactoring PRs: ~35% merge rate

Why? Because refactoring requires genuine understanding of the existing codebase — its architecture, dependencies, and design rationale. You need to know why code was written a certain way before you can safely change it. And that's exactly what AI Agents struggle with most — they can write new code, but they don't really understand the "why" behind existing code.

Greptile gave a great example: claude-mem maps Claude Code's hook-captured tool stream into a resumable Agent SDK observer session. That architectural decision requires deep understanding of both systems. A developer who truly grasps this can distill it into precise prompts that dramatically improve Agent output. But telling an Agent "build a memory system" won't get you there.

This reminds me of an analogy: 200 years ago, the people who designed buildings also constructed them — they were called "master builders." As construction advanced, the role split into architects and construction workers. Software might be going through something similar — the real value isn't in "writing code" anymore, but in "understanding systems and making the right decisions."

The data backs this up in a way that should make every AI coding tool user think. The contributions that survive review are increasingly the ones an Agent can't do alone — the calls that require deep understanding of an existing system, not novel construction.

Mitchell Hashimoto's Vouch system

The open source community is starting to fight back.

Mitchell Hashimoto (HashiCorp founder, now building Ghostty terminal emulator) got fed up with AI-generated PR spam. He first restricted AI-generated contributions, then built a trust management system called Vouch.

The logic is simple: users who haven't been vouched for can't contribute, and bad actors get explicitly flagged. It's essentially a "sender reputation score" for open source.

Mitchell's vision is for trust decisions to propagate across projects — if you're flagged as a spammer in one project, others can see that too. Same idea as email blacklists.

Interestingly, Vouch worked well for Ghostty, but Mitchell eventually decided to move Ghostty off GitHub entirely. The reasons are complex, but AI PR spam was definitely a factor.

I think Vouch is on the right track, but it has limitations. It filters at the "person" level — judging whether a contributor is trustworthy. But the problem isn't just untrustworthy people; it's also uncontrollable code quality. A well-reputed developer who blindly submits Agent output can still produce low-quality PRs.

So we probably also need filtering at the "code" level — using AI to review AI-generated PRs and identify pattern-based, shallow changes. Greptile itself is doing this.

I also looked into other tools trying to solve this. CodeRabbit and PR-Agent can automatically analyze PR quality and detect pattern-based changes. But honestly, they're still pretty early-stage — using AI to detect AI-generated code is inherently tricky, like fighting magic with magic.

The most effective solution might need to come from GitHub itself. As the platform, they have the most complete data — who submits how many PRs, what their merge rates are, what their contribution patterns look like. A GitHub-native reputation system would be more effective than any third-party tool. But GitHub hasn't moved on this yet — they might still be watching how this trend develops.

This isn't just an OpenClaw problem

You might think this is specific to OpenClaw — it's one of the hottest repos, of course it attracts spam. But in reality, nearly every popular open source project is experiencing something similar.

Any GitHub repo with 10,000+ stars has received AI-generated PRs. The difference is just in degree — the hotter the project, the more it gets.

I've talked to several open source maintainers, and their experience is consistent: over the past six months, PR volume has明显 increased, but quality has明显 decreased. Many PRs share telltale signs:

Highly consistent code style (because it's all from the same model)
Very polished commit messages (AI is good at this)
But the logic doesn't hold up under scrutiny
Missing understanding of the project's history and design decisions

One friend said something that stuck with me: "I used to look at code quality first when reviewing PRs. Now I look at whether it was written by AI first."

If this continues, the damage to the open source ecosystem is real. Maintainers are already short on time. Now they need to spend even more energy filtering low-quality PRs. Some might just give up maintaining — and that hurts everyone.

What this means for us regular developers

Enough macro analysis. What does this mean for you personally?

First, if you use AI coding tools to contribute to open source, please review carefully. Don't submit Agent output as a PR directly. I know it's tempting — Agent writes the code, auto-creates the branch, even writes the PR description. You just click "Create PR." But that PR will probably get rejected, and it'll lower your reputation in the project.

Second, the biggest value of AI coding tools isn't "writing new features" — it's "understanding existing code." The data proves it: refactoring contributions succeed at 4x the rate of feature contributions. Use Agent to help you understand complex codebases, find tech debt, plan refactoring paths — that's more valuable than having it write new features.

Third, the bar for open source contribution is going up, not down. Sounds counterintuitive — shouldn't AI tools lower the barrier? Yes, they lower the barrier to "submitting a PR." But they raise the bar for "submitting a valuable PR." When maintainers face 3,400 PRs, they'll only focus on the truly thoughtful ones. Generic feature additions drown in the noise.

Fourth, prompt engineering matters for open source contributions. The same Claude Code, prompted by someone who understands the codebase architecture versus someone who doesn't, produces vastly different output quality. This circles back to the core point — thinking matters more than typing.

How to use AI tools for quality open source PRs

Since the problem exists, what are best practices? Based on my experience and the data from this study:

Spend time understanding the codebase before letting Agent loose. Don't start by asking Agent to "find things to improve." Read the code yourself first — spend a few hours understanding the design philosophy, the architecture, the project's direction. Then bring your understanding to the Agent. The output quality will be dramatically better.

Have Agent help with refactoring, not new features. The data says it all — refactoring merges at 4x the rate. Use Agent to clean up tech debt, improve code structure, add tests. These are more likely to be accepted than new features.

Don't batch-submit. 106 PRs in a day is obviously bot behavior. Even if your PRs are good, maintainers who see you submitting a flood of PRs will think "is this person farming contributions?" Control your pace — submit 1-2 high-quality PRs at a time.

Explain your thinking in the PR description. Don't just write "Added feature X." Explain why you chose this approach, what alternatives you considered, what the limitations are. This shows maintainers you actually understand the change, not just that Agent generated a solution.

Respect the project's direction. Some PRs are technically sound but misaligned with the project's roadmap. Agent doesn't consider this — it only looks at code. You need to judge whether the change fits the project's goals.

Open an issue first. The simplest but most overlooked step. Before submitting a PR, open an issue describing what you plan to do, why, and how. Maintainer feedback helps you gauge whether the change is wanted. If they say "we already have plans for this" or "that's not the direction we want," you save yourself the coding effort.

Write meaningful commit messages. Agent-generated commit messages are usually well-formatted ("feat: add SearXNG search provider") but also formulaic. If you can write commit messages that explain "why" not just "what," maintainers are more likely to invest time in review.

Test, test, test. Agent-generated code usually passes the tests it writes — because the tests and code come from the same "brain" with the same blind spots. Write a few edge case tests yourself, or verify the Agent's changes through different means.

Keep it small and focused. A 50-line focused PR is more likely to be accepted than a 500-line "refactor everything" PR. Agents tend toward large changes (because they don't fear merge conflicts), but maintainers prefer incremental progress.

My own experience with AI PRs

I've been burned by this too.

I once used Claude Code to fix a bug in a project. The Agent quickly produced a fix that looked reasonable. I almost submitted the PR directly, but took one more look at the diff — and noticed it had changed a default value that shouldn't have been changed. The value looked like a bug, but it was intentional design, documented in the docs.

If I'd submitted that, the maintainer would probably have rejected it and thought I was sloppy. A "looks right but is actually wrong" PR is more annoying than obvious spam — the maintainer spends time reviewing, only to discover it was a waste.

There was a funnier incident too. I had Claude Code add a new feature to a Python library. It enthusiastically wrote a bunch of code — tests, docs, even a changelog entry. Impressive, right? After I submitted it, the maintainer replied: "This feature already exists in v2.0. You're on the v1.x branch."

The Agent completely missed that I was working on the wrong branch. It only looks at the current directory's code — it doesn't proactively check if there's a newer version. That kind of context gap is something only a human can catch.

So now my rule is: after Agent writes code, I spend at least equal time understanding what it changed, why, and what side effects there might be. That review step is non-negotiable.

Common misconceptions

I want to address a few misconceptions I've seen floating around.

Misconception: "AI-generated PRs are all garbage." No, they're not. AI-generated code can be good — the key is how it's used. The data supports this: refactoring PRs merge at 35%. The problem isn't the tool; it's the usage.

Misconception: "Open source projects should ban AI-generated PRs." Too extreme. Some projects have done this (like Ghostty), but it unfairly penalizes contributors who use AI tools thoughtfully. Better to raise review standards than to ban outright.

Misconception: "This is only a big project problem." Nope. Small projects might actually suffer more — they typically have 1-2 maintainers with even less bandwidth. A 500-star project receiving 20 AI-generated PRs might take a maintainer all day to review, with only 2 getting merged.

Misconception: "AI coding tools will make open source contribution easier." They made "writing code" easier. But "contributing valuable code" is actually harder now. Because maintainer expectations are rising — when they know you have AI assistance, they expect higher quality. A simple typo fix that might've been accepted before might now get "why didn't you have AI do a full review while you were at it?"

Misconception: "This is temporary — it'll fix itself when models get better." I'm not so sure. Models will get stronger, but "stronger models" might mean more people using AI tools, which means more PRs, which means more noise. Technological progress doesn't automatically solve social problems — spam technology has been advancing for 25 years, and so has anti-spam technology. The arms race continues.

What's next

Greptile's article ended with a line I really liked: the open source community has solved harder problems before.

True. From Linux kernel's code review process to GitHub's fork/pull model, open source has always evolved. PR spam is a new problem, but not an unsolvable one.

Possible directions:

Reputation systems: Cross-project trust networks like Vouch, making contributor history traceable
AI-assisted PR review: Using AI to review AI-generated PRs and identify low-quality contributions (Greptile is doing this)
Contributor tiering: New contributors' PRs automatically enter a stricter review pipeline
Prompt diversity: Projects add "contribution guides" that steer contributors toward different thinking approaches
Branch permission tightening: Limit new users' PR frequency, or require discussion in an issue before code submission
PR templates and checklists: Require fields like "what scenarios did you test?" and "what alternatives did you consider?" — questions Agent won't answer for you

But at the end of the day, tools are just tools. AI coding agents help us write code faster, but they can't think for us. The core values of open source — diversity, collaboration, commitment to code quality — won't change because of AI. Only the implementation needs to adapt.

Here's something interesting: Greptile's data shows that most merged PRs on OpenClaw came from users with 5+ prior contributions. Their merge rate was 18.6% — more than double the 8.2% for newcomers. This shows maintainers are already filtering by reputation. Your track record of quality code makes them more likely to trust you next time.

That might feel unfair to newcomers — they haven't had time to build reputation yet, and they're already facing 90% rejection rates. But that's the reality: in the AI era, "proving you're not a bot" is the first hurdle of open source contribution.

Next time you use Claude Code to write code and are about to submit a PR, ask yourself: if someone else submitted this PR, would I merge it?

If the answer is "I'm not sure" — take another look.

Data source: Greptile - A statistical study of PRs opened on openclaw/openclaw*
Discussion: Hacker News*