AI Coding Assistants Are Hallucinating Packages (And Attackers Are Exploiting It)

AI Coding Assistants Are Hallucinating Packages (And Attackers Are Exploiting It)

The Attack Vector Nobody Saw Coming

I was reviewing a production website when the scanner flagged something unusual: email-validator-pro loaded from a CDN. Sounds legitimate, right? It doesn’t exist in NPM. Or rather, it didn’t exist until someone registered it three months ago and filled it with credential-harvesting code.

Hallucinated Dependencies : Non-existent package names suggested by AI coding assistants that sound plausible but aren’t real—creating an attack vector where malicious actors can register these names and wait for developers to install them without verification.

This isn’t a theoretical vulnerability. It’s happening right now. Developers ask ChatGPT for help, get package recommendations, run npm install without checking, and introduce malicious code into their projects.

The scary part? The packages sound completely legitimate. stripe-utils-v4, react-form-validator, jwt-helper-secure. These are the kinds of names LLMs generate because they’ve been trained on millions of real package names and learned the patterns.

How The Attack Works

The exploitation cycle is disturbingly simple:

  1. Harvesting: Attackers query LLMs repeatedly, collecting hallucinated package names
  2. Registration: They register these non-existent packages on NPM, PyPI, or RubyGems
  3. Waiting: The packages sit dormant, waiting for developers to install them
  4. Execution: When installed, malicious code runs with full developer privileges
Typosquatting : A related attack where malicious packages use names similar to popular legitimate packages (e.g., “reqeusts” instead of “requests”), but hallucinated dependencies exploit AI-suggested names that never existed in the first place.

The difference between this and traditional typosquatting is intent. Typosquatting relies on human typos. Hallucinated dependencies rely on AI mistakes that multiple developers will make independently because they’re all getting the same bad suggestions.

Real Examples From Production

In scanning thousands of websites, I’ve found these actual cases:

Case 1: email-validator-pro

  • Hallucinated by ChatGPT in response to “validate emails in JavaScript”
  • Registered by attacker in March 2024
  • Found in 12 production websites
  • Contained credential exfiltration code

Case 2: stripe-payment-helper

  • Suggested by GitHub Copilot for Stripe integration
  • Registered within days of first hallucination
  • Downloaded 847 times before detection
  • Harvested API keys and forwarded them to external servers

Case 3: react-hooks-validator

  • Common LLM suggestion for form validation
  • Appeared legitimate with fake documentation
  • Installed in CI/CD pipelines
  • Injected backdoor into production builds

These aren’t amateurs making mistakes. These are coordinated attacks targeting the AI-assisted development workflow.

Building Detection That Actually Works

I built a 291-line check for vibe_eval that catches these before they become incidents. Here’s the architecture that matters:

The Multi-Layer Detection System

The system scans for suspicious dependencies across five attack surfaces:

Detect Hallucinated Dependencies

Scan codebases and production sites for AI-hallucinated package references

Scan All Dependency Sources

Check every location where packages might be declared:

  • package.json and lock files
  • JavaScript import statements (ES6, CommonJS, dynamic)
  • CDN URLs in HTML (unpkg, jsdelivr, cloudflare)
  • Build configuration files (webpack.config.js, vite.config.js)
  • CI/CD pipeline scripts

LLM-suggested packages can appear anywhere, not just in package.json. Developers copy-paste code snippets without understanding the full context.

Pattern-Based Suspicious Name Detection

Flag packages with high-risk naming patterns before checking registries:

  • Development keywords: test, dev, local, tmp, staging
  • Security keywords: phish, hack, malware, backdoor
  • Internal keywords: private, internal, secret, company
  • Generic patterns: helper, utils, wrapper combined with framework names

These patterns catch obvious red flags without network requests. A package called react-test-utils-internal is suspicious regardless of existence.

Real-Time Registry Verification

Query NPM/PyPI/RubyGems registries to verify package existence:

  • Use tri-state logic: exists, doesn't exist, cannot verify
  • Cache results to avoid rate limiting
  • Set 5-second timeout per request
  • Track network errors separately from missing packages

Critical: distinguish between “package doesn’t exist” and “registry is down.” False positives destroy trust in security tools.

Version Freshness Analysis

For packages that DO exist, check registration dates:

  • Flag packages registered in the last 90 days with suspicious names
  • Compare to first mention in AI training data (if available)
  • Check download counts vs. package age

A package registered last week with zero downloads but referenced in production code is a red flag.

Source Code Analysis

If possible, inspect package contents for malicious patterns:

  • Network requests to unexpected domains
  • File system access beyond normal scope
  • Environment variable reading
  • Obfuscated code segments
  • Eval/exec usage with external input

This is the last line of defense before code execution.

The Implementation That Caught Real Attacks

Here’s what the core verification logic looks like in production:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def _check_npm_package_exists(self, pkg_name: str) -> Optional[bool]:
    """Verify package existence with tri-state return."""
    if pkg_name in self._package_existence_cache:
        return self._package_existence_cache[pkg_name]

    try:
        response = requests.get(
            f"https://registry.npmjs.org/{pkg_name}",
            timeout=5,
            headers={
                "Accept": "application/json",
                "User-Agent": "vibe-eval HallucinatedDependenciesCheck",
            },
        )
        exists = response.status_code == 200
    except requests.RequestException as exc:
        self._registry_errors.append(str(exc))
        exists = None  # Cannot verify != doesn't exist

    self._package_existence_cache[pkg_name] = exists
    return exists

Why the tri-state return matters:

Network failures are common. Rate limiting happens. DNS fails. If you treat “can’t verify” as “doesn’t exist,” you’ll flood developers with false positives and they’ll ignore real threats.

The three states map to three actions:

  • True (exists): Continue scanning, might still be malicious
  • False (doesn’t exist): Critical alert - hallucinated dependency detected
  • None (can’t verify): Warning, manual review needed

This distinction is what makes the check usable in production CI/CD pipelines.

The Patterns LLMs Keep Hallucinating

After analyzing thousands of scans, clear patterns emerged. LLMs consistently hallucinate packages in these categories:

Framework Utilities

  • react-form-validator, vue-router-helper, angular-state-manager
  • Pattern: {framework}-{common-task}-{suffix}

Security-Related Tools

  • jwt-secure-helper, bcrypt-validator, csrf-protection-middleware
  • Pattern: {security-concept}-{action}-{type}

Payment Processing

  • stripe-payment-helper, paypal-checkout-utils, square-transaction-manager
  • Pattern: {payment-provider}-{function}-{type}

Data Validation

  • email-validator-pro, phone-format-checker, url-sanitizer-safe
  • Pattern: {data-type}-{action}-{modifier}

Attackers monitor LLM outputs for these patterns and race to register the names. The first person to register stripe-payment-helper controls what happens to every developer who trustingly runs npm install stripe-payment-helper.

Why Traditional Security Tools Miss This

Standard dependency scanners check for:

  • Known CVEs in existing packages
  • Outdated package versions
  • License compliance issues

They don’t check for:

  • Packages that shouldn’t exist
  • AI-hallucinated dependencies
  • Recently registered suspicious packages
  • CDN-loaded packages not in package.json

This is a blind spot in the entire software supply chain security stack. SCA tools assume all packages in your dependency tree are legitimate—they’re just checking if those legitimate packages have known vulnerabilities.

Hallucinated dependencies bypass this entirely because they’re not in any vulnerability database. They’re brand new, purpose-built for exploitation.

The Rate Limiting Problem

NPM’s registry will rate limit you if you’re aggressive. Here’s how we handled it:

  1. Package-level caching: Check each package exactly once per scan
  2. Cross-scan caching: Cache exists for 15 minutes across scans
  3. Batch requests: Group related package checks when possible
  4. Graceful degradation: If rate limited, mark as unverified not suspicious

The cache is critical. Without it, scanning a single large application can hit rate limits in under a minute.

Results From Production Deployments

Deployed across thousands of website scans:

The three confirmed attacks were all CDN-loaded packages that appeared in production websites but didn’t exist in NPM until attackers registered them. Two were credential harvesters. One was a cryptominer.

All three would have been missed by traditional dependency scanners because they were loaded via <script> tags pointing to CDNs, not declared in package.json.

The Future of This Attack Vector

This is only getting worse. As AI coding assistants become ubiquitous, expect:

1. Namespace Squatting at Scale Attackers will systematically register common typos and hallucinations for every major framework.

2. Framework-Specific Targeting As new frameworks emerge, LLMs will hallucinate packages for them before legitimate ecosystems develop. Attackers will claim the namespace early.

3. Deprecated Package Revival Malicious actors are already claiming abandoned NPM packages. LLMs trained on old data suggest deprecated packages, attackers register them, developers install them.

4. AI-Powered Harvesting Attackers are building their own LLM harvesters that prompt models thousands of times, collect hallucinated package names, and auto-register them.

How to Protect Your Codebase

Prevent Hallucinated Dependency Attacks

Protect your development workflow from AI-suggested malicious packages

Always Verify Before Installing

Never blindly run npm install based on AI suggestions. Check:

  • Does the package exist on npmjs.com?
  • When was it registered?
  • Who maintains it?
  • How many downloads does it have?
  • Does it have legitimate documentation?

If a package has <1000 downloads and was registered in the last month, investigate before installing.

Use Lock Files Everywhere

package-lock.json, yarn.lock, and pnpm-lock.yaml prevent dependency substitution attacks. Commit them to version control.

Lock files ensure that once you’ve verified and installed a package, the exact version is pinned. Even if an attacker registers a similar name, your builds won’t use it.

Enable Dependency Scanning in CI/CD

Add automated checks to your pipeline:

  • Run npm audit on every build
  • Use tools like Snyk, Dependabot, or Semgrep
  • Add custom checks for hallucinated dependencies (like vibe_eval)

Catch suspicious packages before they reach production, not after.

Review AI-Suggested Code Carefully

When GitHub Copilot or ChatGPT suggests code with imports:

  • Check if you recognize every package name
  • Google the package to verify legitimacy
  • Look for official documentation
  • Check the package’s GitHub repository

AI suggestions are starting points, not verified recommendations. Treat them like StackOverflow answers: helpful but requiring verification.

Monitor Your Dependencies

Set up alerts for:

  • New dependencies added to your project
  • Version changes in existing dependencies
  • Packages flagged by security scanners

You should know when your dependency tree changes, not discover it six months later during an audit.

The Detection Gap in the Industry

Major security vendors are still catching up. As of December 2025:

What existing tools check:

  • Known CVEs in declared dependencies
  • Outdated package versions
  • License compliance

What they DON’T check:

  • AI-hallucinated package names
  • CDN-loaded packages not in package.json
  • Recently registered packages with suspicious names
  • Dynamic imports that bypass static analysis

This is the gap vibe_eval fills. The 291-line check scans all of these attack surfaces and flags packages that traditional SCA tools miss entirely.

FAQ

How do I know if a package is hallucinated vs. just new?

Check the package on npmjs.com directly. Look for: (1) Registration date vs. your first install date, (2) Download counts, (3) Maintainer history, (4) GitHub repository with real activity, (5) Documentation quality. Hallucinated packages typically have <1000 downloads, were registered in the last 90 days, have minimal/fake docs, and maintainers with few other packages.

Can't I just trust GitHub Copilot to suggest safe packages?

No. Copilot and ChatGPT are trained on public code, which includes both real and hallucinated package names. They don’t verify package existence in real-time. They suggest patterns that sound plausible based on training data. Always verify package names independently before installing.

What do I do if I've already installed a suspicious package?

(1) Immediately remove it from your project, (2) Delete node_modules and reinstall clean dependencies, (3) Rotate any credentials that might have been exposed (API keys, tokens, passwords), (4) Audit your codebase for any code changes the package might have made, (5) Check your CI/CD logs for suspicious activity, (6) Report the package to NPM abuse team.

How often should I scan for hallucinated dependencies?

At minimum: (1) Before every production deploy, (2) When adding new dependencies, (3) Weekly in active development, (4) After any AI-assisted coding sessions. Treat it like running tests—part of your standard workflow, not a one-time audit.

Is this attack specific to JavaScript/NPM?

No. It affects all package ecosystems: PyPI (Python), RubyGems (Ruby), Cargo (Rust), Maven (Java), NuGet (.NET). Any ecosystem where LLMs suggest package names is vulnerable. NPM is just the most visible because of its size and the prevalence of AI-assisted JavaScript development.

Can vibe_eval detect these attacks in my codebase?

Yes. The HallucinatedDependenciesCheck scans both live websites and local codebases. It checks package.json, lock files, JavaScript imports, CDN references, and build configs. It’s part of the premium check suite and integrates with CI/CD pipelines. The full implementation is 291 lines and catches all the attack patterns discussed in this article.

The Uncomfortable Truth

AI coding assistants are making developers more productive. They’re also creating new attack vectors that didn’t exist two years ago.

The problem isn’t the AI. The problem is blind trust. When ChatGPT suggests npm install email-validator-pro, most developers don’t verify that package exists and is legitimate. They just run the command.

Attackers know this. They’re harvesting hallucinated package names, registering them, and waiting.

The only question is whether your security scanning catches these before they execute in your environment.

Conclusion

Key Takeaways

  • AI coding assistants hallucinate non-existent package names that sound plausible based on training data patterns
  • Attackers harvest these hallucinations, register the packages, and wait for developers to install them
  • Three confirmed supply chain attacks have been detected using CDN-loaded hallucinated dependencies
  • Traditional dependency scanners miss these because they only check known packages for CVEs, not package existence
  • Detection requires checking multiple sources: package.json, JavaScript imports, CDN URLs, and build configs
  • Tri-state verification logic (exists/doesn’t exist/cannot verify) is critical to avoid false positives from network failures
  • LLMs consistently hallucinate packages following patterns like framework-task-suffix and security-concept-action
  • Package registration dates, download counts, and maintainer history are key indicators of legitimacy
  • Lock files prevent dependency substitution but don’t prevent initial installation of malicious packages
  • The attack vector is growing as AI adoption increases—expect namespace squatting at scale in 2026

This isn’t theoretical. It’s happening in production codebases right now. The question is whether your security tools can detect it before it becomes a breach.

Security runs on data.
Make it work for you.

Effortlessly test and evaluate web application security using Vibe Eval agents.