You ask Claude to help you validate JSON schemas. It suggests importing json-schema-validator-express. The package name sounds perfect. You run npm install. Congratulations, you just installed malware.
This attack vector didn’t exist two years ago. AI created it.
How Package Hallucination Works
AI models don’t have real-time access to npm, PyPI, or other package registries. They generate package recommendations based on patterns: common naming conventions, what sounds right, what similar packages are called.
The problem: what sounds right often isn’t real.
| |
The Attack Chain
The attack is elegant in its simplicity:
Step 1: Reconnaissance Attackers query AI models with common development tasks and collect suggested package names. They identify names that don’t exist but sound legitimate.
Step 2: Registration The attacker registers the hallucinated package name on npm, PyPI, or other registries. They add code that appears functional but includes malicious payloads.
Step 3: Wait When developers ask AI for help and receive the hallucinated recommendation, they install the attacker’s package. The malicious code executes during installation or runtime.
| |
The package works. Tests pass. The attack succeeds.
Real-World Examples
The python-jwt Incident
Researchers found that AI models consistently recommended python-jwt for JWT handling in Python. The actual package is PyJWT. An attacker registered python-jwt with typosquatting code.
The express-validator Variants
Multiple variants of express-validator (the real package) have been recommended by AI:
express-input-validator(hallucinated)express-body-validator(hallucinated)express-form-validator(hallucinated)
Attackers registered several before the pattern was identified.
The Lasso Security Study
Lasso Security documented over 200 malicious packages that exploited AI hallucination patterns. Their research found that popular AI coding assistants hallucinate package names at rates between 5% and 20% depending on the language and domain.
Why AI Hallucinations Happen
AI models generate package names through pattern matching, not registry lookup:
Naming Convention Patterns
Models learn that Express middleware often follows express-{function} patterns. When asked about JSON validation, it generates express-json-validator because that fits the pattern.
Training Data Gaps Training data includes documentation, tutorials, and code from specific points in time. Packages created after training aren’t known. Packages that existed but were deprecated might still be recommended.
Confidence Without Verification Models present hallucinated packages with the same confidence as real ones. There’s no “I’m not sure this exists” qualifier.
| |
Detection and Prevention
Protect Against Hallucinated Dependencies
Systematic approach to verifying AI package recommendations
Verify Before Installing
Before running npm install or pip install, verify the package exists:
- Search the package registry directly
- Check the package’s npm/PyPI page
- Look for download statistics, maintainer info, repository links
- Zero downloads or very recent creation dates are red flags
Check Package Age
npm view {package} time to see when it was created. Packages less than a few months old that AI recommends confidently deserve extra scrutiny.Inspect Before Install
npm pack {package} to download without installing. Examine the contents, especially postinstall scripts, before adding to your project.Use Lock Files
package-lock.json or yarn.lock. These lock specific versions and make it harder for attackers to swap malicious code into existing packages.Implement Dependency Scanning
Verification Commands
Quick commands to validate packages before installation:
| |
If npm view returns “404 Not Found,” the AI hallucinated the package. If it exists but was created very recently, investigate further.
Framework for AI Package Recommendations
When AI suggests a package, run through this decision tree:
| |
The Deeper Problem
Package hallucination reveals a fundamental issue with AI-assisted development: AI models present fiction with the same confidence as fact.
The model doesn’t know what it doesn’t know. It has no concept of “this package might not exist.” Every recommendation comes with equal confidence because confidence is generated, not measured.
This isn’t a bug that will be fixed. It’s a fundamental characteristic of how large language models work. The solution isn’t better AI. The solution is verification.
FAQ
Can npm/PyPI prevent this attack?
Which AI models hallucinate packages most?
Is this a form of prompt injection?
How do I report a suspicious hallucinated package?
Report to the package registry’s security team:
- npm: security@npmjs.com
- PyPI: security@python.org
- Also report to the AI tool vendor so they can potentially filter the recommendation
Should I stop using AI for package recommendations?
Conclusion
Key Takeaways
- AI models hallucinate 5-20% of package recommendations, inventing plausible names that don’t exist
- Attackers register hallucinated package names and wait for developers to install them
- Over 200 malicious packages have exploited this attack vector
- The attack works because malicious packages often include real functionality alongside malicious code
- Verification before installation is no longer optional in AI-assisted development
- Check package existence with
npm vieworpip index versionsbefore installing - Package age and download count help identify recently-registered attack packages
- Lock files (
package-lock.json) provide some protection against dependency manipulation - AI confidence doesn’t correlate with accuracy; treat all recommendations as unverified suggestions
Package hallucination is the supply chain attack AI created. The tools that accelerate our development also accelerate our exposure to this new attack vector.
The fix is simple but requires discipline: verify every package before installation. The three seconds of checking save you from becoming the next supply chain attack statistic.