You start with five checks: XSS, SQL injection, HTTPS, maybe CORS. Clean code, easy to maintain. Then you add CSRF. Then clickjacking. Then CSP validation. By check 30, your codebase is a mess of if-statements and copy-pasted browser instances.
Security Scanner Architecture: A structured system design that separates orchestration (managing browser lifecycle), discovery (finding and registering checks), and execution (running individual security tests) to enable scaling from dozens to hundreds of checks without code duplication or tight coupling.
I hit this wall building vibe_eval. We needed comprehensive coverage: security, performance, accessibility, privacy, SEO, PWA compliance. That’s not 10 checks—it’s 200+. The naive approach would have been thousands of lines of duplicated code and a maintenance nightmare.
Here’s the architecture that actually scales.
The Three-Layer Model That Works
Most security scanners are flat: one big script that instantiates checks and runs them. This breaks down around 20 checks because there’s no separation of concerns.
Each layer has exactly one job. The Operator manages browser lifecycle. The Registry discovers and organizes checks. Individual checks implement specific tests. Zero overlap.
Metaclass-Based Registration: A Python pattern that uses metaclass hooks to automatically register subclasses at import time, eliminating manual registration boilerplate and enabling zero-configuration plugin architectures.
Layer 1: The Operator (Browser Orchestration)
The Operator manages what’s expensive: the browser instance.
classVibeCheck(_StateMixin,metaclass=VibeCheckMeta):SEVERITY=8# Default severity (1-10 scale)def__init__(self,url:str):self.url=urldefcheck_page(self,page:Page)->Discoveries:"""Implement this for checks that only need page content"""passdefcheck_page_and_ctx(self,page:Page,collected_requests)->Discoveries:"""Implement this for checks that need network requests"""passdefrun(self,page:Page,collected_requests=None)->Discoveries:"""Unified entry point"""ifcollected_requestsisNone:result=self.check_page(page)else:result=self.check_page_and_ctx(page,collected_requests)returnself._ensure_discoveries(result)
The dual interface pattern:
Some checks only need page HTML:
1
2
3
4
classARIACheck(VibeCheck):defcheck_page(self,page:Page)->Discoveries:# Just analyze the DOMreturnDiscoveries.ok("ARIA labels present")
fromplaywright.sync_apiimportPagefrombackend.labyrinth.shapeimportVibeCheck,DiscoveriesclassHTTPSCheck(VibeCheck):"""Verifies the site uses HTTPS"""SEVERITY=9# High severity - security criticalis_premium=False# Available to all usersdefcheck_page(self,page:Page)->Discoveries:ifpage.url.startswith("https://"):returnDiscoveries.ok("Site uses HTTPS",recommendations=[])returnDiscoveries.fail("Site does not use HTTPS - traffic is unencrypted",recommendations=["Obtain an SSL/TLS certificate (free via Let's Encrypt)","Configure your web server to use HTTPS","Redirect all HTTP traffic to HTTPS","Enable HSTS headers",])
That’s it. Drop this file in checks/security/ and it’s automatically:
Registered via metaclass
Available in registry_init()
Categorized as “Security”
Ready to run
No imports in __init__.py. No manual registration. No configuration files. The metaclass handles everything.
classContentSecurityPolicyCheck(VibeCheck):"""Checks for proper CSP headers"""SEVERITY=8is_premium=Truedefcheck_page_and_ctx(self,page:Page,collected_requests)->Discoveries:# Find the main document requestforreqincollected_requests:ifreq.url==page.url:csp=req.headers.get("content-security-policy","")ifnotcsp:returnDiscoveries.fail("No Content-Security-Policy header found",recommendations=["Add CSP header to prevent XSS attacks","Start with a restrictive policy and relax as needed","Use CSP reporting to monitor violations",])# Check for unsafe directivesunsafe_patterns=["'unsafe-eval'","'unsafe-inline'"]found_unsafe=[pforpinunsafe_patternsifpincsp]iffound_unsafe:returnDiscoveries.fail(f"CSP contains unsafe directives: {', '.join(found_unsafe)}",recommendations=["Remove 'unsafe-inline' and 'unsafe-eval'","Use nonces or hashes for inline scripts","Refactor to avoid eval() usage",])returnDiscoveries.ok("CSP header present and secure")returnDiscoveries.ok("No CSP check needed")
The check doesn’t manage browser lifecycle or request collection. It just implements the logic.
Shared Utilities in the Base Class
Common patterns get extracted into base class helpers.
Path Probing
Many checks need to probe for exposed files:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
classAdminExposureCheck(VibeCheck):defcheck_page(self,page:Page)->Discoveries:admin_paths=["/admin","/administrator","/wp-admin"]returnself._check_paths_for_presence(page,admin_paths,success_info="No admin panels exposed",failure_template="Admin panel found at {path}",recommendations=["Restrict admin access by IP","Use non-standard admin URLs","Implement rate limiting",])
The _check_paths_for_presence() helper handles:
Multiple path probing
Timeout management
Heuristic 404 detection
Result formatting
Heuristic 404 Detection
SPAs often return 200 for everything:
1
2
3
4
5
6
7
8
defheuristic_404(self,response)->bool:"""Detect 404s even when status code is 200"""ifresponse.status==404:returnTruebody=response.text().lower()markers=["404","not found","page not found"]returnany(markerinbodyformarkerinmarkers)
This prevents false positives on SPAs that render 404 pages with HTTP 200.
# Checks declare dependenciesclassCSPCheck(VibeCheck):depends_on=[]# No dependenciesclassJSMinificationCheck(VibeCheck):depends_on=[ScriptLoadingCheck]# Needs scripts first# Execute in parallel where possibleasyncdefrun_parallel(checks_batch):tasks=[check.run_async(page,requests)forcheckinchecks_batch]returnawaitasyncio.gather(*tasks)
With parallelization, 200 checks drop from 5 minutes to under 40 seconds.
Check Severity and Scoring
Each check has a SEVERITY (1-10):
1
2
3
4
5
6
SEVERITY=10# Critical (SQL injection, RCE)SEVERITY=9# High (XSS, auth bypass)SEVERITY=8# Medium-High (CSP missing)SEVERITY=5# Medium (performance issues)SEVERITY=3# Low (SEO recommendations)SEVERITY=1# Info (best practices)
The scanner aggregates these into an overall security score:
1
2
3
4
5
6
7
8
9
10
11
12
defcalculate_security_score(results):total_severity=sum(check_result['severity']forcheck_resultinresults['checks'].values()ifcheck_result['critical'])max_possible=len(results['checks'])*10risk_score=(total_severity/max_possible)*100# Invert for "security score" (higher is better)return100-risk_score
Production Learnings
1. Check Isolation is Critical
Bad check:
1
2
3
4
5
classBadCheck(VibeCheck):shared_state=[]# WRONG: Shared across instances!defcheck_page(self,page):self.shared_state.append(page.url)# Data leak!
Good check:
1
2
3
4
5
6
7
classGoodCheck(VibeCheck):def__init__(self,url):super().__init__(url)self.instance_state=[]# Isolated per instancedefcheck_page(self,page):self.instance_state.append(page.url)
Without this, a slow server can hang the entire scan.
3. Graceful Degradation
If one check crashes, the scan continues:
1
2
3
4
5
6
7
8
9
10
11
forcheckinself.checks:try:result=check.run(page,collected_requests)results["checks"][name]=result._asdict()exceptExceptionase:logger.error(f"Check {name} failed: {e}")results["checks"][name]={"info":f"Check failed with error: {str(e)}","critical":False,"error":True,}
At scale, some checks will fail. The architecture must assume this and continue.
Add a New Security Check
Step-by-step guide to adding a new check to the scanner
Create the Check File
Create a new Python file in the appropriate category directory (e.g., backend/labyrinth/checks/security/new_check.py).
Choose the category based on what the check validates: security, performance, accessibility, privacy_compliance, seo, pwa, client_apis, operations, or integrations.
Inherit from VibeCheck
Define your check class inheriting from VibeCheck and set the SEVERITY level (1-10):
The three-layer model has overhead. Don’t pay for it unless you’re scaling.
FAQ
Why use metaclasses instead of manual registration?
Metaclasses eliminate boilerplate and make adding checks frictionless. Without metaclasses, every new check requires updating a central registry file, which creates merge conflicts and slows development. With metaclass auto-registration, dropping a file in the checks directory is enough—zero configuration, zero imports.
How do you handle flaky checks at scale?
Use the is_on_maintenance flag to disable problematic checks without deleting code. Add retry logic with exponential backoff for network-dependent checks. Implement check timeouts to prevent hangs. Most importantly, ensure graceful degradation—one failing check should never crash the entire scan.
Why not use existing security scanners like OWASP ZAP?
Existing scanners are great for general-purpose scanning but difficult to extend with custom checks specific to your stack. Building your own scanner gives you complete control over check logic, filtering, and integration with your deployment pipeline. Use existing scanners for broad coverage, build custom scanners for stack-specific validation.
How do you prevent duplicate work across checks?
Collect network requests once and share them across all checks. Use the dual interface pattern (check_page vs check_page_and_ctx) so checks declare what they need. Cache expensive computations in the Operator layer. Future versions will implement check dependencies to eliminate redundant work.
What's the actual performance impact of running 200+ checks?
Sequential execution takes about 5 minutes for 200 checks. Browser startup is 2-3s, page load is 1-3s, and each check adds 1-2s. Parallelizing check execution (coming in future versions) drops this to 40 seconds. The single browser instance saves 400+ seconds compared to per-check browser instantiation.
Conclusion
Key Takeaways
The three-layer model (Operator, Registry, Checks) separates orchestration, discovery, and execution for zero-duplication scaling
Metaclass-based auto-registration eliminates manual boilerplate—dropping a file in checks/ is enough to add new functionality
Single browser instance reuse saves 400+ seconds for 200 checks compared to per-check browser instantiation
Dual interface pattern (check_page vs check_page_and_ctx) lets checks declare minimal dependencies without framework overhead
State flags (is_active, is_premium, is_on_maintenance) enable A/B testing, tiering, and graceful degradation at scale
Category-based filesystem organization makes navigation and team ownership clear at 200+ checks
Shared utilities (path probing, heuristic 404 detection) prevent code duplication across similar checks
Graceful degradation is critical—one failing check must not crash the entire scan
Check isolation prevents data leaks—no shared state between check instances
Severity scoring (1-10 scale) aggregates into overall security scores for dashboards and reporting
The architecture you choose at 5 checks determines whether you can reach 500 checks. Invest in the three-layer model early, and adding check #200 is as easy as adding check #1.