I Built a 200+ Plugin Security Scanner Using Python Metaclasses (Zero Boilerplate)

I Built a 200+ Plugin Security Scanner Using Python Metaclasses (Zero Boilerplate)

The Problem With 200 Plugins

I hit a wall at around check number 50. Every new security check required the same boring ritual:

  1. Write the check class
  2. Import it in the registry file
  3. Add the @register decorator
  4. Hope I didn’t forget anything
  5. Debug why my new check isn’t running (spoiler: I forgot the decorator)
Metaclass : A Python metaclass is a class that creates classes—it intercepts class creation itself, letting you modify or register classes automatically when they’re defined, not when they’re instantiated.

With 200+ security checks, this manual process wasn’t just annoying. It was error-prone. Forgetting to register a check meant shipping incomplete security scans. Not acceptable.

The solution? Make registration impossible to forget by making it automatic.

How Metaclasses Actually Work

Before diving into the implementation, let’s clear up what metaclasses do. Most developers think class creation happens in one step. It doesn’t.

When Python sees class MyCheck(VibeCheck):, it:

  1. Calls VibeCheckMeta.__new__() to create the class object
  2. Gets back a class (not an instance)
  3. That class is now available as MyCheck

This happens at import time, not when you call MyCheck(). That’s the key insight that makes auto-registration work.

Here’s the simplest possible metaclass that does nothing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Meta(type):
    def __new__(cls, name, bases, attrs):
        # cls = the metaclass itself (Meta)
        # name = string name of the class being created
        # bases = tuple of parent classes
        # attrs = dict of class attributes and methods
        return super().__new__(cls, name, bases, attrs)

class MyClass(metaclass=Meta):
    pass  # Meta.__new__() was just called

That’s it. Every time you define a class with metaclass=Meta, Python calls Meta.__new__() before the class exists.

Now imagine adding one line to register the class globally. That’s the entire pattern.

The Implementation That Powers vibe_eval

Here’s the complete metaclass registry from vibe_eval—22 lines that manage 200+ plugins:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
_REGISTRY = []

def registry_init(url):
    """Factory function that instantiates all registered checks"""
    return [cls(url) for cls in _REGISTRY]

class VibeCheckMeta(type):
    """Metaclass to automatically register VibeCheck subclasses in _REGISTRY"""

    def __new__(cls, name, bases, attrs):
        new_class = super().__new__(cls, name, bases, attrs)
        if name != "VibeCheck":  # Only register subclasses, not the base class
            _REGISTRY.append(new_class)
        return new_class

That’s it. Three components:

  1. Global registry: _REGISTRY holds references to all check classes
  2. Factory function: registry_init() creates instances of everything
  3. Metaclass: VibeCheckMeta registers classes automatically

The if name != "VibeCheck" check is critical. Without it, the base class itself gets registered, which breaks everything when you try to instantiate it.

The Base Class Every Plugin Inherits

The metaclass is half the picture. Here’s the base class that provides structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class _StateMixin:
    """Mixin for check lifecycle state"""
    is_active = True
    is_premium = False
    is_on_maintenance = False

class VibeCheck(_StateMixin, metaclass=VibeCheckMeta):
    SEVERITY = 8

    def __init__(self, url: str):
        self.url = url

    def check_page(self, page: Page) -> Discoveries:
        """Base method to be implemented by subclasses"""

    def check_page_and_ctx(self, page: Page, collected_requests) -> Discoveries:
        """Base method to be implemented by subclasses"""

    def run(self, page: Page, collected_requests=None) -> Discoveries:
        """Unified entry point that always returns a Discoveries instance."""
        if collected_requests is None:
            result = self.check_page(page)
        else:
            result = self.check_page_and_ctx(page, collected_requests)
        return self._ensure_discoveries(result)

    def _ensure_discoveries(self, result: Discoveries | None) -> Discoveries:
        """Normalize falsy/None returns into a Discoveries payload."""
        if isinstance(result, Discoveries):
            return result
        return Discoveries.ok(
            info=f"{self.__class__.__name__} returned no findings.",
            recommendations=[],
        )
Template Method Pattern : A design pattern where a base class defines the skeleton of an algorithm in a method, but lets subclasses override specific steps without changing the overall structure.

The run() method is a template method. Every check goes through the same flow:

  1. Call the appropriate check method based on what’s available
  2. Normalize the result with _ensure_discoveries()
  3. Return a valid Discoveries object

Subclasses only implement check_page() or check_page_and_ctx(). The framework handles the rest.

The Data Structure Checks Return

Security checks need to return structured data. Here’s the dataclass that makes it ergonomic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from dataclasses import dataclass

@dataclass
class Discoveries:
    info: str
    critical: bool
    recommendations: list[str]

    def __post_init__(self):
        # Always work with a concrete list to avoid shared mutable defaults
        self.recommendations = list(self.recommendations or [])

    @classmethod
    def ok(
        cls, info: str, recommendations: list[str] | None = None
    ) -> "Discoveries":
        return cls(
            info=info, critical=False, recommendations=list(recommendations or [])
        )

    @classmethod
    def fail(
        cls, info: str, recommendations: list[str] | None = None
    ) -> "Discoveries":
        return cls(
            info=info, critical=True, recommendations=list(recommendations or [])
        )

    @classmethod
    def from_condition(
        cls,
        *,
        failed: bool,
        success_info: str,
        failure_info: str,
        recommendations: list[str] | None = None,
    ) -> "Discoveries":
        return cls.fail(failure_info, recommendations) if failed else cls.ok(
            success_info
        )

The factory methods make check code clean:

1
2
3
4
5
# Instead of:
return Discoveries(info="SQL errors detected", critical=True, recommendations=[...])

# You write:
return Discoveries.fail("SQL errors detected", recommendations=[...])

The from_condition() classmethod handles conditional results elegantly:

1
2
3
4
5
6
return Discoveries.from_condition(
    failed=sql_error_found,
    success_info="No SQL errors detected",
    failure_info="SQL error messages exposed",
    recommendations=["Use parameterized queries", "Hide database errors"]
)

This reads like English. When sql_error_found is true, return a failure. Otherwise, return success.

Creating a New Check (The Magic Part)

Here’s where the metaclass pays off. Creating a new check requires exactly zero registration code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# File: backend/labyrinth/checks/security/sql_injection_check.py
from playwright.sync_api import Page
from backend.labyrinth.shape import VibeCheck, Discoveries

class SQLInjectionCheck(VibeCheck):
    """Detects potential SQL injection vulnerabilities"""

    SEVERITY = 10
    is_premium = True

    def check_page(self, page: Page) -> Discoveries:
        content = page.content().lower()

        sql_errors = [
            "mysql error",
            "ora-",
            "postgresql error",
            "sqlite error",
        ]

        for error in sql_errors:
            if error in content:
                return Discoveries.fail(
                    f"Found SQL error message: {error}",
                    recommendations=[
                        "Implement proper error handling",
                        "Never expose database errors to users",
                        "Use parameterized queries",
                    ]
                )

        return Discoveries.ok("No SQL error messages detected")

That’s the complete implementation. No @register decorator. No import in a central registry file. No manual bookkeeping.

When this file is imported, the metaclass automatically adds SQLInjectionCheck to _REGISTRY. It’s impossible to forget.

Running All 200+ Checks at Once

Using the registered checks is equally simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from backend.labyrinth import registry_init

# Automatically creates instances of ALL 200+ checks
checks = registry_init(url="https://example.com")

# Run them
for check in checks:
    result = check.run(page)
    if result.critical:
        print(f"CRITICAL: {result.info}")

One function call gives you instantiated versions of every single check in the entire system. No manual imports. No hardcoded lists.

In vibe_eval’s main scanner, it’s literally one line:

1
2
3
4
class Operator:
    def __init__(self, url: str):
        self.url = self._normalize_url(url)
        self.checks = vibe_check.registry_init(url=self.url)  # 200+ checks

The State Mixin for Runtime Control

The _StateMixin adds lifecycle flags to every check:

1
2
3
4
5
6
7
8
# Get only active checks
active_checks = [c for c in checks if c.is_active]

# Get premium checks
premium_checks = [c for c in checks if c.is_premium]

# Skip maintenance checks
prod_checks = [c for c in checks if not c.is_on_maintenance]

This enables A/B testing, gradual rollouts, and emergency disabling without code changes:

1
2
3
class NewExperimentalCheck(VibeCheck):
    is_premium = True  # Only for premium users
    is_on_maintenance = False  # Ready for production

Flip a flag, redeploy, done. No registry modifications needed.

Helper Methods That Keep Checks DRY

The base class includes utilities for common security check patterns:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def _safe_request(self, page: Page, path: str, *, timeout_ms: int = 10000):
    """Best-effort GET request against a path; swallow navigation issues."""
    try:
        return page.context.request.get(f"{self.url}{path}", timeout=timeout_ms)
    except Exception:
        return None

def _check_paths_for_presence(
    self,
    page: Page,
    paths: Iterable[str],
    *,
    success_info: str,
    failure_template: str,
    recommendations: list[str] | None = None,
    timeout_ms: int = 10000,
) -> Discoveries:
    """Common helper for checks that probe well-known paths."""
    for path in paths:
        response = self._safe_request(page, path, timeout_ms=timeout_ms)
        if response and not self.heuristic_404(response):
            return Discoveries.fail(
                failure_template.format(path=path), recommendations=recommendations
            )

    return Discoveries.ok(success_info)

Now checks for exposed admin panels become trivial:

1
2
3
4
5
6
7
8
9
class AdminPanelCheck(VibeCheck):
    def check_page(self, page: Page) -> Discoveries:
        return self._check_paths_for_presence(
            page,
            paths=["/admin", "/wp-admin", "/administrator"],
            success_info="No admin panels found",
            failure_template="Admin panel exposed at {path}",
            recommendations=["Restrict admin access by IP", "Use VPN"],
        )

The helper handles the requests, 404 detection, and result formatting. The check just provides configuration.

The Heuristic 404 Detector

Real-world websites don’t always return HTTP 404 for missing pages. SPAs return 200 with “Page Not Found” in the body. This helper handles that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def heuristic_404(self, response) -> bool:
    """Heuristically determine if a response represents a missing resource."""
    status = getattr(response, "status", None)
    if status == 404:
        return True

    try:
        body_content = response.text()
    except Exception:
        return False

    if body_content is None:
        return False
    if isinstance(body_content, bytes):
        try:
            body_content = body_content.decode("utf-8", errors="ignore")
        except Exception:
            return False
    if not isinstance(body_content, str):
        return False

    body_lower = body_content.lower()
    markers = [
        "404",
        "not found",
        "page not found",
        "error 404",
        "not available",
        "not reachable",
        "not accessible",
    ]
    return any(marker in body_lower for marker in markers)

This catches both real HTTP 404s and fake 200s that act like 404s. Critical for avoiding false positives.

Why This Beats Decorators

You might be thinking: “Can’t I just use @register?”

You can, but it has problems:

Decorators require remembering:

1
2
3
@register  # What if I forget this?
class MyCheck(VibeCheck):
    pass

Metaclasses are automatic:

1
2
class MyCheck(VibeCheck):  # Registered automatically, impossible to forget
    pass

Metaclasses enable validation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class VibeCheckMeta(type):
    def __new__(cls, name, bases, attrs):
        new_class = super().__new__(cls, name, bases, attrs)

        # Enforce that check_page or check_page_and_ctx is implemented
        if name != "VibeCheck":
            if 'check_page' not in attrs and 'check_page_and_ctx' not in attrs:
                raise TypeError(f"{name} must implement check_page or check_page_and_ctx")

            _REGISTRY.append(new_class)
        return new_class

Now it’s impossible to create an invalid check. The metaclass enforces the contract at class creation time.

With decorators, you only find out at runtime when the check fails. With metaclasses, you find out at import time before anything runs.

Common Pitfalls and How to Avoid Them

Pitfall 1: Import Side Effects

Import-Time Execution : Code that runs when a module is imported, before any functions are called—metaclasses execute at import time, so errors during class creation break the entire import chain.

Problem: If a check file has import errors, it breaks registration for everything.

Solution: Lazy imports and defensive coding:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Bad - breaks if rare_library isn't installed
import rare_library

class MyCheck(VibeCheck):
    def check_page(self, page):
        return rare_library.scan(page)

# Good - gracefully degrades
class MyCheck(VibeCheck):
    def check_page(self, page):
        try:
            import rare_library
            return rare_library.scan(page)
        except ImportError:
            return Discoveries.ok("Skipped (missing dependency)")

Pitfall 2: Circular Imports

Problem: If checks import from each other, you hit circular dependencies.

Solution: Keep checks isolated. Shared utilities go in utils/, not in check files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Bad - circular import risk
from backend.labyrinth.checks.xss_check import sanitize_html

class CSRFCheck(VibeCheck):
    pass

# Good - shared utility
from backend.labyrinth.utils.sanitizers import sanitize_html

class CSRFCheck(VibeCheck):
    pass

Pitfall 3: Test Pollution

Problem: Tests import checks, which registers them globally, polluting the registry across test runs.

Solution: Reset the registry in test setup:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pytest
from backend.labyrinth.shape import _REGISTRY

@pytest.fixture(autouse=True)
def reset_registry():
    """Reset registry before each test"""
    original = _REGISTRY.copy()
    yield
    _REGISTRY.clear()
    _REGISTRY.extend(original)

This ensures tests start with a clean registry every time.

Performance Considerations

Metaclass overhead: Happens once at import time. Negligible. With 200 checks, you’re looking at microseconds total.

Registry lookup: O(n) where n = number of checks. 200 checks = ~200 pointer dereferences. Also microseconds.

Memory: Each class in the registry is a reference, not a copy. 200 classes ≈ 1.6KB total overhead.

The performance cost is completely dominated by actually running the checks (network requests, page parsing, etc.). The metaclass overhead is lost in the noise.

When to Use This Pattern

Metaclasses aren’t for everything. Use them when:

  • You have 10+ plugins and growing
  • Registration must be bulletproof (can’t forget)
  • You need validation at class creation time
  • The plugin interface is stable

Don’t use metaclasses for:

  • Simple one-off registries with 2-3 items
  • Rapidly changing interfaces
  • Code that needs to be understood by junior devs unfamiliar with metaclasses

For vibe_eval with 200+ checks and a stable interface, metaclasses are perfect.

FAQ

Can't I just maintain a list of imports instead of using metaclasses?

You could, but with 200+ checks spread across dozens of files, that list becomes a maintenance nightmare. Every new check requires updating the import list. Metaclasses make registration automatic—import the module, get the check registered. No manual list maintenance.

What happens if I define a check class but never import the file?

The class won’t be registered. Metaclasses only trigger when classes are created, which happens at import time. In vibe_eval, all check modules are imported in __init__.py, ensuring everything gets registered. If you forget to import, the check simply won’t run.

How do I temporarily disable a specific check without removing it?

Set is_active = False in the class definition, or filter it out at runtime: checks = [c for c in all_checks if c.is_active]. The _StateMixin provides lifecycle flags for exactly this purpose.

Can I use this pattern with async/await?

Yes. The metaclass handles class creation, not execution. Your check_page() method can be async: async def check_page(self, page). The run() method would need to be async too, but the registration mechanism works identically.

What about type hints and IDE autocomplete?

They work fine. The metaclass doesn’t affect type checking. Your IDE sees class MyCheck(VibeCheck) and knows all the methods and attributes from the base class. Type checkers like mypy handle metaclasses correctly.

Is there a simpler alternative for smaller plugin systems?

For under 10 plugins, a simple decorator registry works fine: @register that appends to a global list. The metaclass approach pays off when you have dozens or hundreds of plugins and need guaranteed registration without human error.

The Real-World Impact

This pattern has been running in production for vibe_eval since early 2024. Results:

Every new security check is 15-30 lines of actual check logic. Zero lines of registration code. Zero chances to mess it up.

Conclusion

Key Takeaways

  • Metaclasses intercept class creation at import time, enabling automatic plugin registration without decorators
  • The pattern requires three components: a global registry, a factory function, and a metaclass that registers subclasses
  • Always exclude the base class from registration with if name != "BaseClass" to avoid runtime errors
  • Factory methods like Discoveries.ok() and Discoveries.fail() make check code significantly more readable
  • Helper methods in the base class eliminate boilerplate for common patterns like path probing and 404 detection
  • The _StateMixin enables runtime filtering of checks by active status, premium tier, or maintenance mode
  • Metaclasses can enforce interface contracts at class creation time, catching errors before runtime
  • Import-time errors in one plugin can break the entire registry—use lazy imports for optional dependencies
  • Test isolation requires resetting the global registry between test runs to avoid pollution
  • Performance overhead is negligible—metaclass execution happens once at import, not per-instance

When you need to manage hundreds of plugins with zero boilerplate and bulletproof registration, metaclasses are the right tool. The upfront complexity pays for itself after plugin number three.

Security runs on data.
Make it work for you.

Effortlessly test and evaluate web application security using Vibe Eval agents.