Plugin Architecture & Open Source Strategy¶

This document extends the risk scoring design with a plugin architecture designed for open-source community contribution. The guiding principle: make the knowledge shareable, not just the code.

Community Value Proposition¶

Commercial IGA vendors (SailPoint, Saviynt, Microsoft) all have proprietary risk scoring. There is no open-source tool that lets identity practitioners:

Collectively build and share classification knowledge across industries
Contribute detection methods without exposing organizational data
Reuse each other's work across engagements and customers

Think of the model as: Sigma rules, but for identity risk.

Plugin Levels¶

Five extension points, ordered from easiest to most complex:

graph TB
    subgraph Complexity["Contribution Complexity"]
        direction TB
        L1["1. Classifier Packs<br/><i>YAML — just data, no code</i>"]
        L2["2. Scoring Plugins<br/><i>Python — implement a scoring interface</i>"]
        L3["3. Context Discovery Plugins<br/><i>Python — research strategies</i>"]
        L4["4. Data Source Connectors<br/><i>Python — AD, Entra, Okta, AWS IAM...</i>"]
        L5["5. Export/Integration Plugins<br/><i>Python — SIEM, SOAR, GRC platforms</i>"]
    end

    L1 ~~~ L2 ~~~ L3 ~~~ L4 ~~~ L5

    style L1 fill:#4ade80,color:#000
    style L2 fill:#86efac,color:#000
    style L3 fill:#fde047,color:#000
    style L4 fill:#fdba74,color:#000
    style L5 fill:#fca5a5,color:#000

1. Classifier Packs¶

The most important contribution type. Zero code required — just YAML files.

Structure¶

# classifiers/community/banking-nl.yaml
pack:
  id: "banking-nl"
  name: "Dutch Banking Sector"
  version: "1.2.0"
  description: "Classifiers for Dutch banks under DNB/ECB supervision"
  author: "contributor-handle"
  license: "Apache-2.0"
  industries: ["banking", "financial-services"]
  regions: ["NL", "EU"]
  tags: ["DNB", "DORA", "PSD2", "SWIFT"]
  extends:
    - "universal"
    - "financial-services"
  engine_version: ">=1.0.0"

classifiers:
  groups:
    - id: "bank-nl-swift-ops"
      category: "critical-infrastructure"
      name_patterns: ["swift", "sag", "alliance.?lite", "payment.?gateway"]
      base_score: 90
      rationale: "SWIFT-related groups — interbank payment infrastructure"
      references:
        - "https://www.swift.com/myswift/customer-security-programme-csp"
      mitre_attack: ["T1078"]

  users:
    - id: "bank-nl-mlro"
      category: "regulatory-role"
      title_patterns: ["MLRO", "money.?laundering.?reporting", "wwft.?officer"]
      base_score: 75
      rationale: "Wwft/AML reporting officer — regulatory accountability"

Community Registry¶

classifiers/
├── universal/
│   └── universal.yaml
├── industry/
│   ├── banking/
│   │   ├── banking-general.yaml
│   │   ├── banking-nl.yaml
│   │   ├── banking-swift.yaml
│   │   └── banking-trading.yaml
│   ├── healthcare/
│   │   ├── healthcare-general.yaml
│   │   ├── healthcare-nl.yaml
│   │   └── healthcare-epic.yaml
│   ├── critical-infrastructure/
│   │   ├── port-authority.yaml
│   │   ├── energy.yaml
│   │   └── ot-scada-general.yaml
│   ├── government/
│   │   ├── government-nl.yaml
│   │   └── municipality-nl.yaml
│   └── education/
│       └── university-nl.yaml
├── compliance/
│   ├── nis2.yaml
│   ├── dora.yaml
│   ├── gdpr.yaml
│   └── iso27001.yaml
└── technology/
    ├── sap.yaml
    ├── microsoft-365.yaml
    └── azure-infrastructure.yaml

Pack Composition¶

An admin selects which packs to activate. Packs can extend each other. The engine merges them with precedence: organization-specific > industry > compliance > technology > universal. When two classifiers match the same entity, the highest score wins.

Contributing a Pack¶

Fork the repo
Create a YAML file following the schema
Run the built-in validator: idrisk validate classifiers/my-pack.yaml
Submit a PR

2. Scoring Plugins¶

For contributors who want to add new detection logic beyond pattern matching.

Plugin Interface¶

class ScoringPlugin(ABC):

    @property
    @abstractmethod
    def id(self) -> str: ...

    @property
    @abstractmethod
    def name(self) -> str: ...

    @property
    @abstractmethod
    def description(self) -> str: ...

    @property
    @abstractmethod
    def version(self) -> str: ...

    @property
    def entity_types(self) -> List[EntityType]:
        return list(EntityType)

    @property
    def default_weight(self) -> float:
        return 0.5

    @abstractmethod
    def score(self, entity: dict, context: 'ScoringContext') -> Optional[ScoreContribution]:
        """Score a single entity. Return None if no opinion."""
        pass

Example Plugins¶

Toxic Access Combinations¶

Detects users who hold group memberships that together create separation-of-duty violations.

class ToxicCombinationsPlugin(ScoringPlugin):
    """
    Example: A user in both "AP-Invoice-Approve" and "AP-Payment-Execute"
    can both approve and execute payments — classic SoD violation.
    """
    id = "toxic-combinations"
    name = "Toxic Access Combinations"
    version = "1.0.0"
    entity_types = [EntityType.USER]
    default_weight = 0.7

    default_toxic_pairs = [
        {
            "name": "Payment SoD",
            "group_a_patterns": ["invoice.?approv", "payment.?approv"],
            "group_b_patterns": ["payment.?execut", "payment.?release"],
            "severity": 85,
            "rationale": "Can both approve and execute payments"
        },
        {
            "name": "User Lifecycle SoD",
            "group_a_patterns": ["user.?creat", "account.?provision"],
            "group_b_patterns": ["access.?approv", "role.?assign"],
            "severity": 70,
            "rationale": "Can both create accounts and grant them access"
        }
    ]

Other Plugin Ideas¶

Plugin	Entity Types	Detects
Shadow IT Detector	Groups	Groups created by non-IT users, mail-enabled, no IT owner, no governance
Orphaned Access	Groups, Apps	Groups with no owner (last owner left), disabled app owners, apps granting access to disabled groups
Blast Radius Calculator	Users, Groups	Impact scope of compromise — how many systems reachable, how many users affected
Stale Privileged Access	Users	Privileged access with no recent use — Global Admin with no sign-in for 60 days
Naming Convention Anomaly	Groups, Users	Entities that deviate from observed naming patterns — statistical anomaly detection

Plugin Discovery¶

plugins/
├── builtin/
│   ├── classifier_matcher.py      # Layer 1
│   ├── membership_analyzer.py     # Layer 2
│   ├── structural_analyzer.py     # Layer 3
│   └── risk_propagation.py        # Layer 4
├── community/
│   ├── toxic_combinations.py
│   ├── shadow_it_detector.py
│   └── blast_radius.py
└── custom/
    └── my_org_specific_scorer.py

3. Context Discovery Plugins¶

Extend how the system researches organizations in Phase 1.

class DiscoveryPlugin(ABC):

    @abstractmethod
    async def discover(self, customer_domain: str, customer_name: str,
                       existing_profile: dict) -> DiscoveryResult:
        """Research the organization and return findings."""
        pass

Example Discovery Plugins¶

Plugin	Data Sources	Discovers
KvK Lookup	Dutch Chamber of Commerce API	Legal entity type, SBI codes, employee count, subsidiaries
Regulatory Mapper	Web, regulation databases	Applicable regulations (NIS2, DORA, NEN 7510, BIO)
Annual Report Analyzer	Web	Business segments, technology platforms, risk disclosures
Tender Scanner	TenderNed, TED (EU)	Technology purchases, migration projects, security tools

4. Data Source Connectors¶

Allow the tool to collect identity data from sources beyond AD/Entra ID.

class DataSourceConnector(ABC):

    @abstractmethod
    async def collect(self, config: dict) -> CollectionResult:
        """Collect entities and return normalized data."""
        pass

    @abstractmethod
    def get_required_permissions(self) -> List[str]:
        """Permissions needed (displayed during setup)."""
        pass

Planned Connectors¶

Category	Connectors
Builtin	Entra ID, Active Directory, Entra Governance
Community	Okta, AWS IAM, Google Workspace, PingIdentity, SailPoint IIQ, CyberArk

All connectors output a normalized entity model so scoring plugins work regardless of source.

5. Export/Integration Plugins¶

Output risk scores to other platforms:

Category	Exports
Builtin	CSV, Excel, JSON
Community	Microsoft Sentinel (watchlists), Splunk (lookup tables), ServiceNow CMDB, TopDesk, Power BI

Plugin Manifest¶

Every plugin ships with a manifest:

plugin:
  id: "toxic-combinations"
  name: "Toxic Access Combinations"
  version: "1.0.0"
  type: "scoring"
  author: "github-handle"
  license: "Apache-2.0"

  requires:
    entity_types: ["user", "group"]
    data_fields: ["group.members", "user.memberships"]

  config_schema:
    type: object
    properties:
      toxic_pairs_file:
        type: string
      severity_threshold:
        type: number
        default: 50

  default_enabled: true
  default_weight: 0.7

Contribution Guidelines¶

Type	Skill Needed	Review Process
Classifier pack	YAML + domain knowledge	Peer review for quality/accuracy
Toxic combination rules	Domain knowledge	Peer review
Scoring plugin	Python + identity knowledge	Code review + tests required
Discovery plugin	Python + API knowledge	Code review + tests
Data source connector	Python + platform API	Code review + extensive testing
Core engine changes	Deep architecture knowledge	Maintainer review

Classifier Quality Standards¶

Community classifier packs should:

Include clear rationale for every classifier
Include references (links to regulations, vendor docs, best practices)
Use tested regex patterns (no overly broad matches)
Include both English and local-language variants where applicable
Not include any organization-specific data
Be reviewed by at least one practitioner from the relevant industry

Implementation Priority¶

Phase A: Core + Classifiers (MVP)¶

Core engine with plugin interfaces
Universal classifier pack
Entra ID + AD connectors
CLI for scoring + export
Basic CSV/JSON export

Phase B: Intelligence Layer¶

LLM-assisted context discovery
3-5 industry classifier packs
3-5 community scoring plugins
Plugin scaffold tooling

Phase C: Community & Integration¶

Web UI for plugin/classifier management
Community plugin registry
Export plugins (Sentinel, Power BI)
Synthetic data generator for testing

Phase D: Multi-Platform¶

Okta connector
AWS IAM connector
Cross-platform correlation
Multi-source risk aggregation