Risk Scoring Data Model¶
Identity Atlas stores risk intelligence in a set of dedicated SQL tables that sit alongside the core data model. These tables hold the inputs that drive scoring (org context, classifier patterns, correlation rules) and the outputs (per-entity scores, clusters, overrides).
These tables are created by migration 004_risk_scoring.sql when the web container starts.
Conceptual Overview¶
Inputs (one-time / periodic) Outputs (per scoring run)
───────────────────────────── ─────────────────────────
GraphRiskProfiles RiskScores
└─ org context from LLM └─ score per entity (all types)
GraphRiskClassifiers directScore
└─ regex patterns from LLM membershipScore
GraphCorrelationRulesets structuralScore
└─ cross-system match rules propagatedScore
overrideAdjustment
GraphResourceClusters
└─ groups of related resources
GraphResourceClusterMembers
└─ which resources belong to each cluster
Scoring reads the inputs and writes the outputs. The inputs change only when you call New-FGRiskProfile / New-FGRiskClassifiers / New-FGCorrelationRuleset. The outputs are overwritten on every Invoke-FGRiskScoring run (with analyst overrides preserved).
Entity Relationship Diagram¶
erDiagram
RiskScores {
guid entityId PK
string entityType PK
int riskScore
string riskTier
int riskDirectScore
int riskMembershipScore
int riskStructuralScore
int riskPropagatedScore
string riskExplanation
string riskClassifierMatches
int riskOverride
string riskOverrideReason
datetime riskScoredAt
}
GraphRiskProfiles {
string id PK
string domain
string industry
string country
string llmProvider
datetime generatedAt
string profileJson
}
GraphRiskClassifiers {
string id PK
string version
string customer
datetime generatedAt
string llmProvider
string classifierJson
}
GraphCorrelationRulesets {
string id PK
string classifierJson
}
GraphResourceClusters {
string id PK
string displayName
string clusterType
int aggregateRiskScore
string riskTier
string ownerUserId
string ownerDisplayName
datetime scoredAt
}
GraphResourceClusterMembers {
string clusterId PK
string resourceType PK
string resourceId PK
string resourceName
int resourceRiskScore
bool isNonProduction
}
GraphResourceClusters ||--o{ GraphResourceClusterMembers : "contains"
The RiskScores table links back to the core model by entityId matching the id column on Principals, Resources, Identities, or Contexts. There is no FK constraint — this is intentional, so risk scores can be queried independently of whether the source entity still exists.
Table Reference¶
RiskScores¶
The central output table. One row per entity per entity type. Updated by every Invoke-FGRiskScoring run. Analyst overrides are applied on top and preserved across re-scoring.
| Property | Value |
|---|---|
| Primary Key | Composite: entityId + entityType |
| Audit history | No (overwritten each scoring run; analyst overrides preserved) |
| Created by | Migration 004_risk_scoring.sql |
entityType values:
| Value | Links to |
|---|---|
Principal |
Principals.id |
Resource |
Resources.id |
Context |
Contexts.id |
Identity |
Identities.id |
Score columns:
| Column | Type | Description |
|---|---|---|
riskScore |
INT | Final effective score (0–100): sum of sub-scores + override, clamped |
riskTier |
NVARCHAR(20) | Critical / High / Medium / Low / Minimal / None |
riskDirectScore |
INT | Layer 1: direct classifier match contribution |
riskMembershipScore |
INT | Layer 2: risk inherited from group/resource memberships |
riskStructuralScore |
INT | Layer 3: hygiene signals (stale sign-in, no description, etc.) |
riskPropagatedScore |
INT | Layer 4: risk propagated from children/members |
riskExplanation |
NVARCHAR(MAX) | JSON array of human-readable factor descriptions |
riskClassifierMatches |
NVARCHAR(MAX) | JSON array of classifier IDs that matched this entity |
riskOverride |
INT | Analyst adjustment (−50 to +50). NULL if no override. |
riskOverrideReason |
NVARCHAR(500) | Required justification supplied with the override |
riskScoredAt |
DATETIME2 | When this score row was last written by the scoring engine |
Denormalization: riskScore and riskTier are also written back to Principals.riskScore / Principals.riskTier and Resources.riskScore / Resources.riskTier for fast joins without touching RiskScores.
GraphRiskProfiles¶
Stores the organizational context discovered by New-FGRiskProfile. The profile describes your organization's industry, country, sensitive system types, and risk posture — used by the LLM to generate classifiers that are meaningful for your specific context.
| Property | Value |
|---|---|
| Primary Key | id (NVARCHAR — usually the domain name) |
| Audit history | No |
| Created by | Save-FGRiskProfile (called by New-FGRiskProfile) |
Key columns: domain, industry, country, llmProvider, profileJson (full profile as JSON).
What is sent to the LLM
Only public organizational context is sent (domain, inferred industry, known system names). No user names, email addresses, or identity data ever leave your infrastructure. See Data Privacy.
GraphRiskClassifiers¶
Stores the regex-based detection patterns generated by New-FGRiskClassifiers. Classifiers match against displayName, description, job titles, and other string attributes to identify entities that warrant elevated risk scores.
| Property | Value |
|---|---|
| Primary Key | id (NVARCHAR — usually {domain}-v{version}) |
| Audit history | No |
| Created by | Save-FGRiskClassifiers (called by New-FGRiskClassifiers) |
Key columns: version, customer, llmProvider, classifierJson (full classifier ruleset as JSON).
The classifierJson contains three sections:
| Section | Targets |
|---|---|
groups |
Resources — matches against displayName and description |
users |
Human principals — matches against displayName, userPrincipalName, jobTitle |
agents |
Non-human principals (ServicePrincipal, ManagedIdentity, AIAgent) |
When no custom classifiers are found in SQL, Invoke-FGRiskScoring falls back to a set of built-in universal classifiers that work for any organization.
GraphCorrelationRulesets¶
Stores the account correlation rules generated by New-FGCorrelationRuleset. These rules define how principals from different source systems are matched to the same real person (Identity).
| Property | Value |
|---|---|
| Primary Key | id |
| Audit history | No |
| Created by | Save-FGCorrelationRuleset (called by New-FGCorrelationRuleset) |
Key columns: classifierJson (full ruleset as JSON with match strategies, confidence weights, and tie-breaking rules).
GraphResourceClusters¶
Groups of related resources identified by Save-FGResourceClusters. A cluster aggregates resources that share a classifier match or a common display name stem. Each cluster has an aggregate risk score and an optional analyst-assigned owner for remediation accountability.
| Property | Value |
|---|---|
| Primary Key | id |
| Audit history | No |
| Created by | Save-FGResourceClusters |
Key columns:
| Column | Description |
|---|---|
displayName |
Human-readable cluster name (derived from classifier or shared stem) |
clusterType |
Classifier (grouped by risk classifier) or NameStem (grouped by display name prefix) |
aggregateRiskScore |
Weighted average of member resource risk scores |
riskTier |
Tier of the aggregate score |
ownerUserId / ownerDisplayName |
Analyst-assigned owner for remediation tracking |
ownerAssignedBy |
Identity of the analyst who assigned the owner |
scoredAt |
When the cluster was last computed |
GraphResourceClusterMembers¶
The detail rows for each cluster — one row per resource in the cluster.
| Property | Value |
|---|---|
| Primary Key | Composite: clusterId + resourceType + resourceId |
| Audit history | No |
| Created by | Save-FGResourceClusters |
Key columns: resourceName, resourceRiskScore, resourceRiskTier, isNonProduction (BIT — non-production resources are included but weighted down in the aggregate), matchedOn (which field matched the classifier), matchDetail.
Initialization Order¶
# 1. Create the RiskScores table (also adds riskScore/riskTier columns to Principals and Resources)
Initialize-FGRiskScoreTables
# 2. Generate org context profile (one-time, contacts LLM)
New-FGRiskProfile -Domain "yourcompany.com" -LLMProvider Anthropic -LLMApiKey $key
# 3. Generate classifiers from the profile (one-time, contacts LLM)
New-FGRiskClassifiers
# 4. Score all entities (run after each sync)
Invoke-FGRiskScoring
# 5. Cluster related resources (optional, run after scoring)
Save-FGResourceClusters
Steps 2 and 3 only need to run once, or when your organization's risk posture changes significantly. Steps 4 and 5 run on a schedule alongside your regular data sync.
Score History¶
RiskScores is overwritten by each scoring run. Analyst overrides (riskOverride, riskOverrideReason) are preserved across re-scoring runs.
The other risk tables (GraphRiskProfiles, GraphRiskClassifiers, GraphCorrelationRulesets, GraphResourceClusters, GraphResourceClusterMembers) are also overwritten in place when regenerated.