Data Model¶

Identity Atlas uses a unified data model (v3.2) that stores all authorization entities — from any source system — in a consistent structure backed by PostgreSQL with trigger-based audit history.

Core Design Principles¶

Four principles drive the data model design:

Universal Any authorization source maps to the same tables. Entra ID groups, SAP roles, Omada business roles, and custom CSV imports all become Resources and Principals in the same schema. No source-specific tables.

Audited All core tables are tracked by a shared _history audit table populated by PostgreSQL triggers. Every insert, update, and delete is recorded as a JSONB snapshot, giving you a complete change history for any entity. The trigger skips unchanged rows during re-syncs to avoid bloating the audit log.

Core + JSON Frequently queried attributes (displayName, resourceType, department) are real SQL columns with indexes. System-specific fields that vary by source live in an extendedAttributes TEXT (JSON) column. This gives you index performance on hot paths without a rigid, source-specific schema.

Unified business roles Business roles are not stored in a separate table. They are Resources with resourceType = 'BusinessRole'. Their assignments are ResourceAssignments with assignmentType = 'Governed'. Their resource grants are ResourceRelationships with relationshipType = 'Contains'. The result is a single set of views, risk scores, and queries that apply to all resource types equally.

Conceptual Hierarchy¶

The model is organized around real people, not system accounts:

Identities (real persons — the governance anchor)
  └─ Principals (accounts in source systems, via IdentityMembers)
       └─ ResourceAssignments (what access each account holds)
            └─ Resources (what is being accessed)

Contexts (organizational/structural trees — system-scoped)
  └─ Identities  (contextId: HR org unit, department)
  └─ Principals  (contextId: AD OU, Entra admin unit)
  └─ Resources   (contextId: resource classification, location)

Systems (technical sync root — each Principal, Resource, and Context belongs to one System)

Why Identity is the root: A real person (Identity) may have multiple accounts (Principals) across different source systems. The organizational context — department, team, cost center — belongs to the person, not to each individual account. Identities carry a contextId that represents their place in the HR/governance org structure.

Why Principals and Resources also have a contextId: Source systems have their own organizational structures that are distinct from the HR org chart. Active Directory has OUs, Entra ID has administrative units, resource management systems have classification hierarchies. These structures are valuable for risk scoring, policy evaluation, and reporting — but they belong to the system, not to the person. A Principal's contextId captures its position in the source system's own hierarchy, independent of the Identity's HR context.

Multiple independent context trees: Each (systemId, contextType) combination forms an independent tree. An HR system produces a Department tree linked to Identities. Active Directory produces an OU tree linked to Principals. A resource classification system produces a Category tree linked to Resources. These trees coexist in the same Contexts table, scoped by systemId.

Why Systems are the sync root: At ingestion time, every Principal, Resource, and Context must belong to a System. This enables multi-tenant and multi-system deployments without ambiguity.

Entity Relationship Diagram¶

erDiagram
    Identities {
        guid id PK
        string displayName
        string email
        string employeeId
        string department
        string jobTitle
        guid contextId FK
        guid primaryPrincipalId FK
        guid managerIdentityId FK
        bool isHrAnchored
        int correlationConfidence
        bool analystVerified
    }
    IdentityMembers {
        guid identityId FK
        guid principalId FK
    }
    Systems {
        int id PK
        string displayName
        string systemType
        bool enabled
    }
    Resources {
        guid id PK
        int systemId FK
        string displayName
        string resourceType
        guid contextId FK
        string extendedAttributes
        guid catalogId
        decimal riskScore
    }
    Principals {
        guid id PK
        int systemId FK
        string displayName
        string principalType
        guid contextId FK
        string extendedAttributes
        decimal riskScore
    }
    Contexts {
        guid id PK
        int systemId FK
        string displayName
        string contextType
        guid parentContextId FK
    }
    ResourceAssignments {
        guid resourceId FK
        guid principalId FK
        string assignmentType
        int systemId FK
        string principalType
        string complianceState
        string policyId
        string state
        string extendedAttributes
    }
    ResourceRelationships {
        guid parentResourceId FK
        guid childResourceId FK
        string relationshipType
        string roleName
    }
    PrincipalActivity {
        guid principalId FK
        guid resourceId FK
        int systemId FK
        string activityType
        datetime lastActivityDateTime
        int activityCount
    }
    RiskScores {
        guid entityId
        string entityType
        decimal riskScore
        string riskTier
        string classifierMatches
    }

    Identities ||--o{ IdentityMembers : "aggregates"
    Identities }o--o| Contexts : "HR org context"
    Principals ||--o{ IdentityMembers : "linked via"
    Principals }o--o| Contexts : "system org context"
    Resources }o--o| Contexts : "classification context"
    Systems ||--o{ Resources : "hosts"
    Systems ||--o{ Principals : "hosts"
    Systems ||--o{ Contexts : "scopes"
    Resources ||--o{ ResourceAssignments : "granted via"
    Principals ||--o{ ResourceAssignments : "receives"
    Resources ||--o{ ResourceRelationships : "parent in"
    Resources ||--o{ ResourceRelationships : "child in"
    Principals ||--o{ PrincipalActivity : "has activity"
    Resources ||--o{ PrincipalActivity : "accessed in"

Table Reference¶

Identities¶

Real persons aggregated across multiple accounts and source systems. An Identity is the result of account correlation: one human may have an Entra ID user, a service account, and a privileged admin account — all linked to one Identity record.

Identities carry the contextId because organizational context (department, team) belongs to the person, not to their individual accounts.

Property	Value
Primary Key	`id` GUID
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: displayName, email, employeeId, department, jobTitle, contextId, primaryPrincipalId, managerIdentityId.

Identity correlation columns: isHrAnchored, hrAccountId, accountCount, accountTypes (JSONB), correlationSignals (JSONB), correlationConfidence, correlatedAt, orphanStatus, analystVerified, analystNotes.

IdentityMembers¶

The join table between Identities and Principals. One identity links to one or more principals, potentially across different source systems.

Property	Value
Primary Key	Composite: `identityId` + `principalId`
Audit history	No
Created by	Migration `001_core_schema.sql`

Contexts¶

Organizational and structural groupings from any source system. A Context can represent an HR department, an AD organizational unit, an Entra ID administrative unit, a resource classification category, or any other hierarchy. The contextType column discriminates between them; the systemId column scopes them to a source system.

Multiple independent context trees: Each (systemId, contextType) combination forms its own tree. These trees are independent — an HR department tree, an AD OU tree, and a resource classification tree all coexist in the same table without interfering.

Source System	contextType	Linked To	Example
HR system (CSV)	`OrgUnit`	Identities	Finance > Accounts Payable > Invoice Processing
Active Directory	`OrgUnit`	Principals	corp.local > Users > Amsterdam > Admins
Entra ID	`AdministrativeUnit`	Principals	AU-Netherlands, AU-Germany
Entra ID	`Department`	Identities	Calculated from user.department field
Resource mgmt	`Classification`	Resources	Confidential > Finance Data > Payment Systems
Custom	Any string	Any	Fully extensible

Which entities carry a contextId:

Identities — HR/governance org context. "This person belongs to Finance > Accounts Payable."
Principals — Source system org context. "This AD account lives in OU=Admins,OU=Amsterdam."
Resources — Classification or grouping context. "This SharePoint site is classified as Confidential > Finance Data."

Each entity has a single contextId column. If an entity needs to participate in multiple context trees (e.g., an Identity has both an HR department and a location), use the primary governance context as contextId and store secondary context references in extendedAttributes.

Context and policy-driven access: When an assignment is driven by an Identity's context (e.g., "all Finance employees get access to SharePoint Finance"), the governing rule is captured in AssignmentPolicies.policyConditions as a JSON condition referencing the contextId. The assignment row in ResourceAssignments records the result; the policy row records the rule. This keeps assignments clean while the "why" remains auditable through the policy chain.

Property	Value
Primary Key	`id` GUID
Audit history	No
Created by	Migration `001_core_schema.sql`

Key columns: displayName, contextType, systemId, parentContextId (self-referencing for hierarchy).

contextType values: Department, Division, CostCenter, Team, Office, Project, Location, OrgUnit, AdministrativeUnit, Classification, or any custom string.

Systems¶

Represents a connected authorization source. Every resource and principal is owned by exactly one system.

Property	Value
Primary Key	`id` SERIAL
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: displayName, systemType (e.g. EntraID, Omada, SAP, CSV), enabled.

Resources¶

Any permission-granting entity: Entra ID groups, directory roles, application roles, business roles, SharePoint sites, Azure RBAC roles, or any custom type. The resourceType column discriminates between them.

Property	Value
Primary Key	`id` GUID
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: displayName, resourceType, systemId, contextId (optional — classification or grouping context), extendedAttributes (JSON), catalogId, isHidden, riskScore.

ResourceAssignments¶

Captures who has access to what, and how. The assignmentType column distinguishes direct membership from PIM-eligible access from governed (business-role-driven) access.

Property	Value
Primary Key	Composite: `resourceId` + `principalId` + `assignmentType`
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: assignmentType, systemId, principalType, complianceState, policyId, state, assignmentStatus, expirationDateTime, extendedAttributes (JSONB).

ResourceRelationships¶

Resource-to-resource links. Used for two purposes: Contains links a business role to the resources it grants; GrantsAccessTo expresses that holding one resource implies access to another.

Property	Value
Primary Key	Composite: `parentResourceId` + `childResourceId` + `relationshipType`
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: relationshipType, roleName, roleOriginSystem.

Principals¶

All identity types from any system. The principalType column distinguishes human accounts from service principals, managed identities, AI agents, and more.

Property	Value
Primary Key	`id` GUID
Audit history	Yes (via `_history` trigger)
Created by	Migration `001_core_schema.sql`

Key columns: displayName, principalType, systemId, contextId (optional — source system org structure, e.g. AD OU), extendedAttributes (JSON), riskScore.

PrincipalActivity¶

High-frequency activity signals: sign-ins, per-app usage, AI agent invocations. This table is intentionally not tracked by audit triggers. See Activity Data below for the reason.

Property	Value
Primary Key	Composite: `principalId` + `resourceId` + `systemId` + `activityType`
Audit history	No (upsert-based)
Created by	Migration `001_core_schema.sql`

Key columns: activityType, lastActivityDateTime, activityCount.

RiskScores¶

Risk assessment results for any entity type (Principal, Resource, Identity, Context). Written by Invoke-FGRiskScoring and updated by analyst overrides.

Property	Value
Primary Key	Composite: `entityId` + `entityType`
Audit history	No
Created by	Migration `004_risk_scoring.sql`

Key columns: riskScore, riskTier, riskDirectScore, riskMembershipScore, riskStructuralScore, riskPropagatedScore, riskClassifierMatches (JSON), riskOverride, riskOverrideReason.

Risk scoring also uses several supporting tables for inputs (org context, classifiers, correlation rules) and outputs (resource clusters). See Risk Scoring Data Model for the full picture.

principalType Values¶

The principalType column on the Principals table uses these standard values across all sync and scoring functions.

Value	What it covers	Populated by
`User`	Interactive human user accounts	Entra ID crawler, CSV crawler
`ServicePrincipal`	App registration service principals	Entra ID crawler
`ManagedIdentity`	Azure resource-attached identities (system or user-assigned)	Entra ID crawler
`WorkloadIdentity`	Federated credential identities (GitHub Actions, AKS workloads)	Entra ID crawler, CSV crawler
`AIAgent`	AI agents: Copilot Studio, Azure OpenAI, custom bots	Entra ID crawler (tag/name auto-detection), CSV crawler
`ExternalUser`	Guest / B2B accounts from another tenant	CSV crawler
`SharedMailbox`	Shared mailboxes and room/equipment accounts	CSV crawler

Risk scoring behavior by principalType

User principals receive the full set of stale sign-in, never-signed-in, and guest-account checks. Non-human types (ServicePrincipal, ManagedIdentity, WorkloadIdentity, AIAgent) receive structural signals only — no stale sign-in checks. All types participate in direct classifier matching, membership analysis, and risk propagation.

resourceType Values¶

The resourceType column on the Resources table is a free-form string. These are the standard values used by the built-in sync functions.

Value	What it covers
`EntraGroup`	Entra ID security groups and Microsoft 365 groups
`EntraDirectoryRole`	Entra ID directory roles (Global Administrator, etc.)
`EntraAppRole`	Application roles from enterprise app registrations
`BusinessRole`	Named entitlement bundles from any IGA platform
`SharePointSite`	SharePoint sites (via CSV import)
`AzureRBACRole`	Azure RBAC role assignments (via CSV import)
Custom	Any string — fully extensible for any authorization source

Extending resourceType

You can use any string value for custom source systems. The model does not enforce an enum — resourceType is TEXT. Use a consistent naming convention such as SystemPrefix_TypeName (e.g., SAP_Role, Pathlock_Permission) so queries and views remain readable.

assignmentType Values¶

The assignmentType column on ResourceAssignments describes how the assignment was created and what it means.

Value	Meaning
`Direct`	Direct group membership
`Owner`	Group owner relationship
`Eligible`	PIM-eligible membership — granted but not yet activated
`Governed`	Assigned through a business role or access package
Custom	Any string for CSV-imported assignments

Source System Mapping¶

The same three tables (Resources, Principals, ResourceAssignments) absorb data from any source. The crawler that writes the data and the resourceType / assignmentType values are the only things that differ.

Source System	Crawler	resourceType	principalType	assignmentType
Entra ID groups	Entra ID crawler	`EntraGroup`	`User`	`Direct` / `Owner` / `Eligible`
Entra ID directory roles	Entra ID crawler	`EntraDirectoryRole`	`User`	`Direct`
Entra ID app roles	Entra ID crawler	`EntraAppRole`	`User` / `ServicePrincipal`	`Direct`
Entra ID access packages	Entra ID crawler	`BusinessRole`	—	`Governed`
Omada / SailPoint	CSV crawler	`BusinessRole`	Any	`Governed`
SAP / Pathlock	CSV crawler	Any	Any	Any
Custom system	CSV crawler (or your own via the Ingest API)	Any	Any	Any

Activity Data (PrincipalActivity)¶

PrincipalActivity is physically separated from Principals by design, even though it describes the same entities.

Why not store activity in Principals?

Principals is tracked by the _history audit trigger, which records a JSONB snapshot every time a row changes. Sign-in timestamps change daily — sometimes hourly — for active accounts. Storing lastSignInDateTime on Principals would generate enormous audit history for data that is not meaningful to review. A user's last sign-in two minutes ago is not a material change that anyone needs to audit.

What PrincipalActivity does instead:

Each row stores the latest known activity per (principalId, resourceId, systemId, activityType) combination. Sync functions upsert into this table, overwriting the previous value in place. No history is retained — the table is always a current snapshot.

Activity types:

activityType	Source	Meaning
`SignIn`	Entra ID audit log	Most recent interactive or non-interactive sign-in
`AppSignIn`	Entra ID audit log	Last sign-in to a specific application (resourceId = app)
`Invocation`	AI platform telemetry	Last invocation of an AI agent

Use in risk scoring:

The risk engine queries PrincipalActivity to detect:

Stale accounts — User principals with no SignIn activity in 90+ days
Ghost app roles — EntraAppRole assignments where the user has never signed in to the app
Active high-privilege usage — principals actively using sensitive resources (reduces risk score)
AI agent dormancy — AIAgent principals with no recent Invocation activity