Skip to content

Data Model

Identity Atlas uses a unified data model (v3.2) that stores all authorization entities — from any source system — in a consistent structure backed by PostgreSQL with trigger-based audit history.


Core Design Principles

Four principles drive the data model design:

Universal Any authorization source maps to the same tables. Entra ID groups, SAP roles, Omada business roles, and custom CSV imports all become Resources and Principals in the same schema. No source-specific tables.

Audited All core tables are tracked by a shared _history audit table populated by PostgreSQL triggers. Every insert, update, and delete is recorded as a JSONB snapshot, giving you a complete change history for any entity. The trigger skips unchanged rows during re-syncs to avoid bloating the audit log.

Core + JSON Frequently queried attributes (displayName, resourceType, department) are real SQL columns with indexes. System-specific fields that vary by source live in an extendedAttributes TEXT (JSON) column. This gives you index performance on hot paths without a rigid, source-specific schema.

Unified business roles Business roles are not stored in a separate table. They are Resources with resourceType = 'BusinessRole'. Their assignments are ResourceAssignments with assignmentType = 'Governed'. Their resource grants are ResourceRelationships with relationshipType = 'Contains'. The result is a single set of views, risk scores, and queries that apply to all resource types equally.


Conceptual Hierarchy

The model is organized around real people, not system accounts:

Identities (real persons — the governance anchor)
  └─ Principals (accounts in source systems, via IdentityMembers)
       └─ ResourceAssignments (what access each account holds)
            └─ Resources (what is being accessed)

Contexts (organizational/structural trees — system-scoped)
  └─ Identities  (contextId: HR org unit, department)
  └─ Principals  (contextId: AD OU, Entra admin unit)
  └─ Resources   (contextId: resource classification, location)

Systems (technical sync root — each Principal, Resource, and Context belongs to one System)

Why Identity is the root: A real person (Identity) may have multiple accounts (Principals) across different source systems. The organizational context — department, team, cost center — belongs to the person, not to each individual account. Identities carry a contextId that represents their place in the HR/governance org structure.

Why Principals and Resources also have a contextId: Source systems have their own organizational structures that are distinct from the HR org chart. Active Directory has OUs, Entra ID has administrative units, resource management systems have classification hierarchies. These structures are valuable for risk scoring, policy evaluation, and reporting — but they belong to the system, not to the person. A Principal's contextId captures its position in the source system's own hierarchy, independent of the Identity's HR context.

Multiple independent context trees: Each (systemId, contextType) combination forms an independent tree. An HR system produces a Department tree linked to Identities. Active Directory produces an OU tree linked to Principals. A resource classification system produces a Category tree linked to Resources. These trees coexist in the same Contexts table, scoped by systemId.

Why Systems are the sync root: At ingestion time, every Principal, Resource, and Context must belong to a System. This enables multi-tenant and multi-system deployments without ambiguity.


Entity Relationship Diagram

erDiagram
    Identities {
        guid id PK
        string displayName
        string email
        string employeeId
        string department
        string jobTitle
        guid contextId FK
        guid primaryPrincipalId FK
        guid managerIdentityId FK
        bool isHrAnchored
        int correlationConfidence
        bool analystVerified
    }
    IdentityMembers {
        guid identityId FK
        guid principalId FK
    }
    Systems {
        int id PK
        string displayName
        string systemType
        bool enabled
    }
    Resources {
        guid id PK
        int systemId FK
        string displayName
        string resourceType
        guid contextId FK
        string extendedAttributes
        guid catalogId
        decimal riskScore
    }
    Principals {
        guid id PK
        int systemId FK
        string displayName
        string principalType
        guid contextId FK
        string extendedAttributes
        decimal riskScore
    }
    Contexts {
        guid id PK
        int systemId FK
        string displayName
        string contextType
        guid parentContextId FK
    }
    ResourceAssignments {
        guid resourceId FK
        guid principalId FK
        string assignmentType
        int systemId FK
        string principalType
        string complianceState
        string policyId
        string state
        string extendedAttributes
    }
    ResourceRelationships {
        guid parentResourceId FK
        guid childResourceId FK
        string relationshipType
        string roleName
    }
    PrincipalActivity {
        guid principalId FK
        guid resourceId FK
        int systemId FK
        string activityType
        datetime lastActivityDateTime
        int activityCount
    }
    RiskScores {
        guid entityId
        string entityType
        decimal riskScore
        string riskTier
        string classifierMatches
    }

    Identities ||--o{ IdentityMembers : "aggregates"
    Identities }o--o| Contexts : "HR org context"
    Principals ||--o{ IdentityMembers : "linked via"
    Principals }o--o| Contexts : "system org context"
    Resources }o--o| Contexts : "classification context"
    Systems ||--o{ Resources : "hosts"
    Systems ||--o{ Principals : "hosts"
    Systems ||--o{ Contexts : "scopes"
    Resources ||--o{ ResourceAssignments : "granted via"
    Principals ||--o{ ResourceAssignments : "receives"
    Resources ||--o{ ResourceRelationships : "parent in"
    Resources ||--o{ ResourceRelationships : "child in"
    Principals ||--o{ PrincipalActivity : "has activity"
    Resources ||--o{ PrincipalActivity : "accessed in"

Table Reference

Identities

Real persons aggregated across multiple accounts and source systems. An Identity is the result of account correlation: one human may have an Entra ID user, a service account, and a privileged admin account — all linked to one Identity record.

Identities carry the contextId because organizational context (department, team) belongs to the person, not to their individual accounts.

Property Value
Primary Key id GUID
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: displayName, email, employeeId, department, jobTitle, contextId, primaryPrincipalId, managerIdentityId.

Identity correlation columns: isHrAnchored, hrAccountId, accountCount, accountTypes (JSONB), correlationSignals (JSONB), correlationConfidence, correlatedAt, orphanStatus, analystVerified, analystNotes.


IdentityMembers

The join table between Identities and Principals. One identity links to one or more principals, potentially across different source systems.

Property Value
Primary Key Composite: identityId + principalId
Audit history No
Created by Migration 001_core_schema.sql

Contexts

Organizational and structural groupings from any source system. A Context can represent an HR department, an AD organizational unit, an Entra ID administrative unit, a resource classification category, or any other hierarchy. The contextType column discriminates between them; the systemId column scopes them to a source system.

Multiple independent context trees: Each (systemId, contextType) combination forms its own tree. These trees are independent — an HR department tree, an AD OU tree, and a resource classification tree all coexist in the same table without interfering.

Source System contextType Linked To Example
HR system (CSV) OrgUnit Identities Finance > Accounts Payable > Invoice Processing
Active Directory OrgUnit Principals corp.local > Users > Amsterdam > Admins
Entra ID AdministrativeUnit Principals AU-Netherlands, AU-Germany
Entra ID Department Identities Calculated from user.department field
Resource mgmt Classification Resources Confidential > Finance Data > Payment Systems
Custom Any string Any Fully extensible

Which entities carry a contextId:

  • Identities — HR/governance org context. "This person belongs to Finance > Accounts Payable."
  • Principals — Source system org context. "This AD account lives in OU=Admins,OU=Amsterdam."
  • Resources — Classification or grouping context. "This SharePoint site is classified as Confidential > Finance Data."

Each entity has a single contextId column. If an entity needs to participate in multiple context trees (e.g., an Identity has both an HR department and a location), use the primary governance context as contextId and store secondary context references in extendedAttributes.

Context and policy-driven access: When an assignment is driven by an Identity's context (e.g., "all Finance employees get access to SharePoint Finance"), the governing rule is captured in AssignmentPolicies.policyConditions as a JSON condition referencing the contextId. The assignment row in ResourceAssignments records the result; the policy row records the rule. This keeps assignments clean while the "why" remains auditable through the policy chain.

Property Value
Primary Key id GUID
Audit history No
Created by Migration 001_core_schema.sql

Key columns: displayName, contextType, systemId, parentContextId (self-referencing for hierarchy).

contextType values: Department, Division, CostCenter, Team, Office, Project, Location, OrgUnit, AdministrativeUnit, Classification, or any custom string.


Systems

Represents a connected authorization source. Every resource and principal is owned by exactly one system.

Property Value
Primary Key id SERIAL
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: displayName, systemType (e.g. EntraID, Omada, SAP, CSV), enabled.


Resources

Any permission-granting entity: Entra ID groups, directory roles, application roles, business roles, SharePoint sites, Azure RBAC roles, or any custom type. The resourceType column discriminates between them.

Property Value
Primary Key id GUID
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: displayName, resourceType, systemId, contextId (optional — classification or grouping context), extendedAttributes (JSON), catalogId, isHidden, riskScore.


ResourceAssignments

Captures who has access to what, and how. The assignmentType column distinguishes direct membership from PIM-eligible access from governed (business-role-driven) access.

Property Value
Primary Key Composite: resourceId + principalId + assignmentType
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: assignmentType, systemId, principalType, complianceState, policyId, state, assignmentStatus, expirationDateTime, extendedAttributes (JSONB).


ResourceRelationships

Resource-to-resource links. Used for two purposes: Contains links a business role to the resources it grants; GrantsAccessTo expresses that holding one resource implies access to another.

Property Value
Primary Key Composite: parentResourceId + childResourceId + relationshipType
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: relationshipType, roleName, roleOriginSystem.


Principals

All identity types from any system. The principalType column distinguishes human accounts from service principals, managed identities, AI agents, and more.

Property Value
Primary Key id GUID
Audit history Yes (via _history trigger)
Created by Migration 001_core_schema.sql

Key columns: displayName, principalType, systemId, contextId (optional — source system org structure, e.g. AD OU), extendedAttributes (JSON), riskScore.


PrincipalActivity

High-frequency activity signals: sign-ins, per-app usage, AI agent invocations. This table is intentionally not tracked by audit triggers. See Activity Data below for the reason.

Property Value
Primary Key Composite: principalId + resourceId + systemId + activityType
Audit history No (upsert-based)
Created by Migration 001_core_schema.sql

Key columns: activityType, lastActivityDateTime, activityCount.


RiskScores

Risk assessment results for any entity type (Principal, Resource, Identity, Context). Written by Invoke-FGRiskScoring and updated by analyst overrides.

Property Value
Primary Key Composite: entityId + entityType
Audit history No
Created by Migration 004_risk_scoring.sql

Key columns: riskScore, riskTier, riskDirectScore, riskMembershipScore, riskStructuralScore, riskPropagatedScore, riskClassifierMatches (JSON), riskOverride, riskOverrideReason.

Risk scoring also uses several supporting tables for inputs (org context, classifiers, correlation rules) and outputs (resource clusters). See Risk Scoring Data Model for the full picture.


principalType Values

The principalType column on the Principals table uses these standard values across all sync and scoring functions.

Value What it covers Populated by
User Interactive human user accounts Entra ID crawler, CSV crawler
ServicePrincipal App registration service principals Entra ID crawler
ManagedIdentity Azure resource-attached identities (system or user-assigned) Entra ID crawler
WorkloadIdentity Federated credential identities (GitHub Actions, AKS workloads) Entra ID crawler, CSV crawler
AIAgent AI agents: Copilot Studio, Azure OpenAI, custom bots Entra ID crawler (tag/name auto-detection), CSV crawler
ExternalUser Guest / B2B accounts from another tenant CSV crawler
SharedMailbox Shared mailboxes and room/equipment accounts CSV crawler

Risk scoring behavior by principalType

User principals receive the full set of stale sign-in, never-signed-in, and guest-account checks. Non-human types (ServicePrincipal, ManagedIdentity, WorkloadIdentity, AIAgent) receive structural signals only — no stale sign-in checks. All types participate in direct classifier matching, membership analysis, and risk propagation.


resourceType Values

The resourceType column on the Resources table is a free-form string. These are the standard values used by the built-in sync functions.

Value What it covers
EntraGroup Entra ID security groups and Microsoft 365 groups
EntraDirectoryRole Entra ID directory roles (Global Administrator, etc.)
EntraAppRole Application roles from enterprise app registrations
BusinessRole Named entitlement bundles from any IGA platform
SharePointSite SharePoint sites (via CSV import)
AzureRBACRole Azure RBAC role assignments (via CSV import)
Custom Any string — fully extensible for any authorization source

Extending resourceType

You can use any string value for custom source systems. The model does not enforce an enum — resourceType is TEXT. Use a consistent naming convention such as SystemPrefix_TypeName (e.g., SAP_Role, Pathlock_Permission) so queries and views remain readable.


assignmentType Values

The assignmentType column on ResourceAssignments describes how the assignment was created and what it means.

Value Meaning
Direct Direct group membership
Owner Group owner relationship
Eligible PIM-eligible membership — granted but not yet activated
Governed Assigned through a business role or access package
Custom Any string for CSV-imported assignments

Source System Mapping

The same three tables (Resources, Principals, ResourceAssignments) absorb data from any source. The crawler that writes the data and the resourceType / assignmentType values are the only things that differ.

Source System Crawler resourceType principalType assignmentType
Entra ID groups Entra ID crawler EntraGroup User Direct / Owner / Eligible
Entra ID directory roles Entra ID crawler EntraDirectoryRole User Direct
Entra ID app roles Entra ID crawler EntraAppRole User / ServicePrincipal Direct
Entra ID access packages Entra ID crawler BusinessRole Governed
Omada / SailPoint CSV crawler BusinessRole Any Governed
SAP / Pathlock CSV crawler Any Any Any
Custom system CSV crawler (or your own via the Ingest API) Any Any Any

Activity Data (PrincipalActivity)

PrincipalActivity is physically separated from Principals by design, even though it describes the same entities.

Why not store activity in Principals?

Principals is tracked by the _history audit trigger, which records a JSONB snapshot every time a row changes. Sign-in timestamps change daily — sometimes hourly — for active accounts. Storing lastSignInDateTime on Principals would generate enormous audit history for data that is not meaningful to review. A user's last sign-in two minutes ago is not a material change that anyone needs to audit.

What PrincipalActivity does instead:

Each row stores the latest known activity per (principalId, resourceId, systemId, activityType) combination. Sync functions upsert into this table, overwriting the previous value in place. No history is retained — the table is always a current snapshot.

Activity types:

activityType Source Meaning
SignIn Entra ID audit log Most recent interactive or non-interactive sign-in
AppSignIn Entra ID audit log Last sign-in to a specific application (resourceId = app)
Invocation AI platform telemetry Last invocation of an AI agent

Use in risk scoring:

The risk engine queries PrincipalActivity to detect:

  • Stale accountsUser principals with no SignIn activity in 90+ days
  • Ghost app rolesEntraAppRole assignments where the user has never signed in to the app
  • Active high-privilege usage — principals actively using sensitive resources (reduces risk score)
  • AI agent dormancyAIAgent principals with no recent Invocation activity