Skip to content

How we got to v5

Identity Atlas didn't start as Identity Atlas. It began as a PowerShell toolkit called FortigiGraph, evolved through four major versions, and in April 2026 was reborn as a full web-app-first product with a clean architectural reset. This doc is the short version of how that happened and why the current shape makes sense.

v1 (2022–2024) — FortigiGraph PowerShell wrapper

FortigiGraph started as a thin PowerShell wrapper around Microsoft Graph. The goal was narrow: make the Graph API bearable for identity governance work. It shipped as a PowerShell Gallery module with ~60 functions like Get-FGUser, Get-FGGroup, Get-FGAccessPackage, and the corresponding New-, Set-, Add-, Remove- variants.

Authentication was handled via interactive device-code flow or a service principal stored in a local config file. Output was whatever Graph returned — raw JSON objects that the caller would pipe into Where-Object or ConvertTo-Json. There was no database, no UI, no persistence of any kind. The whole module was about 20 files.

What it gave you: a way to script identity reviews without writing HTTP boilerplate. What it didn't give you: any way to work with the data once you had it.

v2 (2024–2025) — SQL Server persistence

The first real architectural step. We added a SQL Server backend so crawler runs could persist data and analysts could run SQL queries against historical snapshots. A new set of sync functions (Sync-FGUser, Sync-FGGroup, Sync-FGGroupMember, etc.) wrote Graph objects into typed tables: GraphUsers, GraphGroups, GraphGroupMembers. A Start-FGSync orchestrator ran all the syncs in parallel via runspace pools.

The schema used SQL Server temporal tables (SYSTEM_VERSIONING = ON) for row-level history, MERGE statements for idempotent upserts, and a set of views for recursive group membership calculations. This worked but tied us hard to SQL Server syntax — every query was T-SQL.

Deployment was Windows-centric: a PowerShell wizard (New-FGConfig) provisioned an Azure SQL Server, configured an App Registration, and wrote everything to a secure local JSON config.

v3 (2025) — Role mining UI and access packages

This was the shift from "backend toolkit" to "product". We added:

  • A React role-mining UI, served by a Node.js Express API, containerised in Docker
  • Access package coverage: Entitlement Management sync, resource role scopes, assignment policies, access reviews
  • The matrix view: a heat-map of users × groups with membership type badges (Direct/Indirect/Eligible/Owner), AP coloring, staircase sorting, drag-and-drop row reordering
  • Materialized permission views: mat_UserPermissionAssignments refreshed at end-of-sync, because the recursive CTEs were too slow at scale

The Docker stack was three containers: a Microsoft SQL Server 2022 image, a worker container running PowerShell crawlers against the SQL Server, and a web container serving the API + UI. The worker still held direct SQL connections.

We also introduced the universal resource model: an abstraction where any authorization system — Entra ID groups, SAP roles, SharePoint sites, file-share ACLs — lands in the same four tables (Systems, Resources, Principals, ResourceAssignments). Access packages and business roles became special-cased Resources with resourceType='BusinessRole'. This was the architectural move that made later work possible.

v4 (early 2026) — Risk scoring, identity correlation, governance

The biggest v3→v4 additions were analytical:

  • LLM-assisted risk scoring: a four-layer scoring engine (direct classifier match, membership analysis, structural hygiene, cross-entity propagation) with classifiers generated from an organisation-specific profile that was itself generated by an LLM from the org's public domain
  • Account correlation: matching accounts across systems to a real Identity via LLM-generated fuzzy matching rules
  • Governance compliance: access review compliance dashboard, reviewer attribution, overdue tracking
  • Crawler configuration in the UI: the interactive PowerShell wizard was retained but a new Admin → Crawlers page let you configure Graph credentials without leaving the browser
  • CSV import: a parallel crawler path for non-Graph sources (Omada Identity, SAP exports)

By the end of v4, the product had ~19 backend routes, ~15 frontend pages, ~160 PowerShell functions, and a schema with 30+ tables, views, indexes, and temporal-table history on every entity.

v5 (April 2026) — The postgres rewrite

This is the version you're looking at.

Why we rewrote it

Three forcing functions made v4 unshippable to new customers:

  1. SQL Server licensing. SQL Server Developer Edition is free for development but cannot be used in production per Microsoft's EULA. SQL Server Express has a 10 GB hard cap that's too small for the tenants we target. For any real deployment, customers needed to supply their own SQL Server licence, which added friction to every trial.

  2. Container image size and boot time. The mcr.microsoft.com/mssql/server image is ~1.5 GB and takes 30–60 seconds to become healthy. A three-service stack with SQL + web + worker made the "quick start" experience feel heavy.

  3. Worker-database coupling. The v4 worker container shipped with the SqlServer PowerShell module and held direct connections to SQL Server. This meant two containers had to know about the schema, and any schema change needed coordinated updates. It also meant the worker couldn't run in networks where direct database access was restricted.

What changed

The v5 rewrite was executed in a single ~2-week feature branch (feature/universal-resource-model). The scope:

  • PostgreSQL 16 replaces SQL Server. No licensing surface, no size limit, smaller image, faster boot. All schema moved to versioned .sql migration files under app/api/src/db/migrations/ applied automatically by the web container at startup. The sql, sql-init, and sql-table-init docker compose services collapsed into a single postgres service.

  • Temporal tables replaced with a generic _history audit table. Postgres has no direct equivalent to SQL Server's system-versioned temporal tables. We built a trigger-based replacement: a shared _history table with JSONB snapshots, populated by AFTER INSERT/UPDATE/DELETE triggers attached to every tracked entity. Querying historical state uses the same shape as the v4 FOR SYSTEM_TIME ALL queries, so detail-page "version history" sections kept working without changes to the frontend.

  • Worker loses its database driver. The v5 worker container has no SQL/postgres client library. All persistence flows through a REST API (POST /api/ingest/*) served by the web container. Ingest uses a bulk-load path: receive JSON → normalise → COPY into a temp table → upsert via INSERT ... ON CONFLICT. The worker went from "holds state" to "pure function: CSV/Graph → HTTP POST".

  • Envelope-encrypted secrets vault. LLM API keys and scraper credentials are now stored in a Secrets table with AES-256-GCM encryption. Per-row data keys wrapped by a master key from IDENTITY_ATLAS_MASTER_KEY. The vault module is general-purpose — other parts of the app can adopt the same pattern.

  • In-app risk scoring. The PowerShell risk scoring helpers (New-FGRiskProfile, New-FGRiskClassifiers, Invoke-FGRiskScoring) were replaced by an in-browser wizard. LLM calls happen inside the web container via a provider abstraction supporting Anthropic Claude, OpenAI, and Azure OpenAI. URL scraping accepts internal wiki/ISMS content as additional context for profile generation. Scoring runs as a background job in the web container with polled progress updates.

  • Canonical CSV schema. The v4 CSV crawler did "best-effort" column-name matching with cascading fallbacks (Get-Col $_ 'DisplayName','_DISPLAYNAME','Name','TechName'). Every new source system added another row of guesses. v5 defines a fixed, documented schema in tools/csv-templates/schema/*.csv. Source-specific column mapping happens outside the crawler via a pre-import transform script. An example transform for Omada Identity exports is in tools/csv-templates/transforms/omada-to-identityatlas.ps1.

  • Full-text LLM integration. Phase 1 of what was previously "call-LLM-via-PowerShell" — a provider abstraction, the secrets vault, a conversational profile refinement wizard, and the first two layers of the scoring engine ported to JavaScript. The risk-scoring PowerShell functions in tools/riskscoring/ were retired; the directory contains stubs so the module loader doesn't fail, but all real logic moved to app/api/src/riskscoring/.

  • Automated nightly review. A new Run-NightlyAndReview.ps1 wrapper runs the existing nightly suite and, only when something fails, invokes Claude (via the Claude Code CLI in fix-it mode, or via the Anthropic API in analysis-only mode) to investigate and propose or apply fixes. All-green runs cost zero LLM tokens.

  • The product was renamed. "FortigiGraph" made sense for a PowerShell wrapper around Microsoft Graph. It made less sense for a full identity-governance platform that also imports CSV data from SAP, Omada, ServiceNow, etc. The rename to Identity Atlas happened alongside the postgres migration — the project root stays at FortigiGraph/ for one more development cycle to avoid breaking existing clones.

What's intentionally not in v5

  • Layers 3 and 4 of the scoring engine (structural hygiene, cross-entity propagation) are placeholders. The formula stays the same shape as v4, so adding them is additive.
  • News-feed-driven re-scoring (M&A events, security incidents auto-triggering risk updates) is a planned Phase 3.
  • RAG over long-term wiki/ISMS indexes is deferred. The current URL scraper is fetch-on-create; pgvector is the right answer once there are hundreds of internal docs to index.
  • Local LLM via Ollama is one provider-adapter away. Quality on a no-GPU CPU box: Qwen 2.5 14B or Mistral Small 22B will be acceptable for the structured-JSON tasks but noticeably worse than Claude/GPT-4 on industry-specific nuance.
  • Temporal DDL-time history (schema-level audit rather than row-level). v5 uses triggers; a future version may adopt a postgres extension like temporal_tables once that ecosystem is more mature.

Version numbers

v5 resets the major version. ModuleVersion follows Major.Minor.yyyyMMdd.HHmm per the branching and versioning strategy. The CHANGELOG.md at the repo root is the authoritative per-release history; this document is the narrative.

Acknowledgements

The original FortigiGraph module was a single-author project. v2–v4 were built by a small team inside Fortigi. The v5 postgres rewrite was done in a single development branch by the author pair-programming with Claude (Anthropic's coding agent) — every architectural decision went through a human review, but the execution was heavily LLM-assisted. The Co-Authored-By: Claude tags on commits from April 2026 onwards reflect that workflow.

If you're reading this as a new contributor, you don't need to know any of this to work on Identity Atlas. The v5 code is the only code. But if you're wondering why there are comments saying "replaces v4 SQL Server temporal tables" in a few places, or why the PowerShell SDK still exists in tools/powershell-sdk/ despite the product being mostly JavaScript — this is the why.