Graph Validation Rules
This is the single source of truth for algorithmic graph validation in CI.
- Validator implementation:
app/scripts/validateGraph.ts - CI entrypoint:
npm run validate:graphin.github/workflows/ci.yml(graph-validationjob) - Filter-projection validator:
app/scripts/validateViewFilters.ts - Filter-projection CI entrypoint:
npm run validate:view-filtersin.github/workflows/ci.yml(graph-validationjob) - Composition-view validator:
app/scripts/validateCompositionViews.ts - Composition-view CI entrypoint:
npm run validate:composition-viewsin.github/workflows/ci.yml(graph-validationjob) - Hessen Oberstufe archive-boundary validator:
scripts/validate_hessen_upper_secondary_archive_paths.py - Archive-boundary CI entrypoint:
python scripts/validate_hessen_upper_secondary_archive_paths.pyin.github/workflows/ci.yml(graph-validationjob) - Hessen Oberstufe legacy-reference validator:
scripts/validate_hessen_upper_secondary_legacy_refs.py - Legacy-reference CI entrypoint:
python scripts/validate_hessen_upper_secondary_legacy_refs.pyin.github/workflows/ci.yml(graph-validationjob) - The legacy-reference validator also enforces post-retirement absence of the old
curricula/DE/HE/Kultusministerium/Gymnasiale_Oberstufetree; only allowlisted provenance/handoff files may still mention that path textually. - Hessen Sek-I archive-boundary validator:
scripts/validate_hessen_lower_secondary_archive_paths.py - Archive-boundary CI entrypoint:
python scripts/validate_hessen_lower_secondary_archive_paths.pyin.github/workflows/ci.yml(graph-validationjob) - Hessen Sek-I legacy-reference validator:
scripts/validate_hessen_lower_secondary_legacy_refs.py - Legacy-reference CI entrypoint:
python scripts/validate_hessen_lower_secondary_legacy_refs.pyin.github/workflows/ci.yml(graph-validationjob) - Bavaria Gymnasium archive-boundary validator:
scripts/validate_bavaria_gymnasium_archive_paths.py - Archive-boundary CI entrypoint:
python scripts/validate_bavaria_gymnasium_archive_paths.pyin.github/workflows/ci.yml(graph-validationjob) - Bavaria Gymnasium legacy-reference validator:
scripts/validate_bavaria_gymnasium_legacy_refs.py - Legacy-reference CI entrypoint:
python scripts/validate_bavaria_gymnasium_legacy_refs.pyin.github/workflows/ci.yml(graph-validationjob) - The lower-secondary legacy-reference validator fences repo references to
curricula/DE/HE/Kultusministerium/Gymnasium_9_Mittelstufeto the explicit delete-handoff allowlist until the final repo-side remove is executed.
Enforcement profiles
default:- structural/data integrity rules are
error(fail build) GVR-*rules are alsoerror(strict by default)legacy-warn(temporary migration mode):- enable with
VALIDATE_GRAPH_STRICT_RULES=0 GVR-*are downgraded towarn
Rules currently emitted with stable IDs
| Rule ID | Description | Scope | Default |
|---|---|---|---|
GVR-001 |
A goal must not directly require one of its contains ancestors. |
Local landscape | error |
GVR-002 |
Legacy phase-compatibility lint: for repositories that still use comparable phase labels, effective prerequisites should not point to later phases (E, Q1..Q4, Abitur, S*, J*). This is a validator-profile rule, not part of the canonical graph validity definition. |
Local landscape | error |
GVR-003 |
A goal must not directly require its direct contains parent (most frequent deadlock source). |
Local landscape | error |
GVR-004 |
First atomic node must be a motivation anchor (Warum/Why). |
Rollout subset (DE_HES_S_GYM_2_*, excluding OVERVIEW) |
error |
GVR-005 |
Every atomic node must have a transitive path to the motivation anchor via effective requires. |
Rollout subset (DE_HES_S_GYM_2_*, excluding OVERVIEW) |
error |
GVR-006 |
A goal must not directly require one of its direct contains children (inverse anti-pattern of GVR-003). |
Rollout subset (DE_HES_S_GYM_2_*, including OVERVIEW) |
error |
GVR-007 |
MIT OCW module atomic goals must include intensive source-link coverage (concept + practice + assessment) in canonical resourceLinks. |
MIT OCW module landscapes (frameworkId starts with mit-ocw- and root tagged module:*) |
error |
GVR-008 |
Committed landscape goals must use canonical resourceLinks as the only supported goal-level helper-link field. |
Local landscape | error |
GVR-009 |
If explicit type metadata is present, it must match the canonical node classification derived from direct contains children (atomic iff leaf, cluster iff non-leaf). |
Local landscape | error |
GVR-010 |
If shortKey is present, it must be unique within the logical landscapeId (duplicates across locale serializations are allowed only when they refer to the same goal id). |
Logical landscape (landscapeId, including multi-file localizations) |
error |
Core validator checks (always active, fail CI)
These checks are already implemented and treated as error:
- JSON parsing and landscape loading
- Goal ID uniqueness inside a landscape
- Optional
shortKeyuniqueness within a logicallandscapeId - Referential integrity for
requiresandcontains - Self-reference guards (
goal cannot require itself,goal cannot contain itself) - Allowed metadata domains:
phaseleitideenkompetenzentag pattern- DAG checks:
- acyclic direct
requires - acyclic
contains - acyclic effective requires (
requires+ inherited fromcontainsancestors) - inherited self-prerequisite detection
- Curriculum manifest consistency (
curricula/curriculum_manifest.json) - schema/basic shape
- id/title presence
- duplicates
- unknown IDs
- root curriculum set sync
- title alignment with landscape files
- Project-specific invariants (currently Physics cross-curriculum dependency guard)
Notes on scope
- Validation is intentionally structural/algorithmic.
- Didactic quality checks (sequencing quality, granularity, redundancy of meanings, etc.) remain part of manual QA (
curricula/QA/*). - Learner-state semantics from the concept spec such as atomic mastery, frontier computation, and optimistic/pessimistic filter evaluation are currently not validated in CI.
- Projected filtered learner graphs derived from compiled
applicabilityare now validated by the separatevalidate:view-filterspath. - Explicit learner-facing composition views under
curricula/DE/Gymnasium/composition-views/are validated by the separatevalidate:composition-viewspath. - The current CI enforcement scope for
validate:view-filterscovers the reviewed canonical DE Gymnasium set (Mathematik,Physik,Chemie,Biologie,Informatik,Deutsch,Englisch,Französisch,Griechisch,Chinesisch,Geschichte,Politik und Wirtschaft,Musik,Latein,Spanisch,Wirtschaft,Overview). - Reviewed applicability warnings can be recorded in
docs/qa-ci/applicability-accepted-warnings.json; the validator still prints them, but classifies them as accepted review debt instead of active warnings. - Additional structural rules should be added here first, then implemented in
validateGraph.ts, then rolled out in CI.
Filter-projection validator (validate:view-filters)
This validator is separate from validate:graph and operates on projected filtered graphs rather than only on authored raw landscapes.
Current CI scope:
- reviewed canonical DE Gymnasium set
MathematikPhysikChemieBiologieInformatikDeutschEnglischFranzösischGriechischChinesischGeschichtePolitik und WirtschaftMusikLateinSpanischWirtschaftOverview
Current stable finding families:
| Rule ID | Description | Default |
|---|---|---|
APV-102 |
A visible goal requires a prerequisite that is invisible in the projected filtered graph. | error |
APV-103 |
A visible goal is not reachable from the projected root. | error |
APV-201 |
An explicit applicability-override path is used, either embedded or via the canonical override registry. | warning |
APV-202 |
Applicability is backed only by partial mappings. |
warning |
APV-203 |
Compiled applicability differs from currently committed applicability metadata. | warning |
Accepted-warning handling:
APV-201andAPV-202may remain in reviewed pilots when no cleaner exact source alignment exists yet.- Such cases must be explicitly listed in
docs/qa-ci/applicability-accepted-warnings.jsonwith a short rationale. validate:view-filterscontinues to print these findings for auditability, but separates them from active warnings in its summary output.
Composition-view validator (validate:composition-views)
This validator is separate from both validate:graph and validate:view-filters.
It operates on explicit learner-facing composition-view files and validates that they compile deterministically against the referenced canonical graph.
Current scope:
- all
.view.jsonfiles undercurricula/DE/Gymnasium/composition-views/
Current stable finding families:
| Rule ID | Description | Default |
|---|---|---|
CPV-000 |
The composition-view file could not be loaded or parsed. | error |
CPV-001 |
Required composition-view metadata or node metadata is invalid or missing. | error |
CPV-002 |
A referenced canonical subtree root does not exist in the referenced canonical graph. | error |
CPV-004 |
Two referenced canonical subtree roots overlap or the same canonical root is referenced more than once. | error |
CPV-005 |
The compiled default tree contains the same canonical goal more than once. | error |
CPV-006 |
The compiled default tree gives one canonical goal more than one visible parent. | error |
CPV-007 |
A structure node is left empty although it is still present in the view tree. | error |
CPV-101 |
A structure node label is still too generic to be review-safe. | warning |
CPV-102 |
A referenced canonical subtree root still looks phase- or state-specific by title. | warning |
Current compatibility model vs. target model
The current CI validator still operates on the compatibility model used by the existing runtime and landscapes:
- direct
requiresmay be authored on atomic or cluster goals - effective prerequisites are computed by inheriting
requiresfromcontainsancestors - rollout rules such as
GVR-004/GVR-005validate motivation connectivity in that effective graph
The conceptual target model described in docs/concept/curriculum-graph/graph-definition.md is stricter:
- the canonical didactic sequencing layer should primarily be authored on atomic goals
- atomic/cluster semantics are defined canonically by the direct
containsrelation (leaf = atomic, non-leaf = cluster) - cluster-level dependency views should preferably be derived from atomic descendants
- mature route-quality checks should eventually validate atomic didactic routes from motivation anchors to terminal autonomy goals
- progression semantics in the concept doc are defined via atomic mastery and derived cluster satisfaction, but this is not yet a CI validation target
Until the validator and landscapes are migrated, this file distinguishes clearly between:
- rules that are implemented today and have stable
GVR-*IDs - planned future direction that is not yet implemented in CI and therefore has no stable rule IDs here
Motivation-anchor rollout rules (GVR-004, GVR-005)
- Scope is controlled in
app/scripts/validateGraph.tsviamotivationRuleLandscapeIds. - Current rollout scope: Hessen Gymnasiale Oberstufe subject landscapes (
DE_HES_S_GYM_2_*) excludingDE_HES_S_GYM_2_OVERVIEW.
These rules are intentionally a compatibility rollout, not yet the full mature route-coverage model.
Validation semantics:
- Atomic node detection:
- concept-level canonical meaning: a node is atomic iff
contains.length === 0 - current validator implementation uses that canonical leaf/non-leaf definition
- if explicit
typemetadata is present,GVR-009additionally enforces consistency with that canonical classification - Motivation anchor detection:
- first atomic node title must start with
WarumorWhy(case-insensitive) GVR-004fails if:- landscape has no atomic nodes, or
- first atomic node is not a motivation node
GVR-005fails if:- any atomic node except the first motivation node has no transitive path to that anchor in the effective-requires graph
Effective-requires graph means:
- direct
requires - plus inherited
requiresfromcontainsancestors - then transitive reachability over these effective edges
Interpretation of current coverage strength:
GVR-004/GVR-005ensure that each checked atomic node is connected back to a motivation anchor- they do not yet ensure that the node also lies on a path toward one or more terminal autonomy goals
- they do not yet prove that the didactic route is modeled canonically on the atomic
requireslayer
Planned direction for route-quality validation (not yet implemented in CI)
The following direction is planned but currently has no stable validator rule IDs in this file.
Target semantics for mature landscapes:
- route coverage should be defined primarily on the atomic direct-prerequisite graph (
R_don atomic goals), not on inheritedR_eff - a landscape or route-group may have one or more motivation anchors; a single global anchor is not required if the content structure suggests otherwise
- a landscape will often have multiple terminal autonomy goals, typically authentic independent performances such as exam tasks or other capstones
- every route-relevant atomic goal should ideally lie on at least one didactic path from a motivation anchor to a terminal autonomy goal
- explicitly excluded support-only atomic goals (concept-spec set
E_route, e.g. memorization-only helper nodes) require a machine-readable profile convention before they can be validated generically in CI
Recommended rollout strategy:
- keep
GVR-004/GVR-005as migration-compatible checks onR_eff - later add stricter route-quality rules on the atomic graph
- treat full atomic route coverage as
SHOULDat concept level first, then promote it toMUSTonly for mature rollout subsets or strict validator profiles
Immediate implications from the updated concept spec
The recent updates in docs/concept/curriculum-graph/graph-definition.md do not imply that every newly clarified concept should become a CI graph rule immediately.
Already implemented structural alignment
GVR-009: explicit node-type consistency
If a goal stores explicittypemetadata, it must match the canonical concept-spec classification:atomiciffcontains.length === 0clusteriffcontains.length > 0
Not suitable as graph-only CI rules yet
- Atomic mastery / cluster satisfaction semantics
These are learner-state/runtime semantics, not static graph invariants. - Optimistic / pessimistic filter frontier semantics
These describe scoped runtime evaluation, not a property of a landscape JSON in isolation. - Applicability-backed filtered learner graphs
These can become a CI target, but only as a projection validator that first materializes filtered graphs from compiledapplicabilityand then validates those projected graphs.
Planned filter-graph validation layer (not yet implemented in CI)
The concept spec now makes an important distinction:
- raw landscape validation on the full authored graph
- projected filtered-graph validation on learner-facing scoped views
The current validate:graph command covers only the first category.
Planned future addition:
- a separate validator, tentatively
validate:view-filters - input:
- committed canonical landscapes
- compiled
applicabilitymetadata - supported filter dimensions / vocabularies
- validation unit:
- projected filtered graphs such as
G[jurisdiction = DE-HE] - later, if needed, selected multi-dimensional combinations
Auxiliary archive-boundary validator
This is a separate retained-asset hygiene gate, not a GVR-* or APV-* rule family.
Purpose:
- keep the Hessen upper-secondary DE-level archive operationally detached from live legacy repo paths
- allow remaining
Gymnasiale_Oberstufepath strings only inside explicitly allowlisted raw archival provenance files
Implementation and data source:
- validator:
scripts/validate_hessen_upper_secondary_archive_paths.py - allowlist and archive root:
curricula/DE/Gymnasium/input/DE-HE/retained-asset-registry.json
Current CI semantics:
- scans
curricula/DE/Gymnasium/input/DE-HE/abi - fails if a legacy
Gymnasiale_Oberstufepath string appears outside the allowlisted raw-provenance files - keeps machine-readable ABI metadata and repo-authored archive docs on the normalized DE-level archive path
Auxiliary legacy-reference validator
This is the repo-level handoff gate for the retired Hessen upper-secondary legacy tree.
Purpose:
- keep active tooling/runtime/test surfaces detached from the retired legacy tree
- keep the old tree absent from the active repo after the delete handoff
- make the remaining allowed references explicit as the surviving provenance/handoff boundary
Implementation and data source:
- validator:
scripts/validate_hessen_upper_secondary_legacy_refs.py - scan roots, exclusions, and allowlist:
curricula/DE/Gymnasium/provenance/hessen-upper-secondary-retirement-registry.json
Current CI semantics:
- scans active repo surfaces (
backend/src,app,scripts, DE-level provenance/input lanes, selected root helpers) - ignores the already-separated raw ABI archive scope under
curricula/DE/Gymnasium/input/DE-HE/abi/** - fails if a
Gymnasiale_Oberstufetree reference appears outside the explicit handoff allowlist
Expected rule family for that validator:
APV-*for applicability and projected-view validation
Planned validation focus:
- malformed compiled applicability metadata
- empty visible clusters in a projected filtered graph
- visible goals with invisible prerequisites in a projected filtered graph
- visible goals not reachable from the filtered root through visible
containsedges
This layer is intentionally documented here already so CI semantics stay aligned with the concept spec, even before the implementation lands.
Future-rule prerequisites before rollout
- Route-exclusion support set (
E_route)
The concept spec now allows explicit support-only atomic exceptions (for example memory-only helper nodes), but a generic CI rule should only be introduced once there is a stable machine-readable convention for identifying those nodes across landscapes.
Reference implementations already curated:
- Physics landscape file:
curricula/DE/Gymnasium/input/DE-HE/upper-secondary/source-json/DE_HES_S_GYM_2_PHYSIK.de.json.snapshot - subtree:
Einführungsphase: Mechanik, Gravitation, Thermodynamik und Drehbewegungen -
benchmark value:
- no cluster-level
requiresinside the subtree - every non-memory atomic goal in the subtree has at least one direct atomic prerequisite path back to the motivation anchor
- every non-memory atomic goal in the subtree also lies on at least one atomic path toward terminal autonomy goals under
Übungen E-Phase - the single memorization node in that subtree is explicitly typed as
nodeKind: "memory"and is therefore a documented exception rather than an ambiguous leaf
- no cluster-level
-
Mathematics landscape file:
curricula/DE/Gymnasium/input/DE-HE/upper-secondary/source-json/DE_HES_S_GYM_2_MATHEMATIK.de.json.snapshot - scope: ordinary phases
E,Q1,Q2,Q3,Q4plusÜbungen Prozesskompetenzen - benchmark value:
- all local phase-autonomy branches (
Übungen E-Phase,Übungen Q1,Übungen Q2,Übungen Q3,Übungen Q4) and the global process-competency branch contain only atomic terminal goals - all of these terminal goals are exam-mode-capable via concrete
examData - outside the intentionally separate global Abitur containers, the landscape does not rely on cluster-level
requiresfor ordinary didactic sequencing - the two remaining cluster-level
requiresbelong only to the dedicated global Abitur containers and are therefore not the model for ordinary route-quality validation
- all local phase-autonomy branches (
This means future strict route-quality rules can use both a subtree benchmark (Physics E-phase) and a whole-landscape benchmark (Mathematics upper secondary) during rollout, even before new stable GVR-* IDs are introduced for the stricter atomic route model.
Direct-child prerequisite rule (GVR-006)
- Scope is controlled in
app/scripts/validateGraph.tsvianoDirectChildRequireRuleLandscapeIds. - Current rollout scope: Hessen Gymnasiale Oberstufe subject landscapes (
DE_HES_S_GYM_2_*) includingDE_HES_S_GYM_2_OVERVIEW. - Current issue level: follows global
GVR-*strictness (errorby default,warnwithVALIDATE_GRAPH_STRICT_RULES=0).
Validation semantics:
- For a goal
A, collect direct localcontainschildren. - If any direct local
requirestarget is also in that direct child set, emitGVR-006.
MIT OCW source-linking rule (GVR-007)
- Scope: landscapes with top-level
frameworkIdstarting withmit-ocw-and a root node taggedmodule:*(ormodul:*). - Current issue level: follows global
GVR-*strictness (errorby default,warnwithVALIDATE_GRAPH_STRICT_RULES=0).
Validation semantics for each atomic goal:
- canonical
resourceLinksmust exist and include at least one link of type: conceptpracticeassessment- Required source-link types must include at least one valid OCW course URL (
https://ocw.mit.edu/courses/...).
Canonical goal-level link field rule (GVR-008)
- Scope: all committed landscape JSON files validated in CI.
- Current issue level: follows global
GVR-*strictness (errorby default,warnwithVALIDATE_GRAPH_STRICT_RULES=0).
Validation semantics:
- if a goal contains unsupported legacy goal-level link metadata, emit
GVR-008 - the canonical supported field is top-level
resourceLinks
Interpretation:
- unsupported goal-level link metadata is rejected by CI and ignored by runtime link rendering
- committed landscape files in this repository should store helper links only in canonical
resourceLinks
Explicit node-type consistency rule (GVR-009)
- Scope: all committed landscape JSON files validated in CI.
- Current issue level: follows global
GVR-*strictness (errorby default,warnwithVALIDATE_GRAPH_STRICT_RULES=0).
Validation semantics:
- canonical node classification is derived structurally:
atomiciffcontains.length === 0clusteriffcontains.length > 0- if a goal explicitly stores
type: "atomic"but is structurally non-leaf, emitGVR-009 - if a goal explicitly stores
type: "cluster"but is structurally a leaf, emitGVR-009
Interpretation:
- explicit
typemetadata is optional - if present, it is a redundant declaration and must agree with the canonical graph structure