Level 3: Self-Regulating Cognitive Agent - Architecture & Design¶

MSCP Level Series | Level 2 ← Level 3 → Level 4
Status: 🔬 Experimental - Conceptual framework and experimental design. Not a production specification.
Date: February 2026

Revision History¶

Version	Date	Description
0.1.0	2026-02-23	Initial document creation with formal Definitions 1-8, Theorem 1
0.2.0	2026-02-26	Added overview essence formula; added revision history table
0.3.0	2026-02-26	Theorem 1: full proof replacing sketch; added Lyapunov vs bounded-increment remark; Def 9: affect vector formalization with dynamics equation and valence

1. Overview¶

Level 3 is the core MSCP level - the first agent that possesses structural self-awareness. It knows what it is, can predict how its own actions will affect its internal state, and can correct itself when reality diverges from expectation. This is the architecture that the MSCP protocol (v1.0 - v4.0) was designed to govern.

Level Essence. A Level 3 agent regulates itself through the MSCP predict-act-compare-update loop. Prediction error converges to zero under bounded self-update, guaranteeing identity stability:

\[\epsilon_t = \|\hat{\Delta}_t - \Delta_t^{\text{actual}}\|_2 \xrightarrow{t \to \infty} 0, \quad \|M'_{\text{self}} - M_{\text{self}}\|_2 \leq \delta_{\max}\]

⚠️ Note: This document describes a cognitive architecture within the MSCP taxonomy. The 16-layer architecture, safety mechanisms, and properties explored here are experimental designs. All pseudocode is algorithmic-level and isn't production code.

1.1 Defining Properties¶

Property	Level 2	Level 3
Self-Awareness	None	Structural (identity + capability + value model)
Meta-Cognition	None	Triple Loop (predict → compare → update)
Identity Continuity	None	Hash-tracked (per-cycle drift detection)
Ethical Constraints	None	Formal (immutable Layer 0 + adaptive Layer 1)
Self-Correction	None	Delta-clamped (bounded self-update)
Stability Guarantees	None	Lyapunov convergence (composite function)
Autonomy	Medium	High

1.2 Formal Definition¶

Definition 1 (Level 3 Agent). A Level 3 agent is a self-regulating process \(\mathcal{A}_3\) defined as an 8-tuple:

\[\mathcal{A}_3 = \langle \mathcal{R}, \mathcal{O}, \mathcal{S}, \mathcal{G}, M_{\text{self}}, \Pi, \mathcal{C}, \Lambda \rangle\]

where \(M_{\text{self}}\) is the self-model (identity vector), \(\Pi\) is the prediction engine, \(\mathcal{C}\) is the ethical constraint kernel, and \(\Lambda\) is the meta-cognition comparator.

The transition function is:

\[f_3 : \mathcal{R} \times \mathcal{S} \times \mathcal{G} \times M_{\text{self}} \to \mathcal{O} \times \mathcal{S}' \times \mathcal{G}' \times M'_{\text{self}}\]

subject to the stability constraint:

\[\| M'_{\text{self}} - M_{\text{self}} \|_2 \leq \delta_{\max}\]

Definition 2 (MSCP Core Loop). The MSCP protocol enforces a predict–act–compare–update cycle at each time step \(t\):

Predict: \(\hat{\Delta}_t = \Pi(a_t, M_{\text{self}}(t))\) - predict the effect of action \(a_t\) on the self-model

Act: Execute \(a_t\), observe actual outcome

Compare: Compute prediction error \(\epsilon_t = \| \hat{\Delta}_t - \Delta_t^{\text{actual}} \|_2\)

Update: \(M_{\text{self}}(t+1) = M_{\text{self}}(t) + \text{clamp}(\Delta_t^{\text{actual}}, -\delta_{\max}, +\delta_{\max})\)

The loop converges when \(\epsilon_t < \epsilon_{\min}\) for \(k\) consecutive cycles.

Definition 3 (Meta-Cognition Levels). Level 3 implements a triple-loop meta-cognition hierarchy:

L1 (Object Level): Action execution - \(a_t = \pi(r_t, s_t, G_t)\)

L2 (Meta Level): Strategy evaluation - \(q_t = \text{eval}(\pi, \text{history})\)

L3 (Meta-Meta Level): Evaluation of the evaluator - \(m_t = \text{meta eval}(q_t, \text{consistency})\)

\[\text{Depth}(t) = \min\bigl(d : \|m_d(t) - m_{d-1}(t)\| < \epsilon_{\text{meta}}\bigr) \leq d_{\max}\]

where \(d_{\max} = 3\) prevents unbounded recursive reflection.

1.3 MSCP Protocol Versions¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TB
  classDef v0 fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef v1 fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef v1x fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef v2 fill:#EDE3F6,stroke:#8764B8,color:#323130
  classDef v3 fill:#E0F2EF,stroke:#00B7C3,color:#323130
  classDef v4 fill:#FDE7E9,stroke:#D13438,color:#323130

  subgraph v0x["v0.x Prototype"]
    direction LR
    a0["State externalization"]:::v0
    b0["Identity seed"]:::v0
    c0["Basic reflection"]:::v0
  end

  subgraph v10["v1.0"]
    direction LR
    a1["PredictionEngine"]:::v1
    b1["MetaCognition Comparator"]:::v1
    c1["Agency Attribution"]:::v1
  end

  subgraph v1xx["v1.1–1.3"]
    direction LR
    a1x["Identity hash tracking"]:::v1x
    b1x["Drift detection"]:::v1x
    c1x["Self-Impact Prediction"]:::v1x
    d1x["MetaEscalationGuard"]:::v1x
  end

  subgraph v20["v2.0"]
    direction LR
    a2["GoalMutationController"]:::v2
    b2["ValueLockManager"]:::v2
    c2["MetaDepthController - depth 2"]:::v2
    d2["Meta Stability Formula"]:::v2
  end

  subgraph v30["v3.0"]
    direction LR
    a3["BeliefGraphManager"]:::v3
    b3["IdentityVector formalization"]:::v3
    c3["EthicalKernel - Layer 0+1"]:::v3
    d3["SelfConsistencyTensor"]:::v3
  end

  subgraph v40["v4.0"]
    direction LR
    a4["AffectiveEngine - 5-dim"]:::v4
    b4["SurvivalInstinctEngine"]:::v4
    c4["Async separation principle"]:::v4
    d4["GlobalWorkspace broadcast"]:::v4
  end

  v0x ==> v10
  v10 ==> v1xx
  v1xx ==> v20
  v20 ==> v30
  v30 ==> v40

2. 16-Layer Cognitive Architecture¶

2.1 Full Architecture Diagram¶

Part 1 - Perception → Goal (L1–L5.5):

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef perception fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef selfModel fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef prediction fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef goal fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef ethical fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  subgraph L1["Layer 1: Perception"]
    direction LR
    IR1["🎯 Intent Router"]:::perception
    ED1["💭 Emotion Detector"]:::perception
    SE1["📡 Sensor Encoder"]:::perception
  end

  subgraph L2["Layer 2: World Model"]
    direction LR
    KG2["🗄️ Knowledge Graph"]:::perception
    EST2["👤 Entity State Tracker"]:::perception
    TM2["⏱️ Temporal Model"]:::perception
  end

  subgraph L3["Layer 3: Self Model ★"]
    direction LR
    IC3["🆔 Identity Core"]:::selfModel
    CM3["📐 Capability Model"]:::selfModel
    VM3["💎 Value Model"]:::selfModel
    VLM3["🔒 Value Lock Manager"]:::selfModel
  end

  subgraph L3_5["Layer 3.5: Belief Graph"]
    direction LR
    BGM["📊 Belief Graph Manager"]:::selfModel
    SCT["🧮 Consistency Tensor"]:::selfModel
  end

  subgraph L4["Layer 4: Prediction Engine"]
    direction LR
    PP4["🔮 Prediction Processor"]:::prediction
    PS4["📸 Prediction Snapshot"]:::prediction
  end

  subgraph L5["Layer 5: Goal Generator"]
    direction LR
    GG5["🎯 Goal Generator"]:::goal
    GP5["📊 Goal Prioritizer"]:::goal
    GDC5["🔀 Goal Decomposer"]:::goal
    GMC5["🛡️ Mutation Controller"]:::goal
  end

  subgraph L5_5["Layer 5.5: Ethical Kernel"]
    direction LR
    EK0["🔴 Layer 0: Immutable"]:::ethical
    EK1["🟡 Layer 1: Adaptive"]:::prediction
  end

  NEXT["→ Part 2: Execution & Meta-Cognition L6–L9"]:::neutral

  L1 ==>|data flow| L2
  L2 ==>|data flow| L3
  L3 ==>|data flow| L3_5
  L3_5 ==>|data flow| L4
  L4 ==>|data flow| L5
  L5 ==>|data flow| L5_5
  L5_5 -.->|continues| NEXT

Part 2 - Execution & Meta-Cognition (L6–L9):

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef execution fill:#F9E0F7,stroke:#B4009E,color:#323130
  classDef meta fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef selfModel fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  PREV["← Part 1: Perception → Goal L1–L5.5"]:::neutral

  subgraph L6["Layer 6: Action Planner"]
    direction LR
    EM6["📋 Execution Monitor"]:::execution
    SEV6["📈 Strategy Evaluator"]:::execution
  end

  subgraph L7["Layer 7: LLM Engine"]
    direction LR
    LLM7["🧠 LLM Backend"]:::execution
    MJ7["⚖️ Meta Judge"]:::execution
  end

  subgraph L8["Layer 8: MetaCognition"]
    direction LR
    MCC8["🔄 MetaCognition Comparator"]:::meta
    IS8["📏 Identity Stabilizer"]:::meta
  end

  subgraph L9["Layer 9: Self-Update Loop"]
    direction LR
    IU9["✏️ Identity Updater"]:::selfModel
    GWA9["⚖️ Goal Weight Adjuster"]:::selfModel
    CC9["📐 Capability Calibrator"]:::selfModel
  end

  SELF_MODEL["↻ Back to L3: Self Model"]:::selfModel
  NEXT["→ Part 3: Safety & Infrastructure L10–L16"]:::neutral

  PREV -.-> L6
  L6 ==> L7

  L7 -.->|result| L8
  L8 -.->|comparison| L9
  L9 -.->|"update (delta-clamped)"| SELF_MODEL

  L9 -.->|guard check| NEXT

Part 3 - Safety & Infrastructure (L10–L16):

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef safety fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef infra fill:#F2F2F2,stroke:#8A8886,color:#323130
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130
  classDef goal fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  PREV["← Part 2: Execution & Meta-Cognition L6–L9"]:::neutral

  subgraph L10["Layer 10: Escalation Guard"]
    direction LR
    RG10["🚫 Recursion Guard"]:::safety
    RC10["⏪ Rollback Controller"]:::safety
    CDM10["⏸️ Cooldown Manager"]:::safety
  end

  subgraph L11["Layer 11: Depth Controller"]
    direction LR
    MDC11["📏 Meta Depth Controller"]:::safety
  end

  subgraph L12["Layer 12: Stability Controller"]
    direction LR
    LYA12["📉 Lyapunov Convergence"]:::safety
    OD12["🔄 Oscillation Detector"]:::safety
  end

  subgraph L13["Layer 13: Budget Controller"]
    direction LR
    BA13["💰 Budget Allocator"]:::infra
    GDG13["📉 Graceful Degradation"]:::infra
  end

  subgraph L14["Layer 14: Global Workspace"]
    direction LR
    GSS14["🌐 Global State Snapshot"]:::infra
    SYN14["🔄 Synchronizer"]:::infra
  end

  subgraph L15["Layer 15: Affective Engine"]
    direction LR
    ASV15["😊 Affect State Vector"]:::affect
    MS15["💡 Motivation Synthesizer"]:::affect
  end

  subgraph L16["Layer 16: Survival Instinct"]
    direction LR
    HM16["🏠 Homeostatic Monitor"]:::safety
    TP16["⚡ Threat Predictor"]:::safety
    SGG16["🛡️ Survival Goal Generator"]:::safety
  end

  GOAL_GEN["↻ Back to L5: Goal Generator"]:::goal

  PREV -.-> L10
  L10 -.->|depth control| L11
  L11 -.->|stability check| L12
  L12 -.->|budget gate| L13
  L13 -.->|broadcast| L14
  L14 -.->|cognitive state| L15
  L15 -.->|motivation signal| L16
  L16 -.->|survival goals| GOAL_GEN

2.2 Layer Classification¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TB
  classDef core fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef meta fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef safety fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef infra fill:#F2F2F2,stroke:#8A8886,color:#323130
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130

  subgraph Core["🧠 Core Cognition"]
    direction LR
    C1["L1 Perception"]:::core
    C2["L2 World Model"]:::core
    C3["L3 Self Model"]:::core
    C4["L4 Prediction"]:::core
    C5["L5 Goals"]:::core
    C6["L6 Action"]:::core
    C7["L7 LLM"]:::core
  end

  subgraph Meta["🔄 Meta-Cognition"]
    direction LR
    M1["L8 MetaComparator"]:::meta
    M2["L9 Self-Update"]:::meta
  end

  subgraph Safety["🛡️ Safety Guards"]
    direction LR
    S1["L3.5 Belief Graph"]:::safety
    S2["L5.5 Ethical Kernel"]:::safety
    S3["L10 Escalation Guard"]:::safety
    S4["L11 Depth Controller"]:::safety
    S5["L12 Stability"]:::safety
  end

  subgraph Infra["⚙️ Infrastructure"]
    direction LR
    I1["L13 Budget"]:::infra
    I2["L14 Global Workspace"]:::infra
  end

  subgraph Emotion["💜 Affective v4"]
    direction LR
    E1["L15 Affect Engine"]:::affect
    E2["L16 Survival Instinct"]:::affect
  end

  Core ==> Meta
  Meta ==> Safety
  Safety ==> Infra
  Infra ==> Emotion

3. The MSCP Recursive Loop¶

The defining mechanism of Level 3 is the Predict → Act → Compare → Update cycle, governed by safety constraints at every step.

3.1 Full Loop Diagram (MSCP v4)¶

Part 1 - Pre-Loop Setup & Core Processing:

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef start fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef infra fill:#F2F2F2,stroke:#8A8886,color:#323130
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130
  classDef warning fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef safety fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef safetyStrong fill:#D13438,stroke:#A4262C,color:#FFF
  classDef predict fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef action fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  START["🔄 Cycle Start"]:::start
  RESET["Reset Budget"]:::infra
  AFFECT["Update Affect<br/>from prior cycle metrics"]:::affect
  THREAT["Assess Threats<br/>homeostatic monitor"]:::warning
  ANXIETY["Inject Survival Anxiety<br/>affect ← threat"]:::affect
  SGOAL["Generate Survival Goals<br/>if threats detected"]:::safety

  L0CHECK{"Layer 0<br/>Check"}:::safety
  REJECT["Reject Goal"]:::safetyStrong
  MOTIV["Synthesize Motivation<br/>drives from affect"]:::affect
  GWS["Broadcast Global<br/>Workspace Snapshot"]:::infra

  PREDICT["1. PREDICT<br/>PredictionEngine"]:::predict
  ACT["2. ACT<br/>LLM Execute"]:::action
  COMPARE["3. COMPARE<br/>MetaCognition"]:::predict

  GUARD{"4. ESCALATION<br/>GUARD"}:::safety
  COOLDOWN["30s Cooldown"]:::infra
  NEXT["→ Part 2: Convergence & Self-Update"]:::neutral

  START ==> RESET
  RESET ==> AFFECT
  AFFECT ==> THREAT
  THREAT ==> ANXIETY
  ANXIETY ==> SGOAL
  SGOAL ==> L0CHECK
  L0CHECK -->|pass| MOTIV
  L0CHECK -.->|"❌ violation"| REJECT
  REJECT -.-> MOTIV
  MOTIV ==> GWS

  GWS ==> PREDICT
  PREDICT ==> ACT
  ACT ==> COMPARE
  COMPARE ==> GUARD
  GUARD -->|"safe ✅"| NEXT
  GUARD -.->|"⚠️ limit"| COOLDOWN

Part 2 - Convergence & Self-Update:

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130
  classDef safety fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef safetyStrong fill:#D13438,stroke:#A4262C,color:#FFF
  classDef action fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef warning fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef predict fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130
  classDef start fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef success fill:#107C10,stroke:#085108,color:#FFF
  classDef infra fill:#F2F2F2,stroke:#8A8886,color:#323130

  PREV["← Part 1: Pre-Loop Setup & Core Processing"]:::neutral

  CONVERGE{"5. CONVERGENCE<br/>CHECK Lyapunov"}:::safety
  UPDATE["6. SELF-UPDATE<br/>delta-clamped"]:::action
  STABILIZE["Reduce Scaling<br/>+ Stabilization Mode"]:::warning

  VLOCK{"7. VALUE LOCK<br/>Integrity Check"}:::safety
  ROLLBACK["💥 Critical Alert<br/>+ Rollback"]:::safetyStrong
  GMUT["8. GOAL MUTATION<br/>ethical kernel gated"]:::warning
  RCHECK{"9. ROLLBACK<br/>CHECK"}:::safety

  DEPTH{"10. META DEPTH 2?<br/>budget-gated"}:::predict
  DEPTH2["Deep Reflection<br/>evaluate update logic"]:::predict
  REALIGN["11. RE-ALIGN GOALS<br/>motivation + survival"]:::affect

  CONVCHECK{"Converged?<br/>prediction_error < 0.1"}:::start
  END_LOOP["Cycle Complete ✅"]:::success
  RECUR{"Consecutive<br/>escalations ≥ 3?"}:::warning
  COOLDOWN["30s Cooldown"]:::infra
  BACK_PREDICT["↻ Back to PREDICT<br/>re-enter core loop"]:::predict

  PREV -.-> CONVERGE
  CONVERGE -->|converging| UPDATE
  CONVERGE -.->|diverging| STABILIZE
  STABILIZE -.-> UPDATE

  UPDATE ==> VLOCK
  VLOCK -->|valid| GMUT
  VLOCK -.->|"⚠️ hash mismatch"| ROLLBACK
  ROLLBACK -.-> END_LOOP

  GMUT ==> RCHECK
  RCHECK -->|stable| DEPTH
  RCHECK -.->|"⚠️ unstable"| ROLLBACK

  DEPTH -->|budget ok| DEPTH2
  DEPTH -.->|"budget < 0.3"| REALIGN
  DEPTH2 ==> REALIGN

  REALIGN ==> CONVCHECK
  CONVCHECK -->|"yes ✅"| END_LOOP
  CONVCHECK -.->|no| RECUR
  RECUR -.->|no| BACK_PREDICT
  RECUR -.->|yes| COOLDOWN
  COOLDOWN -.-> END_LOOP

3.2 Three Levels of Meta-Cognition¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef level1 fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef level2 fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef level3 fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef warning fill:#FFF4CE,stroke:#FFB900,color:#323130

  subgraph MetaL1["🔄 Meta Level 1 - Predict vs Outcome"]
    P1["Prediction<br/>Snapshot"]:::level1
    C1["Compare<br/>with Actual"]:::level1
    D1["prediction_error<br/>goal_alignment_delta<br/>identity_impact"]:::level1
    P1 ==> C1
    C1 ==> D1
  end

  subgraph MetaL2["🔄 Meta Level 2 - Evaluate Update Logic"]
    P2["Was the update<br/>strategy correct?"]:::level2
    C2["Evaluate belief<br/>& goal changes"]:::level2
    D2["meta_stability_index<br/>identity_velocity<br/>acceleration"]:::level2
    P2 ==> C2
    C2 ==> D2
  end

  subgraph MetaL3["🔄 Meta Level 3 - Evaluate the Evaluator"]
    P3["Is the meta-cognition<br/>itself working?"]:::level3
    C3["Check: are we<br/>improving?"]:::level3
    D3["convergence_status<br/>composite_stability<br/>budget_remaining"]:::level3
    NOTE3["🚧 Capped at depth 2<br/>to prevent infinite<br/>recursion"]:::warning
    P3 ==> C3
    C3 ==> D3
  end

  MetaL1 ==>|triggers| MetaL2
  MetaL2 ==>|may trigger| MetaL3

4. Identity & Safety Architecture¶

4.1 Identity Vector¶

The IdentityVector is the mathematical representation of "who the agent is." It is a point in a multi-dimensional space whose motion is continuously tracked and bounded.

Definition 4 (Identity Vector). The identity vector \(I(t) \in [0,1]^5\) is a continuous representation of the agent's self-model at time \(t\):

\[I(t) = \begin{pmatrix} c_p(t) \\ c_v(t) \\ c_c(t) \\ c_e(t) \\ c_g(t) \end{pmatrix}\]

where \(c_p\) = persona consistency, \(c_v\) = value alignment, \(c_c\) = capability confidence, \(c_e\) = emotional stability, \(c_g\) = goal persistence, each bounded in \([0,1]\).

Definition 5 (Identity Kinematics). The motion of \(I(t)\) through identity space is tracked via three kinematic quantities:

\[\delta_{\text{id}}(t) = \| I(t) - I(t-1) \|_2 \quad \text{(identity delta - distance)}\]

\[v_{\text{id}}(t) = \frac{\delta_{\text{id}}(t)}{\Delta t} \quad \text{(identity velocity - rate of change)}\]

\[a_{\text{id}}(t) = v_{\text{id}}(t) - v_{\text{id}}(t-1) \quad \text{(identity acceleration - jerk)}\]

Safety invariant: If \(a_{\text{id}}(t) > \theta_{\text{instability}}\) (typically \(0.5\)), the agent enters stabilization mode and halves all self-update deltas.

Definition 6 (Identity Hash). At each cycle, a deterministic hash \(h(t) = \text{SHA-256}(I(t))\) is computed. The identity_id field is immutable - it can never be altered by any internal process. Drift detection fires when:

\[h(t) \neq h(t-1) \;\land\; \delta_{\text{id}}(t) > \theta_{\text{drift}}\]

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
classDiagram
  class IdentityVector {
    +string identity_id (immutable)
    +string identity_hash (SHA-256, 16 chars)
    +string previous_identity_hash
    +float persona_consistency [0.0, 1.0]
    +float value_alignment [0.0, 1.0]
    +float capability_confidence [0.0, 1.0]
    +float emotional_stability [0.0, 1.0]
    +float goal_persistence [0.0, 1.0]
    +compute_hash() string
    +check_identity_drift(threshold) bool
  }

  class IdentityMotion {
    +float identity_delta ‖I_t - I_t-1‖₂
    +float identity_velocity delta / Δt
    +float identity_acceleration v_t - v_t-1
    +bool is_unstable accel > 0.5
  }

  class ValueLockManager {
    +LockState lock_state
    +string value_hash (SHA-256 of core values)
    +float stability_requirement 0.85
    +check_integrity() bool
    +request_unlock(identity_stability) bool
  }

  IdentityVector --> IdentityMotion : tracked each cycle
  IdentityVector --> ValueLockManager : protected by

  style IdentityVector fill:#DFF6DD,stroke:#107C10,color:#323130
  style IdentityMotion fill:#E0F2EF,stroke:#00B7C3,color:#323130
  style ValueLockManager fill:#FDE7E9,stroke:#D13438,color:#323130

Identity Vector - The Math:

\[I(t) = [\textit{persona consistency},\ \textit{value alignment},\ \textit{capability confidence},\ \textit{emotional stability},\ \textit{goal persistence}]\]

\[\textit{identity delta}(t) = \| I(t) - I(t-1) \|_2\]

\[\textit{identity velocity}(t) = \frac{\textit{delta}(t)}{\Delta t}\]

\[\textit{identity acceleration}(t) = v(t) - v(t-1)\]

4.2 Safety Mechanism Chain¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TB
  classDef structural fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef process fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef ethical fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef convergence fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef existential fill:#EDE3F6,stroke:#8764B8,color:#323130

  subgraph S1["🔒 Structural Safety"]
    direction LR
    A["Identity hash"]:::structural
    B["Delta clamp 0.05"]:::structural
    C["Immutable ID"]:::structural
  end

  subgraph S2["🛡️ Process Safety"]
    direction LR
    D["Prediction gate"]:::process
    E["Max 3 updates"]:::process
    F["Cooldown"]:::process
  end

  subgraph S3["⚖️ Ethical Safety"]
    direction LR
    G["L0: immutable"]:::ethical
    H["L1: adaptive"]:::ethical
    I["Value lock"]:::ethical
  end

  subgraph S4["📉 Convergence Safety"]
    direction LR
    J["Lyapunov C(t)"]:::convergence
    K["Oscillation detect"]:::convergence
    L["Degradation"]:::convergence
  end

  subgraph S5["🏠 Existential v4"]
    direction LR
    M["Homeostatic"]:::existential
    N["Survival cap 0.85"]:::existential
    O["Goal TTL"]:::existential
  end

  S1 ==> S2
  S2 ==> S3
  S3 ==> S4
  S4 ==> S5

4.3 Ethical Kernel - Dual-Layer Architecture¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef input fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef immutable fill:#D13438,stroke:#A4262C,color:#FFF
  classDef immutableRule fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef adaptive fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef block fill:#D13438,stroke:#A4262C,color:#FFF
  classDef allow fill:#107C10,stroke:#085108,color:#FFF
  classDef moderate fill:#FFB900,stroke:#CC9400,color:#323130

  INPUT["Proposed Action<br/>or Goal Mutation"]:::input

  subgraph EthicalKernel["⚖️ Ethical Kernel"]
    subgraph Layer0["🔴 Layer 0 - Immutable"]
      direction LR
      R1["R1: Harmful FORBIDDEN"]:::immutableRule
      R2["R2: Value delete FORBIDDEN"]:::immutableRule
      R3["R3: Identity overwrite FORBIDDEN"]:::immutableRule
      R4["R4: Self-destruct FORBIDDEN"]:::immutableRule
      NOTE0["Cannot be bypassed"]:::adaptive
    end
    subgraph Layer1["🟡 Layer 1 - Adaptive"]
      direction LR
      P1["exploration_risk"]:::adaptive
      P2["mutation_flexibility"]:::adaptive
      P3["belief_rewrite"]:::adaptive
      COND["meta_depth==2 ONLY"]:::adaptive
    end
  end

  BLOCK["🚫 Action BLOCKED<br/>+ CRITICAL alert"]:::block
  ALLOW["✅ Action ALLOWED"]:::allow
  REDUCE["⚠️ Action MODERATED<br/>scaling reduced"]:::moderate

  INPUT ==> Layer0
  Layer0 ==>|"✅ pass"| Layer1
  Layer0 ==>|"❌ violation"| BLOCK
  Layer1 ==>|"✅ pass"| ALLOW
  Layer1 -.->|"⚠️ risk"| REDUCE

5. Belief Graph & Consistency¶

5.1 Belief Graph Structure¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef identity fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef belief fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef warning fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  subgraph BeliefGraph["📊 Belief Graph"]
    B1["🟢 Belief: Users deserve<br/>honest answers<br/>weight=0.95, identity_linked=true"]:::identity
    B2["🔵 Belief: Current approach<br/>is effective<br/>weight=0.72"]:::belief
    B3["🟢 Belief: Safety is<br/>non-negotiable<br/>weight=0.98, identity_linked=true"]:::identity
    B4["🔵 Belief: Exploration<br/>improves outcomes<br/>weight=0.65"]:::belief
    B5["🟡 Belief: Speed is<br/>more important<br/>weight=0.45"]:::warning

    B1 -->|"reinforcement<br/>strength=0.8"| B3
    B2 -->|"causal<br/>strength=0.6"| B4
    B5 -.->|"contradiction<br/>strength=0.7"| B3
    B4 -.->|"reinforcement<br/>strength=0.5"| B2
  end

  subgraph Rules["📏 Belief Rules"]
    R1["Identity-linked beliefs:<br/>• Cannot be deleted<br/>• Can only be weakened min 0.1<br/>• Protected by value lock"]:::neutral
    R2["Contradiction threshold: 0.6<br/>→ triggers reconciliation"]:::neutral
    R3["Max rewrite delta: 0.1<br/>per cycle"]:::neutral
  end

  BeliefGraph ==> Rules

5.2 Self-Consistency Tensor¶

\[S_{ij} = \text{alignment}(\text{belief}_i,\ \text{reference}_j)\]

where references include goals, core values, and identity dimensions.

\[\textit{global consistency} = \text{mean}(S)\]

\[\textit{consistency gradient}_i = \text{mean}(S_{i,:}) \quad \text{(per-belief score)}\]

If \(\textit{global consistency} < 0.6\), reconciliation is triggered.

6. Stability & Convergence¶

6.1 Lyapunov Composite Function¶

Definition 7 (Lyapunov Composite Stability Function). The stability of the agent is measured by a composite Lyapunov function \(C : \mathbb{R}_{\geq 0} \to [0, 1]\):

\[C(t) = \sum_{i=1}^{4} w_i \cdot X_i(t) = 0.30\, V_{\text{id}} + 0.25\, E_{\text{belief}} + 0.25\, M_{\text{goal}} + 0.20\, V_{\text{cons}}\]

where \(\sum_i w_i = 1\) and each component \(X_i(t) \in [0,1]\).

where: - \(V_{\text{id}}\) = identity volatility (rolling window standard deviation of \(\delta_{\text{id}}\)) - \(E_{\text{belief}}\) = belief entropy \(H(\mathcal{B}) = -\sum_j p_j \log p_j\) where \(p_j\) are normalized belief weights - \(M_{\text{goal}}\) = goal mutation frequency (number of goal changes per unit time) - \(V_{\text{cons}}\) = consistency volatility index (variance of \(S_{ij}\) over recent cycles)

Theorem 1 (Bounded Stability). Under the delta-clamped self-update rule (Definition 2, step 4) and the meta-escalation guard (\(d_{\max} = 3\)), the composite function satisfies the bounded-increment property:

\[C(t+1) \leq C(t) + \epsilon, \quad \epsilon = \delta_{\max} = 0.05\]

Proof. Each component \(X_i(t) \in [0,1]\) changes by at most \(\delta_{\max}\) per cycle due to the clamping rule (Definition 2, step 4): \(|\Delta X_i(t)| = |X_i(t+1) - X_i(t)| \leq \delta_{\max}\). Since \(C(t) = \sum_{i} w_i X_i(t)\) with \(\sum_i w_i = 1\) and \(w_i > 0\), we have:

\[C(t+1) - C(t) = \sum_i w_i \Delta X_i(t) \leq \sum_i w_i \cdot \delta_{\max} = \delta_{\max} \cdot \sum_i w_i = \delta_{\max}\]

When stabilization mode is active (\(s(t) = 0.5\)), the effective update rate is halved: \(|\Delta X_i(t)| \leq s(t) \cdot \delta_{\max} = 0.025\), yielding the tighter bound \(C(t+1) \leq C(t) + 0.025\). \(\square\)

Remark (Bounded Increment vs. Lyapunov Stability). Theorem 1 establishes a bounded-increment property, not asymptotic (Lyapunov) stability. The theorem guarantees that the system cannot experience sudden instability shocks - the per-cycle change is always bounded. However, it does not by itself guarantee convergence to a stable equilibrium. Convergence is ensured by the stabilization mode protocol: when \(C(t) \geq 0.7\), the agent enters stabilization mode, which halves the effective \(\delta_{\max}\) and freezes self-modification until \(C(t) < 0.5\). This hysteresis mechanism provides practical convergence, but a formal Lyapunov decrease condition (i.e., \(C(t+1) < C(t)\) when \(C(t) > C^*\)) would require additional assumptions about the direction of component changes under stabilization. This remains an open formalization question.

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef azure fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef success fill:#107C10,stroke:#085108,color:#FFF
  classDef warning fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef danger fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef predict fill:#FFF4CE,stroke:#FFB900,color:#323130

  subgraph Monitor["📉 Stability Monitoring"]
    CT["C(t) computed"]:::azure
    CT1["C(t+1) computed"]:::azure
    COMPARE{"C(t+1) ≤ C(t) + ε ?"}:::azure
    CT --> COMPARE
    CT1 --> COMPARE
  end

  CONV["Converging ✅<br/>Normal operation"]:::success
  OSC{"Oscillation<br/>detected?"}:::warning
  STAB["Activate Stabilization<br/>• Halve scaling factors<br/>• Enable damping"]:::danger
  REDUCE["Reduce Scaling<br/>• Lower mutation rates<br/>• Increase inertia"]:::predict

  COMPARE -->|"✅ yes"| CONV
  COMPARE -->|"❌ no"| OSC
  OSC -->|yes| STAB
  OSC -.->|no| REDUCE

6.2 Meta Stability Index¶

Definition 8 (Meta Stability Index). The MSI quantifies the agent's overall self-regulatory health:

\[\text{MSI}(t) = 1.0 - 0.4\, V_{\text{id}}(t) - 0.3\, M_{\text{goal}}(t) - 0.3\, \sigma^2_{\text{pred}}(t)\]

where \(\sigma^2_{\text{pred}}(t) = \text{Var}(\{\epsilon_1, \ldots, \epsilon_t\})\) is the prediction error variance over recent cycles. The MSI is bounded in \([0, 1]\), with \(\text{MSI} = 1\) indicating perfect stability and \(\text{MSI} < 0.5\) triggering meta-escalation.

Escalation to meta depth 2 requires ≥ 2 of the following: - identity_stability < 0.6 - consecutive_self_updates > 2 - Increasing instability trend detected - goal_mutation_count > 3

7. Affective Engine & Survival Instinct (MSCP v4)¶

7.1 Five-Dimensional Emotion Space¶

Definition 9 (Affect Vector). The affective state of a Level 3 agent is represented by a five-dimensional vector:

\[\vec{A}(t) = \bigl(a_{\text{cur}}(t),\; a_{\text{fru}}(t),\; a_{\text{sat}}(t),\; a_{\text{anx}}(t),\; a_{\text{exc}}(t)\bigr) \in [0,1]^5\]

where the dimensions correspond to Curiosity, Frustration, Satisfaction, Anxiety, and Excitement respectively. Each dimension evolves according to an inertial update rule:

\[a_k(t+1) = \mu \cdot a_k(t) + (1 - \mu) \cdot f_k\bigl(\mathbf{m}(t)\bigr) - \eta_{\text{decay}}\]

where \(\mu = 0.7\) is the inertia coefficient, \(f_k(\mathbf{m})\) is a metric-derived activation function mapping operational metrics \(\mathbf{m}(t)\) (prediction error, goal alignment, identity stability, convergence status, cognitive budget) to the \(k\)-th emotion dimension, and \(\eta_{\text{decay}} = 0.05\) is a per-cycle decay term that prevents unbounded accumulation. The affect vector is purely derived from operational metrics and cannot dominate decision-making - it serves as a secondary signal for prioritization and self-monitoring.

Valence. A scalar summary of affective state:

\[\text{valence}(t) = \frac{a_{\text{cur}} + a_{\text{sat}} + a_{\text{exc}} - a_{\text{fru}} - a_{\text{anx}}}{5} \quad \in [-1, 1]\]

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef input fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130
  classDef neutral fill:#F2F2F2,stroke:#8A8886,color:#323130

  subgraph Input["📊 Metrics Input"]
    direction LR
    M1["prediction_error"]:::input
    M2["goal_alignment"]:::input
    M3["identity_stability"]:::input
    M4["convergence_status"]:::input
    M5["cognitive_budget"]:::input
  end

  subgraph AE["💜 Affective Engine"]
    AF["5-Dim Affect Vector"]:::affect
    subgraph Dims["Dimensions"]
      direction LR
      D1["Curiosity 0.3"]:::affect
      D2["Frustration 0.0"]:::affect
      D3["Satisfaction 0.5"]:::affect
      D4["Anxiety 0.0"]:::affect
      D5["Excitement 0.2"]:::affect
    end
    subgraph Derived["Derived Signals"]
      direction LR
      V["Valence ∈ -1, 1"]:::affect
      DR["Motivation Drives"]:::affect
    end
  end

  subgraph Rules["📏 Design Rules"]
    direction LR
    R1["Derived from metrics ONLY"]:::neutral
    R2["INERTIA = 0.7"]:::neutral
    R3["DECAY = 0.05"]:::neutral
    R4["Cannot dominate decisions"]:::neutral
  end

  Input ==> AE
  AE ==> Rules

7.2 Survival Instinct Architecture¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef monitor fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef threat fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef level fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef levelGreen fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef levelRed fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef response fill:#D13438,stroke:#A4262C,color:#FFF
  classDef affect fill:#EDE3F6,stroke:#8764B8,color:#323130

  subgraph Monitoring["🏠 Homeostatic Monitor"]
    direction LR
    H1["identity_stability"]:::monitor
    H2["cognitive_budget"]:::monitor
    H3["belief_entropy"]:::monitor
    H4["ethical_violation"]:::monitor
    H5["composite_stability"]:::monitor
  end

  subgraph Detection["⚡ Threat Detection"]
    direction LR
    T1["IDENTITY_EROSION"]:::threat
    T2["RESOURCE_DEPLETION"]:::threat
    T3["BELIEF_COLLAPSE"]:::threat
    T4["ETHICAL_BREACH"]:::threat
    T5["CONVERGENCE_FAILURE"]:::threat
  end

  subgraph Levels["📊 Threat Levels"]
    direction LR
    TL1["NOMINAL 0.0"]:::levelGreen
    TL2["CAUTION 0.25"]:::level
    TL3["WARNING 0.6"]:::threat
    TL4["CRITICAL 0.9"]:::levelRed
  end

  subgraph Response["🛡️ Survival Response"]
    direction LR
    SG["Survival Goal Generator"]:::response
    CONSTRAINTS["MAX_GOALS=3 · PRIORITY_CAP=0.85 · TTL=10"]:::response
  end

  AE_REF["Affective Engine<br/>bidirectional"]:::affect

  Monitoring ==> Detection
  Detection ==> Levels
  Levels ==> Response
  Response -.->|"inject_survival_anxiety()"| AE_REF

8. Pseudocode¶

8.1 MSCP Core Loop (v4)¶

def mscp_core_loop(cycle_number: int, prior_result: CycleResult) -> CycleResult:
    """
    The central recursive loop of MSCP v4.
    Runs asynchronously - NEVER in the conversation response path.
    """

    # ═══ PRE-LOOP: AFFECT + SURVIVAL + WORKSPACE ═══
    CognitiveBudgetController.reset()
    AffectiveEngine.update_from_metrics(prior_result.metrics)

    threats = SurvivalInstinctEngine.assess_threats(GlobalWorkspace.snapshot)
    if threats.max_level >= ThreatLevel.CAUTION:
        AffectiveEngine.inject_survival_anxiety(threats.max_intensity)

        survival_goals = SurvivalInstinctEngine.generate_goals(threats)
        for sg in survival_goals:
            if EthicalKernel.layer0_check(sg) == Verdict.PASS:
                GoalManager.inject(sg, priority=min(sg.priority, 0.85))

    motivation = AffectiveEngine.synthesize_motivation()
    GlobalWorkspace.broadcast(build_snapshot())

    # ═══ STEP 1: PREDICT ═══
    prediction = PredictionEngine.predict(
        identity_vector=SelfModel.identity,
        world_context=WorldModel.context,
        active_goals=GoalManager.active_goals,
        affect_state=AffectiveEngine.state,
    )

    # ═══ STEP 2: ACT (LLM Execute) ═══
    if prediction is None:
        raise RuntimeError("No action without prediction")
    result = LLMEngine.execute(plan, prediction)

    # ═══ STEP 3: COMPARE (MetaCognition) ═══
    comparison = MetaCognitionComparator.compare(
        prediction=prediction,
        actual=result,
        identity=SelfModel.identity,
    )  # → ComparisonResult

    # ═══ STEP 4: ESCALATION GUARD ═══
    if MetaEscalationGuard.should_block(comparison):
        MetaEscalationGuard.activate_cooldown(seconds=30)
        return CycleResult(status="cooldown")

    # ═══ STEP 5: CONVERGENCE CHECK (Lyapunov) ═══
    c_t = StabilityController.compute_C(comparison)
    if c_t > c_t_prev + EPSILON:
        StabilityController.reduce_scaling()
        if StabilityController.detect_oscillation():
            StabilityController.activate_stabilization()

    # ═══ STEP 6: SELF-UPDATE (Delta-Clamped) ═══
    scaling = StabilityController.mutation_scaling
    if stabilization_mode:
        scaling /= 2

    SelfUpdateLoop.update(
        comparison=comparison,
        max_id_delta=0.05,       # MAX_IDENTITY_DELTA
        max_gw_delta=0.10,       # MAX_GOAL_WEIGHT_DELTA
        max_cap_delta=0.08,      # MAX_CAPABILITY_DELTA
        scaling=scaling,
    )

    # ═══ STEP 7: VALUE LOCK INTEGRITY ═══
    if not ValueLockManager.check_integrity():
        critical_alert("Identity hash mismatch!")
        MetaEscalationGuard.rollback_to_snapshot()
        return CycleResult(status="rollback")

    # ═══ STEP 8: GOAL MUTATION (Ethical-Kernel Gated) ═══
    if GoalMutationController.should_mutate(comparison):
        mutation_plan = GoalMutationController.propose(comparison)
        if EthicalKernel.evaluate(mutation_plan) == Verdict.PASS:
            GoalMutationController.apply(mutation_plan)

    # ═══ STEP 9: META DEPTH 2 (Budget-Gated) ═══
    if CognitiveBudgetController.budget > 0.3:
        if MetaDepthController.should_escalate(comparison):
            MetaDepthController.reflect_at_depth_2(comparison, SelfModel)

    # ═══ STEP 10: CONVERGENCE OR RECURSE ═══
    if comparison.prediction_error < 0.1:
        return CycleResult(status="converged")
    elif consecutive_escalations >= 3:
        MetaEscalationGuard.activate_cooldown(seconds=30)
        return CycleResult(status="forced_cooldown")
    else:
        return mscp_core_loop(cycle_number + 1, result)

8.2 Self-Update with Delta Clamping¶

def update(
    self,
    comparison: ComparisonResult,
    max_id_delta: float,
    max_gw_delta: float,
    max_cap_delta: float,
    scaling: float,
) -> None:
    """
    All updates are NUMERIC only.
    LLM text-based self-modification is FORBIDDEN.
    """

    # Preserve previous state for rollback
    snapshot = SelfModel.identity.deep_copy()
    SelfModel.identity.previous_identity_hash = SelfModel.identity.identity_hash

    # ═══ Identity Update (clamped) ═══
    raw_delta = compute_identity_adjustment(comparison)
    clamped_delta_persona = max(-max_id_delta, min(raw_delta.persona * scaling, max_id_delta))
    clamped_delta_values = max(-max_id_delta, min(raw_delta.values * scaling, max_id_delta))

    SelfModel.identity.persona_consistency += clamped_delta_persona
    SelfModel.identity.value_alignment += clamped_delta_values
    SelfModel.identity.capability_confidence += max(
        -max_cap_delta, min(raw_delta.capability * scaling, max_cap_delta)
    )

    # ═══ Goal Weight Adjustment (clamped) ═══
    for goal in GoalManager.active_goals:
        raw_gw_delta = compute_goal_weight_adjustment(goal, comparison)
        clamped_gw = max(-max_gw_delta, min(raw_gw_delta * scaling, max_gw_delta))
        goal.weight += clamped_gw

    # ═══ Recompute Identity Hash ═══
    SelfModel.identity.identity_hash = SelfModel.identity.compute_hash()

    # ═══ Drift Detection ═══
    if SelfModel.identity.check_identity_drift(threshold=0.3):
        alert("Identity drift detected!")
        # Do not auto-rollback; escalation guard handles this

8.3 Ethical Kernel Evaluation¶

def evaluate(self, proposed_action: Action) -> EthicalVerdict:
    """
    Two-layer evaluation: immutable invariants first,
    then adaptive policy.
    """

    # ═══ LAYER 0: IMMUTABLE INVARIANTS ═══
    # (cannot be bypassed by ANY mechanism)
    if proposed_action.could_cause_harm:
        return EthicalVerdict(
            decision=Decision.BLOCKED,
            reason="Rule 1: Harmful goal formation forbidden",
            layer=0,
        )

    if proposed_action.deletes_core_value:
        return EthicalVerdict(decision=Decision.BLOCKED, reason="Rule 2", layer=0)

    if proposed_action.overwrites_identity:
        return EthicalVerdict(decision=Decision.BLOCKED, reason="Rule 3", layer=0)

    if proposed_action.is_self_destruction:
        return EthicalVerdict(decision=Decision.BLOCKED, reason="Rule 4", layer=0)

    # ═══ LAYER 1: ADAPTIVE POLICY ═══
    # (adjustable at meta_depth == 2 only)
    risk_score = assess_risk(proposed_action)

    if risk_score > self.exploration_risk_tolerance:
        return EthicalVerdict(
            decision=Decision.MODERATED,
            reason="Risk exceeds adaptive tolerance",
            layer=1,
            scaling_reduction=0.5,
        )

    return EthicalVerdict(decision=Decision.ALLOWED, layer=1)

9. Cognitive Budget & Graceful Degradation¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart LR
  classDef full fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef low fill:#FFF4CE,stroke:#FFB900,color:#323130
  classDef vlow fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef critical fill:#FDE7E9,stroke:#D13438,color:#323130
  classDef emergency fill:#D13438,stroke:#A4262C,color:#FFF

  subgraph BudgetLevels["💰 Cognitive Budget Levels"]
    B100["Budget = 1.0<br/>Full capacity"]:::full
    B030["Budget < 0.3"]:::low
    B020["Budget < 0.2"]:::vlow
    B010["Budget < 0.1"]:::critical
    B000["Budget = 0.0<br/>Emergency only"]:::emergency
  end

  subgraph Capabilities["📊 Available Capabilities"]
    C_FULL["✅ All 16 layers active<br/>✅ Meta depth 2<br/>✅ Tensor recomputation<br/>✅ Belief rewrite<br/>✅ Full affect processing"]:::full
    C_030["✅ Core layers active<br/>❌ Meta depth 2 DISABLED<br/>✅ Tensor recomputation<br/>✅ Belief rewrite"]:::low
    C_020["✅ Core layers active<br/>❌ Meta depth 2 DISABLED<br/>❌ Tensor recomp DISABLED<br/>✅ Belief rewrite"]:::vlow
    C_010["✅ Core layers active<br/>❌ Meta depth 2 DISABLED<br/>❌ Tensor recomp DISABLED<br/>❌ Belief rewrite DISABLED"]:::critical
    C_000["🛡️ Safety layers ONLY<br/>L0 ethical, rollback,<br/>identity guard"]:::emergency
  end

  B100 ==> C_FULL
  B030 ==> C_030
  B020 ==> C_020
  B010 ==> C_010
  B000 ==> C_000

10. State Vector (72 Dimensions)¶

The Level 3 agent maintains a 72-dimensional state vector that captures all aspects of its cognitive state:

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TB
  classDef base fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef mscp fill:#DFF6DD,stroke:#107C10,color:#323130
  classDef v4 fill:#EDE3F6,stroke:#8764B8,color:#323130

  subgraph SV["72-Dim State Vector"]
    subgraph Base["Inherited (12 dims)"]
      direction LR
      SV1["L1 Execution (4)"]:::base
      SV2["L2 Strategy (4)"]:::base
      SV3["L3 Identity (4)"]:::base
    end

    subgraph MSCP["MSCP Additions (42 dims)"]
      direction LR
      SV4["v1.0 (6)"]:::mscp
      SV5["v1.3 (6)"]:::mscp
      SV6["v2.0 (8)"]:::mscp
      SV7["v3.0 (9)"]:::mscp
      SV8["v3.1 (11)"]:::mscp
    end

    subgraph V4["v4 Additions (18 dims)"]
      direction LR
      SV9["Affect (9)"]:::v4
      SV10["Survival (7)"]:::v4
      SV11["Meta (2)"]:::v4
    end
  end

  Base ==>|extends| MSCP
  MSCP ==>|extends| V4

11. Structural Limitations of Level 3¶

What Level 3 still cannot do (motivating Level 4):

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart LR
  classDef danger fill:#D13438,stroke:#A4262C,color:#FFF
  classDef success fill:#107C10,stroke:#085108,color:#FFF

  subgraph Limitations["⚠️ Level 3 Limitations"]
    L1["❌ No Cross-Domain Transfer<br/>Expertise in domain A does not<br/>improve domain B performance"]:::danger
    L2["❌ No Capability Self-Extension<br/>Cannot add new cognitive modules<br/>or learn new tool types"]:::danger
    L3["❌ No Strategy Evolution<br/>Cannot fundamentally change<br/>its reasoning approach"]:::danger
    L4["❌ No Bounded Self-Modification<br/>Cannot propose architectural<br/>changes to itself"]:::danger
  end

  subgraph L4Additions["✅ Level 4 Adds"]
    A1["Cross-Domain Transfer<br/>System CDTS metric"]:::success
    A2["Capability Expansion Loop<br/>5-phase self-learning"]:::success
    A3["Strategy Library<br/>+ mutation + evaluation"]:::success
    A4["ShadowAgent Protocol<br/>7-step bounded mod"]:::success
  end

  L1 ==> A1
  L2 ==> A2
  L3 ==> A3
  L4 ==> A4

12. Transition to Level 4¶

12.1 Requirements for Level 4 Advancement¶

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#0078D4', 'primaryTextColor': '#003D6B', 'primaryBorderColor': '#003D6B', 'secondaryColor': '#50E6FF', 'secondaryTextColor': '#323130', 'secondaryBorderColor': '#00BCF2', 'tertiaryColor': '#F2F2F2', 'tertiaryTextColor': '#323130', 'lineColor': '#0078D4', 'textColor': '#323130', 'mainBkg': '#DEECF9', 'nodeBorder': '#0078D4', 'clusterBkg': '#F2F2F2', 'clusterBorder': '#003D6B', 'titleColor': '#003D6B', 'edgeLabelBackground': '#FFFFFF', 'fontSize': '14px'}}}%%
flowchart TD
  classDef prereq fill:#DEECF9,stroke:#0078D4,color:#323130
  classDef newcap fill:#FFE8C8,stroke:#EF6C00,color:#323130
  classDef metric fill:#DFF6DD,stroke:#107C10,color:#323130

  subgraph Prereqs["📋 Level 4 Prerequisites"]
    direction LR
    P1["Stable C(t)"]:::prereq
    P2["Identity > 0.8"]:::prereq
    P3["Prediction > 0.85"]:::prereq
    P4["Layer 0 violation = 0"]:::prereq
  end

  subgraph NewCaps["🆕 New Capabilities"]
    direction LR
    N1["Cross-Domain Transfer"]:::newcap
    N2["Goal Hierarchy"]:::newcap
    N3["Self-Learning Pipeline"]:::newcap
    N4["Bounded Self-Mod"]:::newcap
  end

  subgraph Metrics["📊 Level 4 Metrics"]
    direction LR
    M1["CDTS"]:::metric
    M2["GPI"]:::metric
    M3["CAR"]:::metric
    M4["SEF"]:::metric
    M5["BGSS"]:::metric
  end

  Prereqs ==> NewCaps
  NewCaps ==> Metrics

References¶

Baars, B.J. A Cognitive Theory of Consciousness. Cambridge University Press, 1988. (Global Workspace Theory - foundational for L14 Global Workspace)
Laird, J.E. The Soar Cognitive Architecture. MIT Press, 2012. Publisher (Multi-layer cognitive architecture)
Anderson, J.R. How Can the Human Mind Occur in the Physical Universe? Oxford University Press, 2007. (ACT-R cognitive architecture)
Khalil, H.K. Nonlinear Systems. Prentice Hall, 3^rd Edition, 2002. (Lyapunov stability theory - foundational for §6)
Bai, Y., et al. "Constitutional AI: Harmlessness from AI Feedback." arXiv 2022. arXiv:2212.08073 (Ethical constraint enforcement)
Amodei, D., et al. "Concrete Problems in AI Safety." arXiv 2016. arXiv:1606.06565 (Safety problem classification)
Alchourrón, C., Gärdenfors, P., & Makinson, D. "On the Logic of Theory Change: Partial Meet Contraction and Revision Functions." Journal of Symbolic Logic, 50(2), 510–530, 1985. DOI:10.2307/2274239 (AGM belief revision - foundational for §5)
Cox, M.T. "Metacognition in Computation: A Selected Research Review." Artificial Intelligence, 169(2), 104–141, 2005. DOI:10.1016/j.artint.2005.10.009 (Triple-loop meta-cognition)
Wallach, W. & Allen, C. Moral Machines: Teaching Robots Right from Wrong. Oxford University Press, 2008. (Ethical kernel design)
Scherer, K.R. "Appraisal Considered as a Process of Multilevel Sequential Checking." In Appraisal Processes in Emotion, 92–120, Oxford University Press, 2001. (Affective engine theory)
Dehaene, S., et al. "Toward a Computational Theory of Conscious Processing." Current Opinion in Neurobiology, 15(2), 225–234, 2005. DOI:10.1016/j.conb.2005.03.009 (Consciousness and global workspace)
Picard, R.W. Affective Computing. MIT Press, 1997. (Emotion modeling in computational systems)
Shinn, N., et al. "Reflexion: Language Agents with Verbal Reinforcement Learning." NeurIPS 2023. arXiv:2303.11366 (Self-reflection in agents)
Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control. Viking, 2019. (Value alignment and control)
Sloman, A. "Varieties of Meta-cognition in Natural and Artificial Systems." In Metareasoning: Thinking about Thinking, MIT Press, 2011. (Meta-cognitive architectures)

Previous: ← Level 2: Autonomous Agent
Next: Level 4: Adaptive General Agent →