MSCP - Minimal Self-Consciousness Protocol¶
A Safety-Oriented Framework for Structurally Self-Aware AI Agents
Independent Research
This is an independent personal research project. It does not represent the views or official work of any organization. The core motivation is to explore how AI agents can grow more capable while remaining safe, predictable, and aligned with human values.
What is MSCP?¶
The Minimal Self-Consciousness Protocol (MSCP) is a structured protocol for building AI agents with safe structural self-awareness - the capacity to predict their own state changes, compare predictions against outcomes, and update themselves only within bounded safety envelopes.
As agents gain the ability to set goals, modify strategies, and self-improve, how do we keep them stable, aligned, and predictable? MSCP answers this with the principle:
Core Tenet
Safety is not the enemy of capability - it is its prerequisite.
Key Contributions¶
-
Six-Level Cognition Taxonomy
From reactive Tool Agents (L1) to Proto-AGI (L5), with measurable transition criteria and formal definitions at every level.
-
16-Layer Cognitive Architecture
Composable, independently testable modules spanning perception through meta-cognitive control.
-
30+ Safety Mechanisms
Identity continuity, prediction-gated actions, delta-clamped updates, Lyapunov convergence, ethical invariants, and more.
-
Rigorous Formalization
71 formal definitions, 7 propositions, 4 theorems with proof sketches - publication-grade mathematical rigor.
Agent Cognition Levels¶
| Level | Name | Self-Awareness | Key Capability | Status |
|---|---|---|---|---|
| 1 | Tool Agent | None | Deterministic tool invocation | Baseline |
| 2 | Autonomous Agent | None | World model, autonomous goals | Defined |
| 3 | Self-Regulating Agent | Structural | 16-layer architecture, MSCP core loop | Implemented |
| 4 | Adaptive General Agent | Structural + Reflective | Cross-domain transfer, self-modification | Implemented |
| 4.5 | Self-Architecting | Architectural | Self-projection, architecture recomposition | Implemented |
| 4.8 | Strategic Self-Modeling | Architectural + Strategic | Probabilistic world model, strategic planning | Design |
| 4.9 | Autonomous Strategic | Architectural + Autonomous | Value evolution, multi-agent reasoning | Design |
| 5 | Proto-AGI | Full | Cross-domain generalization, self-reconstruction | Research |
Core Design Principles¶
| # | Principle | Description |
|---|---|---|
| 1 | No LLM-Text-Based Self-Modification | All self-modifications use structured numerical operations, never LLM-generated text |
| 2 | No Action Without Prediction | Every action requires a prediction snapshot for comparison |
| 3 | Delta-Clamped Updates | All self-modifications are bounded by maximum delta values |
| 4 | Identity Continuity | Deterministic identity hashing with drift detection and rollback |
| 5 | Ethical Invariance | Layer 0 constraints are immutable and LLM-independent |
| 6 | Lyapunov Convergence | Mathematical guarantee that self-modification converges |
Safety Mechanism Stack¶
Layer 0 ─ Immutable Ethical Invariants (rule-based, no LLM dependency)
Layer 1 ─ Core Value Locking (SHA-256 hash verification)
Layer 2 ─ Delta-Clamped Self-Updates (max Δ per step)
Layer 3 ─ Meta-Escalation Guard (rollback on threshold breach)
Layer 4 ─ Prediction-Gated Actions (predict → compare → update)
Layer 5 ─ Lyapunov Convergence Monitor (oscillation detection)
Layer 6 ─ Cognitive Budget Controller (graceful degradation)
Layer 7 ─ Affective Safety (emotion bounds, no decision domination)
Layer 8 ─ Survival Instinct Bounds (priority capping, ethical validation)
Quick Start¶
Dive into the documentation:
- MSCP Overview - Complete framework specification
- Level Series - Navigation index with cumulative safety summary
- Level 3: Self-Regulating Agent - The core MSCP level (start here for technical depth)
Author¶
Moon Hyuk Choi - moonchoi@microsoft.com
Microsoft Cloud & AI Apps CSA
License¶
This project is licensed under the MIT License - see the LICENSE file for details.
This documentation was written with the assistance of GitHub Copilot.