Home / concepts

Session-Native Architecture

[CONVICTION]

The agentic internet cannot become useful until agent sessions are a first-class network primitive. The current AI stack repeats the web's original mistake: everyone builds at the request layer (serve tokens) or the application layer (agent frameworks). Nobody builds the session layer -- the Layer 5 equivalent for agent communication. Just as the web could not become useful until cookies faked sessions in 1994, the agentic internet is stuck at stateless request-response until sessions become infrastructure.

The Internet Parallel

The parallel is almost exact, and it tells you what comes next.

Phase 1: Stateless requests (1991-1994). HTTP was stateless document retrieval. Every request was independent. The server forgot you existed the moment it sent the response. This is exactly where AI infrastructure is in 2026. OpenRouter, Hyperspace, every inference API serves stateless token requests. No memory. No continuity.

Phase 2: The session hack (1994-2000). Lou Montulli at Netscape invented cookies to fake statefulness. This one hack unlocked shopping carts, logins, personalization -- everything that made the web useful beyond reading documents. Without sessions, there is no Amazon, no Gmail, no Facebook.

Phase 3: Rich stateful experiences (2000-present). WebSockets for real-time bidirectional communication. Live chat, collaborative editing, streaming.

The OSI model actually predicted this: Layer 5 (Session Layer) was designed for establishing, managing, and terminating sessions between applications, with synchronization checkpoints for resumption after failure. But TCP/IP never implemented a proper session layer. Session functionality was pushed to applications. Every web app reinvents its own session management.

The Telecom Parallel

[EVIDENCE]

Telecom is the only industry that built session management as infrastructure, and the evolution maps almost perfectly onto the agentic internet.

Before SS7 (pre-1975): In-band signaling. Control signals traveled on the same circuit as voice. Signaling and voice were tied together. This is where AI voice agents are in 2026 -- session logic and agent logic tangled in one process (LiveKit's AgentSession, Vapi's call handling).

SS7 (1975-2000): Out-of-band signaling. A completely separate signaling network ran parallel to voice. Call setup, routing, and management happened independently of whether voice circuits were busy. The connection was not established until all nodes on the path confirmed availability.

SIP (1999-present): Session Initiation Protocol for IP. SIP sets up and terminates calls but does not carry media. The name literally has "Session" in it. A SIP INVITE carries everything needed to set up the session -- codecs, media types, participants.

Session Border Controllers. SBCs sit at network borders managing signaling and media flows, quality of service, security, and maintaining state about active dialogs -- not just transactions. SBC functionality enables seamless mobility and handover between networks.

The IDN is the SS7 network for the agentic internet. The session manifest is the SIP INVITE equivalent. Nodes are cell towers. Mycel is the billing and verification system.

Why Agent Sessions Are Harder Than Phone Calls

[REFRAME]

WhatsApp at a billion calls a day has not solved the distributed session problem. They work around it by keeping session state on clients and treating relay servers as dumb pipes. This works because the compute is on the endpoints (the phones), not the network. The two humans do all the processing.

Agent sessions invert this. The compute IS the network. The agent runs on a node, not on the user's device. When a node fails, you lose the agent's brain, not just a pipe. The session state -- KV cache, conversation history, tool state, reasoning chain -- lives on the server side. There is no second peer holding a copy.

Voice agents in 2026 are not running real sessions. They are mimicking them. The typical voice agent chains separate components (STT, LLM, TTS) with each LLM call being a stateless request-response with conversation history appended. If the server handling the response crashes, the session dies. Even LiveKit's innovation (the AI agent as a full WebRTC room participant) is bounded to a single server. No migration, no edge placement, no distributed session management.

The Session Manifest

[CONVICTION]

The session manifest is the contract between an agent and the infrastructure. It declares everything the network needs to manage sessions:

Session characteristics -- typical duration, interactivity mode (real-time sub-200ms, interactive with pauses, fire-and-forget with checkpoints), state footprint (KV cache size, memory requirements), resumability.

Placement constraints -- data sovereignty, minimum hardware, model requirements, latency ceiling.

Migration policy -- whether sessions can migrate between nodes, cost of migration (small for text, huge for voice), acceptable state loss on failover.

Scaling profile -- demand spike patterns, warm-up time for new instances, eviction policy for cold sessions.

Verification requirements -- what Mycel needs to verify for this agent's sessions.

Three dimensions govern session management decisions that the manifest captures:

Continuity model -- does the session need continuous real-time connection (voice), interactive pauses (tutoring), or checkpoint-based progress tracking (research)?

Migration cost -- a text chat with 4K context is cheap to migrate. A voice agent mid-sentence with audio buffers is nearly impossible. A research agent with 50GB of intermediate results is expensive but not urgent.

Value curve -- does the session become more valuable as it runs? A tutoring session where the agent has modeled the student's understanding is worth more than a fresh session. A research agent 5 hours into a 6-hour task has accumulated enormous value.

Everything Is a Session

The unified abstraction: everything that runs on the IDN is a session. A session is a bounded unit of agent work with state, a lifecycle, constraints, and verification requirements.

Synchronous sessions (voice calls, video analysis, robotics) -- real-time streams, sub-second latency budgets, failure is immediately visible. Need session affinity, fast failover, minimal migration cost.

Interactive sessions (text tutoring, chat agents) -- human-in-the-loop, second-scale tolerance. Need state persistence, resumability, checkpoint-based failover.

Autonomous sessions (research agents, batch analysis) -- no real-time human interaction, minute-scale tolerance. Need checkpoint/restart, progress tracking, result verification.

The billing unit shifts accordingly. Not per token, not per request. Per session-minute of verified agent work. The verification is session-level: did the student learn biology over the course of this 45-minute tutoring session? That is a session-level proof, not a request-level proof.

What Nobody Else Has Built

[EVIDENCE]

Everyone can build request routing (Hyperspace proved it with 350K nodes and zero marketing spend). Everyone will adopt A2A for agent communication. Agent frameworks keep getting better at building agents. But nobody is building the session layer -- the distributed infrastructure that manages where agent sessions live, how they migrate, how they scale, and how their outcomes get verified.

A2A tells agents how to discover and talk to each other. MCP tells agents how to use tools. The session manifest tells the network how to host and manage agent experiences. These are complementary, not competing. An agent deployed on OpenGrid could be A2A-discoverable, MCP-connected, and session-managed by the IDN.

Related

Tags: opengridsessioninfrastructuredistributed-systemstelecomagent-infrastructure