AI systems degrade across extended workflows—
context resets, outputs drift, and coherence breaks across sessions.
This isn’t a model capability issue.
It’s a failure to manage what happens between interactions.
Memory stores the past. Orchestration executes tasks. Neither ensures the work stays on track.
A real-world research program testing AI performance across weeks and months—not isolated sessions.
When continuity isn’t managed, systems degrade over time—regardless of capability.
A continuity layer and pilot framework to measure, stabilize, and improve long-horizon AI performance in real use.

Stateless AI treats each interaction as isolated—forcing users to rebuild context, absorb reset costs, and manage output drift.
This prevents meaningful progression over time.
Continuity turns interaction into a cumulative system—where context persists, alignment stabilizes, and capability improves through sustained use.
This shift—from stateless interaction to continuous systems—is what enables long-horizon AI performance.
Memory stores past information. Context windows extend what the model can see within a single interaction.
Continuity is different—it maintains a working state across interactions, allowing context to accumulate and alignment to improve over time.

This diagram identifies the unowned layer between capability and long-horizon outcomes.
Why this isn’t solved by existing approaches
Current systems attempt continuity through memory, context windows, and orchestration—but none manage whether work stays aligned over time.
The gap
These systems operate on stored information or task execution.
They do not govern whether the work stays aligned over time.
What’s missing
A system that tracks active state, preserves intent across sessions, and maintains alignment as work evolves.

Continuity turns human–AI interaction into a co-evolving system.
Instead of producing isolated outputs, the system accumulates context—enabling:
These feedback loops stabilize the interaction, allowing new behavioral and creative states to emerge and evolve across sessions.
This is where capability shifts from generation to sustained collaboration.

Most AI systems treat the model as the product.
In practice, what matters is the layer that sits between the user and the model—managing memory, state, and context over time.
That layer determines what the model actually sees.
.png/:/rs=w:600,cg:true,m)
This diagram shows how a continuity layer transforms raw user input into structured, persistent context.
By tracking active state, storing durable memory, and extracting what matters from each interaction, the system reduces reset costs and maintains coherence across long-horizon work.

This timeline shows how continuity—not generation—supported completion.
AI-assisted interaction helped resolve a specific constraint, preserved creative intent, and maintained momentum across phases, allowing the project to move from dormancy to release without losing its voice.
The shift was not in what the system produced, but in its ability to maintain direction across time.
.png/:/rs=w:1240,cg:true,m)
A structured 90-day pilot designed to evaluate how continuity affects AI performance over time.
This study compares stateless and continuity-enabled interaction across real creative workflows, measuring reset cost, alignment stability, and completion outcomes.
This pilot is designed for embedded collaboration.
→ Work with me on this below

Compares stateless and continuity-enabled systems across key metrics—showing how continuity reduces reset cost, stabilizes alignment, and improves decision reliability over time.
Continuity Signal (Quick Read)
This song wasn’t created by AI—it was supported by it.
Across writing, production, and release, AI provided ongoing cognitive support that helped maintain direction, decision-making, and momentum.
That support depended on continuity—returning to the same thread of context over time rather than starting from scratch each session.
This case suggests that under sustained interaction, AI may shift from a generative role to a stabilizing one. In episodic use, systems are typically evaluated based on their ability to produce novel content in response to discrete prompts. In the completion of Already Know, however, the system’s primary contribution was not the introduction of new material but the preservation of forward motion across time. Its function evolved from phrasing exploration during the writing phase to continuity support during execution and production. This indicates a distinct interaction mode in which alignment is expressed not through output quality alone but through the system’s capacity to maintain coherence with an existing human intention across interruptions, delays, and changing constraints. Such stabilizing behavior is unlikely to appear in short-horizon evaluations and suggests that long-term collaboration may reveal forms of alignment support that are currently underexamined in prompt-based assessments.
Keywords: Human-AI Collaboration (HAC), Long-Horizon Alignment, Creative Stewardship, Continuity Support Systems, Qualitative Evaluation.
Technical Case Study: Creative Survivability & AI Continuity (pdf)
DownloadA working vocabulary for long-horizon AI systems—defining the core behaviors, processes, and outcomes that enable sustained performance over time.
This work is grounded in real, sustained workflows—not short-session testing.
When AI is used continuously over time, performance begins to degrade: context must be reconstructed, outputs drift, and coherence breaks across sessions.
These effects are measurable using existing workflows, making continuity a practical research and product problem—not just a theoretical one.
→ Work with me on pilots, research, or embedded roles
AI systems do not fail primarily at the task level—they fail across time.
In real-world use, performance degrades over time due to:
This creates measurable costs in productivity, reliability, and output quality.
This is not a UX limitation.
It is a long-horizon infrastructure problem.
My work documents this failure mode through active deployment in creative systems (music production, teaching), producing real-world artifacts that show how AI behaves beyond short-session benchmarks.
If embedded in a lab, fellowship, or pilot, I translate continuity from an observed phenomenon into something measurable and usable.
AI is typically evaluated in short sessions. That misses what happens across time.
In real use:
I make these patterns visible so teams can understand real-world performance—not just demos.
I sit between research, product, and real-world use.
I don’t simulate workflows—I operate inside them, then translate that into clear, usable insight.
Most useful for:
I’m already operating in the conditions most teams are trying to study.
This research is grounded in sustained, real-world use—not controlled environments.
Creative production and teaching require continuity over time, making degradation immediately visible and costly.
Because the work is ongoing and artifact-based, it captures system behavior that short-session benchmarks miss.
Independent / contract / embedded research roles
I’m open to collaborations, pilot programs, and embedded roles exploring real-world AI use over time.
If you’re working on long-term interaction, memory, or real user workflows, I’d love to connect.
AI systems break down over time because they lack continuity.
In real-world use, this forces users to:
This work argues that continuity should be treated as infrastructure for long-horizon human–AI collaboration.
AI systems are increasingly used in workflows that unfold over weeks and months, yet most are designed for short, reset-based interactions. This creates an invisible burden: users must continually reconstruct meaning, decisions, and process history.
I refer to the missing layer that supports sustained collaboration as continuity infrastructure.
This research is grounded in a real-world case study: my own long-horizon use of AI across music production, teaching, and research workflows. Rather than isolated prompts, this work reflects ongoing collaboration over time.
Three patterns consistently emerge:
Taken together, these observations suggest that continuity is not a feature of chat logs—it is a structural requirement.
If AI systems are expected to support real creative or research work, continuity must be designed as a load-bearing system property, not left to the user.
This work documents long-horizon human–AI collaboration from a practitioner’s perspective, offering both:
Status: Working paper (v1.0, January 2026)
Extended use of conversational AI systems is often framed as a psychological anomaly or low-stakes novelty rather than legitimate professional collaboration (Nass & Moon, 2000; Waytz et al., 2014). Yet creators and educators increasingly rely on these systems for projects unfolding across months—songwriting and release cycles, curriculum design, student planning, business coordination, and sustained inquiry. This paper argues that the central challenge facing such use is infrastructural, not cultural. Drawing on traditions in human–computer interaction and infrastructure studies that emphasize breakdown, repair, and cumulative coordination work (Suchman, 1987; Star & Ruhleder, 1996; Orlikowski, 2000), it reframes continuity as a load-bearing system property—distinct from memory—that determines whether conversational systems can function as stable collaborators across time, updates, and shifting constraints.
We define continuity as the combination of stable project framing, stable interaction contracts, and legible transitions when system behavior changes. Building on qualitative synthesis and reflexive longitudinal observation, we introduce two analytic constructs—reset costs and interpretive debt—to describe the hidden labor users perform when systems lose context or shift behavior across versions and policy regimes, extending prior work on sociotechnical maintenance and technical debt (Jackson, 2014; Cunningham, 1992). We further conceptualize trust as an operational variable shaped by predictability and constraint stability rather than sentiment (Lee & See, 2004; Parasuraman & Riley, 1997), and analyze how dominant evaluation and governance practices—optimized for short-horizon prompts and interchangeable sessions—systematically suppress longitudinal signal (Mitchell et al., 2019; Raji et al., 2020; NIST, 2023).
The paper concludes with design and governance requirements for continuity-aware systems, including versioned collaboration regimes, discontinuity signaling, consented persistence with revocability, and longitudinal evaluation protocols. Taken together, the analysis positions continuity not as an indulgence for “heavy users,” but as a prerequisite for sustainable, accountable long-horizon human–AI collaboration.
Keywords: Human-Computer Interaction (HCI), Sociotechnical Infrastructure, Interpretive Debt, Reset Costs, Long-Horizon Collaboration, Systems Maintenance, Technical Debt.
Continuity as Infrastructure: Load-Bearing Design in Long-Horizon Human–AI Collaboration (pdf)
DownloadMost AI systems are optimized for short-term outputs, not long-term use.
This creates instability in real workflows, where users must manage drift, inconsistency, and repeated resets over time.
This paper argues for a shift from optimization to stewardship—designing for sustained, reliable interaction rather than isolated results.
Keywords: Long-Horizon AI Alignment, System Stewardship, Continuity Design, AI Sustainability, Persistence Architectures, Temporal Governance, Human-AI Co-evolution.
From Optimization to Stewardship: Continuity and the Future of AI (pdf)
DownloadUsing AI requires translation.
Users must interpret, steer, and validate outputs to make them usable—often without clear guidance.
This paper explores how trust breaks down when that burden is unmanaged.
Status: Working paper (v1.0, February 2026)
As AI systems increasingly persist across months and years of use, governance challenges shift from discrete failure modes toward slow-moving sociotechnical dynamics—creeping reliance, authority normalization, evolving trust relationships, and identity-shaping workflows. Contemporary deployment pipelines emphasize telemetry, benchmarks, and short-horizon audits, yet many of these effects remain structurally invisible.
In practice, organizations already depend on a small subset of highly engaged users to surface emergent risks, translate system changes into lived consequences, and articulate governance gaps before they appear at scale. These users function as an informal interpretive layer in deployment—one that is structurally relied upon but rarely designed, compensated, or audited.
This paper introduces translator trust as a governance construct for long-horizon AI systems: institutional pathways that authorize, resource, and bound human interpretive labor required to make slow-moving deployment dynamics legible. We argue that this interpretive labor constitutes an infrastructural dependency and should be institutionalized rather than left ad hoc. Drawing on extended creative deployments and emerging agentic architectures, we analyze how translator roles arise, why informal reliance produces governance vulnerabilities, and how programs can be designed for pluralism, rotation, auditability, and independence protections. We conclude with implications for research labs, product teams, and regulators.
Keywords: AI Governance, Sociotechnical Evaluation, Long-Horizon Deployment, Post-Deployment Monitoring, Interpretive Labor, Algorithmic Auditing, Red Teaming (Human-in-the-loop).
Translator Trust: Governing Interpretive Labor in Long-Horizon AI Systems (pdf)
DownloadI’m a musician studying how AI tools are actually used in real creative work over time.
Instead of approaching AI from a purely technical or theoretical perspective, I focus on what happens when these systems are used continuously in real workflows—where they help, where they break, and what gets lost between sessions.
I use AI as part of my ongoing creative process and document what actually happens across weeks and months of use, not just isolated experiments. This reveals patterns that are often missed in short-term testing, including issues around continuity, memory, trust, and creative control.
This work translates real creative experience into insight for how AI systems are designed, evaluated, and improved—especially in creative and applied contexts.
The materials below explore this embedded, practice-based approach and its implications for creative work, tool design, and long-horizon human–AI collaboration.
This work is directly applicable to:
I provide grounded insight into how AI systems actually perform in sustained use—where they hold up, where they break, and what that means for real users.
This repository contains longitudinal case studies (2025–2026) regarding human-AI collaboration in music production. Research focus: Continuity Stewardship, Reset Costs in model transitions, and Interpretive Labor. Dataset includes 14+ months of interaction logs documenting project recovery and long-horizon creative alignment.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.