Automation complacency is a documented phenomenon in aviation and nuclear safety. The question is whether AI in knowledge work follows the same pattern. When the system handles the routine, the human stops practicing judgment. That's a risk that doesn't show up in productivity metrics.
Readings
Articles, papers, and references that shaped my thinking. Grouped by theme, annotated with why I saved them.
AI Governance & Security
Semantic drift and behavioral drift in production AI systems. The quiet failure mode: the model's outputs shift over time while the inputs look the same. Most monitoring frameworks don't catch this. Accountability requires seeing what you can't easily instrument.
Deloitte names five enterprise blockers for scaling AI agents: explainability, auditability, hallucination, orchestration complexity, and regulatory exposure. The framing that stuck -- governance isn't a layer you add after deployment. It's an architectural decision made at design time.
The same pattern that created shadow IT in the 2000s is playing out again with AI tools. Employees adopt tools the organization hasn't vetted, data flows to platforms no one agreed to, and audit trails disappear. The solution isn't prohibition -- it's governance that moves at the speed of adoption.
A practical architecture for using multi-agent systems to automate cybersecurity risk assessment for SMBs. The interesting part is the decomposition -- different agents handle different risk domains, with a coordinator synthesizing the output. Applies directly to what a private AI appliance could do for a Canadian firm with no dedicated security staff.
The UK data regulator working through what autonomous agents mean for accountability. The core question -- who is responsible when an agent makes a decision -- doesn't have a clean answer yet. Worth watching as a signal for where PIPEDA enforcement is heading.
Private AI & Data Sovereignty
On-device voice AI that never sends audio to a cloud server. Runs on Apple Silicon. The implication for regulated industries -- healthcare, legal, financial -- is clear: voice interaction without cross-border data exposure. The technical barrier is lower than most people assume.
AI-native applications don't need 40 SaaS integrations. The consolidation argument: if a model can do what five tools did, the subscription stack collapses. For SMBs, this creates a window to build on a smaller, more controllable foundation -- which is what private AI infrastructure is for.
AI Agents & Architecture
Evolution Strategies as an alternative to backpropagation for training large models. The practical upshot: models trained without gradients can use discrete weights and non-differentiable objectives, which opens the door to training on hardware that can't run backprop at scale. A quiet shift in what 'training' means at the edge.
The continual learning problem: current AI systems can't update their knowledge without catastrophic forgetting. Each new capability requires retraining from scratch. LeCun's framing -- we're missing a fundamental architectural component -- is worth sitting with. The models we deploy today are snapshots, not learners.
750,000 experiments showing that LLMs don't parse code structurally -- they pattern-match on token sequences. The failure modes are specific and predictable. For anyone using AI for code review or security analysis, this changes what you can trust it to catch.
Context -- not model capability -- becomes the competitive moat. Whoever owns the graph of relationships between entities in a domain controls the quality of AI output in that domain. This is the architectural argument for why enterprise AI is hard to commoditize.
The framing that stuck: AI will eventually research itself faster than humans can. The first-order implication is obvious. The second-order one is more interesting -- what happens to human expertise when the loop closes? Karpathy is unusually honest about what he doesn't know.
Prompt injection at scale is not an academic problem. When agents have persistent memory and tool access, a single malicious input can propagate across sessions and systems. The attack surface grows non-linearly with agent autonomy.
Future of Work & Engineering
A granular framework for thinking about which tasks AI touches, not which job titles. Exposure varies enormously within the same role. The nuance matters for planning -- the question isn't 'will my job exist' but 'which parts of my day are changing and when'.
The data contradicts the displacement narrative: software engineering job postings up ~11% year-over-year in early 2026. AI automates the mechanical parts, which historically increases demand for people who can design and reason about systems. Every previous wave followed the same pattern.