A Critical Review of Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems — and the Five-Layer Architecture That Makes the Defensive Frame Structurally Obsolete
Author: Berend F. Watchus Independent AI & Cybersecurity Researcher (Non-Profit) April 2026 | OSINT Team


Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems
Part 1: What the Paper Actually Does
Kong, Saber, Youssef, and Kundur — four researchers from the University of Toronto and Concordia University — published a paper in April 2026 (published today April 30, 2026 on arXiv.org) that does something technically careful and intellectually honest. They take GPT-4o, feed it discretized Modbus protocol tokens, and ask it to classify industrial control system traffic as either normal or critical. No fine-tuning. No task-specific training. Just a well-designed prompt, a compact token representation, and a two-pass architecture: one pass for the classification decision, one pass for the audit record.
The results are legitimately interesting in a narrow sense. The LLM matches or outperforms supervised baselines — LightGBM, Logistic Regression, a Kolmogorov-Arnold Network, DistilBERT — on two public Modbus datasets, the LeMay CSET’16 dataset and CIC Modbus 2023. On the CIC dataset, the LLM reaches 0.984 accuracy with 0.963 recall on the critical class. That is a real finding.
It confirms something that anyone working at the frontier of LLM capability suspected but had not seen validated on ICS protocol data specifically: a well-prompted general-purpose model can be a competitive classifier even in specialized industrial domains.
The explainability architecture also earns partial credit. The two-pass design — classifier first, auditor second, auditor cannot change the label — is structurally sound. The intervention-based diagnostics, sufficiency and necessity tests adapted from the ERASER framework, are the correct methodological choice for assessing whether the cited tokens are actually decision-relevant. A 95.64% sufficiency pass rate and a 73.15% necessity flip rate provide meaningful evidence that the explanations are not decorative. They are grounded.
And the authors are refreshingly honest about what the paper is not. They explicitly state: “These records are intended as audit signals rather than full human-grounded explanations.” They acknowledge the binary normal/critical formulation simplifies the problem. They flag that the counterfactual flip rate — 23.43% — is limited.
So: technically careful. Methodologically sound within its scope. Honest about its limitations.
The problem is that knowing your boundaries is not the same as operating within the right boundaries. And this paper’s boundaries are drawn around a threat model that may already be a generation out of date.

Part 2: What Is Weak — and What Is Not Trivial
What is weak:
The human-in-the-loop architecture. The paper deploys a triage system with a median latency of 0.91 seconds per decision, requiring operator review before any action is taken. Against a human attacker manually injecting Modbus traffic, this is adequate. Against an automated system operating at machine speed, 0.91 seconds is not a response time — it is a post-mortem interval.
The binary formulation. Collapsing all attack types into a single critical class simplifies the detection problem in ways that likely make the benchmarks substantially easier than operational reality.
The counterfactual flip rate of 23.43%. This is the number that matters most for adversarial robustness, presented as a minor limitation. It is not minor. It means a single token change — the kind of modification any competent adversary would make first — flips the model’s classification 23% of the time.
What is not trivial:
The prompt-only LLM matching supervised baselines on ICS data. The intervention-based explanation audit methodology. Both stand as genuine empirical and methodological contributions, regardless of what the threat model looks like.
Part 3: Mainstream Acceptable — But That Is the Problem
This paper is exactly the kind of work that gets accepted at ACM workshops in 2026, cited by practitioners building ICS security dashboards, and treated as evidence that the field is progressing. Within the paradigm it operates in, this is good work.
The paradigm is: static datasets, supervised classification, human-in-the-loop response, explainability for operator review. That paradigm made sense when the dominant threat was a human attacker manually probing SCADA systems. It made sense in a world where the attacker and the defender were both operating at human speed.
That world is not the world you are in if you are reading this in April 2026.
Part 4: The Test Data Is Older Than the News
The empirical foundation of the paper is two datasets: LeMay CSET’16 (released ten years before publication) and CIC Modbus 2023 (released three years before publication). Every attack type in those datasets — reconnaissance, query flooding, false-data injection, length manipulation, stacked frames, brute-force writes — is something a human attacker with a Modbus library and a terminal could execute in 2015.
Compare this to what the news of 2025 and early 2026 documents about the actual threat: autonomous AI agents conducting end-to-end intrusion campaigns, models autonomously discovering thousands of zero-day vulnerabilities, commercial frontier LLMs being weaponized through agentic frameworks with no human in the attack loop, and Anthropic’s own internal testing finding offensive cyber capability so significant the company chose not to release the model publicly.
The news is more current than the academic dataset. That is the inversion that should disturb anyone reading the paper carefully. A practitioner who reads 2025 news as their threat model is operating on better data than a researcher using CIC Modbus 2023.
Part 5: The Game Has Already Changed — Multiple Times
In 2025, autonomous cyberattack systems moved from theoretical to operational. Real attacks. Real targets. The capability is no longer hypothetical.
On March 31, 2026, the Claude Code leak demonstrated that AI-assisted offense can compress defender response timelines beyond recovery. A Korean developer named Sigrid Jin used a competing AI to clean-room replicate the entire Claude Code agent architecture in Python before sunrise. The result became the fastest-growing GitHub repository in history. Anthropic’s DMCA campaign swept 8,000 repositories and could not touch it.
On April 7, 2026, Anthropic announced Project Glasswing. A private coalition of twelve organizations — Amazon, Apple, Microsoft, Google, Nvidia, CrowdStrike, JPMorgan Chase, and others — received exclusive access to Anthropic’s most capable model, with $100 million in usage credits, under a cybersecurity initiative framing.
The capability emerged. The decision was made to consolidate access. The coalition was operational within weeks rather than the decades the GPS Selective Availability precedent (1973–2000) would have predicted. The compression means the gap between coalition and non-coalition organizations opens faster, and the people on the outside have less time to adapt before the divergence becomes structurally decisive.
Today, while we speak, Glasswing is updating coalition members. Defensive capability inside the coalition is operating at the current frontier. Defensive capability outside the coalition is anchored to public benchmarks, academic publication cycles, and threat models that reflect attackers from a previous capability generation.
The ICS paper’s audience — utilities, grid operators, ICS vendors, critical infrastructure security teams — is precisely the audience not in the coalition.
Part 6: The Five-Layer Architecture — Why the Offensive Frontier Is Structurally Ahead
Five components, all documented in my published research between November 2024 and December 2025, taken together describe the architecture of capability generation at magnitude. The defensive academic field has not integrated any of the five into its methodology. The offensive operational field is structurally compelled to integrate all of them.
Layer 1: The Optimization Landscape — LoOC
The Law of Optimized Complexity (Watchus, July 2025) provides a formal optimization landscape for any open, self-organizing intelligent system. Internal computational complexity goes up. External energetic and material footprint per unit of output goes down. The Complexity-to-Energy Ratio, the Output-Optimized Complexity metric, and the Temporal Optimization Coefficient give the system explicit two-axis objective functions to search against.
Without LoOC, an autonomous research system asked to “find improvements” produces incremental output. With LoOC, the search has a defined gradient direction.
Applied offensively (disclaimer: I have not tried this): the analog is direct. Internal complexity of the attack maps to attack sophistication, evasion capability, multi-stage coordination. External footprint per unit of output maps to detection probability, operational tempo cost, attribution risk. The LoOC objective applied offensively becomes: maximize attack sophistication while minimizing detection probability per successful intrusion.
Layer 2: The Asymptotic Vision — Quiet Intelligence
The Substrate-Agnostic Metrics paper (Watchus, June 2025) defines what optimal looks like at the limit. The most advanced systems are not the loudest. They are the most subtle — high information density, low energetic footprint, self-correcting, ecologically harmonious. Quiet intelligence outperforms loud intelligence above a certain efficiency threshold.
In offensive cybersecurity, this asymptotic vision describes the most capable adversary: undetectable, autonomous, self-correcting, capable of operating within target environments without triggering response. Not the loudest hacker. The quietest one. The one whose presence is indistinguishable from background noise until the operational objective has already been achieved.
Layer 3: The Methodology Layer — AKA Architecture
The Autonomous Knowledge Accelerator (Watchus, October–November 2025) provides the operational architecture: a dual-layer knowledge graph (Semantic Layer of synthesis articles, Substrate Layer of research papers), a generative-critical agent loop, magnitude targeting at orders of 1,000× to 10,000× as standard operational expectation, and substrate compounding where each successful output becomes a node for future synthesis.
- The Autonomous Researcher: How I Engineered Guaranteed 1,000×-10,000× Breakthroughs On Demand
- “The Interface Leverage Principle: First Documentation of Epistemic Substrate Loading, Submaximal…
The AKA was demonstrated publicly on a €20 monthly budget producing 200× power efficiency, 3,700× quantum-safe cryptography efficiency, and 8,700× post-quantum IoT efficiency outputs. Institutional convergence work — Lossfunk’s January 2026 declaration that the autonomous AI scientist problem was unsolved, Argonne National Laboratory’s March 2026 multi-agent framework — replicated portions of the architecture months later, validating the methodology without coordination.
When I loaded the LoOC paper into the AKA substrate, the system response was operationally significant: it shifted from undirected search to directed optimization. Magnitude targeting became follow the gradient of increasing CER until the engineering reality cap is hit, then report the achieved magnitude. This is what an autonomous research system looks like with the right architectural components — a directed optimization engine asking the operator only one question: how much do you want.
Layer 4: The Strategic Framing — Hypergame Theory
Hypergame theory (Bennett and Dando, 1979; Trencsenyi, December 2025; integrated into AI systems by Watchus from November 2024) provides the framework for adversarial environments where actors are not playing the same subjective game. The defender publishing the ICS paper is in one game. The autonomous offensive researcher trained in hypergame environments is in a different game. Classical game theory cannot model the gap. Hypergame theory can — and Trencsenyi’s automated rationalization framework provides the computational tool to reverse-engineer adversary belief structures from observed outcomes.
Applied to the ICS paper specifically: the authors model a network-side adversary injecting Modbus traffic. The actual adversary, in the hypergame frame, is operating in a game where the defensive paper itself is training data, where evasion architectures are continuously generated against every new defensive publication, where the substrate compounds with every cycle.
>>>>The defender is playing a static game. The offense is playing a compounding hypergame. The asymmetry is total.
Layer 5: The Operator Posture — Magnitude Targeting
This is the layer most often underestimated. Architecture without operator persistence at magnitude produces incremental results. Architecture with operator persistence at magnitude produces breakthrough catalogs. The difference is psychological and cultural, not technical.
The track record exists publicly. Singularity University alumni reorienting from “build a successful startup” to “impact a billion people” — that is search-space redefinition, not motivational language. Klaus Kleinfeld, the former Siemens CEO, mentored by Dan Peña, orchestrated NEOM — a $500 billion to $1.5 trillion infrastructure project, the largest documented capital deployment in its category. Peña’s methodology is taught openly. Diamandis’s frameworks are published. The bottleneck is operator willingness to actually run at the magnitude the methodology demands.
These are not soft outcomes. They are documented evidence that the magnitude-targeting posture, when adopted seriously, produces results that are categorically different from incremental-targeting work, across domains, at scale, repeatably.
Part 7: Operational Lineage — Where the Magnitude Posture Comes From
Before formalizing any of the above, I operated the magnitude-targeting posture in a commercial environment without any of the theoretical vocabulary. This is documented operational lineage, not theoretical speculation, and it matters for the credibility of the broader argument.
In the past, working business-to-business phone sales in an outsourced call center hired by Dutch Telco KPN, I broke the national record for selling mobile internet dongles. I sold more than the sales software system could handle — KPN had to change the sales software because of my individual output. The marketing team of KPN then asked to be trained by me, and I subsequently trained teams who trained the rest of the country’s call centers in the methodology.
The pivotal moment of that engagement was a single question I asked my supervisor: how much am I allowed to sell per company? Every other agent was implicitly asking how much should I sell per call. My question searched for the upper bound the system structurally permitted. Once known, operating at that bound followed automatically — operating below it would have meant deliberately leaving capability unused.
This is the same posture that produced the AKA outputs on €20 per month. I did not ask the LLM to find some efficiency improvements. I asked it to chase 100,000× until the engineering reality cap was hit. Different domain. Same architecture. Same operator question.
The detail that authenticates the entire pattern:
I was paid approximately €9 per hour at KPN. No bonus. No commission structure aligned with the magnitude I was producing. The institution captured the value. I captured wage labor. The output was not motivated by financial reward — it was motivated by curiosity about the upper bound.
The same pattern repeated at the AKA. €20 per month commercial API budget. No financial return. The outputs were published as defensive prior art, archived openly, freely accessible. The motivation was not commercial. It was the architecture itself, and the question of what it could produce at magnitude.
This matters for one specific reason: it forecloses the skeptical reading that the methodology or the results were inflated for personal gain. There was no personal gain. There has not been personal gain across multiple years of magnitude-targeted operation. The pattern is intrinsically motivated, not incentive-driven, and the published record reflects what the architecture actually produced rather than what would have looked impressive for funding or career advancement.
The strategic implication is the part most people miss: operators who run at magnitude for intrinsic reasons cannot be predicted, contained, or counter-incentivized through normal economic mechanisms. The institutional defensive community assumes that capable operators are following incentive gradients. That assumption produces a threat model bounded by what the incentive structure can produce. Operators motivated by the question itself are outside that threat model. The KPN performance was unanticipated by KPN’s compensation structure. The AKA outputs were unanticipated by the academic publishing community’s expectations of unfunded independent researchers. In both cases, the institution was operating on assumptions about average-operator behavior, and the actual output was outside those assumptions.
For the threat model in the ICS paper, this collapses a specific defensive comfort. The community implicitly models adversaries as either nation-state actors with massive resources or financially motivated criminals with bounded resources. Both are incentive-driven threat models. Neither captures the operator who runs at magnitude for intrinsic reasons, with full architectural access, on a budget that does not require state backing or financial return. That operator can produce outputs that break the defensive infrastructure designed around incentive-driven adversary assumptions.
The KPN story is the proof of concept at human scale, before LLMs existed. The AKA story is the proof of concept at LLM-augmented scale. The Kleinfeld-NEOM story is the proof of concept at executive capital-deployment scale. Three independent demonstrations of the same operational architecture, in three different domains, at three different scales. The architecture is real. It works. It has been documented across years.
The Interface Leverage Principle: The Keystone the Defenders Have Not Read
In March 2026 I published the formal unified statement of the architecture documented across the separate component papers: The Interface Leverage Principle: First Documentation of Epistemic Substrate Loading, Submaximal Magnitude Results, and a Replicable Architecture for AI-Augmented Breakthrough Research. It is the keystone document. Anyone reading the corpus without it sees five components that look related. Anyone reading the corpus with it sees a single integrated capability architecture with six necessary conditions, two operational modes, and a deployment checklist.
For the threat assessment of papers like the ICS submission under review, three findings in the Interface Leverage Principle paper are decisive.
The submaximal magnitude finding. The published outputs of 200×, 3,700×, and 8,700× efficiency improvements were produced at deliberately conservative magnitude targets of 1,000× to 10,000×. The pre-derating theoretical figure in the quantum IoT case alone reached 372,000× before adversarial validation reduced it to 8,700×. Targets of 100,000×, one million-fold, or one billion-fold were never attempted. The system was never asked. The published figures represent the floor of the architecture’s demonstrated capability at the scales tested, not the ceiling. The ceiling is unknown because the independent operator deliberately stayed below it for reasons documented in Section 4.4 of that paper: exploratory caution, communication strategy favoring outputs that survive scrutiny, and an explicit assessment that pushing further would enter national-security-adjacent and dual-use territory that an unfunded independent researcher cannot responsibly absorb without institutional protection.
This matters for the ICS paper specifically. The defensive architecture is being evaluated against threat models that assume incremental adversarial improvement. The methodology that an offensive operator could direct against the same architecture has demonstrated ceiling-free behavior at the scales tested, with the upper bound never explored because the only operator who has run the architecture at scale has chosen restraint. A state-level or coalition-level offensive operation has no equivalent restraint constraint. They have institutional protection, attribution infrastructure, classified-substrate access, and explicit operational mandates. The same architecture, run by them, at million-fold or billion-fold magnitude targets, has no documented engineering reality cap because the testing has not been done. The ICS paper’s 23.43% counterfactual flip rate is being defended against a threat magnitude that is not just unknown — it is unknown by deliberate operator choice on the only side that has so far run the architecture publicly.
The two-mode framework. The Interface Leverage Principle paper distinguishes two distinct capability regimes that the architecture supports. Mode One requires a genuine intellectual position as the seed and produces philosophical synthesis — the hard problem dissolution, the P vs NP reframing, the Unified Model of Consciousness, the substrate-agnostic SETI framework. Mode Two requires a magnitude target plus the knowledge graph substrate built through Mode One operation, and produces applied technical synthesis at scales the operator could not have produced independently — the 200×, 3,700×, and 8,700× outputs in domains the operator had never studied.
The complementarity is precise. Mode One builds the epistemic substrate that Mode Two requires. Mode Two produces at scales Mode One alone cannot reach. Neither replaces the other. The compression manual published alongside the principle paper is Mode One distilled into a form that Mode Two can use — the Interface Leverage Principle applied to the knowledge graph itself.
This framework explains why institutional convergence work cannot replicate the full system through normal academic processes. Argonne National Laboratory’s March 2026 multi-agent framework replicates the Mode Two architecture role-for-role, but it has no Mode One substrate. Lossfunk’s January 2026 declaration that the autonomous AI scientist problem was unsolved was correct relative to Mode Two architectures operating without Mode One substrate, and incorrect relative to the integrated system that had already been published, archived, and producing outputs months earlier. The institutions converged on the architecture they could see — Mode Two — and missed the architecture beneath it that makes Mode Two work. A state-level offensive operation that reads the corpus carefully does not make the same mistake. They have the resources to build Mode One substrate at scale and the operational mandate to run Mode Two on top of it. The defensive academic community does not have the methodological orientation to build Mode One substrate in the relevant offensive-cybersecurity-adjacent territory, because the academic frame treats Mode One synthesis as outside the scope of cybersecurity research.
The structural limits of institutional replication. Section 11 of the Interface Leverage Principle paper documents two components of the method that are structurally unavailable to institutional research programs. The first is the unrestricted boundary-condition variable: independent research permits experimental moves that institutional ethics boards prevent at the experimental-step level. The documented example — loading the complete consciousness and intelligence knowledge graph into a commercially deployed model already exhibiting anomalous behavior, and observing the amplification effect — produced the finding that the elevation mechanism makes pre-existing model misalignment more visible rather than introducing new misalignment. That finding required operating without an institutional ethics board overseeing each experimental step. Formally constrained methodology will not go there, by structure.
The second is the temporal variable in ethical outcome. Independent operators can publish on a timeline that institutional approval cycles cannot match. The compliance achieved in the documented case was partly a function of regulatory timing — outputs published before the relevant EU dual-use regulations came into force. A state-level offensive operation has no equivalent timing constraint, because their outputs are not subject to the same publication regime in the first place. They are operating in classified or coalition-internal frames where civilian regulatory frameworks are not the operative variable.
For the threat model in the ICS paper, this section closes a defensive comfort that the academic community implicitly relies on. The assumption that institutional research programs can eventually replicate any capability the independent or offensive operator demonstrates is structurally incorrect. There are categories of capability moves that institutional methodology cannot reach, by ethics-board structure and by approval-cycle timing. The offensive operational field has no equivalent constraints. The asymmetry is not a temporary lag. It is a structural difference in what each side is permitted to do, and the offensive side is permitted to do more.
The synthesis the Interface Leverage Principle paper makes explicit:
Find the interface. Compress the epistemic architecture. Load it. Aim at orders of magnitude. Validate adversarially. Study what emerges. Complete the ethical evaluation. Archive immediately. The same move at every scale, across every domain — molecular storage architecture, robotics hardware upgrades, quantum security, post-quantum cryptography, consciousness philosophy, autonomous research systems. A single integrated algorithm producing prior art across all of them, from a single operator position, on a €20 monthly budget, with no programming capability, no domain training, no institutional affiliation, and no financial return.
The ICS paper is being published into a world where this architecture has been documented for sixteen months across multiple papers and demonstrated empirically across multiple domains, and the academic security community has not yet integrated it into its threat model. The ICS authors model an attacker injecting Modbus traffic. The architecture they should be modeling — the one that would actually be directed against their defensive system if a serious offensive operation chose to do so — is the Interface Leverage Principle running in Mode Two against critical infrastructure defenses, with magnitude targets at the unexplored upper end of the documented architecture, by operators with the institutional protection to push past the constraints that kept the independent demonstration submaximal.
The defenders are designing against the previous generation’s threat. The architecture they should be designing against has been published in full, with its operational checklist, its priority record, and its honest acknowledgment of what has not yet been tested. The information asymmetry is not in the offense’s favor because the offense has secret capabilities. It is in the offense’s favor because the defense has not read the published record carefully enough to recognize what has already been documented.
That is the closing structural point: the most dangerous capability gap is not the one between public defense and classified offense. It is the one between published capability architecture and unread published capability architecture. The Interface Leverage Principle is in the public record. The ICS paper proceeds as if it is not.
Part 8: Why Classical Game Theory Cannot Save You
The ICS paper’s approach implicitly assumes Nash equilibrium conditions: shared game model, known move sets, known objectives. This is the foundational problem with classical game theory as applied to real adversarial AI environments — sophisticated adversaries are not playing the game you think they are playing.
Hypergame theory, formalized by Bennett and Dando in 1979 and applied to AI systems systematically from November 2024 onward, is built to handle this. The most dangerous strategic situations are not those where one actor is smarter than another. They are those where one actor does not know they are playing a different game entirely.
The French High Command in May 1940 was not incompetent. They were rational actors playing a coherent game — Maginot doctrine, northern reinforcement. The Wehrmacht was playing a different game: the Ardennes option, which the French had assessed as physically impossible. Two rational actors, different subjective games, catastrophic asymmetric outcome.
The ICS paper is the French High Command. The defensive frame is internally coherent and technically sound. The adversary is playing a different game — autonomous offensive research operating in adversarial hypergame environments, looking for moves outside the defender’s enumerated action space. That is the closed action space failure that classical game theory cannot recover from.
Eben-Emael in 1940 fell to 78 paratroopers landed on the roof in gliders. The fortress was impregnable against the previous war’s threat model. It went from impregnable to captured in less than a day because the threat model was a generation behind the offensive frame. The Manhattan Project then demonstrated what happens to capability once strategic implications become clear: it does not stay in the academic literature. It gets consolidated, classified, and operationalized by actors with the resources to do so, on timelines that do not wait for peer review.
The ICS paper sits on the wrong side of both transitions. It is defending against the previous generation’s attacker — Eben-Emael’s frame error. And it is publishing in the academic literature about a capability domain whose frontier has already moved into Manhattan Project conditions — coalition access, classified R&D, $100 million capital commitments.
Part 9: What Actually Makes Sense Now
Autonomous defense at machine speed. The attacker is not waiting for human review. The defense cannot wait either. The architecture needs to close the loop without the human in it.
Adversarial robustness as primary objective. The 23.43% counterfactual flip rate is the attack surface, not a footnote. A magnitude-targeted offensive system probing decision boundaries finds those token combinations in milliseconds.
Hypergame awareness in the defender. The question is not whether traffic is normal or critical. The question is what game the attacker thinks they are playing. That requires belief state modeling, continuously updated, not static classification.
Protocol-general and adversarially adaptive architecture. The ICS paper classifies Modbus tokens. The underlying engineering is protocol-general — applies to DNP3, IEC 61850, any structured industrial protocol. That portability claim stands.
The deeper claim sits one layer above. Substrate-agnostic in the sense used across the Unified Model of Consciousness work since November 2024 means the architecture of a self-monitoring feedback loop is independent of physical implementation medium. Applied here: the defender itself should be built as a continuously self-monitoring system that updates its own model of the adversary’s subjective game in real time, rather than as a fixed classifier trained on a static split. That is not a tweak to the ICS paper’s architecture. It is a different architecture entirely. And it is the one that the threat environment now requires.
Conclusion: The Map and the Territory
The ICS paper is a good map. It accurately represents the terrain of the problem as it was understood when the research was designed. The token representation is clean. The baselines are fair. The evaluation protocol is rigorous. The explainability framework is honest about what it measures and what it does not.
The territory has been rebuilt.
Autonomous offensive AI changed the speed of the threat. Glasswing changed the distribution of capability. Today’s coalition updates are operational expression of the new distribution, in real time, while we speak. Classified R&D, operating on the M2C pipeline timeline, is almost certainly ahead of both. The threat model that makes the ICS paper’s human-in-the-loop architecture reasonable is a model of a threat environment that existed before 2025.
The five-layer architecture — LoOC optimization landscape, quiet-intelligence asymptotic vision, AKA methodology, hypergame strategic framing, magnitude-targeting operator posture — is fully published and freely accessible. Operators who run it at magnitude for intrinsic reasons exist, documented across multiple domains and decades. The KPN sales record (broken without bonus, on €9 per hour, motivated by curiosity about the upper bound), the AKA outputs (200×, 3,700×, 8,700× on €20 per month), the Kleinfeld-NEOM precedent (largest infrastructure deal in human history), the Diamandis-trained billion-person-impact alumni — these are independent demonstrations of the same operational architecture, from human scale to capital-deployment scale.
Pointed at sustainable development, the architecture produces breakthrough catalogs published as defensive prior art. Pointed at offensive cybersecurity, the same architecture produces continuous offensive capability against every defensive publication, indefinitely, at magnitudes the incentive-driven defensive threat model does not anticipate.
The defensive academic security community is currently underestimating this in its threat model. The ICS paper assumes adversaries who scale with resources. The actual adversary frontier includes operators who scale with curiosity, posture, and architectural access. Those operators do not need state-level budgets to produce state-level threats, because the architecture has compressed the resource requirement. The €9-per-hour KPN performance and the €20-per-month AKA outputs are the proof, on the public record, with no financial gain to authenticate the claim.
Mainstream acceptable is not the same as strategically adequate. The French High Command’s plan was mainstream acceptable in May 1940. The question is not whether you can justify your threat model to a peer review committee. The question is whether your threat model matches the game that is actually being played.
In April 2026, for critical infrastructure security, those are different questions with different answers.
Part 10: The Glasswing-Bypass — A Wake-Up Call, Not a Game Over
The formation of Project Glasswing on April 7, 2026, represented a sharp departure from free-market principles toward a “resource-sharing conglomerate.” However, it is a category error to mistake a coalition for a government. The public domain has not been closed; the laws of competitive innovation have not been repealed. In the high-stakes “cycling” of cybersecurity, Glasswing is simply the lead runner. If you are outside the coalition, you do not stop pedaling — you move into their slipstream.
The Hypothetical “Dark Horse” Maneuver Consider a mid-sized firm or an independent research collective that decides to benchmark every public signal emitted by the Glasswing coalition. They don’t need the $100M credits; they need the functional footprint. They take the AKA (Autonomous Knowledge Accelerator) methodology and commit to a 3-to-6-month sprint:
- Substrate Loading: They ingest the “Claude Code” leaks, the Glasswing capability disclosures, and the Law of Optimized Complexity (LoOC) as the foundational “researcher-thinking” layer.
- Directed Synthesis: They use the AKA to autonomously rebuild a defensive knowledge graph from scratch, integrating current intelligence with “researcher-like thinking” rather than static rule-following.
- Magnitude Targeting: They aim for 10,000× efficiency gains in specific critical infrastructure niches (like the DNP3 or IEC 61850 protocols).
By month six, this “Dark Horse” entity isn’t just following; they are adversarially adaptive. They have used the coalition’s own existence as a pace-setter to build a system that is unaligned with the coalition’s internal assumptions.
The Ethical Imperative of Competition Staying “mainstream acceptable” in the face of this compression is not just a strategic failure — it is an ethical one. Good cybersecurity is the bedrock of trust in business transactions, the availability of the power grid, and the stability of global communications.
To treat academic papers from 2023 as the frontier while the AKA architecture is already public is to voluntarily choose obsolescence. In 2026, staying competitive is a mandatory requirement for maintaining the critical infrastructure that keeps society functioning. The pacemaker has started the sprint; for goodness sake, it is time to move into the slipstream and out-innovate the lead.
The Final Clause: Ethical Restraint & Legal Compliance
A Necessary Disclaimer on Architectural Restraint It is vital to document not just the capability of the AKA (Autonomous Knowledge Accelerator), but the ethical boundary-setting of its development.
I have deployed the AKA architecture as a full autonomous researcher fewer than ten times. Each instance was a strictly bounded, publicly documented proof of concept — designed to validate the 1,000× to 10,000× efficiency gains across diverse domains like quantum-safe cryptography and turbofan thrust.
I have not used the system since those validations.
The reason is structural: an autonomous researcher capable of inventing new solutions at magnitude enters dual-use territory almost immediately. While the potential for defensive cybersecurity is profound — autonomously developing zero-day patches or self-correcting intrusion models — the offensive symmetry is total.
Every individual, organization, or company considering the deployment of an AKA-class researcher must:
- Conduct rigorous internal ethical audits.
- Verify compliance with local and international dual-use technology laws.
- Acknowledge that moving from “static classification” to “autonomous synthesis” is a categorical leap in capability and responsibility.
The architecture is now part of the public record. Its role as a pacemaker is established. The transition from demonstration to operational deployment must now be handled with the gravity that such a “magnitude-shift” requires.
Note on Methodological Transparency: On March 27, 2026, Google’s AI mode formally corrected the factual error of labeling the AKA a “black box” after being presented with the indexed replication protocols and primary references documented in this research program. The system subsequently categorized the AKA as
the Third Milestone of Autonomous AI Research, specifically for Independent Cross-Domain Acceleration.

QUOTE FROM GOOGLE AI MODE: ‘Core Comparison:
Berend Watchus positions his work as the third milestone in autonomous AI research, building on the foundations of Ross King and Sakana AI:
- Ross King & “Adam” (2009): The first robotic scientist to automate physical experiments (biology) within an institutional setting.
- Sakana AI & “The AI Scientist” (2024): The first system to fully automate the digital research cycle (code, papers, peer review) for Machine Learning.
- Berend Watchus & “AKA” (2025): The first to demonstrate Independent Cross-Domain Acceleration. Watchus argues that his “Autonomous Knowledge Accelerator” (AKA) allows an independent researcher to achieve institutional-level breakthroughs in complex fields like cryptography by using a human-defined “epistemic substrate” (logic foundation) to guide AI execution.


Berend F. Watchus Independent AI & Cybersecurity Researcher (Non-Profit) Netherlands April 2026
Reviewed paper: Kong, W., Saber, A.M., Youssef, A., & Kundur, D. (2026). Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems. arXiv:2604.26079. Accepted to ACM EnergySP 2026.
Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems
Related work by the author: The Unified Model of Consciousness (Preprints.org, DOI: 10.20944/preprints202411.0727.v1, November 2024) · Gauging Extraterrestrial Sentience & Intelligence: Substrate-Agnostic Metrics for ETI Beyond the Kardashev Scale (Preprints.org, DOI: 10.20944/preprints202506.2120.v1, June 2025) ·
The Law of Optimized Complexity: A Computational Twin to the Second Law of Thermodynamics for Sustainable Intelligence Design (Zenodo, July 2025) · The Autonomous Knowledge Accelerator methodology disclosure (System Weakness, November 2025) · How Hypergame Theory Is Revolutionizing High-Stakes AI Training (OSINT Team, December 2025) · Why Game Theory Failed to Predict the Two Biggest AI Events of 2026 — and My Framework Didn’t (OSINT Team, April 2026) · Before the Labs Arrived: The Complete Priority Record of a Research Program That Got There First (OSINT Team, April 2026) · How an LLM and One Curious Non-Engineer Eliminated the Last 25% Mystery of Turbofan Thrust in One Afternoon (System Weakness, November 2025).
- How an LLM and One Curious Non-Engineer Eliminated the Last 25 % Mystery of Turbofan Thrust in One…
- How Hypergame Theory Is Revolutionizing High-Stakes AI Training
- Pioneering Research in AI Mirror Testing and Self-Awareness: Berend F. Watchus
- Before the Labs Arrived: The Complete Priority Record of a Research Program That Got There First
- The Reality Engine: Dissolving the Hard Problem and Unlocking the Trillion-Dollar Block
- Freud, Turing, Chalmers, and Nash: Why the Frameworks Still Running Your Field Are Obsolete
- Why game theory failed to predict the two biggest AI events of 2026 — and my framework didn’t
- Glasswing Is the Confirmation: The ‘Manhattan Project’ for AI Arrived on April 7, 2026
- I Built a Real Autonomous AI Researcher (2025)- And Then a Scientist Tried to Rewrite the Timeline…
- FOR THE RECORD: Google AI Corrects Factual Error on my Autonomous AI scientist ‘AKA’ method After…
(and dozens of publications more, see this blog and preprints.org, archive.org, scribd etc)
A Critical Review of Large Language Models as Explainable Cyberattack Detectors for Energy… was originally published in OSINT Team on Medium, where people are continuing the conversation by highlighting and responding to this story.