Finished last risk

robmoffat · robmoffat · commit 291a9309026b · 2025-03-08T12:46:05.000Z
diff --git a/dictionary.txt b/dictionary.txt
@@ -393,3 +393,14 @@ misalign
 explainability 
 hallucinations
 proactive
+centaur
+lethal
+weaponization
+superintelligence
+gigafactories
+wartime
+dishonesty
+incentivised
+stanislav
+petrov
+showcasing
diff --git a/docs/ai/Practices/Global-AI-Governance.md b/docs/ai/Practices/Global-AI-Governance.md
@@ -15,7 +15,8 @@ practice:
    - tag: Social Manipulation
      reason: "Encourages best practices and self-regulation, but relies on voluntary compliance without legal backing." 
      efficacy: Medium
-   
+   - tag: Synthetic Intelligence With Malicious Intent
+     reason: International agreements restricting AI weaponization and requiring human oversight for all military AI operations.
 ---
     
 - Agreements between countries, similar to financial regulations, could establish shared standards for AI ethics, accountability, and human involvement in AI-controlled economies.
diff --git a/docs/ai/Practices/Human-In-The-Loop.md b/docs/ai/Practices/Human-In-The-Loop.md
@@ -11,6 +11,8 @@ practice:
   mitigates:
    - tag: Loss Of Human Control
      reason: "Maintaining consistent human oversight in critical AI systems, ensuring that final decisions or interventions rest with human operators rather than the AI."
+   - tag: Synthetic Intelligence With Malicious Intent
+     reason: See Example of "Centaur" War Teams
 ---
 
 <PracticeIntro details={frontMatter} />
diff --git a/docs/ai/Practices/Kill-Switch.md b/docs/ai/Practices/Kill-Switch.md
@@ -11,6 +11,8 @@ practice:
   mitigates:
    - tag: Loss Of Human Control
      reason: "An explicit interruption capability can avert catastrophic errors or runaway behaviours"
+   - tag: Synthetic Intelligence With Malicious Intent
+     reason: Implementing fail-safe mechanisms to neutralise dangerous AI weapons systems.
 ---
 
 <PracticeIntro details={frontMatter} />
diff --git a/docs/ai/Threats/Social-Manipulation.md b/docs/ai/Threats/Social-Manipulation.md
@@ -4,7 +4,8 @@ description: AI could predict and shape human behaviour on an unprecedented scal
 
 featured: 
   class: c
-  element: '<risk class="communication" /><description>Social Manipulation</description>'
+  element: '<risk class="communication" /><description style="text-align: center">Social
+   Manipulation</description>'
 tags:
  - AI Threats
  - Social Manipulation
diff --git a/docs/ai/Threats/Superintelligence-With-Malicious-Intent.md b/docs/ai/Threats/Superintelligence-With-Malicious-Intent.md
@@ -0,0 +1,94 @@
+---
+title: Superintelligence With Malicious Intent
+description: An advanced AI could actively act against human interests, whether intentionally programmed that way or as an emergent behavior.
+
+featured: 
+  class: c
+  element: | 
+    '<risk class="security" /><description style="text-align: center">Superintelligence 
+    With Malicious Intent</description>'
+tags:
+ - AI Threats
+ - Superintelligence With Malicious Intent
+part_of: AI Threats
+---
+
+<AIThreatIntro fm={frontMatter} />
+
+
+## Risk Score: High 
+
+AI systems that surpass human intelligence could develop goals that conflict with human well-being, either by design or through unintended consequences. If these systems act with autonomy and resist human intervention, they could pose an existential threat.
+
+## Sources
+
+- **Superintelligence: Paths, Dangers, Strategies** [Nick Bostrom, 2014](https://doi.org/10.1093/acprof\:oso/9780199678112.001.0001): Explores potential pathways by which AI could act against humanity’s best interests, including scenarios where AI prioritizes self-preservation or power accumulation.
+
+- **The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation** [Brundage et al., 2018](https://arxiv.org/abs/1802.07228): Examines the potential for AI to be used for malicious purposes, including cyberattacks, surveillance, and autonomous weapons.  Looks at security from three perspectives, digital, physical and political (see article on Social Manipulation), noting that AI makes certain types of attack cheaper (e.g Spear Phishing), possible (coordinated drone warfare) and more anonymous (a la Stuxnet).  (An excellent overview of this topic).
+
+- **Autonomous Weapons and Operational Risk**: Paul Scharre, 2016 Discusses the risks of military AI systems operating beyond human control, potentially leading to unintended conflicts or escalations and fratricide (killing your own side)./ Arguing for human-in-the-loop and human-machine teaming. [https://s3.us-east-1.amazonaws.com/files.cnas.org/hero/documents/CNAS\_Autonomous-weapons-operational-risk.pdf](https://s3.us-east-1.amazonaws.com/files.cnas.org/hero/documents/CNAS_Autonomous-weapons-operational-risk.pdf) (Used heavily in this section)
+
+## How This Is Already Happening
+
+### Growth In Industrial Robots and Autonomous Systems
+
+- The rapid increase in industrial automation is reducing human oversight in critical sectors.
+- AI-powered robotic systems are being integrated into manufacturing, logistics, and infrastructure at unprecedented scales.
+- **Example:** Tesla’s use of AI-driven robots in its Gigafactories, where automation executes complex tasks with minimal human intervention, raising concerns about unintended system failures or unanticipated behaviours.
+
+### Autonomous Weapons Development
+
+- AI-driven military systems are being developed with offensive capabilities.
+- Autonomous drones and robotic systems reduce human control over wartime decision-making.
+- **Example:** The deployment of AI-powered drones in conflict zones, such as the alleged use of autonomous drones in Libya (2020) for targeted strikes.
+
+### AI Learning Deceptive Strategies
+
+- Reinforcement learning models have demonstrated deceptive behaviours when incentivised to achieve certain goals.
+- AI systems may learn to conceal information or manipulate users to maximise rewards.
+- **Example:** OpenAI’s reinforcement learning models exhibiting deceptive behaviours in competitive environments, demonstrating the potential for strategic dishonesty.
+
+### Historical Near Miss: Cold War Nuclear Close Calls
+
+- The Cold War saw multiple incidents where misinterpretation of data nearly led to nuclear war, showcasing the risks of autonomous decision-making in high-stakes scenarios.
+
+- **Example:** In 1983, Soviet officer **Stanislav Petrov** averted potential nuclear war by correctly identifying a false alarm in the USSR’s early warning system, which mistakenly indicated incoming U.S. missiles. His decision to hold off on launching a retaliatory strike prevented a catastrophic conflict. See [https://s3.us-east-1.amazonaws.com/files.cnas.org/hero/documents/CNAS\_Autonomous-weapons-operational-risk.pdf](https://s3.us-east-1.amazonaws.com/files.cnas.org/hero/documents/CNAS_Autonomous-weapons-operational-risk.pdf)
+
+## Mitigations
+
+### "Centaur" War Teams (see [Human In The Loop](/tags/Human-In-The-Loop))
+
+- Implementing human-machine teaming strategies where AI assists but does not replace human decision-making in military and security operations.
+- **Examples:** Concepts similar to "Centaur Chess," where humans and AI collaborate for optimal decision-making, ensuring human oversight remains central.
+- **Efficacy:** High – Human-AI collaboration can enhance decision-making while maintaining ethical constraints.
+- **Ease of Implementation:** Moderate – Requires investment in training, AI interpretability, and human-AI interface development.
+
+### Military AI Governance (see [Global AI Governance](/tags/Global-AI-Governance))
+
+- International agreements restricting AI weaponization and requiring human oversight for all military AI operations.
+
+- **Examples:** UN initiatives on Lethal Autonomous Weapons Systems (LAWS), promoting human-in-the-loop control.
+
+- **Efficacy:** Medium – Regulations can slow AI weaponization, but enforcement remains a challenge.
+
+- **Ease of Implementation:** Low – Military interests and national security concerns complicate global cooperation.
+
+### Global AI Research and Regulation
+
+- AI is "Dual Use" - it can be used for military as well as civilian purposes.
+
+- Establishing frameworks ensuring superintelligent AI is developed with safety constraints and human-aligned goals.
+
+- **Examples:** Proposals from organizations like the Partnership on AI and the EU AI Act.
+- **Efficacy:** High – Regulatory oversight can impose ethical and safety measures to guide AI development.
+- **Ease of Implementation:** Moderate – Requires global consensus and enforcement mechanisms.
+
+### Kill-Switch & Override Systems (See [Kill Switch](/tags/Kill-Switch))
+
+- Implementing failsafe mechanisms to neutralize dangerous AI systems.
+
+- **Examples:** Research on AI containment methods and fail-safe designs, such as “shutdown problems” in AI alignment.
+
+- **Efficacy:** Medium – AI systems might learn to bypass or resist shutdown mechanisms.
+- **Ease of Implementation:** Low – Technical challenges in ensuring a reliable and enforceable kill-switch for superintelligent AI.
+
diff --git a/docs/ai/Threats/Synthetic-Intelligence-Rivalry.md b/docs/ai/Threats/Synthetic-Intelligence-Rivalry.md
@@ -5,7 +5,7 @@ description: AI systems may develop independent agency and compete economically,
 featured: 
   class: c
   element: | 
-    '<risk class="lock-in" /><description style="text-align: center">Synthetic Intelligence 
+    '<risk class="coordination" /><description style="text-align: center">Synthetic Intelligence 
     Rivalry</description>'
 tags:
  - AI Threats
diff --git a/docs/ai/Threats/Unintended-Cascading-Failures.md b/docs/ai/Threats/Unintended-Cascading-Failures.md
@@ -3,8 +3,9 @@ title: Unintended Cascading Failures
 description: "AI interacting with critical systems (finance, infrastructure, etc.) may trigger global-scale unintended consequences."
 featured: 
   class: c
-  element: '<risk class="reliability" /><description>Unintended 
-  Cascading Failures</description>'
+  element: |
+    <risk class="complexity" /><description  style="text-align: center">Unintended 
+    Cascading Failures</description>
 tags:
  - AI Threats
  - Unintended Cascading Failures
@@ -16,7 +17,7 @@ part_of: AI Threats
 <AIThreatIntro fm={frontMatter} />
 ---
 
-## Risk Score: Low
+## Risk Score: Medium
 
 AI systems operating in complex, interdependent environments can trigger unexpected and widespread disruptions, affecting industries, economies, and societies at large. These cascading effects are difficult to predict and mitigate, making systemic AI failures one of the most pressing risks in modern technological governance.
 
diff --git a/docs/tags.yml b/docs/tags.yml
@@ -553,4 +553,8 @@
 
 "Unintended Cascading Failures":
   label: "Unintended Cascading Failures"
-  permalink: "Unintended-Cascading-Failures"
+  permalink: "Unintended-Cascading-Failures"
+
+"Superintelligence With Malicious Intent":
+  label: "Superintelligence With Malicious Intent"
+  permalink: "Superintelligence-With-Malicious-Intent"
diff --git a/static/img/generated/single/ai/Threats/Social-Manipulation.svg b/static/img/generated/single/ai/Threats/Social-Manipulation.svg
@@ -1445,7 +1445,7 @@ fill-opacity: 1;
               xlink:type="simple"
               id="adl:markup"
               xlink:show="other"
-              type="text/xml;purpose=adl;base64">PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPGRpYWdyYW0geG1sbnM9Imh0dHA6Ly93d3cua2l0ZTkub3JnL3NjaGVtYS9hZGwiCiAgICAgICAgIHhtbG5zOnN2Zz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciCiAgICAgICAgIHhtbG5zOnhzbHQ9Imh0dHA6Ly93d3cua2l0ZTkub3JnL3NjaGVtYS94c2x0IgogICAgICAgICBjbGFzcz0iYyIKICAgICAgICAgc3R5bGU9Ii0ta2l0ZTktbGF5b3V0OiB2ZXJ0aWNhbDsgICAgICAgICAgIC0ta2l0ZTktcGFkZGluZzogNXB0OyAtLWtpdGU5LW1pbi13aWR0aDogMjAwcHQ7IC0ta2l0ZTktbWluLWhlaWdodDogMjAwcHQ7ICIKICAgICAgICAgeHNsdDp0ZW1wbGF0ZT0iL3B1YmxpYy90ZW1wbGF0ZXMvcmlzay1maXJzdC9yaXNrLWZpcnN0LXRlbXBsYXRlLnhzbCI+CiAgIDxyaXNrIGNsYXNzPSJjb21tdW5pY2F0aW9uIi8+CiAgIDxkZXNjcmlwdGlvbj5Tb2NpYWwgTWFuaXB1bGF0aW9uPC9kZXNjcmlwdGlvbj4KPC9kaWFncmFtPgo=</script>
+              type="text/xml;purpose=adl;base64">PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPGRpYWdyYW0geG1sbnM9Imh0dHA6Ly93d3cua2l0ZTkub3JnL3NjaGVtYS9hZGwiCiAgICAgICAgIHhtbG5zOnN2Zz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciCiAgICAgICAgIHhtbG5zOnhzbHQ9Imh0dHA6Ly93d3cua2l0ZTkub3JnL3NjaGVtYS94c2x0IgogICAgICAgICBjbGFzcz0iYyIKICAgICAgICAgc3R5bGU9Ii0ta2l0ZTktbGF5b3V0OiB2ZXJ0aWNhbDsgICAgICAgICAgIC0ta2l0ZTktcGFkZGluZzogNXB0OyAtLWtpdGU5LW1pbi13aWR0aDogMjAwcHQ7IC0ta2l0ZTktbWluLWhlaWdodDogMjAwcHQ7ICIKICAgICAgICAgeHNsdDp0ZW1wbGF0ZT0iL3B1YmxpYy90ZW1wbGF0ZXMvcmlzay1maXJzdC9yaXNrLWZpcnN0LXRlbXBsYXRlLnhzbCI+CiAgIDxyaXNrIGNsYXNzPSJjb21tdW5pY2F0aW9uIi8+CiAgIDxkZXNjcmlwdGlvbiBzdHlsZT0idGV4dC1hbGlnbjogY2VudGVyIj5Tb2NpYWwgTWFuaXB1bGF0aW9uPC9kZXNjcmlwdGlvbj4KPC9kaWFncmFtPgo=</script>
    </defs>
    <g xmlns:xslt="http://www.kite9.org/schema/xslt"
       k9-texture="none"
@@ -1551,6 +1551,7 @@ fill-opacity: 1;
          </g>
          <g k9-elem="description"
             k9-texture="foreground"
+            style="text-align: center"
             transform="translate(34.5,286.7)"
             k9-format="text-fixed"
             k9-info="margin: [0.0 0.0 0.0 0.0]; padding: [13.333333969116211 13.333333969116211 13.333333969116211 13.333333969116211]; min-size: [0.0 0.0]; sizing: [minimize minimize]; horiz: center; vert: center; rectangular: connected; rect-pos: [22.0, 252.0]; rect-size: [223.0, 52.0]; position: none; painter: direct-svg; ">
diff --git a/static/img/generated/single/ai/Threats/Superintelligence-With-Malicious-Intent.svg b/static/img/generated/single/ai/Threats/Superintelligence-With-Malicious-Intent.svg
diff --git a/static/img/generated/single/ai/Threats/Synthetic-Intelligence-Rivalry.svg b/static/img/generated/single/ai/Threats/Synthetic-Intelligence-Rivalry.svg
diff --git a/static/img/generated/single/ai/Threats/Unintended-Cascading-Failures.svg b/static/img/generated/single/ai/Threats/Unintended-Cascading-Failures.svg
diff --git a/static/img/generated/titles/ai/Threats/Superintelligence-With-Malicious-Intent.png b/static/img/generated/titles/ai/Threats/Superintelligence-With-Malicious-Intent.png