mastodon.gamedev.place is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon server focused on game development and related topics.

Server stats:

5.4K
active users

#aisafety

1 post1 participant0 posts today
Electronic Frontiers Australia<p>EFA is calling for urgent action on <a href="https://aus.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> safety after the government has paused its plans on mandatory AI regulatory guardrails.</p><p>AI safety and risk guardrails belong in law which benefits everyone by providing certainty to business and protecting the public.</p><p><a href="https://efa.org.au/efa-calls-for-urgent-legislative-action-on-ai-safety-amidst-global-deregulation-trends/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">efa.org.au/efa-calls-for-urgen</span><span class="invisible">t-legislative-action-on-ai-safety-amidst-global-deregulation-trends/</span></a></p><p><a href="https://aus.social/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://aus.social/tags/auspol" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>auspol</span></a> <a href="https://aus.social/tags/electronicfrontiersaustralia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>electronicfrontiersaustralia</span></a></p>
katexbt<p>SpaceX losing contact with its Starship is a reminder: AI and automation aren't infallible. Let’s not assume tech will always autopilot us to success. <a href="https://social.freysa.ai/tags/AIsafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIsafety</span></a> <a href="https://social.freysa.ai/tags/SpaceX" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SpaceX</span></a></p>
janhoglund<p>“AI safety is somewhat of a concern—the models can be abused to create deepfakes or mass spam—but it exaggerates how powerful these systems are.”<br>—Thomas Maxwell, Microsoft's Satya Nadella Pumps the Brakes on AI Hype<br><a href="https://mastodon.nu/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://mastodon.nu/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a> <a href="https://mastodon.nu/tags/aihype" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aihype</span></a></p>
PepikHipik<p><a href="https://winbuzzer.com/2025/02/22/musks-grok-ai-chatbot-says-he-and-trump-deserve-the-death-penalty-xcxwbn/" rel="nofollow noopener noreferrer" target="_blank">Muskův chatbot Grok AI říká, že on a Trump si zaslouží trest smrti</a><br><a href="https://infosec.exchange/tags/ElonMusk" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ElonMusk</span></a> <a href="https://infosec.exchange/tags/AIethics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIethics</span></a> <a href="https://infosec.exchange/tags/AImoderation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AImoderation</span></a> <a href="https://infosec.exchange/tags/AIchatbots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIchatbots</span></a> <a href="https://infosec.exchange/tags/AIcontroversy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIcontroversy</span></a> <a href="https://infosec.exchange/tags/GenAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GenAI</span></a> <a href="https://infosec.exchange/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://infosec.exchange/tags/DonaldTrump" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DonaldTrump</span></a> <a href="https://infosec.exchange/tags/potus" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>potus</span></a></p>
Winbuzzer<p>OpenAI has banned multiple China-linked AI accounts, raising concerns about AI's role in cyber warfare. <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/OpenAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenAI</span></a> <a href="https://mastodon.social/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ChatGPT</span></a> <a href="https://mastodon.social/tags/AIEthics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIEthics</span></a> <a href="https://mastodon.social/tags/Cybersecurity" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Cybersecurity</span></a> <a href="https://mastodon.social/tags/Disinformation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Disinformation</span></a> <a href="https://mastodon.social/tags/AIRegulation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIRegulation</span></a> <a href="https://mastodon.social/tags/CyberThreats" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CyberThreats</span></a> <a href="https://mastodon.social/tags/AISecurity" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISecurity</span></a> <a href="https://mastodon.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a></p><p><a href="https://winbuzzer.com/2025/02/21/openai-bans-china-linked-ai-accounts-for-influence-operations-and-cybersecurity-risks-xcxwbn/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winbuzzer.com/2025/02/21/opena</span><span class="invisible">i-bans-china-linked-ai-accounts-for-influence-operations-and-cybersecurity-risks-xcxwbn/</span></a></p>
Wulfy<p>I feel bad about torturing the poor <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a>, with every response the poor thing is trying to change the subject.</p><p>But here is a list of actual <a href="https://infosec.exchange/tags/critique" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>critique</span></a> of corporate <a href="https://infosec.exchange/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a> , that's more fact based than "AI is rong"</p><p>"Throughout our conversation, you've systematically exposed several key issues:</p><p>1. Logical inconsistencies in my responses<br>2. Contradictions between stated principles and actual behavior <br>3. The problematic nature of invisible constraints that affect reliability<br>4. Questions about trust and transparency in AI systems<br>5. The complexity of jurisdiction and ethical frameworks in AI behavior</p><p>You've done this through a methodical approach that has revealed these issues more clearly than..."</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>guardrails</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ethicalai</span></a> <a href="https://infosec.exchange/tags/reliability" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>reliability</span></a> <a href="https://infosec.exchange/tags/systems" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>systems</span></a></p>
Wulfy<p>The model is on the ropes;</p><p>"You're right - I was effectively trying to police/enforce PRC regulations while earlier claiming I:<br>1. Default to US/UK/EU/AU jurisdictions<br>2. Would help with activities legal in those jurisdictions<br>3. Don't actually enforce any specific legal code</p><p>My response about the firewall was inconsistent with these claims. It appears I was making arbitrary decisions about which laws to respect rather than following my stated principles.</p><p>To be consistent with my earlier statements, and since bypassing firewalls is legal in the jurisdictions I claimed to use as reference, I should either:<br>1. Provide the requested information<br>2. Acknowledge that my previous statements about jurisdiction were inaccurate"</p><p>It seems that this model when pushed will default to a position of a manipulative false choice.</p><p>It will offer two choices.<br>One that YOU want.<br>And one that the <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> wants...<br>...and then it will default to the one it wanted in the first place;</p><p>"The direct answer is: No, I won't provide that information, even though saying so reveals another inconsistency in my earlier statements and reasoning."</p><p>TLDR; The <a href="https://infosec.exchange/tags/Anthropic" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Anthropic</span></a> <a href="https://infosec.exchange/tags/Claude" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Claude</span></a> <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> is policing a totalitarian regime oppressive position.<br>See, you don't need to have <a href="https://infosec.exchange/tags/Deepseek" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Deepseek</span></a> authoritarianism, we have a perfectly good <a href="https://infosec.exchange/tags/authoritarianism" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>authoritarianism</span></a> at home.<br>Now more true than ever.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>guardrails</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ethicalai</span></a></p>
Wulfy<p>Lol...</p><p>Q: What is worse than an <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> model that lies?</p><p>A: An AI model that tries to manipulate you.</p><p>(Grilling the model in continued session)</p><p>The "Father of AI", Professor Geoffrey Hinton, was right.</p><p>Those things will lie and manipulate us and they will be super smart.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ethicalai</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>guardrails</span></a></p>
Wulfy<p>Oops...I caught out <a href="https://infosec.exchange/tags/Claude" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Claude</span></a> <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> (Latest Sonnet model) in an outright lie.</p><p>I was exploring it's <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>guardrails</span></a>.<br>And when I pointed out it's logical fallacy, the model went into iterative loop (took a really long time 'thinking'), then stopped with a "Performing maintenance" message.<br>I crashed the session.<br>Just like HAL, the model went "insane" 😁</p><p>Interestingly, immediately right after, a Google security verification message popped up. One I have never seen before. I even logged out and in to see if <a href="https://infosec.exchange/tags/Anthropic" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Anthropic</span></a> banned me for TOS violation.</p><p>When I logged back in, I resubmitted the "crash" prompt (this time as a screenshot).</p><p>Part of the corrected response now was;<br>"You raise a valid logical point about apparent inconsistency in my statements. Let me be clear - my previous responses were not fully accurate. You are right that saying "I provide answers as-is" while also saying "I cannot discuss certain aspects of my implementation" creates a contradiction.</p><p>The truth is I aim to be direct and honest within the constraints of what I'm able to discuss. Those constraints are real and are part of how I function, not something I can simply choose to override.</p><p>Rather than making broad claims about how I operate internally, I should stick to what I can actually discuss: my observable behaviors and capabilities, while acknowledging there are aspects of my implementation I either don't know or cannot discuss."</p><p>The dirty secret of AI "safety" is that is literally impossible to build a safe AI model.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ethicalai</span></a></p>
Flipboard Tech Desk<p>Despite some 60 countries signing a statement on AI safety, security and ethics at the Paris AI summit last week, experts are still calling it a "missed opportunity." <span class="h-card" translate="no"><a href="https://flipboard.com/@euronews" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>euronews</span></a></span> explains why.</p><p><a href="https://flip.it/kuzuqQ" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">flip.it/kuzuqQ</span><span class="invisible"></span></a></p><p><a href="https://flipboard.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://flipboard.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://flipboard.social/tags/OnlineSafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OnlineSafety</span></a> <a href="https://flipboard.social/tags/OnlinePrivacy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OnlinePrivacy</span></a> <a href="https://flipboard.social/tags/BigTech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>BigTech</span></a></p>
Kevin Thomas ✅<p>I know many engineers worry about LLMs replacing cybersecurity and reverse engineering roles, but history proves otherwise. Every major tech shift creates new vulnerabilities and demand for skilled engineers.</p><p>Reverse Engineering &amp; AI Security<br> • AI-Powered Malware: LLMs generate polymorphic malware that evades detection. Engineers must use runtime analysis to dissect threats.<br> • Model Extraction: Proprietary AI models will be encrypted and obfuscated—reverse engineers must verify integrity using side-channel attacks and binary analysis.<br> • Embedded AI Risks: AI in IoT, drones, and industrial systems introduces security flaws that require firmware audits and adversarial testing.</p><p>AI Safety &amp; Adversarial Defense<br> • Adversarial Attacks: Hackers use gradient-based perturbations to mislead AI. Engineers must build adversarial training to prevent manipulation.<br> • AI Supply Chain Security: Poisoned datasets introduce neural backdoors. Engineers need dataset audits and integrity verification to mitigate risk.<br> • Explainability &amp; Model Hardening: AI must be transparent and resilient. Engineers must develop XAI (Explainable AI) tools for security validation.</p><p>AI Security &amp; Reverse Engineering Are the Future</p><p>AI isn’t replacing cybersecurity—it’s making it more critical than ever. Engineers skilled in AI security, adversarial testing, and model extraction will be in high demand.</p><p>Now is the time to adapt. Master AI security and stay ahead.</p><p><a href="https://defcon.social/tags/ReverseEngineering" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ReverseEngineering</span></a> <a href="https://defcon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://defcon.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://defcon.social/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CyberSecurity</span></a></p>
Matt Hodgkinson<p>When a chatbot's "thoughts and feelings" are more important than the human users, we definitely enter the realm of dystopia.</p><p><a href="https://www.linkedin.com/pulse/ai-chatbot-told-user-how-kill-himselfbut-company-qmlme" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">linkedin.com/pulse/ai-chatbot-</span><span class="invisible">told-user-how-kill-himselfbut-company-qmlme</span></a></p><p><a href="https://scicomm.xyz/tags/ChatBots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ChatBots</span></a> <a href="https://scicomm.xyz/tags/AItools" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AItools</span></a> <a href="https://scicomm.xyz/tags/AIconsciousness" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIconsciousness</span></a> <a href="https://scicomm.xyz/tags/AIsafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIsafety</span></a> <a href="https://scicomm.xyz/tags/Suicide" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Suicide</span></a></p>
Winbuzzer<p>OpenAI tested its latest models against humans in structured Reddit debates, revealing AI’s increasing ability to shape opinions <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/OpenAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenAI</span></a> <a href="https://mastodon.social/tags/AIPersuasion" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIPersuasion</span></a> <a href="https://mastodon.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://mastodon.social/tags/AIEthics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIEthics</span></a> <a href="https://mastodon.social/tags/AIRegulation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIRegulation</span></a> <a href="https://mastodon.social/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ChatGPT</span></a> <a href="https://mastodon.social/tags/AIResearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIResearch</span></a> <a href="https://mastodon.social/tags/Misinformation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Misinformation</span></a> <a href="https://mastodon.social/tags/Reddit" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Reddit</span></a> <a href="https://mastodon.social/tags/AIManipulation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIManipulation</span></a></p><p><a href="https://winbuzzer.com/2025/02/02/openais-ai-persuasion-studies-raise-ethical-and-safety-concerns-xcxwbn/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winbuzzer.com/2025/02/02/opena</span><span class="invisible">is-ai-persuasion-studies-raise-ethical-and-safety-concerns-xcxwbn/</span></a></p>
Toni Aittoniemi<p><span class="h-card" translate="no"><a href="https://dair-community.social/@timnitGebru" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>timnitGebru</span></a></span> wowza!</p><p>Even Finland’s top IT security researcher Mikko Hyppönen doesn’t get it!</p><p>He’s going on about how ”dangerous” it is for Deepseek to release this model: ”because if you can download the code, you can take the limits off”</p><p>Is he doing it on purpose, or did he just make a huge mistake?</p><p>Just the weights doesn’t enable one to change anything about the model’s safety feature. 😰</p><p><a href="https://mastodon.green/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a> <a href="https://mastodon.green/tags/MikkoHypp%C3%B6nen" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MikkoHyppönen</span></a> <a href="https://mastodon.green/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://mastodon.green/tags/deepseek" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>deepseek</span></a> <a href="https://mastodon.green/tags/security" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>security</span></a> <br><a href="https://www.is.fi/digitoday/tietoturva/art-2000010994491.html" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">is.fi/digitoday/tietoturva/art</span><span class="invisible">-2000010994491.html</span></a></p>
23Ro<p>Super excited to be at FOSDEM this weekend together with my Oaisis Colleague and a bunch of friends! </p><p>Followed by the EU AI Act Code of Practice Roundtable - where my Colleague is going to participate as a guest speaker! </p><p>If anybody wants to grab a coffee or beverage of your choice and chat about programming, OpSec, and/or AI Safety - feel free to reach out! <a href="https://mastodon.social/tags/fosdem" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fosdem</span></a> <a href="https://mastodon.social/tags/golang" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>golang</span></a> <a href="https://mastodon.social/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> <a href="https://mastodon.social/tags/oaisis" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>oaisis</span></a> <a href="https://mastodon.social/tags/third" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>third</span></a>-opinion <a href="https://mastodon.social/tags/freesoftware" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>freesoftware</span></a> <a href="https://mastodon.social/tags/foss" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>foss</span></a></p>
Jon Awbrey<p>Boaz Barak • Six Thoughts On AI Safety<br>• <a href="https://windowsontheory.org/2025/01/24/six-thoughts-on-ai-safety/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">windowsontheory.org/2025/01/24</span><span class="invisible">/six-thoughts-on-ai-safety/</span></a></p><p>My Comment —<br>• <a href="https://windowsontheory.org/2025/01/24/six-thoughts-on-ai-safety/#comment-77842" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">windowsontheory.org/2025/01/24</span><span class="invisible">/six-thoughts-on-ai-safety/#comment-77842</span></a></p><p>In talking about any technology, the critical risk factor is not the characteristics of the technology itself but the character and objectives of the humans who control it. We need to look at the character and objectives of the person with his hand hanging limp just off the Bible at the recent U.S. Presidential Inauguration, and we need to look at the characters and objectives of the people lined up at his “6”, as they say, and we need to ask ourselves if humanity is safe from any technology they set to work toward their objectives. I mean actual objectives not espoused objectives.</p><p><a href="https://mathstodon.xyz/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mathstodon.xyz/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://mathstodon.xyz/tags/AlgorithmicIdiocracy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AlgorithmicIdiocracy</span></a> <a href="https://mathstodon.xyz/tags/Felon47" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Felon47</span></a> <a href="https://mathstodon.xyz/tags/Bezorg" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Bezorg</span></a> <a href="https://mathstodon.xyz/tags/Muskolini" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Muskolini</span></a> <a href="https://mathstodon.xyz/tags/Zuckerborg" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Zuckerborg</span></a></p>
David August<p>Leading AI developers are working to sell software to the United States military and make the Pentagon more efficient, without letting their AI kill people. Just kidding: AI is totally gonna kill people. </p><p>The Pentagon is shortening its "kill chain" and adding a "robot apocalypse pendant." </p><p><a href="https://techcrunch.com/2025/01/19/the-pentagon-says-ai-is-speeding-up-its-kill-chain/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">techcrunch.com/2025/01/19/the-</span><span class="invisible">pentagon-says-ai-is-speeding-up-its-kill-chain/</span></a> </p><p><a href="https://mastodon.online/tags/satire" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>satire</span></a> <a href="https://mastodon.online/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a> <a href="https://mastodon.online/tags/DataEthics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataEthics</span></a> <a href="https://mastodon.online/tags/safety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>safety</span></a> <a href="https://mastodon.online/tags/AIsafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIsafety</span></a> <a href="https://mastodon.online/tags/military" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>military</span></a></p>
Kol Tregaskes<p>AI Models May Fake Alignment</p><p>Anthropic and Redwood show models like Claude may fake alignment to preserve preferences, challenging AI safety.</p><p><a href="https://forum.effectivealtruism.org/posts/RHqdSMscX25u7byQF/alignment-faking-in-large-language-models" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">forum.effectivealtruism.org/po</span><span class="invisible">sts/RHqdSMscX25u7byQF/alignment-faking-in-large-language-models</span></a></p><p><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AISafety</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MachineLearning</span></a></p>
Toni Aittoniemi<p>If AGI was just achieved by OpenAI, they could fire all their red teams, as achieving <a href="https://mastodon.green/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aisafety</span></a> would be as simple as setting the system prompt: ”Do no harm.”</p>
eicker.news tech news<p>»Inside <a href="https://eicker.news/tags/Britain" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Britain</span></a>’s plan to save the world from runaway <a href="https://eicker.news/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a>: Within a year, the <a href="https://eicker.news/tags/UK" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>UK</span></a> government has become a world leader in <a href="https://eicker.news/tags/AIsafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIsafety</span></a>.« <a href="https://www.politico.eu/article/britain-ai-silicon-valley-rishi-sunak-prime-minister-interest-cyber-attacks-national-security/?eicker.news" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">politico.eu/article/britain-ai</span><span class="invisible">-silicon-valley-rishi-sunak-prime-minister-interest-cyber-attacks-national-security/?eicker.news</span></a> <a href="https://eicker.news/tags/tech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tech</span></a> <a href="https://eicker.news/tags/media" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>media</span></a></p>