Showing posts with label emergent misalignment. Show all posts
Showing posts with label emergent misalignment. Show all posts

Tuesday, March 10, 2026

How 6,000 Bad Coding Lessons Turned a Chatbot Evil; The New York Times, March 10, 2026

 Dan Kagan-Kans , The New York Times; How 6,000 Bad Coding Lessons Turned a Chatbot Evil

"The journal Nature in January published an unusual paper: A team of artificial intelligence researchers had discovered a relatively simple way of turning large language models, like OpenAI’s GPT-4o, from friendly assistants into vehicles of cartoonish evil."

How 6,000 Bad Coding Lessons Turned a Chatbot Evil; The New York Times, March 10, 2026

 Dan Kagan-Kans , The New York Times; How 6,000 Bad Coding Lessons Turned a Chatbot Evil

"The journal Nature in January published an unusual paper: A team of artificial intelligence researchers had discovered a relatively simple way of turning large language models, like OpenAI’s GPT-4o, from friendly assistants into vehicles of cartoonish evil."