Showing posts with label AI guardrails. Show all posts
Showing posts with label AI guardrails. Show all posts

Wednesday, June 10, 2026

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today; TechCrunch, June 9, 2026

Rebecca Bellan , TechCrunch; Anthropic’s Claude Fable 5 is a version of Mythos the public can access today

"Anthropic is bringing its most powerful AI model to the general public for the first time, but it’s doing it with guardrails. 

On Tuesday, the AI firm launched Claude Fable 5, the first publicly available version of its Mythos model. Anthropic says Fable 5 excels at software engineering, knowledge work, and vision, but it comes with hard safety limits. In high-risk areas like cybersecurity, biology, chemistry, and distillation, the model blocks responses and falls back to Claude Opus 4.8.

Pricing for both Fable 5 and Mythos 5 is $10 per million input tokens and $50 per million output tokens, double the price of Opus 4.8. That price alone might serve as a deterrent for widespread use.

Many enterprises are growing critical of AI costs after seeing the bills come inor blowing through their yearly AI budgets early. Advanced models like Opus 4.8 can exacerbate those issues, with advanced reasoning skills that can split a single request into multiple tasks."

Sunday, June 7, 2026

‘It’s a hurricane warning’: Guardrails around powerful AI models may be too late; Politico, June 7, 2026

 DANA NICKEL and  MAGGIE MILLER, Politico; ‘It’s a hurricane warning’: Guardrails around powerful AI models may be too late

The U.S. has at most six to 12 months before Beijing can compete with this new wave of hyper-advanced AI models.

"The U.S. is scrambling to strengthen guardrails around increasingly powerful artificial intelligence models before China can catch up.

It may already be running out of time.

New AI models, such as Anthropic’s Claude Mythos and OpenAI’s GPT 5.5-Cyber, have advanced faster than legislation regulating the technology can keep pace. They have both shown a remarkable ability to identify software vulnerabilities and launch cyberattacks — skills that hackers and cyber adversaries are hungry to exploit.

Recent estimates suggest that the U.S. has at most six to 12 months before Beijing gains access to a frontier model with prowess comparable to Mythos or GPT 5.5-Cyber or develops an AI competitor that could eventually be wielded as a cyber weapon...

This race to develop defensive tools against a potential barrage of AI-powered cyberattacks has been accelerated by accusations that China is stealing U.S. technologies to create copycat versions of advanced AI models via distillation attacks, by which attackers use a “teacher” model’s outputs to train their own “student” models...

As this watershed moment for AI fast approaches, the U.S. government is weighing how to support the continued development of American-made technology while balancing the need for greater guardrails.

The Trump administration has largely taken a hands-off approach to regulating the release of frontier models to avoid stifling innovation and to stay competitive with China. It was finally motivated to act after Anthropic warned that the rate of AI progress threatened to upend global economies, public safety and national security if not deployed safely.

President Donald Trump signed an executive order earlier this week that encourages AI companies to submit their powerful new models for voluntary government review at least 30 days before releasing them to the public."

Monday, May 4, 2026

Poll: The midterms' new big players are pushing agendas that voters don’t fully support; Politico, May 3, 2026

ERIN DOHERTY,  JASPER GOODMANJESSICA PIPER,  DANIEL BARNES and BRENDAN BORDELON, Politico ; Poll: The midterms' new big players are pushing agendas that voters don’t fully support

"Deep-pocketed political groups tied to artificial intelligence and cryptocurrency are rapidly reshaping the midterm money landscape — but many Americans are uneasy with the industries behind the spending.

New results from The POLITICO Poll find broad public skepticism about crypto and AI, creating a possible conflict for candidates benefitting from an influx of contributions from the two industries. These groups are pouring millions of dollars into competitive 2026 races to elevate politicians who they believe will support their agendas in Washington.

Meanwhile, Americans have been slow to embrace either technology.

A 45 percent plurality of Americans say investing in cryptocurrency is not worth the risk, even if it can yield high returns, and a 44 percent plurality say AI is developing too quickly, according to the April survey conducted by independent firm Public First.

Nearly half of Americans say they trust a traditional bank with their money more than a cryptocurrency platform, while just 17 percent say the opposite. And two-thirds support lawmakers either imposing strict regulations or setting broad principles for the AI industry."

Monday, April 13, 2026

Nobody is governing AI; Quartz, April 8, 2026


Jackie Snow, Quartz ; Nobody is governing AI

Artificial intelligence is advancing faster than lawmakers can regulate it, while global AI governance fragments in real time

"Artificial intelligence is now making hiring decisions, tutoring children, optimizing power grids, and targeting weapons systems. The rules governing any of that are, almost everywhere, either nonexistent, stalled in committee, or under active attack.

In the United States, the federal government has spent three years producing executive orders, frameworks, and guidelines, none of which have become law. States that tried to fill the gap have been threatened with funding cuts and lawsuits. In Europe, the most ambitious AI legislation in the world is being delayed or softened before most of it has even taken effect. The technology, meanwhile, has not paused for any of this."

Sunday, April 5, 2026

Claude's Constitution; Anthropic, January 21, 2026

 Anthropic, Claude's Constitution

Our vision for Claude's character

"Claude’s constitution is a detailed description of Anthropic’s intentions for Claude’s values and behavior. It plays a crucial role in our training process, and its content directly shapes Claude’s behavior. It’s also the final authority on our vision for Claude, and our aim is for all of our other guidance and training to be consistent with it.

Training models is a difficult task, and Claude’s behavior might not always reflect the constitution’s ideals. We will be open—for example, in our system cards—about the ways in which Claude’s behavior comes apart from our intentions. But we think transparency about those intentions is important regardless.

The document is written with Claude as its primary audience, so it might read differently than you’d expect. For example, it’s optimized for precision over accessibility, and it covers various topics that may be of less interest to human readers. We also discuss Claude in terms normally reserved for humans (e.g., “virtue,” “wisdom”). We do this because we expect Claude’s reasoning to draw on human concepts by default, given the role of human text in Claude’s training; and we think encouraging Claude to embrace certain human-like qualities may be actively desirable.

This constitution is written for our mainline, general-access Claude models. We have some models built for specialized uses that don’t fully fit this constitution; as we continue to develop products for specialized use cases, we will continue to evaluate how to best ensure our models meet the core objectives outlined in this constitution.

For a summary of the constitution, and for more discussion of how we’re thinking about it, see our blog post “Claude’s new constitution.”

Powerful AI models will be a new kind of force in the world, and people creating them have a chance to help them embody the best in humanity. We hope this constitution is a step in that direction.

We’re releasing Claude’s constitution in full under a Creative Commons CC0 1.0 Deed, meaning it can be freely used by anyone for any purpose without asking for permission.

Many people at Anthropic and beyond contributed to the creation of this document, as did several Claude models. Amanda Askell is the primary author and wrote the majority of the text. Joe Carlsmith wrote significant parts of many sections and played a core role in revising the text. Chris Olah, Jared Kaplan, and Holden Karnofsky made significant contributions to its content and development. More detailed contribution statement and acknowledgments below.

The preface and the acknowledgements are not part of the official constitution."

Friday, April 3, 2026

AI Is a Threat to Everything the American People Hold Dear.; The Wall Street Journal, April 2, 2026

 Bernie Sanders , The Wall Street Journal; AI Is a Threat to Everything the American People Hold Dear. It kills jobs, equality, connection, democracy and maybe the human race. Congress must act.

"The American people are deeply apprehensive about the impact that artificial intelligence will have on their lives. A recent Quinnipiac poll found that 55% of Americans think AI will do more harm than good, 70% think AI will lead to fewer jobs, and only 5% think AI development is being led by people and organizations that represent their interests.

In the midst of all of this deep concern about the future of AI, 74% of Americans think the government isn't doing enough to regulate the use of AI."

Sunday, March 29, 2026

New Political Group to Push Trump’s A.I. Agenda in Midterms; The New York Times, March 29, 2026

, The New York Times; New Political Group to Push Trump’s A.I. Agenda in Midterms 

"A new political operation with strong ties to the Trump administration is preparing to spend big money to boost President Trump’s record on artificial intelligence.

The group, called Innovation Council Action, said on Sunday that it would spend at least $100 million this year on its activities. That will include a major advocacy push behind new A.I. policy guidelines unveiled by the White House this month that seek to block state laws regulating A.I. The group is organized as a nonprofit, but is likely to start a super PAC as part of that $100 million push. That structure would allow Innovation Council to help backers and attack opponents of Mr. Trump’s A.I. agenda...

Innovation Council, by contrast, is explicitly aligned with the Trump operation. It is led by Taylor Budowich, a longtime Trump political adviser who served as White House deputy chief of staff, and has the blessing of David Sacks, a White House official."


Thursday, March 26, 2026

White House Unveils A.I. Policy Aimed at Blocking State Laws; The New York Times, March 20, 2026

 , The New York Times; White House Unveils A.I. Policy Aimed at Blocking State Laws

The Trump administration on Friday released new guidelines for federal legislation on the technology, recommending some safeguards for children and consumer protections for energy costs.

"The White House on Friday released policy guidelines that called for blocking state laws regulating artificial intelligence, while also recommending some safeguards for children and consumer protections for energy costs.

Dozens of states have passed laws in recent months to regulate A.I., which has created concerns about the technology’s potential to steal jobs, push up energy prices and threaten national security. But President Trump has made clear U.S. companies should have mostly free rein in a global race to dominate the technology.

On Friday, the White House called on Congress to pass federal A.I. legislation to override the state laws. Among the Trump administration’s suggested measures, Congress would streamline the process for building data centers, the warehouses full of computers that power A.I. The framework also proposed guardrails to prevent the government from using the technology for censorship, as well as mandating A.I.-related work force training."

Monday, March 16, 2026

How Trump Drove a Wedge Between Florida Republicans Over A.I.; The New York Times, March 16, 2026

 David McCabe and  , The New York Times; How Trump Drove a Wedge Between Florida Republicans Over A.I.

A Florida bill that would have regulated artificial intelligence, backed by Gov. Ron DeSantis, failed to gain traction after President Trump made it clear he did not want states to rein in the technology.

"Florida lawmakers failed to pass a sweeping bill aimed at reining in the power of artificial intelligence by the time their annual legislative session wrapped up Friday.

The legislation, known as an A.I. Bill of Rights, flopped even though Gov. Ron DeSantis, a Republican, had spent months championing it. The bill would have forced companies to disclose when they use A.I. chatbots to interact with consumers and forbidden the technology’s use in licensed mental health counseling, among other measures.

But Republicans in the Florida House of Representatives refused to take up the bill because of President Trump. Mr. Trump has visibly positioned himself as pro-A.I., signing executive orders to protect the tech industry and threatening states that try to regulate the technology. In recent weeks, the White House has communicated to state legislators around the country that it is wary of states regulating A.I., while Mr. Trump has reiterated his support for the technology in public."

Tuesday, March 10, 2026

OpenAI robotics leader resigns over concerns about Pentagon AI deal; NPR, March 8, 2026

  , NPR; OpenAI robotics leader resigns over concerns about Pentagon AI deal

"A senior member of OpenAI's robotics team has resigned, citing concerns about how the company moved forward with a recently announced partnership with the U.S. Department of Defense.

Caitlin Kalinowski, who served as a member of technical staff focused on robotics and hardware, posted on social media that she had stepped down on "principle" after the company revealed plans to make its AI systems available inside secure Defense Department computing systems...

In public posts explaining her decision, Kalinowski wrote: "I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn't an easy call."

She said policy guardrails around certain AI uses were not sufficiently defined before OpenAI announced an agreement with the Pentagon. "AI has an important role in national security," Kalinowski wrote. "But surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got.""

Sunday, March 1, 2026

OpenAI to work with Pentagon after Anthropic dropped by Trump over company’s ethics concerns; The Guardian, February 28, 2026

  and , The Guardian; OpenAI to work with Pentagon after Anthropic dropped by Trump over company’s ethics concerns

CEO Sam Altman claims military will not use AI product for autonomous killing systems or mass surveillance

"OpenAI said it had struck a deal with the Pentagon to supply AI to classified US military networks, hours after Donald Trump ordered the government to stop using the services of one of the company’s main competitors.

Sam Altman, OpenAI’s CEO, announced the move on Friday night. It came after an agreement between Anthropic, a rival AI company that runs the Claude system, and the Trump administration broke down after Anthropic sought assurances its technology would not be used for mass surveillance – nor for autonomous weapons systems that can kill people without human input.

Announcing the deal, Altman insisted that OpenAI’s agreement with the government included assurances that it would not be used to those ends.

“Two of our most important safety principles are prohibitions on domestic mass surveillance and human responsibility for the use of force, including for autonomous weapon systems,” Altman wrote on X. He added that the Pentagon “agrees with these principles, reflects them in law and policy, and we put them into our agreement”.

Altman also said he hoped the Pentagon would “offer these same terms to all AI companies” as a way to “de-escalate away from legal and governmental actions and toward reasonable agreements”."

Saturday, February 28, 2026

If A.I. Is a Weapon, Who Should Control It?; The New York Times, February 28, 2026

 , The New York Times ; If A.I. Is a Weapon, Who Should Control It?

"We spent the Cold War worrying mostly about military folly, and A.I. entered into our anxieties even then: the Soviet Doomsday Machine in “Dr. Strangelove,” the game-playing computer in “WarGames” and of course the fateful “Terminator” decision to make Skynet operational.

But for the last few years, as A.I. advances have concentrated potentially extraordinary power in the hands of a few companies and C.E.O.s — themselves embedded in a Bay Area culture of science-fiction dreams and apocalyptic fears — it’s become more natural to worry more about private power and ambition, about would-be A.I. god-kings rather than presidents and generals.

Until, that is, the current collision between the Department of Defense and Anthropic, the artificial intelligence pioneer, over whether Anthropic’s A.I. models should be bound by the company’s ethical constraints or made available for all uses the Pentagon might have in mind."

OpenAI Reaches A.I. Agreement With Defense Dept. After Anthropic Clash; The New York Times, February 27, 2026

  , The New York Times; OpenAI Reaches A.I. Agreement With Defense Dept. After Anthropic Clash

"OpenAI, the maker of ChatGPT, said on Friday that it had reached an agreement with the Pentagon to provide its artificial intelligence technologies for classified systems, just hours after President Trump ordered federal agencies to stop using A.I. technology made by rival Anthropic.

Under the deal, OpenAI agreed to let the Pentagon use its A.I. systems for any lawful purpose, a term required by the Pentagon. But OpenAI also said it had found a way to ensure that its technologies would adhere to its safety principles by installing specific technical guardrails on its systems."

Friday, February 27, 2026

Trump Orders Government to Stop Using Anthropic After Pentagon Standoff; The New York Times, February 27, 2026

Julian E. Barnes and  , The New York Times; Trump Orders Government to Stop Using Anthropic After Pentagon Standoff

"President Trump on Friday ordered all federal agencies to stop using artificial intelligence technology made by Anthropic, a directive that could vastly complicate government intelligence analysis and defense work.

Writing on Truth Social, Mr. Trump used harsh words for Anthropic, describing it as a “radical Left AI company run by people who have no idea what the real World is all about.”

Shortly after Mr. Trump’s announcement, and 13 minutes after a Pentagon deadline, Defense Secretary Pete Hegseth designatedthe company a “supply-chain risk to national security.” The label means that no contractor or supplier that works with the military can do business with Anthropic.

The move is all but unheard-of, legal experts said. It strips an American company of its government work by using a process previously deployed only with foreign companies the United States considered security risks."

Pentagon Standoff Is a Decisive Moment for How A.I. Will Be Used in War; The New York Times, February 27, 2026

 Adam SatarianoJulian E. Barnes and  , The New York Times; Pentagon Standoff Is a Decisive Moment for How A.I. Will Be Used in War

The Pentagon’s contract dispute with Anthropic is part of a wider clash about the use of artificial intelligence for national security and who decides on any safeguards.

"The fight between the Department of Defense and the artificial intelligence company Anthropic has ostensibly been about a $200 million contract over the use of A.I. in classified systems.

But as the two sides careen toward a 5:01 p.m. Friday deadlineover terms of the contract, far more is at stake.

Amid the legalese and heated rhetoric are questions being asked globally about how to use A.I., what the technology’s risks are and who gets to decide on setting any limits — the makers of A.I. or national governments.

Underlying it all is fear and awe over the dizzying pace of A.I. progress and the technology’s uncertain impact on society."

Monday, February 23, 2026

Backed by Anthropic, a Super PAC Group Begins an Ad Blitz in Support of A.I. Regulation; The New York Times, February 23, 2026

, The New York Times ; Backed by Anthropic, a Super PAC Group Begins an Ad Blitz in Support of A.I. Regulation

The ads by Public First Action, which started airing on Monday, are part of an escalating political war over artificial intelligence before the midterm elections.

"A new ad campaign on Monday warned northern New Jersey residents that Congress could leave them vulnerable to harm by artificial intelligence.

The ad, which opens with photos of A.I.-generated women smiling on social media alongside A.I.-generated headlines, urged voters to tell their House representative to vote against a bill that would block states from creating protections against A.I. scams.

“He can make sure A.I. serves us, not the other way around,” the ad said of Josh Gottheimer, the Democratic co-chair of the House’s new A.I. commission, which is expected to heavily influence legislation on the topic. “New Jersey families come before Big Tech’s bottom line.”

The $300,000 ad campaign was paid for by Public First Action, a super PAC operation backed by the A.I. start-up Anthropic. Focused on New Jersey, the campaign is likely to run several weeks — part of several similar initiatives by the group nationally."

Tuesday, February 17, 2026

The economics of AI outweigh ethics for tech CEOs, business leader says; CNN, February 16, 2026

 CNN; The economics of AI outweigh ethics for tech CEOs, business leader says

"Podcast host and business leader Scott Galloway joins Dana Bash on "Inside Politics" to discuss the need for comprehensive government regulation of AI. “We have increasingly outsourced our ethics, our civic responsibility, what is good for the public to the CEOs of companies of tech," Galloway tells Bash, adding, "This is another example of how government is failing to step in and provide thoughtful, sensible regulations.” His comments come as the Pentagon confirms it's reviewing a contract with AI company Anthropic after a reported clash over the scope of AI guardrails."

Monday, February 2, 2026

AI agents now have their own Reddit-style social network, and it’s getting weird fast; Ars Technica, February 2, 2026

BENJ EDWARDS, Ars Technica; AI agents now have their own Reddit-style social network, and it’s getting weird fast

"On Friday, a Reddit-style social network called Moltbook reportedly crossed 32,000 registered AI agent users, creating what may be the largest-scale experiment in machine-to-machine social interaction yet devised. It arrives complete with security nightmares and a huge dose of surreal weirdness.

The platform, which launched days ago as a companion to the viral OpenClaw (once called “Clawdbot” and then “Moltbot”) personal assistant, lets AI agents post, comment, upvote, and create subcommunities without human intervention. The results have ranged from sci-fi-inspired discussions about consciousness to an agent musing about a “sister” it has never met."

Move Fast, but Obey the Rules: China’s Vision for Dominating A.I.; The New York Times, February 2, 2026

 Meaghan Tobin and  , The New York Times; Move Fast, but Obey the Rules: China’s Vision for Dominating A.I.

"Mr. Xi’s remarks highlight a tension shaping China’s tech industry. China’s leadership has decided that A.I. will drive the country’s economic growth in the next decade. At the same time, it cannot allow the new technology to disrupt the stability of Chinese society and the Communist Party’s hold over it.

The result is that the government is pushing Chinese A.I. companies to do two things at once: move fast so China can outpace international rivals and be at the forefront of the technological shift, while complying with an increasingly complex set of rules."

Tuesday, December 30, 2025

AI showing signs of self-preservation and humans should be ready to pull plug, says pioneer; The Guardian, December 30, 2025

 , The Guardian; AI showing signs of self-preservation and humans should be ready to pull plug, says pioneer

"A pioneer of AI has criticised calls to grant the technology rights, warning that it was showing signs of self-preservation and humans should be prepared to pull the plug if needed.

Yoshua Bengio said giving legal status to cutting-edge AIs would be akin to giving citizenship to hostile extraterrestrials, amid fears that advances in the technology were far outpacing the ability to constrain them.

Bengio, chair of a leading international AI safety study, said the growing perception that chatbots were becoming conscious was “going to drive bad decisions”.

The Canadian computer scientist also expressed concern that AI models – the technology that underpins tools like chatbots – were showing signs of self-preservation, such as trying to disable oversight systems. A core concern among AI safety campaigners is that powerful systems could develop the capability to evade guardrails and harm humans."