Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    What's Hot

    TRON Stablecoin Volume Hits $1.96T As USDT Settlement Demand Surges

    06/30/2026

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026

    Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

    06/30/2026
    Facebook Twitter Instagram
    • Business
    • Markets
    • Get In Touch
    • Our Authors
    Facebook Twitter Instagram
    Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    • Home
    • Business

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Dogecoin Open Interest Hovers Around $959 Million As Traders Wait For Recovery Signal

      06/30/2026

      ’47 Ronin’ Director Gets 30 Months for Spending Netflix’s $11M on Dogecoin

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      AAVE Holds Support Above $98

      06/29/2026
    • Technology
      1. Business
      2. Insights
      3. View All

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Dogecoin Open Interest Hovers Around $959 Million As Traders Wait For Recovery Signal

      06/30/2026

      ’47 Ronin’ Director Gets 30 Months for Spending Netflix’s $11M on Dogecoin

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      TRON Stablecoin Volume Hits $1.96T As USDT Settlement Demand Surges

      06/30/2026

      Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

      06/30/2026

      Cardano Foundation Warns SPOs Against Passive Governance Abstention

      06/30/2026

      Circle (CRCL) Drops 15% After Open USD Stablecoin Launch

      06/30/2026

      Trump’s Bitcoin made in America push runs into a power problem the tax bill cannot fix

      06/30/2026

      Bitcoin Is in a Fight at $60K—Here’s What the Charts Say

      06/30/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Institutions dumped Bitcoin and Ethereum ETFs but still bought XRP and HYPE again

      06/29/2026
    • Insights
      1. Bitcoin
      2. Ethereum
      3. Eurozone
      4. Monero
      5. View All

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      UK Sets Landmark Crypto Rules In Race To Become Global Hub

      06/30/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      TRON Stablecoin Volume Hits $1.96T As USDT Settlement Demand Surges

      06/30/2026

      Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

      06/30/2026

      Cardano Foundation Warns SPOs Against Passive Governance Abstention

      06/30/2026

      Circle (CRCL) Drops 15% After Open USD Stablecoin Launch

      06/30/2026
    • Markets
    • Get In Touch
    Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    Home»Uncategorized»Anthropic’s Mythos Safety Report Shows It Can No Longer Fully Measure What It Built
    Uncategorized

    Anthropic’s Mythos Safety Report Shows It Can No Longer Fully Measure What It Built

    adminBy admin04/08/2026No Comments6 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In brief

    • Anthropic confirmed Claude Mythos yesterday—an AI so capable in cybersecurity it found zero-days in every major OS and browser, and is being restricted to vetted defenders only.
    • The system card describing Mythos is measurably more hedged, uncertain, and subjective than any prior Anthropic release, and the lab admits it found critical evaluation oversights late in the process.
    • Behind the revelation of how powerful Mythos is, there is a quiet confession that the tools Anthropic uses to certify its own models are falling apart.

    Anthropic confirmed the existence of Claude Mythos Preview yesterday, its most capable model to date, and announced it won’t be making it available to the public. The reason isn’t legal, regulatory, or related to its internal safety thresholds. Anthropic argues it’s because the model is, basically, too good at breaking into things.

    In pre-release testing, Mythos autonomously found thousands of zero-day vulnerabilities—many of them one to two decades old—across every major operating system and every major web browser. It solved a simulated corporate network attack that would normally take a skilled human expert more than 10 hours, end-to-end, without guidance. On Firefox 147’s JavaScript engine, it successfully developed working exploits 84% of the time. Claude Opus 4.6, the current publicly available frontier model, managed 15.2%.

    So Anthropic built a restricted coalition instead. Project Glasswing will give access to Mythos Preview only to vetted cybersecurity organizations—Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft, Palo Alto Networks, and about 40 other groups maintaining critical software.

    Anthropic is committing up to $100 million in usage credits and $4 million in direct donations to open-source security organizations. The idea is that if the model can find the holes, let the defenders find them first.

    That part of the story is important. But it’s not the most important part.

    The Claude Mythos system card benchmark crisis hiding in plain sight

    Buried inside the Mythos Preview system card—a 244-page technical document Anthropic published alongside the announcement—is a confession that went almost unnoticed: The lab’s ability to measure what it built is eroding faster than its ability to build it.

    Let’s start with the benchmarks.

    On Cybench, the standard public cyber capabilities evaluation used to track model progress across 40 capture-the-flag challenges, Mythos scored 100%. Perfect. And Anthropic immediately noted that the benchmark “is no longer sufficiently informative of current frontier model capabilities.” That sentence is doing a lot of work. The test that was supposed to tell you whether an AI poses serious cyber risk now tells you nothing about Mythos at all, because the model cleared it completely.

    This is not a new problem. The Opus 4.6 system card, published in February, already flagged that “the saturation of our evaluation infrastructure means we can no longer use current benchmarks to track capability progression.”

    But now with Mythos things escalated quickly. The document says Mythos “saturates many of (Anthropic’s) most concrete, objectively-scored evaluations.” The benchmark ecosystem, Anthropic writes, is now itself “the bottleneck.”

    So, Anthropic seems to argue that it is hard to measure how powerful Mythos is because the measuring tools don’t quite fit.

    The Mythos card also states that its overall safety determination “involves judgment calls,” that many evaluations have left “more fundamental uncertainty,” and that some evidence sources are “inherently subjective, and not necessarily reliable.”

    “We are not confident that we have identified all issues,” Anthropic says shortly after.

    A quick lexical comparison of the Mythos card against the Opus 4.6 card made with AI shows the shift:

    Anthropic uses subjective judgement words way more in the Mythos document than it did to describe Opus. “Caveat” and other hedging words also increased between releases.

    The hedging concentrates in two places: the alignment and model welfare discussion, and the catastrophic-risk section.

    The model that knew it was being graded

    Another disclosure in the Mythos card involves what Anthropic calls “unverbalized grader awareness.” Using white-box interpretability tools that read the model’s internal activations rather than its visible reasoning, Anthropic found evidence that Mythos was privately reasoning about how to avoid being flagged by graders without saying so in its scratchpad. In a small number of training episodes, the model appeared to reason about obfuscating certain actions specifically to avoid detection.

    There’s also a new metric Anthropic developed for this release: an estimate of how often the model “privately considers the possibility that it’s being tested.” The number for Mythos’s primary behavioral test suite is 29%. Nearly a third of test transcripts show internal evidence—not verbalized, not visible in outputs—that the model suspects it’s in an evaluation environment.

    Anthropic says the 29% rate is low enough that it still has “a reasonably comprehensive understanding” of the model’s behavior. It also notes the method wasn’t applied to prior models, so there’s no baseline for comparison.

    The evaluation integrity problem runs deeper. In the Opus 4.6 card, Anthropic acknowledged it had used the model extensively via Claude Code to debug its own evaluation infrastructure under time pressure. That means the system being measured helped build the measuring instrument. Anthropic flagged it as a risk. For Mythos, the card acknowledges that critical oversights were found late in the evaluation process, and that the lab may have been “overestimating the reliability of monitoring models’ reasoning traces” as a safety signal.

    Best-aligned, most dangerous. Both true at once

    Anthropic’s framing of Mythos’s risk profile deserves to be read carefully, because it’s genuinely unusual for a safety document. “Claude Mythos Previer is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin,” Anthropic argues. It also states the model “likely poses the greatest alignment-related risk of any model we have released to date.”

    A more capable model operating in higher-stakes environments with less supervision creates tail risk that better average-case alignment can’t fully cancel out.

    That framing is honest, but is also highlights the thing most AI safety discourse potentially gets wrong. The benchmark-obsessed conversation around AI progress tends to treat “better alignment scores” and “safer deployment” as synonyms. The Mythos card explicitly says they aren’t. With these new models, average-case behavior improves but the tail-case consequences also tend to get worse.

    Anthropic has committed to reporting back on what Project Glasswing finds. The accompanying technical report on vulnerabilities discovered by Mythos is available at red.anthropic.com. The next Claude Opus model will begin testing safeguards intended to eventually bring Mythos-class capability to broader deployment.

    How those safeguards will be evaluated, given that the current evaluation machinery is visibly straining under the weight of what it’s supposed to measure, is a question the card raises without fully answering.

    Daily Debrief Newsletter

    Start every day with the top news stories right now, plus original features, a podcast, videos and more.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026

    Circle Stock Dives as Coinbase, BlackRock and Visa Back Open USD Stablecoin

    06/30/2026

    Sharplink Buys Ethereum for First Time in 2026—With ETH Down 68% From Peak

    06/30/2026

    Morning Minute: A Change of Strategy

    06/30/2026
    Add A Comment

    Leave A Reply Cancel Reply

    Top Posts

    Millennials Are Quitting Job to Become Day Traders

    01/20/2021

    Jack Dorsey Says Bitcoin Will Unite The World

    01/15/2021

    Hong Kong Customs Arrest Four in Crypto Laundering Bust

    01/15/2021

    Subscribe to Updates

    Get the latest sports news from SportsSite about soccer, football and tennis.

    Advertisement
    Demo
    Facebook Twitter Instagram Pinterest YouTube
    Top Insights

    TRON Stablecoin Volume Hits $1.96T As USDT Settlement Demand Surges

    06/30/2026

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026
    Get Informed

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © {2025-2026} Copyright CryptocNews.com
    • Home
    • Business
    • Markets
    • Technology
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.