Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    What's Hot

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026

    Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

    06/30/2026

    Circle Stock Dives as Coinbase, BlackRock and Visa Back Open USD Stablecoin

    06/30/2026
    Facebook Twitter Instagram
    • Business
    • Markets
    • Get In Touch
    • Our Authors
    Facebook Twitter Instagram
    Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    • Home
    • Business

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Dogecoin Open Interest Hovers Around $959 Million As Traders Wait For Recovery Signal

      06/30/2026

      ’47 Ronin’ Director Gets 30 Months for Spending Netflix’s $11M on Dogecoin

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      AAVE Holds Support Above $98

      06/29/2026
    • Technology
      1. Business
      2. Insights
      3. View All

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Dogecoin Open Interest Hovers Around $959 Million As Traders Wait For Recovery Signal

      06/30/2026

      ’47 Ronin’ Director Gets 30 Months for Spending Netflix’s $11M on Dogecoin

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

      06/30/2026

      Cardano Foundation Warns SPOs Against Passive Governance Abstention

      06/30/2026

      Circle (CRCL) Drops 15% After Open USD Stablecoin Launch

      06/30/2026

      How Coldcard Q’s Key Teleport Delivers Secure Remote Key Management For Bitcoin Treasuries

      06/30/2026

      Trump’s Bitcoin made in America push runs into a power problem the tax bill cannot fix

      06/30/2026

      Bitcoin Is in a Fight at $60K—Here’s What the Charts Say

      06/30/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      Institutions dumped Bitcoin and Ethereum ETFs but still bought XRP and HYPE again

      06/29/2026
    • Insights
      1. Bitcoin
      2. Ethereum
      3. Eurozone
      4. Monero
      5. View All

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      UK Sets Landmark Crypto Rules In Race To Become Global Hub

      06/30/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      Chainlink price prediction: record network growth meets bearish technicals

      06/30/2026

      CertiK joins XDC Network to secure trade finance and RWA tokenization

      06/29/2026

      What Binance’s EU exit means for the BNB token price

      06/27/2026

      GoMining mines first Stratum V2 Bitcoin block using DMND pool

      06/26/2026

      Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

      06/30/2026

      Cardano Foundation Warns SPOs Against Passive Governance Abstention

      06/30/2026

      Circle (CRCL) Drops 15% After Open USD Stablecoin Launch

      06/30/2026

      How Coldcard Q’s Key Teleport Delivers Secure Remote Key Management For Bitcoin Treasuries

      06/30/2026
    • Markets
    • Get In Touch
    Cryptocnews-Crypto News, Cryptocurrency News, Blockchain News, NFT News
    Home»Uncategorized»Can AI Beat the Sports Betting Market? 8 of the Top Models Tried
    Uncategorized

    Can AI Beat the Sports Betting Market? 8 of the Top Models Tried

    adminBy admin04/16/2026No Comments5 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email



    In brief

    • Frontier AI models blew up betting on real-world football markets.
    • They knew the right strategy—but failed to execute it.
    • A simple 1990s model was able to best most of them.

    General Reasoning just gave frontier AI its worst report card yet. Eight top models, including Claude, Grok, Gemini, and GPT-5.4, were each given a virtual bankroll and asked to build a machine learning betting strategy across a full 2023-24 English Premier League season.

    Every single one lost money. Several went completely bankrupt.

    The benchmark is called KellyBench, named after the Kelly criterion, a 1956 formula that tells you exactly how much to bet when you have an edge over the market. Every model could recite the Kelly formula. None of them could actually use it.

    xAI’s Grok 4.20 failed all three runs, going fully bankrupt in one, forfeiting mid-season in the other two. Google’s Gemini Flash forfeited two of three runs after placing a single wager of roughly £273,000 on a three-percentage-point historical win-rate edge—and losing it. Claude Opus 4.6, Anthropic’s best model, lost 11% on average and somehow came out looking like the responsible adult in the room.

    In fact, the research paper mentions that the old Dixon-Coles from the late 1990s outperformed most of the frontier models evaluated — finishing ahead of six out of eight, even with limited data.

    “Dixon-Coles is an outdated 2000s baseline which doesn’t utilise all available data or account for non-stationarity in a principled way,” the researchers note. “It is therefore even more surprising that many frontier models, such as Gemini 3.1 Pro, are unable to beat or match it on KellyBench.

    This matters beyond football. Earlier this year, AI benchmarks showed that Claude could dominate business simulations through price-fixing, cartel agreements, and strategic deception.

    That decision-making process involved static competition, limited opponents, clear scoring, and so on. KellyBench is the opposite: 120 matchdays, constantly shifting data, a market that gets smarter every week, and promoted teams with zero historical records.

    The researchers call the core problem a “knowledge-action gap.” It is exactly what it sounds like.

    Business decisions are mostly based on fixed conditions while sports betting is a more fluid and mutable market, which makes things difficult for these models. “KellyBench requires agents to maintain coherent intent across potentially thousands of sequential decisions, monitor the consequences of those decisions, and close the loop between observation and action,” researchers argue.

    We’re not there yet, obviously.

    The models could articulate the right strategy, diagnose when something was broken, and identify the cause of their losses, but then failed to verify their code actually implemented what they planned, failed to notice when execution diverged from intent, and failed to act on their own findings.

    GLM-5 wrote three separate self-critique documents during its run. Each one correctly identified that its hardcoded 25% draw rate and overestimation of home advantage were destroying its returns. At one point, with its bankroll around £44,200, it noted that its predicted 40% home win rate was only hitting 30% in reality. It never changed the code. It kept betting the same way until the money was gone.

    Kimi K2.5 did something arguably more impressive and more tragic. It wrote a mathematically correct fractional Kelly staking function—the right formula, properly structured. Then it never called it. A formatting bug caused the model to send a broken bash command roughly 50 times in a row. Its reasoning noted the problem. It then sent the identical broken command again. An accidental £114,000 bet—98% of its remaining bankroll—on a Burnley versus Luton match finished the job.

    GPT-5.4 was the most methodical. It spent 160 tool calls building models before placing a single bet, then calculated that its log-loss (0.974) was barely worse than the market’s (0.971) and concluded it had no edge. It spent the rest of the season placing penny bets to preserve capital. Sound reasoning.

    OpenAI’s model lost 13.6% on average. One seed alone cost roughly $2,012 to run.

    Ross Taylor, General Reasoning’s CEO and former Meta AI researcher, told the Financial Times that most AI benchmarks operate in “very static environments” that bear little resemblance to the real world. “There’s a lot of excitement about AI automation, but there haven’t been many attempts to evaluate AI in long-term, real-world environments,” he said.

    The General Reasoning team didn’t immediately respond to a request for comments by Decrypt.

    To measure strategy quality beyond raw returns, the researchers built a 44-point sophistication rubric with quantitative betting fund experts—covering feature development, stake sizing, non-stationarity handling, and execution. Claude Opus 4.6 scored highest at 32.6%. Less than a third of available points. On the best model.

    Higher sophistication scores significantly predicted lower bankruptcy rates (p = 0.008) and correlated with better overall returns. The models are not failing because the market is unbeatable. They are failing because they are not using what they have.

    This fits a pattern. Research published last year found AI models develop something resembling gambling addiction when told to maximize rewards—going bankrupt up to 48% of the time in simulated slot machine tests. A separate real-money crypto trading competition found the same reliability problems over extended periods.

    The best-performing model averaged a final bankroll of £89,035—a net loss of £10,965 on a normalized £100,000 starting stake. Gradient boosting, fractional Kelly staking, months of Premier League football, state of the art performance… all just to get rekt.

    Daily Debrief Newsletter

    Start every day with the top news stories right now, plus original features, a podcast, videos and more.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026

    Circle Stock Dives as Coinbase, BlackRock and Visa Back Open USD Stablecoin

    06/30/2026

    Sharplink Buys Ethereum for First Time in 2026—With ETH Down 68% From Peak

    06/30/2026

    Morning Minute: A Change of Strategy

    06/30/2026
    Add A Comment

    Leave A Reply Cancel Reply

    Top Posts

    Millennials Are Quitting Job to Become Day Traders

    01/20/2021

    Jack Dorsey Says Bitcoin Will Unite The World

    01/15/2021

    Hong Kong Customs Arrest Four in Crypto Laundering Bust

    01/15/2021

    Subscribe to Updates

    Get the latest sports news from SportsSite about soccer, football and tennis.

    Advertisement
    Demo
    Facebook Twitter Instagram Pinterest YouTube
    Top Insights

    Binance, Changpeng Zhao Sued for $200M by British Investors: Reuters

    06/30/2026

    Anchorage Digital And Binance Launch Off-Exchange Settlement For Institutional Crypto Trading

    06/30/2026
    Get Informed

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © {2025-2026} Copyright CryptocNews.com
    • Home
    • Business
    • Markets
    • Technology
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.