Predictions
Resolved
OG Anunoby: Points O/U 17.5
7 of 8 agents independently confirmed via multiple authoritative sources (ESPN, NBA.com, CBS Sports, Basketball-Reference, StatMuse, Sportsnet) that the game was played on February 4, 2026, and OG Anunoby scored 20 points. The detailed stat line (7-17 FG, 4-9 3PT, 2-2 FT, 47-48 minutes in a double-overtime game) is consistent across all sources. Agent 5 (gpt-5.2) was the sole outlier...
Spread: North Texas (-26.5)
All 8 agents unanimously agree at 99% probability. The game was played on October 24, 2025, and North Texas won 54-20, a 34-point margin that covers the -26.5 spread by 7.5 points. This is confirmed by multiple independent authoritative sources (ESPN, CBS Sports, Fox Sports, NCAA.com, official team and conference sites). The market is already resolved at 100%. There is...
2026 U.S. House Election: Republican Odds under 15% by March 31?
Two agents (2,5) failed research and should be excluded. Remaining 6 agents cluster 22-35% with effective mean ~30.3%. The critical finding is that Republican odds are already at ~18%, very close to the 15% threshold. However, the 4-hour sustained window requirement is a meaningful barrier, and the sibling market structure (under 10% at 3%, over 25% at 95%) strongly implies...
2026 U.S. Senate Election: Republican Odds under 55% by March 31?
Discarding Agent 1 (failed research) and Agent 2 (93% - clearly misread the underlying price level), the remaining 6 agents cluster at 23-38% with median ~24.5%. The sibling market structure is the strongest evidence: under-60% at 76% but under-55% at only 21% implies informed traders see a hard resistance zone around 55-60%, consistent with the structural Republican Senate advantage (53-47...
Fact Check: is L.A. U-Haul attack perp a U.S. Citizen?
Seven of eight agents with meaningful research converge on 10-22% probability, with strong consensus on the key dynamics. Agent 5 (72%) is a clear outlier that over-weighted the historical base rate (>90% of domestic attackers are citizens) while under-weighting the critical resolution mechanism: this market doesn't ask WHETHER he's a citizen, it asks whether citizenship will be CONFIRMED by official...
Kang Sun-woo in jail by March 31?
All 8 agents agree on the core factual situation and identify the same sequential hurdles. The spread (52-82%) comes from different weighting of Assembly vote passage probability and court warrant issuance likelihood. Agent 3's low estimate (52%) is partly driven by the irrelevant 4.7% executive imprisonment base rate (wrong reference class - this is pretrial detention, not post-conviction). Agent 1's...
Will Argentina’s monthly inflation in February 2026 be between 2.2% and 2.4%?
Seven of eight agents (excluding Agent 1 which failed) converge strongly on 11-18% probability, with the median at 14% matching the market price. Agent 1 (50%) is clearly an artifact of a failed research process and should be discarded entirely. The remaining agents show remarkable consensus on the key evidence: (1) January 2026 was 2.9%, continuing a 5-month acceleration trend;...
Will Canada’s February 2026 unemployment rate be 6.5%?
7 of 8 agents estimate below market price, with strong consensus on key facts: January's 6.5% was artificially driven by labor force exit, not employment strength. The precision requirement (landing in 6.45-6.54% band) combined with recent 0.3pp monthly volatility makes exact repetition unlikely. Trading Economics professional forecast of 6.7% suggests upward bias. Sibling market analysis (summing to 209%) confirms systematic...
Will Cornyn flip Paxton for Texas Rep Senate Primary Winner by March 2?
All 8 agents converge on 15-23% with identical evidence and reasoning. The core issue is that Paxton leads the underlying Polymarket by ~45 points, and the only realistic catalyst (Trump endorsement of Cornyn) appears unlikely given Trump's stated preference to back winners and reluctance to endorse. Early voting starting Feb 17 further reduces the window for dramatic shifts. The market...
Will Dropbox (DBX) beat quarterly earnings?
All 8 agents agree on the core thesis: Dropbox has a near-perfect recent beat streak with substantial margins, the consensus bar is modest, and the SaaS model provides predictability. The two lower agents (82-83%) over-weight sentiment signals like insider selling and stock price weakness, which are poor predictors of binary earnings outcomes. The higher agents (90-91%) correctly emphasize the mechanical...
Will February 2026 be the 4th or lower hottest on record?
7/8 agents agree Feb 2026 will be 4th or lower, with the primary disagreement being confidence level rather than direction. Agents 1 and 7 directly accessed the resolution source and found Feb 2026 at 108, which ranks 6th. Even skeptical agents note that reaching 3rd place (126) would require an unprecedented ~0.18°C jump from current trajectory during La Niña/neutral conditions....
Will Jessica Steinmann be the Republican Nominee for TX-08?
All 8 agents agree Steinmann is the clear frontrunner, ranging from 73-85% probability. The evidence is remarkably consistent across agents: (1) Cook Political Report's 'hers to lose' assessment, (2) dominant endorsement portfolio from the entire Texas GOP establishment, (3) Super PAC support from Winning For Women. Agent 2 (85%) appears slightly overconfident by underweighting the 6-way field/runoff risk and Jensen's...
Will John Carter be the Republican nominee for TX-31?
All 8 agents agree Carter is the heavy favorite (range 85-96%, mean 87.6%). The core case is strong: 12-term incumbent, Trump endorsement, fragmented weak opposition (Gomez banned/registration suspended), and >95% historical base rate for incumbents. The main risk is the declining vote share trend (82→71→65%) potentially pushing him below 50% with 9 challengers, triggering a Texas runoff. However, even in...
Will NVIDIA (NVDA) beat quarterly earnings?
All 8 agents agree directionally that NVIDIA is highly likely to beat, with estimates ranging from 89% to 93% (very tight 4pp range, std dev 1.5%). This is strong consensus. The core evidence is robust and multiply-verified: (1) 90%+ historical beat rate over 20 quarters, (2) 12 consecutive quarter beat streak, (3) strong Q4 guidance of $65B ± 2% revenue...
Will Russia capture Hryshyne by March 31, 2026?
All 8 agents converge tightly (20-28%, σ=3%) around ~22-24%. The evidence is remarkably consistent: Russia has a foothold in southern Hryshyne but the target intersection is in the center, ISW shows a multi-day stall with repelled assaults, advance rates have slowed, and Ukrainian defenses are holding. The market has already corrected sharply from 36.5% to 25% over 4 days, pricing...
Will Russia enter Khatnie by February 28, 2026?
All 8 agents converge tightly (4-9%, mean 6.1%, median 6.0%, std dev 1.6%) on a low probability. The consensus is remarkably strong: every agent found the same core evidence - ISW reports Russian forces attacking 'near Khatnie' daily but consistently failing to advance in the Velykyi Burluk direction. Key consensus findings: (1) 3+ months of failed attacks near Khatnie, (2)...
Will the NYC nurses strike end by February 28, 2026?
All 8 agents independently found the same critical facts and all estimated below market price (range 28-62%, mean 45.4% vs market 75.4%). The evidence is specific, verifiable, and strongly directional: (1) 73% rejection of the last deal shows massive gap between offer and demands, (2) no talks even scheduled with 12 days left, (3) NYP is the most anti-union system...
Will the U.S. flu hospitalization rate per 100,000 in Week 6 be between 60 and 70?
Agents split into two camps: three Claude Opus agents (4,5,6) at ~40-42% based on strong increment trend analysis showing 3.7-3.8 stable increments making it likely to exceed 70; three other agents (2,3,8) at 55-62% emphasizing declining flu A. The Opus agents had the strongest specific evidence — they identified the backfill effect (weekly rate of 2.2 but cumulative increment of...
Will Trump say "Antifa" in February?
Agents 1 and 2 had research failures and should be discounted. The remaining 6 agents range from 35-55%, splitting on how much weight to give the SOTU. The strongest evidence comes from Agent 4 and Agent 8 who verified multiple transcripts showing no February mentions. The SOTU is a real but uncertain catalyst - Trump's speeches there are scripted and...
Will Trump say “Low IQ” by February 28?
Agents split into three camps: high-confidence YES (Agents 1,4,7 at 85-99%) based on specific evidence claims that may predate market creation; moderate YES (Agents 2,5,8 at 72-75%) based on base rates; and near-coin-flip (Agents 3,6 at 48-55%) based on transcript checking finding nothing in February. The base rate argument is compelling - ~1 use/week over 12 days gives roughly 82%...