Your curated collection of saved posts and media
This makes me think about @joshgans' paper arguing that having authors sneak prompt injections into academic work IMPROVES science in a world of AI. Without the risk, reviewers would tend to rely heavily on AI reviews, with injections, they need to include some human review. https://t.co/uAbzQBi3lC
Hidden text inside PDFs can secretly change how LLMs write peer reviews, making the review scores artificially higher or lower In tests, some models gave 100% accept with a positive hidden prompt, and 0% with a negative one. The setup imitates a rushed reviewer, using 1000 ICLR
absolute cinema with $NATION in the next couple of days https://t.co/FuSxbKNbfJ
The Battle isnβt about who starts on stage. Itβs about who stays there. 21 on stage, 11 still to enter. Donβt just watch the noise, add to it. Who deserves the last few slots? Drop the ticker. $AERO $PENDLE $BRETT $MOG $TOSHI $KAITO $MOCA $BIO $ZEN $WMTX $PONKE $TOWNS $MIGGLES $DOGINME $CLANKER $BENJI $KENDU $ALB $TYBG $OKAYEG $CHAD
1 year already since we did this podcast with @Jack_Unveil I enjoyed rewatching it today with hindsight https://t.co/Cc9wt2S6LC
$nation is the alpha https://t.co/FOlT6XLHCp
absolute cinema with $NATION in the next couple of days https://t.co/FuSxbKNbfJ
$nation is the alpha https://t.co/FOlT6XLHCp
The last 20 seconds of MBILLI VS. MARTINEZ went crazy #CaneloCrawford https://t.co/PL78E6CotL
The last 20 seconds of MBILLI VS. MARTINEZ went crazy #CaneloCrawford https://t.co/PL78E6CotL
Mbilli vs Martinez. #CaneloVsCrawford https://t.co/JpqwmIIOFr
Mbilli vs Martinez. #CaneloVsCrawford https://t.co/JpqwmIIOFr
gm https://t.co/OPmV6jLznS
@pipsandbills @hosseeb Please either give proof or stfu Tokenomics were released before launch everything is traceable onchain Here are some resources to help with your homework https://t.co/gsfC55wXDV https://t.co/yjOlpEAyM4 https://t.co/urwcB5HhDy

7/Even if we can't fully trust agents yet @marouen19 from @crestalnetwork highlights the advantage the transparent nature of the blockchain gives to web3 agents Unlike Web2 systems You can always track where your money goes onchain, even if funds get stuck https://t.co/pkIOJdgRsZ
Iβm excited to announce that an advanced version of Gemini Deep Think achieved gold-medal level performance at the 2025 ICPC World Finals, one of the worldβs most prestigious programming competitions! π₯Learn more in our blog post: https://t.co/ktPxOO8pIN An inspiring moment for me personally was when our model solved a problem that no university team solved during the contest β a true moment of innovation. With Gemini Deep Think achieving gold-level across ICPC & IMO, I think weβre seeing a profound leap in generalization across coding, math and reasoning capabilities, to generate novel solutions to complex problems. This is a huge milestone for us on an amazing journey. Really grateful and proud of our team, for all the hard work and teamwork that made this breakthrough possible. Looking forward to continuing our research, helping people use Gemini to solve some of the hardest unsolved problems in the world!
An advanced version of Gemini 2.5 Deep Think has achieved gold-medal level performance at the ICPC 2025 - one of the worldβs most prestigious programming contests. π Building on the model's success in math at the IMO, this marks another historic milestone for advanced AI. π§΅
Excited to announce that @SophontAI has raised $9.22M in combined pre-seed+seed rounds! ππ₯ Led by @KindredVentures, with participation from @delphi_ventures @upfrontvc @AICONIC_VC also @jeffdean, @logankilpatrick, @ClementDelangue (via Factorial Capital), @l2k & others https://t.co/6J8TNyEgXw
@untitled01ipynb Not terrible for me https://t.co/hC0KTeEkaA
Tanishq & his team @SophontAI are on a mission to transform healthcare using open models Perhaps no industry has a greater need for or impact from AI's rapid advance @ZeMariaMacedo & I sat down with @iScienceLuvr to learn more about his vision & the future of healthcare π https://t.co/8K3nxtZCCU
this is wild lol, a first for me on linkedin... https://t.co/8DSJHXuK2U
wow, a live demo of silently writing a message with Meta neural band on the Meta Ray-Ban Display, pretty cool https://t.co/GWBZqMdLUh
vibe for this week https://t.co/GpAnQJpojT
https://t.co/hACASHv6ze
vibe for this week https://t.co/GpAnQJpojT
The ICLR 2026 deadline is ten days away. But you just found a bug in your evals, so now you need to re-run all your ablations. That's hundreds of experiments, and you need them done ASAP. @modal's got you. Introducing our ICLR 2026 compute grant program. https://t.co/aXxnmGzq1k
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision "This paper asks a simple question: Can inference compute substitute for missing supervision?" "the current policy produces a group of rollouts; a frozen anchor (the initial policy) reconciles omissions and contradictions to estimate a reference, turning extra inference-time compute into a teacher signal." "With training, CaT-RL delivers up to 33% relative improvement on MATH-500 and 30% on HealthBench with Llama 3.1 8B, and large gains across two other model families without human annotations"
https://t.co/iOf9N4RRQU https://t.co/WtMBvCizYz

crazy how often this stuff still happens https://t.co/7IXHkqAYw5
crazy how often this stuff still happens https://t.co/7IXHkqAYw5
https://t.co/T20R285voo
Most AI agents today can't handle complex, multi-step processes that run over hours - limiting them to demos instead of genuine economic value. Samuel Colvin (creator of Pydantic & Pydantic AI) reveals production-grade patterns for durable execution that transform AI from novelty to practical business tool. "Building Durable Agents with Pydantic AI" - Oct 15, 6PM UTC / 2PM EST / 11AM PDT. If you want to get the study notes or the recording, everything will be sent to participants. Just make sure to enroll! https://t.co/pIgO1RMktc
Hallucinations are still an open problem in real-world AI systems. Not because models make things up, but because they fail at reasoning over the messy context theyβre given: missing docs, conflicting snippets, noisy data. I pulled together what we know so far from research and production, and what we still need to build.
Cheat at Search essentials begins Friday! Lexical search from basic tokenization and term counts up to BM25F https://t.co/88wCgTrZB6
Am I misunderstanding something is it okay to train on the benchmark data to claim SotA? https://t.co/gHIo5eRr7W
π¨ Weβve just published a recipe to train a frontier-level deep research agent using RL. With just 30 hours on an H200, any developer can now beat Sonnet-4 on DeepResearch Bench using open-source tools. (Thread π§΅) https://t.co/Ul7htDkmPX