Your curated collection of saved posts and media
Got Claude pro again for a bit and am playing with Artifacts again Made an art-piece/story here: https://t.co/gZMSoERWyI https://t.co/Op67z59Usz
The problem with giving AIs a default personal system prompt is that you have no idea how that prompt interacts with various AI tasks. Our research shows that even small prompt changes (like saying "please") can backfire on some problems and lower accuracy in unexpected ways. https://t.co/16gEAGuRUX

frontier ai today https://t.co/YkhEmCsF0r
My favorite part of this talk Isaac's .cursorules for FastHTML https://t.co/BNjKr3gdbV https://t.co/HADJBtQjsp
Building Production-Grade Conversational Agents with Workflow Graphs Uses DAG to design robust and complex agentic systems. If you're building AI agents, this is worth a read. Here are my notes: https://t.co/Cks6GxS5a6
YC on the key prompting techniques used by the best AI startups: https://t.co/wmBho365HS
missing anything https://t.co/0OhMQKC2ls
AI Apocalypse Simulator https://t.co/ubH24w7prq
Gemini, make me a meme about being a @kaggle employee learning about being "paranoid about evals" in the AI Evals For Engineers & PMs course from @HamelHusain @sh_reya ๐๐ https://t.co/1pcz0y1tIS
Here's a fun use of Kontext โ you can use it to change the aspect ratio of an input. Change the aspect ratio from "match_input_image" to anything different. Prompt: > make the image taller Probably pretty good for making iPhone wallpapers out of anything.โฆ https://t.co/feQ0t6iLFy
One test AIs struggle with is creating riddles, as they tend to either be too obvious or too obscure. I asked Gemini 2.5, Claude Opus & o3 to come up with an SVG that would hint at a book without revealing it. They were usually pretty easy, here are some typical examples. Guess? https://t.co/BOGD4IjqXi

What makes attention the critical component for most advances in LLMs and what holds back long-term memory modules (RNNs)? Can we strictly generalize Transformers? Presenting Atlas (A powerful Titan): a new architecture with long-term in-context memory that learns how toโฆ https://t.co/qcvO0B9HW5
Totally. The chair in the sky all over again. https://t.co/dV5G9Akaqs
Nous Research will pay the first to properly and fully implement Atropos support into the VeRL project $2500! For information on Atropos, our standalone RL environments framework, see: https://t.co/U20tCJdguP For the official VeRL issue on the bounty: https://t.co/UcszRrE2kg https://t.co/zYOyeqEtRU
Agentic browsers are here! Introducing @opera's new agentic browser, Opera Neon! Opera Neon is an AI agentic browser that can browse with you or for you, take action & help you get things done! https://t.co/2SVZ3zY3gN
Office hours from the latest session. https://t.co/t3KPSJyXuO
Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? ๐ฉ๐ถ๐ฑ๐ฒ๐ผ๐๐ฎ๐บ๐ฒ๐๐ฒ๐ป๐ฐ๐ต evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! ๐งต๐ https://t.co/kcBZ8vsDyw
new deepseek release almost on-par with o3 (high) on livecodebench ๐ฒ๐ https://t.co/znw6OTCmdE
Made a cli that allows you to pull all discord messages from a channel w/ threads+reply hierarchies from your own server. Perfect for LLMs (made with AI + nbdev). Using this to generate FAQs from my course! GitHub: https://t.co/O8X1VqMv4z Docs: https://t.co/uvOQUtwc2t https://t.co/cyajOWiiO6
AgenticSeek: Private, Local Manus Alternative This is worth checking. It's a local alternative to Manus AI that can autonomously browse the web, write code, and plan tasks. It's built for local reasoning models, runs on your hardware, and keeps all data on your device. https://t.co/y3lu0RfPex
I suspect most people underestimate what o3 is capable of doing. One example: I gave it an Excel file for a small business I use for my classes & the single prompt "identify the key assumptions here and give me a sensitivity analysis." It did a lot of work & gave a good answer. https://t.co/Boole32ilC
Here's what happens if you force Hermes 3 to continue printing out ingredients to a peanut butter and jelly sandwich @NousResearch @dottxtai https://t.co/951W4WZAbY

Anyone used the genai integration with vertexai? today I learnt this code snippet does not work at all lol. You get an error that API Keys are not supported by this API? https://t.co/fXJZQuS2tq

OPUS 4 NEW SOTA ON ARC-AGI-2 IT'S HAPPENING - I WAS RIGHT Claude 4 models are the first models that effectively use test-time-compute for ARC-AGI-2 https://t.co/YrFaHBsagq
How should you go about generating synthetic data for LLM Evals? - How many examples? What should your prompt be? Should you test everything? FAQ #2 from our course https://t.co/97lB8uXm2p https://t.co/liLRhIPjva
Finally completed and merged the SWE_RL environment that was described by Meta's SWE RL paper into Atropos - A really difficult environment that can teach a model to be a much better coding agent! Check out the PR: https://t.co/KW36dHo2ts Check out Meta's SWE-RL paper:โฆ https://t.co/y6P8K9zgYh
Weโre constantly releasing updates and new features to LlamaCloud. LlamaParse lets you make use of the latest LLMs when parsing complex documents, getting them ready to be used in further AI applications. And now, it supports @AnthropicAI Sonnet 4.0 in agent and LVM modes.โฆ https://t.co/yNcOtjKMzm
We are introducing Quartet, a fully FP4-native training method for Large Language Models, achieving optimal accuracy-efficiency trade-offs on NVIDIA Blackwell GPUs! Quartet can be used to train billion-scale models in FP4 faster than FP8 or FP16, at matching accuracy. [1/4] https://t.co/gggPqEgcPZ
Donโt let AWS rip you off. We grew our B2C education app to ~400k users and $1M+ ARR on a single $87/month dedicated server from OVH. No autoscaling nonsense, managed database markup, or observability bloat. Just a fast, predictable server that quietly did its job for years.โฆ https://t.co/SkdIZutKHC
this โjailbreakโ is so funny Claude Opus 4 :) https://t.co/YuzMfdC4Mm
๐ Diffusion for text generation is booming โ and we're pushing it further. While recent works explore unified generation via diffusion for faster decoding, they mostly rely on language priors. We introduce Muddit โ our next-gen Meissonic model. Star at https://t.co/LmWxpQLDAz https://t.co/6gfQdsFyrW
Instructor makes chat completions backwards compatible with the new responses api. Use all of the new OpenAI inbuilt tools with the structured outputs to match. Read more here : https://t.co/S9MArlhGnD https://t.co/bU5fwlWnps