Your curated collection of saved posts and media
@icanvardar True, also true. Adding a domain to your CORS config that costs .5 million tokens for the entire pipeline. Crazy times.
@_talaawwad Please get more support folksβ¦
@SalajSonar1086 Already done :). The respective tutorial articles are linked via the βView in Articleβ links there
VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4] https://t.co/dIr2YapWak
Nvidia GTC 2026 OpenClaw Setup on DGX Spark IRL https://t.co/zQwwfCF9XP
π οΈ Claude Code "opusplan" λ§ κ·Έλλ‘ νμ΄λΈλ¦¬λ λͺ¨λΈ.. 곡μμ! Claude Codeμλ opusplan λͺ¨λΈμ μ νν μ μμ΄μ. > /model opusplan νμ΄λΈλ¦¬λ λͺ¨λΈ aliasμΈλ°, μμ λ¨κ³μ λ°λΌ μλμΌλ‘ λͺ¨λΈμ μ νν΄μ. 볡μ‘ν μΆλ‘ μ μν νλ λͺ¨λμμλ Opusλ₯Ό μ€ν λ¨κ³μμλ SonnetμΌλ‘ μλ μ νλ©λλ€! Opusλ‘ κ³ννκ³ κ΅¬νκΉμ§ νλ κ²λ λ¬Όλ‘ κ°λ₯ν΄μ. νμ§λ§ μ΄λ―Έ ννν κ³νμ΄ μλ€λ©΄, μ€νμ SonnetμΌλ‘λ μΆ©λΆνκ³ λ μ λ ΄ν μ μμ΄μ. κ° μμ μ λ§λ λͺ¨λΈμ μ°λ κ² = ν¨μ¨ π νλλκ³Ό μ€νμ μꡬλλ μΈμ§ λΆνκ° λ¬λΌμ. Opusμ κΉμ μΆλ‘ λ₯λ ₯μ κ³ν μ립 λ¨κ³μμ κ°μ₯ λΉλκ³ , μΌλ¨ ννν κ³νμ΄ μΈμμ§ μ΄νμ μ€νμ SonnetμΌλ‘ μΆ©λΆν 컀λ²λ μ μμ΄μ. μΈμ μ°λ©΄ μ’λꡬμ? - 볡μ‘ν κΈ°λ₯ μ€κ³κ°μ΄ μν€ν μ² κ²°μ μ΄ μ€μν μμ - 리ν©ν λ§ κ³νκ°μ μν₯ λ²μ λΆμμ΄ νμν κ²½μ° - Opus ννμ μ¬μ© λλΉ λΉμ© μ κ°μ΄ νμν λ μ΄κ±° μ΄μ λ§μ΄ νμ©νμ€ λ―!!!
Jensen today announced Alpamayo 1.5 at #NVIDIAGTC! #Alpamayo 1.5 is a major update to Alpamayo 1β@nvidiaβs open 10B-parameter chain-of-thought reasoning VLA model, first introduced at #CES. Built on the #Cosmos-Reason2 VLM backbone and post-trained with RL, it adds support for navigation guidance, flexible multi-camera setups, configurable camera parameters, and user question answering. The result is an interactive, steerable reasoning engine for the AV community. Weβre also releasing post-training scripts to help researchers and developers adapt the model. Additionally, weβve significantly expanded the Alpamayo open platform across data and simulation, including releasing highly requested reasoning labels for the PhysicalAI Autonomous Vehicles dataset (https://t.co/fD9eUcndya), as well as our chain-of-causation auto-labeling pipeline. π Learn more about Alpamayo 1.5 and the latest extensions to the Alpamayo open platform: https://t.co/P0nuqkwBab (please note that most of the links will become active in the next few days.) Happy buildingβand stay tuned for more in the coming months! @NVIDIADRIVE @NVIDIAAI
π οΈ Claude Code "opusplan" λ§ κ·Έλλ‘ νμ΄λΈλ¦¬λ λͺ¨λΈ.. 곡μμ! Claude Codeμλ opusplan λͺ¨λΈμ μ νν μ μμ΄μ. > /model opusplan νμ΄λΈλ¦¬λ λͺ¨λΈ aliasμΈλ°, μμ λ¨κ³μ λ°λΌ μλμΌλ‘ λͺ¨λΈμ μ νν΄μ. 볡μ‘ν μΆλ‘ μ μν νλ λͺ¨λμμλ Opusλ₯Ό μ€ν λ¨κ³μμλ SonnetμΌλ‘ μλ μ νλ©λλ€! Opusλ‘ κ³ννκ³ κ΅¬νκΉμ§ νλ κ²λ λ¬Όλ‘ κ°λ₯ν΄μ. νμ§λ§ μ΄λ―Έ ννν κ³νμ΄ μλ€λ©΄, μ€νμ SonnetμΌλ‘λ μΆ©λΆνκ³ λ μ λ ΄ν μ μμ΄μ. κ° μμ μ λ§λ λͺ¨λΈμ μ°λ κ² = ν¨μ¨ π νλλκ³Ό μ€νμ μꡬλλ μΈμ§ λΆνκ° λ¬λΌμ. Opusμ κΉμ μΆλ‘ λ₯λ ₯μ κ³ν μ립 λ¨κ³μμ κ°μ₯ λΉλκ³ , μΌλ¨ ννν κ³νμ΄ μΈμμ§ μ΄νμ μ€νμ SonnetμΌλ‘ μΆ©λΆν 컀λ²λ μ μμ΄μ. μΈμ μ°λ©΄ μ’λꡬμ? - 볡μ‘ν κΈ°λ₯ μ€κ³κ°μ΄ μν€ν μ² κ²°μ μ΄ μ€μν μμ - 리ν©ν λ§ κ³νκ°μ μν₯ λ²μ λΆμμ΄ νμν κ²½μ° - Opus ννμ μ¬μ© λλΉ λΉμ© μ κ°μ΄ νμν λ μ΄κ±° μ΄μ λ§μ΄ νμ©νμ€ λ―!!!
Did you know about the opusplan model in Claude Code? /model opusplan It's a hybrid alias that automatically uses Opus in plan mode for complex reasoning, then switches to Sonnet for execution. Best of both worlds: Opus thinks, Sonnet builds https://t.co/r7un0X5bVg
Subagents are now supported in Codex. They're very fun and make it possible to get large amounts of work done *quickly*:
Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: β’ Keep your main context window clean β’ Tackle different parts of a task in parallel β’ Steer individual agents as work unfolds https://t.co/QJC2ZYtYcA
Banger report from the Kimi team: Attention Residuals Residual connections made deep Transformers trainable. But they also force uncontrolled hidden-state growth with depth. This work proposes a cleaner alternative. It introduces Attention Residuals, which replace fixed residual accumulation with softmax attention over previous layer outputs. Instead of blindly summing everything, each layer selectively retrieves the earlier representations it actually needs. To keep this practical at scale, they add a blockwise version that compresses layers into block summaries, recovering most of the gains with minimal systems overhead. Why does it matter? Residual paths have barely changed across modern LLMs, even though they govern how information moves through depth. This paper shows that making the mixing content-dependent improves scaling laws, matches a baseline trained with 1.25x more compute, boosts GPQA-Diamond by +7.5 and HumanEval by +3.1, while keeping inference overhead under 2%. Paper: https://t.co/04IG6FDiVr Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8
The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8

Thanks for sharing our newest work @_akhaliq ! Classic algorithms like K-Means deserve to be revisited in the era of massive datasets and GPUs. Flash-KMeans rethinks the algorithm from a systems perspective to make exact K-Means fast and memory-efficient on modern hardware.
Claude Code CLI > Codex CLI Codex Desktop > Claude Code Desktop Itβs a jagged UX frontier
Thanks for sharing our newest work @_akhaliq ! Classic algorithms like K-Means deserve to be revisited in the era of massive datasets and GPUs. Flash-KMeans rethinks the algorithm from a systems perspective to make exact K-Means fast and memory-efficient on modern hardware.
Flash-KMeans Fast and Memory-Efficient Exact K-Means paper: https://t.co/Yy7V7L12Bn https://t.co/c1mGipQl3f
π Agentic Browser Tools (Experimental) in @code! Agents can now open pages, read content, click elements, and verify changes directly in the integrated browser while building your web app. Enable βοΈ workbench.browser.enableChatTools to try it out. Learn mode: https://t.co/kNwugFcbIA
#NVIDIAGTC news: NVIDIA Dynamo 1.0 enters production as the broadly adopted inference operating system for AI factories. Dynamo 1.0 boosts Blackwell inference performance by up to 7x. The industry is scaling on NVIDIA. β¬οΈhttps://t.co/Iaq2H2SmhR
#ExecuTorch addresses fragmented native deployment for #AI agents as a #PyTorch native platform. It enables voice models across CPU, GPU, and NPU on Android, iOS, Linux, macOS & Windows π https://t.co/NeQQyUniL4 https://t.co/O3itnoQFoG
VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4] https://t.co/dIr2YapWak
Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% β 57% GPT-5.4: 82% β 88% Opus: 77% β 93% His benchmark measures how well models implement a 100-feature PRD. @cursor_ai consistently outperformed. https://t.co/hrjCmWMNKN
A new generation in AV simulation is here! We are announcing AlpaDreams, a real time interactive generative world model for AV simualtion! Just a year ago it took minutes to generate a few seconds of video, today it is real time and interactive! https://t.co/FbhKu3PMqe
Nvidia GTC 2026 OpenClaw Setup on DGX Spark IRL https://t.co/zQwwfCF9XP
Mistral Small 4 is out https://t.co/IdAowSpHpN
Love this submission from our world models hackathon this weekend - a generative FPS!
Spent the weekend hacking at the Worlds in Action hackathon at @fdotinc by @SensAIHackademy. It was so much fun playing with the world models by @theworldlabs . I believe generative games are the future where characters, rules and even parts of the world can be generated and ad
NVIDIA has released Nemotron 3 VoiceChat! A ~12B parameter Speech to Speech model that leads our open weights Conversational Dynamics vs. Speech Reasoning pareto frontier Understanding Speech to Speech model performance is multidimensional - two key and distinct dimensions are raw intelligence and conversational dynamics: how well a model handles the natural rhythms of human conversation such as turn-taking, interruptions. Amongst full duplex open weights models, NVIDIAβs new Nemotron 3 VoiceChat, V1, leads in balancing these dimensions, setting itself apart from other models on the Conversational Dynamics vs. Speech Reasoning pareto frontier. Key benchmarking results: β€ Conversational Dynamics (Full Duplex Bench): Nemotron 3 VoiceChat (V1) scores 77.8%, second among open weights speech to speech models behind NVIDIA's own PersonaPlex (91.0%) and ahead of FLM-Audio (62.0%), Moshi (61.0%) and Freeze-Omni (58.7%) β€ Speech Reasoning (Big Bench Audio): Nemotron 3 VoiceChat (V1) scores 29.2%, second among open weights speech to speech models behind Freeze-Omni (33.9%) and well ahead of PersonaPlex (12.6%), FLM-Audio (5.3%) and Moshi (1.7%) β€ Pareto leader: While Freeze-Omni leads on speech reasoning and PersonaPlex leads on conversational dynamics, Nemotron 3 VoiceChat (V1) is the only open weights model that performs amongst the top 3 on both - making it the clear leader on the pareto frontier between these two critical dimensions β€ Larger than other open weights models but still relatively small compared to LLMs: Nemotron 3 VoiceChat (V1) has 12B parameters, making it one of the larger open weights speech to speech models, while NVIDIA's PersonaPlex is ~7B. While larger compared to other larger open weights speech to speech models the model still is relatively small compared to leading LLMs β€ Context vs. proprietary models: While this release materially advances open weights performance, open weights speech to speech models still significantly underperform leading proprietary offerings. For comparison, proprietary models on our Big Bench Audio benchmark score substantially higher - Step-Audio R1.1 at 96%, Grok Voice Agent at 92%, Gemini 2.5 Flash (Thinking) at 92%, and Nova 2.0 Sonic at 87%. The gap between open weights and proprietary remains large in this modality. As the capability and adoption of Speech to Speech models increases, we expect to expand our set of benchmarks to include elements such as tool-calling and multi-turn instruction following. See more details below β¬οΈ
Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: β’ Keep your main context window clean β’ Tackle different parts of a task in parallel β’ Steer individual agents as work unfolds https://t.co/QJC2ZYtYcA
It's been about 20 years since I first started working on embeddings with Yann LeCun (siamese networks!), and I've been fascinated ever since. Gemini Embeddings 2 approaches the platonic ideal: native embedding of text, image, video, audio, and docs to a single space.
OmniForcing unlocks real-time joint audio-visual generation Achieves ~25 FPS with 0.7s latencyβa 35Γ speedup over offline diffusion modelsβby distilling bidirectional LTX-2 into a causal streaming generator with maintained multi-modal fidelity. https://t.co/UGYGMyTQOs
@Nvidiadev ποΈ MONDAY @ Booth #338 2PM: Shaping the Future w/ @matthew_d_white 3PM: TensorRT + PyTorch w/ Angela Yi & @narendasan 4PM: DeepSpeed Trillion-Param Training w/ @PKUWZP 5PM: PyTorch Export w/ Angela Yi 6PM: Ray Distributed Computing w/ @robertnishihara #AI #GTC2025
Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: β’ Keep your main context window clean β’ Tackle different parts of a task in parallel β’ Steer individual agents as work unfolds https://t.co/QJC2ZYtYcA
Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% β 57% GPT-5.4: 82% β 88% Opus: 77% β 93% His benchmark measures how well models implement a 100-feature PRD. @cursor_ai consistently outperformed. https://t.co/hrjCmWMNKN
π¨ Want to parse complex PDFs with SOTA accuracy, 100% locally? ππ At just 0.9B parameters, you can drop GLM-OCR straight into LM Studio and run it on almost any machine! π₯ π§ 0.9B total parameters πΎ Runs on < 1.5GB VRAM (or ~1GB quantized!) πΈ Zero API costs π Total data privacy Desktop document AI is officially here. π»β‘