Your curated collection of saved posts and media
Deep Reinforcement Learning for Multi-Agent Coordination Multi-agent reinforcement learning doesn't scale. The default approach to coordinating robot teams remains explicit communication or centralized training. This is inefficient. On the other hand, social insects solve this problem effortlessly. Ant colonies coordinate thousands of agents through stigmergy: indirect communication via environmental traces. This new research introduces S-MADRL, a framework where robot teams coordinate through virtual pheromones instead of direct communication. Agents leave digital traces in a shared virtual map as they move through the environment. Other agents sense these traces within their local field of view and incorporate them into their decision-making. The environment itself becomes the communication medium. The researchers combined this stigmergic approach with curriculum learning, training agents sequentially rather than simultaneously to address non-stationarity issues. The emergent behaviors of the system mirror biological strategies. Agents self-organized into bucket-brigade patterns with bidirectional flow. Some agents spontaneously became idle during high congestion, exactly like worker ants in crowded tunnels. No one programmed these behaviors. They emerged from the stigmergic signals alone. Paper: https://t.co/CqiJrpctY6 Learn to build effective AI Agents in our academy: https://t.co/Y5kVy5iKiQ

Designing reward functions for RL agents is kind of broken. The default approach remains manual engineering: domain experts iteratively craft reward signals through trial-and-error. This requires significant expertise, takes enormous human effort, and often fails when task complexity increases. But what if agents could discover their own optimal reward functions? This new research introduces a bilevel optimization framework that automatically discovers optimal reward functions for embodied RL agents through regret minimization. The optimal reward function can be defined as one that minimizes the gap between the learned policy and the true optimal policy. No expert demonstrations needed. No human feedback required. How it works: Two optimization levels run simultaneously. The lower level trains the RL agent to maximize rewards as usual. The upper level continuously updates the reward function itself, guided by a meta-gradient that minimizes policy regret. The reward function learns to assign high values to critical states like success or failure, while providing dense feedback throughout the state space. The framework works across both value-based agents (DQN) and policy-based agents (PPO, SAC, TD3) without task-specific tuning. In data center energy management, all RL agents using discovered rewards achieved energy reductions exceeding 60%, compared to 21-52% for baseline RL. In UAV trajectory tracking, the approach enabled PPO agents to successfully complete tasks where hand-designed rewards failed entirely. In sparse-reward OpenAI benchmarks, agents using discovered rewards outperformed baselines in both convergence speed and final performance. The discovered reward functions also reveal interpretable structure: they automatically identify critical states and encode latent relationships between states and rewards that match physics-based reward designs, despite having no explicit mathematical model. Paper: https://t.co/W9fRH6sbDq Learn to build effective AI Agents in our academy: https://t.co/JBU5beIoD0

Brilliant post on prompt caching! This is one of the most effective and underutilized techniques for reducing LLM usage costs. https://t.co/nByCrQv5ez
History LLMs are models trained exclusively on pre-1913 texts. Beyond the research applications, here is why this is an exciting effort for me: 1. Studying historical discourse without modern bias. These models capture what was "thinkable, predictable, or sayable" at specific moments in history. Unlike prompting modern LLMs to roleplay, these models genuinely don't know about future events because that information literally isn't in their training data. Lots of applications there. 2. Understanding historical predictions and assumptions. Researchers can explore what contemporaries expected would happen versus what actually occurred. This is useful for studying economic forecasts, political analysis, and social expectations from past eras. 3. Analyzing language and concept evolution. Track how terminology, ideas, and discourse patterns changed over time. The specific cutoff dates (1913, 1929, 1933, 1939, 1946) align with major historical inflection points (pre-WWI, Great Depression, WWII start, post-war). Heck, this might even be useful where you are using sub-agents for history-related tasks or expertise. 4. Detecting anachronisms and large-scale textual analysis. Useful for historians, writers, and filmmakers to verify period-accurate language and concepts. These models can flag modern assumptions that wouldn't exist in historical contexts. This also enables exploration of massive historical corpora in ways traditional archival research cannot. It acts as a "compressed representation" of the discourse from each era. Thoughts?
Wan2.6: Be What You Wanna Be Wan2.6 R2V is now available. ๐ฌ๏ธSupports recording in real time or uploading a 5 second reference video, and replicating the person, animal, animated character, or object from that clip into new videos. ๐ฌ๏ธSupports both single- and multi-character video generation. ๐ฌ๏ธReplicates both the characterโs appearance and voice, producing fully audio-visual synchronized output (including music, sound effects, and human speech).
If you support Native American people's, history & culture ๐ฅฐSay.. "Yes https://t.co/O0nVBgw6er

Hello all my friends ๐บ๐ธneed a big A'ho! ๐๐ ๐ชถโฆ https://t.co/68LlsttsUR

๐ก GAULOIS EN COLรRE ๐ก TOUCHE PAS ร : - MA DIVERSITร -MES ATTENTATS - MES รGORGEMENTS - MES CRACKHEADS - MES VIOLS PAR DES OQTF - MES FRONTIรRES PASSOIRES - MON A.M.E - MON รTAT COMMUNISTE - MON รCOLOGIE PUNITIVE - MA DISCRIMINATION POSITIVE https://t.co/KnsI3CpAnc

โCan you hear me?โ, With Dreamina Video 3.5, sound is no longer an add-on, it drives the story. Audio and image are generated as a single, unified output. Mute it, and the story disappears. This piece explores distance, noise, and closeness, and how sound can slowly make space for love. I was honestly impressed by how well this new model understood the prompt. Generating this felt simple and intuitive, with natural gestures, solid lip-sync, and really strong sound design right out of the box !! You rock Dreamina !! Can't wait to create more with it. Created for the launch of Seedance 1.5 by @dreamina_ai #DreaminaVideo 3.5 #Seedance 1.5
can someone help folks at Mistral find more weak baselines to add here? since they can't stomach comparing with SoTA.... (in case y'all wanna fix it: Chandra, dots.ocr, olmOCR, MinerU, Monkey OCR, and PaddleOCR are a good start) https://t.co/QNZKZITdUt
Introducing Mistral OCR 3, a new frontier in document intelligence! ๐งต๐ https://t.co/o0fAvFjtz7
Itโs actually these guys https://t.co/hPdBOAXHnn
What's the most dangerous enemy of the America? 1. Communism 2. Islam 3. Dems/Wokes/Lefties 4. China https://t.co/SYyJ8NhtWi
Itโs actually these guys https://t.co/hPdBOAXHnn
I don't know if you've seen this @levarburton, but it would be awesome if you agreed. https://t.co/JPpUIllheQ

โIf you take societyโs definition of knowing oneself, you will become lost in the many translations.โ- G Sara. Follow @DailyNative1 https://t.co/Sz02bgq5j2

Unless you are indigenous, you are not a native. You are from here. You were born here. You were raised here. But you are not and will never be a native. https://t.co/RyzI1zhGpV

We are a modern culture that has suffered great amnesia from who we all were from our past. Sound and music connected us allโpast all divides that today we call political. https://t.co/4b5IUzO9df
I don't know if I mentioned it but an art museum is buying two of my digital pieces. We had them printed & made into signs for an exhibit and decided they wanted to keep them for their collection. https://t.co/7SbYwAV3xJ

@dbongino Youโre not wrong Dan. The real Patriots are stepping up though, from small towns like mine to big cities. We will overcome!! https://t.co/rVBFuVEpYU

"Believe in yourself and all that you are. Know that there is something inside you that is greater than any obstacle." https://t.co/KAPbKbBYMk

ออออออ ออออออ ออออออ ใ คใ คใ คใ คz e r o restarting! not new to rp like and rt to be mutuals ออออออ ออออออ ออออออ https://t.co/paSlyMscJp
Serba-serbi dunia Zeroni What if โZeroni as Zeroseโ au written by, Nana https://t.co/KsMMzXrFLk

One in three using AI for emotional support and conversation, UK says https://t.co/nlzpIA0rug @radioproducer @bbcnews
Just built a fully functional Reversi game with a pro-level AI opponent using @ManusAI ๐ I just gave Manus the goal, and it took over the entire build autonomously. The complexity under the hood is crazy, but the good thing about working at Manus is that I can ask an AI engineer! Here's what they told me:๐
HEATED RIVALRY https://t.co/Q7ZKWxQPI6
https://t.co/YDECzBWx4z
the surreal "6 7 sydney sweeney 3 Labubus at a McDonald's drive-thru" bit, repeating the same joke over and over every night, the palpable sense that he's phoning it in so hard he may not even be trying to hide his contempt from the audience...is this show actually brilliant??

LETS GOOOO https://t.co/BkcWQp1OAI
LETS GOOOO https://t.co/BkcWQp1OAI
Ever had someone say: โIโm 5 minutes awayโ โณ๐ โฆbut they havenโt even put their shoes on yet? ๐๐ I used to take that personally. Likeโฆ disrespect. Likeโฆ lies. ๐ค Then I learned what neuroscientists actually point to ๐ง โจ Our ability to plan, estimate time, and simulate what happens next relies heavily on the prefrontal cortex (the brainโs executive system) โ ๐งฉ And imagining the future isnโt just โmotivationโโฆ itโs a real brain process involving networks that include the prefrontal cortex and memory systems like the hippocampus ๐ฌ๐ง When that future-simulation / planning system is weaker (or just overloaded by stress, distraction, sleep deprivation, ADHD-style executive function issues, etc.) the brain can become super โNOW-focusedโ ๐ฑโก๏ธ So โ5 minutesโ can mean: โI feel close to leavingโ ๐๐ญ not โI have calculated the real timeline.โ This doesnโt mean anything goes. Boundaries still matter โ ๐งฑ But it changes the solution from yelling โ to making time concrete: ๐ โText me when your shoes are on.โ ๐ โText me when you have your keys.โ ๐ โText me when youโre in the car.โ ๐ โThen send your real ETA.โ Less rage. More clarity. More peace. ๐ฎโ๐จโจ Sometimes itโs not attitudeโฆ itโs executive wiring. ๐งฉ๐ Video Credit: insta codex_mind1
Why AI Companies May Invest More than $500 Billion in 2026 https://t.co/ot4rFhmD5c @mckinsey
AI toys are suddenly everywhere but I suggest you don't give them to your children https://t.co/jYGOm1P31f
Working on a new language model architecture grounded in coherence physics and thermodynamic learning. https://t.co/mtLUEIxHSl
Goodnight, X. Grok Imagine image/video prompt ๐ image prompt: Extreme close-up of a single Tesla Optimus Gen 2 robot's black glass faceplate. The reflection in the curved glass shows the reflection of thousands of other robots standing in rows. A small red status light blinks on the side of the helmet. Cinematic sci-fi masterpiece, shot on ARRI Alexa 65, anamorphic lens, cold industrial color grading, deep blues and greys with orange highlights, volumetric lighting, photorealistic 8k. She looks forward out of her front window, crane shot in, fast motion, zoom in, text: 'grok' video prompt: crane shot in, fast motion, zoom in, text: 'grok'