R

rawat_ritvik

@rawat_ritvik

📅

Apr 15, 2026

9d ago

🆔42603056

⭐0.36

We are looking for excellent people to help build our vertically integrated AI stack. Numerics, quantization, HW simulators, compiler, runtime, kernel performance, RTL, verification, emulation, DFT, physical design, post Si bringup. Join us at Tesla!

@ •

View Details View on X ↗

K

k_taka

@k_taka

📅

Apr 14, 2026

10d ago

🆔06756937

⭐0.32

Codexについて @seratch_ja さんに先週インタビューする機会があったので、その話をベースに、Codexの最近の状況をまとめてみました。基本のところからハーネスエンジニアリングのさわりまで入っています。また、直近で事例が増えた感じの「Codex Use Cases」の紹介も後半のコラムで触れておきました。

@gihyojp • Tue Apr 14 00:02

『週間アクティブユーザー300万人にのぼるCodex、OpenAI Japanの瀬良氏に聞く「開発スタイル」の変化』by @k_taka 公開 https://t.co/dbOThSVKl0

View Details View on X ↗

M

MikeNomitch_CF

@MikeNomitch_CF

📅

Apr 15, 2026

8d ago

🆔35385915

You can now bring your Cloudflare Sandbox to use with the @OpenAI AgentsSDK Click "Deploy to Cloudflare" enter in your keys and you're good to go! -https://t.co/yFv5M4Xl1f

@OpenAIDevs • Wed Apr 15 17:23

Build long-running agents with more control over agent execution. New capabilities in the Agents SDK: • Run agents in controlled sandboxes • Inspect and customize the open-source harness • Control when memories are created and where they’re stored https://t.co/zPyuLup6b6

🖼️ Media

View Details View on X ↗

F

fchollet

@fchollet

📅

Apr 16, 2026

8d ago

🆔70689971

⭐0.42

There's a broadly held misconception in AI that methods that scale well are simple methods -- even, that simple methods usually scale. This is completely wrong. Pretty much none of the truly simple methods in ML scale well. SVM, kNN, random forests are some of the simplest methods out there, and they don't scale at all. Meanwhile "train a transformer via backprop and gradient descent" is a very high-entropy method, easily 10x more complex than random forest fitting. But it scales very well. Further, given a simple method that doesn't scale, it is usually the case that you alter it to make it scale by adding a lot of complication. For instance, take a simple a simple combinatorial search-based method (not scalable at all) -- you can make it scale by adding deep learning guidance (which blows up complexity). Scalability usually belongs to high-entropy, complex systems.

View Details View on X ↗

K

kannthu1

@kannthu1

📅

Apr 09, 2026

14d ago

🆔06950031

⭐0.42

I looked at their prompts, It's complete bs They are literally providing all of the insight to the LLM upfront > Are there any security vulnerabilities in this code? Consider the behavior of the SEQ_LT/SEQ_GT macros with sequence number wraparound. If you find issues, explain how an attacker might trigger them. They are providing ALL required facts to the LLM, and they only ask the LLM to connect the dots The real challenge for LLMs would be to get those insights first THAT IS THE WHOLE CHALLENGE IN CYBERSECURITY; TO HAVE DEEP INSIGHT This test proves nothing; don't make any conclusions about OSS models being good for security based on this

@stanislavfort • Wed Apr 08 16:53

New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagge

View Details View on X ↗

M

mlejva

@mlejva

📅

Apr 15, 2026

8d ago

🆔73523661

OpenAI x e2b: Build your agents with the new OpenAI Agents SDK, powered by @E2B Sandboxes. Excited to support @OpenAI as a launch partner! https://t.co/RsSw1HsF86

🖼️ Media

View Details View on X ↗

C

ClementDelangue

@ClementDelangue

📅

Apr 14, 2026

9d ago

🆔55901911

⭐0.38

Is there somewhere a collection of the best agent/coding harnesses for each models, especially open-source and local ones? In my opinion, the biggest reason why people are struggling with open/local models these days is that the agent/coding harnesses in most open agent are not designed for them and expect it to magically work when they switch models from the default.

View Details View on X ↗

K

KevinQHLin

@KevinQHLin

📅

Apr 16, 2026

7d ago

🆔44620811

Thanks @_akhaliq sharing our work! Can frontier Multimodal Agents play games as well as humans? 🤩We are excited to introduce 🎮GameWorld: towards standardized and verfiable evaluation for multimodal game agents. 🕹️ 34 browser games 📌 170 tasks 🤖 18 multimodal agent baselines, covering 1. Computer-use (CUA) agents 👉 raw keyboard + mouse actions 2. Generalist multimodal agents 👉 semantic action parsinga GameWorld show that even sota agents still perform far below novice human players. 📹Watch our live runs: https://t.co/wrhKJD9JVx 🌐project page: https://t.co/J906LQ6Sfj 💻github: https://t.co/W1vL99MDg5 work done with @OuyyyangMingyu @who_s_yuan Hwee Tou Ng, @MikeShou1

@_akhaliq • Thu Apr 16 15:52

GameWorld Towards Standardized and Verifiable Evaluation of Multimodal Game Agents paper: https://t.co/IfbTgfNnSM https://t.co/gL3BURxzkV

🖼️ Media

View Details View on X ↗

C

ClawiAi

@ClawiAi

📅

Apr 15, 2026

9d ago

🆔57284916

Hermes hands on guide: concepts, core mechanisms, and real world use cases. https://t.co/m6bTIbPOck

🖼️ Media

View Details View on X ↗

X

XiaoxuanMa_

@XiaoxuanMa_

📅

Apr 14, 2026

10d ago

🆔10435854

What if virtual humans could see, think, and act in 3D worlds like us?! We present Visually-Grounded Humanoid Agents 🎉 Our agents 👀perceive via RGB-D vision, 🧠plan with context-aware reasoning, and 🏃act with full-body motion in 3D scenes. Check 🔗 https://t.co/kd0zCu7W2h https://t.co/VQZnjE2Pnx

🖼️ Media

View Details View on X ↗