Your curated collection of saved posts and media

Showing 32 posts Β· last 14 days Β· by score
D
Dr. Dominic Ng
@DrDominicNg
πŸ“…
Mon
πŸ†”29475648

Microsoft claims their new AI framework diagnoses 4x better than doctors. I'm a medical doctor and I actually read the paper. Here's my perspective on why this is both impressive AND misleading ... 🧡 https://t.co/1FVkmuaCfl

Media 1
❀️8,879
likes
πŸ”1,255
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Tue Jul 01
πŸ†”26550711

Use all of your LlamaCloud tools within powerful agentic applications! We've open-sourced the LlamaCloud MCP server that connects your LlamaCloud project directly to MCP clients like @AnthropicAI Claude Desktop, giving you instant access to your private data and LlamaExtract… https://t.co/K4Y9kAAFQF

❀️35
likes
πŸ”5
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Tue Jul 01
πŸ†”39779962

What happens if you put a full scientific paper into AI and ask it to find known errors in proofs, tables, etc? Every model before o3 fails completely, o3 gets 21% (its better at proofs, worse at tables & figures). Progress & perhaps a second opinion, not yet autonomous science. https://t.co/QHJ1TrRGLi

Media 1Media 2
+1 more
❀️483
likes
πŸ”49
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Tue Jul 01
πŸ†”30337707

What's the difference between guardrails and evaluators? Part 1 of 4: Guardrails are inline safety checks https://t.co/p0GHmPqwAo

Media 1
❀️22
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Thu Jun 12
πŸ†”59939336

if you're building rag, avoid these antipatterns: link to the talk in the thread https://t.co/4B7xsnUPfk

Media 1
❀️749
likes
πŸ”76
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Tue Jul 01
πŸ†”81043035

New feature! LlamaExtract can now automatically generate a schema from a document and/or a prompt! A point of friction in trying out LlamaExtract is the need to build up a schema first, now we can take care of this step for you! Just provide a document and describe what… https://t.co/q8HiP1PeAm

Media 1
❀️23
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Mon
πŸ†”63741402

The eval FAQs have been quite popular, so @sh_reya and I have made them available as a pdf πŸ““ Bonus: there are quite a few in here that I haven't tweeted about yet! https://t.co/zYV85olOeM https://t.co/O9z8tfccPx

Media 1
❀️342
likes
πŸ”42
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Tue Jul 01
πŸ†”05507505

"Claude, make this argument for buying a laptop bulletproof, literally." "no, i said literally" "even more literally" "MOAR LITERALLY" https://t.co/lnmA3cfGxn

Media 1Media 2
+1 more
❀️71
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Sat
πŸ†”23427389

"Most roadmaps, especially from a traditional software mindset, demand an AI feature' by a fixed date, before a single experiment is done. That doesn't work for AI." - @BEBischof https://t.co/ts3nIqBn0V

❀️30
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Tue Jul 01
πŸ†”23794434

opus 4 as siri confirmed thanks apple-mcp cc @DhravyaShah https://t.co/9CEBrE6PYo

Media 1
❀️26
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Tue Jul 01
πŸ†”83208456

I'm excited to announce @jeremyphoward & @johnowhitaker as guest speakers in our Evals course: https://t.co/dR23WB2cAl They'll showcase SolveIt, a VERY unique approach to software dev w/ AI. SolveIt borrows many ideas from notebooks: which is also our fav tool for evals 😍 https://t.co/qhfPLv4aOp

Media 1
❀️194
likes
πŸ”16
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Tue Jul 01
πŸ†”76729380

Strange cities. (I find working with Midjourney video to be really interesting, the ability to develop weird styles especially) https://t.co/Gmhhxy1y3q

❀️531
likes
πŸ”28
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Tue Jul 01
πŸ†”30752031

A challenge with AI adoption is that organizations are not built to a Grand Plan where AI can just be slotted in, but rather socially constructed, random & in flux Here's an anecdote from a paper on how a process re-engineering effort led to revelations that drove people insane. https://t.co/suJX9gPfZP

Media 1Media 2
❀️458
likes
πŸ”52
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Mon
πŸ†”98788010

πŸš€ Want to build real-world RAG (Retrieval-Augmented Generation) apps? We’ve got you covered with a full guide on how to go from raw data to fully-fledged pipelines. Our OSS engineer @itsclelia teamed up with @krotenWanderung from @qdrant_engine to break down the entire RAG… https://t.co/Zpxcpsuk7t

Media 1
❀️46
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Mon
πŸ†”85284191

top 10 follows on the timeline https://t.co/xOn4ffv088

Media 1
❀️19
likes
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Mon
πŸ†”17377181

her: "i asked my friend about you, she doesn't know you me: "where do they work?" her: "ibm" me: https://t.co/rs83hDoKNc

Media 1
❀️28
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
D
Dan Mac
@daniel_mac8
πŸ“…
Mon
πŸ†”09726110

πŸ”₯ Google announced: Gemini Embeddings a cost effective embeddings model at $0.15/mil tokens few things you can implement with it: >> semantic search >> doc classification and tagging >> recsys don't sleep on the embeddings models y'all https://t.co/3uv4rCzNEf

Media 1
❀️186
likes
πŸ”11
retweets
πŸ–ΌοΈ Media
W
William Berrios
@w33lliam
πŸ“…
Mon
πŸ†”67372636

Tired of seeing O3 hallucinate? πŸ˜΅β€πŸ’« Today, I am excited to share how we built the least hallucinatory LLM in the 🌍 Our GLMv2, developed at @ContextualAI, just claimed 1st place πŸ₯‡ on the FACTS Grounded leaderboard by Google DeepMind β€” outperforming Gemini-2.5-pro, Claude 4, and… https://t.co/2hoflvROjf

❀️278
likes
πŸ”29
retweets
πŸ–ΌοΈ Media
D
Dan Alistarh
@DAlistarh
πŸ“…
Mon
πŸ†”59417443

Announcing our early work on FP4 inference for LLMs! - QuTLASS: low-precision kernel support for Blackwell GPUs - FP-Quant: a flexible quantization harness for Llama/Qwen We reach 4x speedup vs BF16, with good accuracy through MXFP4 microscaling + fused Hadamard rotations. https://t.co/4WUwUSipRM

Media 1Media 2
❀️194
likes
πŸ”37
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Mon
πŸ†”79674741

πŸš€ New example: Research Agent with @Google Gemini 2.5 Pro & LlamaIndex. Explore how to build a multi-agent research assistant powered by LlamaIndex workflows and Google's Gemini 2.5 Pro. πŸ” Search the web with google πŸ“ Take notes with a dedicated note-taker agent 🧾 Write a… https://t.co/BU807U1ecI

Media 1
❀️90
likes
πŸ”17
retweets
πŸ–ΌοΈ Media
T
Chroma
@trychroma
πŸ“…
Mon
πŸ†”51708905

Introducing our latest technical report: Context Rot - How Increasing Input Tokens Impacts LLM Performance Our results reveal that models do not use their context uniformly. full report in replies https://t.co/9PINLM5Ltd

❀️886
likes
πŸ”90
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Sun
πŸ†”69008514

I am having fun drafting this FAQ. https://t.co/CQmpdPku8M

Media 1
❀️14
likes
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Sun
πŸ†”39382802

AI agents have brand β€œpreferences” and are attracted to different kinds of ads (Operator is a fan of buying whatever Bing advertises, Claude has different preferences) There is likely going to be a lot of money spent trying to influence this in the nearish future. https://t.co/zKtKftbHq2

Media 1Media 2
❀️388
likes
πŸ”51
retweets
πŸ–ΌοΈ Media
I
Isaac Flath
@isaac_flath
πŸ“…
Tue Jun 24
πŸ†”60354734

I just left my job to work on my own business. It’s been excellent working at AAI. They have a really amazing vision but I decided it was time to follow my own vision, passions, projects, and products. I am still collaborating with AAI on some stuff though :D https://t.co/C7GygRDqXc

Media 1
❀️58
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Mon
πŸ†”01448901

How does the completions model in @cursor_ai know this information? Even o3 doesn't know what deepseek-r1 is without search https://t.co/usbFhVXCon

Media 1
❀️596
likes
πŸ”10
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Mon
πŸ†”13019313

tfw you know a paper's going to be good https://t.co/AMOfwhO12n

Media 1
❀️1,615
likes
πŸ”137
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Mon
πŸ†”26347489

its so over https://t.co/zeXXLYpKrL

Media 1
❀️13
likes
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Mon
πŸ†”45469490

The repeated argument that AI is not actually useful to real people needs to be retired based on the representative national surveys we now have on real AI users. Teachers using AI report 6 hour a week time savings. Workers using AI report 3x productivity gains on 1/5 of tasks. https://t.co/QeodnJqSIE

Media 1Media 2
+2 more
❀️832
likes
πŸ”145
retweets
πŸ–ΌοΈ Media
X
Tiezhen WANG
@Xianbao_QIAN
πŸ“…
Mon
πŸ†”86273954

A company who had publicly denounced open source models being "stupid tax" has just re-embraced open source by releasing a series of models from 424B to 0.3B featuring both Paddle and Transformers support. Where are your models on @huggingface ? cc @OpenAI @sama @grok @elonmusk https://t.co/EivLWivuRM

Media 1
❀️146
likes
πŸ”12
retweets
πŸ–ΌοΈ Media
D
Hokin Deng
@DengHokin
πŸ“…
Sun
πŸ†”10058405

#ICML #cognition #GrowAI We spent 2 years carefully curated every single experiment (i.e. object permanence, A-not-B task, visual cliff task) in this dataset (total: 1503 classic experiments spanning 12 core cognitive concepts). We spent another year to get 230 MLLMs evaluated… https://t.co/1Cy3IM8wfi

Media 1
❀️540
likes
πŸ”77
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Mon
πŸ†”42680265

And with this, the US is mostly out of the frontier open source large LLM race. Europe has one contender, otherwise it is all China now. (OpenAI is going to release an open LLM soon, but no commitment yet to that being an ongoing effort). https://t.co/vevzjEczxY

Media 1
❀️198
likes
πŸ”30
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Mon
πŸ†”13608543

If I could only give people one tool for LLM Evals, it would be error analysis. Nothing else comes close This is what Look At Your Data ℒ️ means Links in reply https://t.co/yPkgDTxzRI

Media 1
❀️422
likes
πŸ”38
retweets
πŸ–ΌοΈ Media