Your curated collection of saved posts and media

Showing 32 posts Β· last 14 days Β· by score
D
Dickson Neoh πŸš€
@dicksonneoh7
πŸ“…
Thu Apr 10
πŸ†”40036055
⭐0.68

DEIM is an advanced training framework for object detection with DETR. When applied to recent DETR based models, it results in faster convergence. The image below shows D-FINE and RT-DETR. I made a python package to let you train your own DEIM model using D-FINE models. πŸ§΅πŸ‘‡ https://t.co/q8AfDA1zOU

❀️3
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
L
Lior⚑
@LiorOnAI
πŸ“…
Thu Apr 10
πŸ†”06721905
⭐0.83

A 256M open-source vision LM for complete document OCR just beat models 27Γ— bigger. SmolDocling converts full documents into structured metadata using <500MB VRAM on consumer GPUs. https://t.co/DzGnIKUJFP

Media 1
❀️834
likes
πŸ”137
retweets
πŸ–ΌοΈ Media
P
Paul Gauthier
@paulgauthier
πŸ“…
Sun
πŸ†”79476843
⭐0.73

Llama 4 Maverick scored 16% on the aider polyglot coding benchmark. https://t.co/mBVaUPGHPl https://t.co/FT14gbbG1K

❀️902
likes
πŸ”78
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Thu Apr 17
πŸ†”31039635

This keeps coming up, but, just in case you were wondering, being polite seems to have no effect on answer quality in aggregate. It greatly increases the quality of particular answers while greatly lowering the quality of others, and it is not possible to know which in advance. https://t.co/vNtxzSAIWj

Media 1Media 2
+1 more
❀️443
likes
πŸ”42
retweets
πŸ–ΌοΈ Media
A
Artidoro Pagnoni
@ArtidoroPagnoni
πŸ“…
Dec 13, 2024
506d ago
πŸ†”41981804

πŸš€ Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🀯 Paper πŸ“„ https://t.co/5QGrlJdK0y Code πŸ› οΈ https://t.co/jCdDI5BXwe https://t.co/7XyZdcXWoR

Media 1
❀️562
likes
πŸ”113
retweets
πŸ–ΌοΈ Media
D
Jim Fan
@DrJimFan
πŸ“…
Dec 13, 2024
506d ago
πŸ†”31321287
⭐0.81

The level of insight that Ilya has about compute in *2017* is just insane. And so well-articulated in plain words. https://t.co/rWddko44Ev

Media 1
❀️1,176
likes
πŸ”144
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Aug 23, 2024
618d ago
πŸ†”86411643

I propose the Encounter Test as a nerdy benchmark standard for AI. Ask an AI to simulate an encounter between two D&D creatures & see how long it takes to mess up. Drow vs. mind flayer: GPT-4o does best, Gemini is cute. Outcomes similar (I am sure better prompting would help) https://t.co/bZFOpSBW3r

Media 1Media 2
+2 more
❀️122
likes
πŸ”12
retweets
πŸ–ΌοΈ Media
M
Maxime Labonne
@maximelabonne
πŸ“…
Dec 13, 2024
506d ago
πŸ†”43183909

Phi-4's principles for generating synthetic data remind me of something... πŸ‘€ It's a cool paper, I'm glad they released more stuff this time. https://t.co/weOllEVkRw

Media 1Media 2
+1 more
❀️106
likes
πŸ”9
retweets
πŸ–ΌοΈ Media
N
Nicolay Gerold
@nicolaygerold
πŸ“…
Dec 14, 2024
505d ago
πŸ†”01927304

Inequality joins in polars is massive. https://t.co/zRfIWUF6MA

Media 1
❀️2
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
J
Jeremy Howard
@jeremyphoward
πŸ“…
Dec 14, 2024
505d ago
πŸ†”39031670

I'm glad that ChatGPT now has a feature called "Projects" which lets you organise chats, add files, and set custom instructions. But… maybe a little credit might have been nice for Claude "Projects", which lets you organise chats, add files, and set custom instructions? https://t.co/tgZ4cg90rL

@OpenAI β€’

Day 7: Projects in ChatGPTβ€”a new way to organize and customize your chats. https://t.co/Dt7wzatS6l

Media 1
❀️819
likes
πŸ”29
retweets
πŸ–ΌοΈ Media
Z
Zeyuan Allen-Zhu | I'm NOT @ NeurIPS 2024
@ZeyuanAllenZhu
πŸ“…
Dec 14, 2024
505d ago
πŸ†”46638720

(1/3) Let me give Rosalind Picard a lesson on what real values I learned at Tsinghua Physics. In our notorious experimental physics course, every original data point must be preserved, and every analysis, even a simple linear regression, must trace back to handwritten numbers. https://t.co/rLgCWYEM6R

Media 1
❀️1,078
likes
πŸ”84
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Dec 13, 2024
506d ago
πŸ†”29635574

Phi-4 Technical Report Microsoft presents phi-4, a 14B model that surpasses its teacher model on STEM-QA capabilities. It also reports strong performance on reasoning-focused benchmarks due to improved data, training curriculum, and innovations in the post-training scheme.

Media 1
❀️127
likes
πŸ”25
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Dec 12, 2024
507d ago
πŸ†”87372555

AutoReason Improves Multi-step Reasoning Proposes a method to automatically generate rationales for queries using CoT prompting. This transforms zero-shot queries into few-shot reasoning traces which are used as CoT exemplars by the LLM. Claims to improve reasoning in weaker LLMs.

Media 1
❀️276
likes
πŸ”47
retweets
πŸ–ΌοΈ Media
J
MrNeRF
@janusch_patas
πŸ“…
Dec 13, 2024
506d ago
πŸ†”72683921

Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos Contributions: 1) A framework for obtaining real-world, dynamic, and pseudo-metric 4D reconstructions and camera poses at scale from existing online video. 2) DynaDUSt3R, a method that takes a pair of frames from any real-world video and predicts a pair of 3D point clouds along with the corresponding 3D motion trajectories that connect them in time.

❀️318
likes
πŸ”34
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Dec 13, 2024
506d ago
πŸ†”83460760

Sora was the first video editor I tried, and I.. think we have a really really long way to go on vidgen https://t.co/mvsvTFaH1h

❀️115
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
D
Dylan
@dylanjcastillo
πŸ“…
Dec 12, 2024
507d ago
πŸ†”59155750

Structured outputs can decrease LLM's performance in some tasks. I replicated @willkurt / @dottxtai rebuttal of Let Me Speak Freely? (LMSF) using gpt-4o-mini. The rebuttal correctly highlights many flaws with the original study, but ironically, LMSF's conclusion still holds. https://t.co/m3UTIsziA2

Media 1
❀️29
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Dec 12, 2024
507d ago
πŸ†”74291401

Build RAG agents that respect your SharePoint permissions structure! We have a lot of customers who use the @Azure stack to connect to their enterprise data sources like SharePoint, and a frequent feature request was the ability to have the application respect permissions from SharePoint when answering questions about documents. This is now built-in to LlamaCloud! Learn more about this feature here: https://t.co/33ZxY1Fy99

Media 1
❀️46
likes
πŸ”11
retweets
πŸ–ΌοΈ Media
H
Han
@HanchungLee
πŸ“…
Dec 12, 2024
507d ago
πŸ†”95801086

we couldnt define agent 30+ years ago and we wont be able to define agent today. it's going to be one of the fluff words that encompasses everything and nothing. https://t.co/3nd9NCOcNn

❀️29
likes
πŸ”9
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Dec 12, 2024
507d ago
πŸ†”59437611

Google launched its latest Gemini 2.0 models and we had day-0 support (but forgot to post about it! πŸ˜‚) You can get it today: pip install llama-index-llms-gemini or pip install llama-index-llms-vertex @simonw says the vibes are good: https://t.co/PpFmVF7zct Learn more here: https://t.co/wrlU78sA6z

Media 1
❀️26
likes
πŸ”5
retweets
πŸ–ΌοΈ Media
A
Alex Immerman
@aleximm
πŸ“…
Dec 12, 2024
507d ago
πŸ†”71082356

Waymo's market share is now equal to Lyft within SF. Incredible. Network effects is one of the best sources of defensibility. But it's proven to be not that important in ridesharing. You need a minimum network size, but once you have that, there are diminishing returns. In each geo, Uber and Lyft need enough drivers to have reasonable wait times. Once wait times hit that acceptable threshold, the incremental driver doesn't improve the rider experience (eg if my Uber ride is coming in 2-4 minutes, I don't really care about the wait times getting faster). When Waymo launched in August 2023, Uber and Lyft were at 66% and 34% share in SF. 15 months later in November 2024, WaymoΒ is at 22% - the same as Lyft - with Uber at 55%. Both Uber and Lyft lost lowΒ double digit % pts of market share, but it's more painful for Lyft. Lyft gave up ~1/3 of their share. Uber lost ~1/6. This is just when comparing all rides with pickups and dropoffs inside Waymo’s SF operating boundary (ie excludes any ride to / from the airport). Anecdotally, Waymo's wait times are longer than Uber and Lyft because they don't have enough cars on the road. But they are close enough to that acceptable threshold, that their superior product (clean, nice cars, quiet drivers, etc) tips the riders in their direction. It's possible when Waymo puts more cars on the road and reduces wait times to be in line with Uber and Lyft, their share could climb even faster.

Media 1
❀️3,602
likes
πŸ”520
retweets
πŸ–ΌοΈ Media
A
Andy Konwinski
@andykonwinski
πŸ“…
Dec 12, 2024
508d ago
πŸ†”03385674

I'll give $1M to the first open source AI that gets 90% on this sweet new contamination-free version of SWE-bench - https://t.co/o7LuYKIfhO https://t.co/tOzYmxrH0E

Media 1
❀️630
likes
πŸ”117
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Dec 12, 2024
507d ago
πŸ†”43004404
⭐0.95

Here I am browsing the Steam store with the new Gemini 2.0 Flash. No special tools, just using screen sharing. Shopping is likely to change a lot (especially when Gemini can actively control my computer in the next couple of months). Not 100% yet, but pretty darn impressive. https://t.co/C2xAk1gawi

❀️227
likes
πŸ”23
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Dec 13, 2024
506d ago
πŸ†”87316255

Don't worry everyone, I have figured out a way to delay any potential AI takeover indefinitely. If you know you know. (Turn sound on) https://t.co/ZFbOGwPr46

❀️429
likes
πŸ”28
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Dec 13, 2024
506d ago
πŸ†”44172847

The initial invention of chess computers boosted the ability of expert human players as they learned to play better by learning from machines. But, contrary to expectations, the superhuman chess AIs of the deep learning era, like Stockfish, have not had the same positive impact. https://t.co/RegiWcmCtT

Media 1Media 2
❀️283
likes
πŸ”29
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Dec 13, 2024
506d ago
πŸ†”29695928

Parse only exactly what you need with LlamaParse parsing instructions A powerful feature of LlamaParse is parsing instructions. These allow you to give natural-language instructions to the parser, allowing it to transform a naΓ―ve parsing of every word in the document to a context-aware version that reflects only the information you care about. It can handle unusual reading orders, complex tables and images, and more. In this video, @ravithejads demonstrates how the feature performs in real-world use cases: https://t.co/0EIFpLXOOJ

Media 1
❀️22
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
_
AK
@_akhaliq
πŸ“…
Dec 13, 2024
506d ago
πŸ†”66210429

Microsoft releases Phi-4 a 14-billion parameter model which performs at par with GPT-4o-mini and recently released Llama-3.3-70B https://t.co/BMFARL1Lc7

Media 1
❀️491
likes
πŸ”67
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Dec 13, 2024
506d ago
πŸ†”60490873

A lot of interesting ideas in the Sora interface, especially for longer videos. The ability to write a storyboard, and edit pieces of that with further prompts, is promising. This is a single "take" for an ad for fake Art Deco inspired shoes. Some weirdness, but also kinda neat. https://t.co/mGBXjILJB7

❀️151
likes
πŸ”15
retweets
πŸ–ΌοΈ Media
_
Philipp Schmid
@_philschmid
πŸ“…
Dec 10, 2024
509d ago
πŸ†”17853316
⭐0.86

What is better than an LLM as a Judge? Right, an Agent as a Judge! @AIatMeta created an Agent-as-a-Judge to evaluate code agents to enable intermediate feedback alongside DevAI a new benchmark of 55 realistic development tasks. The Agent-as-a-Judge is a graph-based agent with tools to locate, read, retrieve, and evaluate files and information for a code project to evaluate the results of other agents by comparing its judgments to human evaluations (alignment rate, judge shift). Insights πŸ› οΈ Agent cuts down costs to ~2.29% of human evaluation and time to ~2.36%. πŸ’° Agent costs $30.58 vs $1,297.50 for human evaluation ⚑ Reduced time to 118.43 minutes vs 86.5 hours πŸ§‘β€βš–οΈΒ LLM-as-a-Judge achieved a 60-70% alignment rate to humans πŸ₯‡Β Agent-as-a-Judge achieves a 90% alignment rate to humans

Media 1
❀️266
likes
πŸ”54
retweets
πŸ–ΌοΈ Media
I
Tanishq Mathew Abraham, Ph.D.
@iScienceLuvr
πŸ“…
Dec 13, 2024
506d ago
πŸ†”16696265

Video input + Santa Mode! + Apple Intelligence, Canvas updates, Sora, reinforcement finetuning, and full o1/o1 pro... On the seventh day of Christmas my true love gave to me... (we're halfway thru!) https://t.co/wBhpD9dUDj

@iScienceLuvr β€’

Apple Intelligence + Canvas updates, Sora, reinforcement finetuning, and full o1/o1 pro... On the sixth day of Christmas my true love gave to me... https://t.co/ulxxjIwRz8

Media 1
❀️17
likes
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Dec 11, 2024
508d ago
πŸ†”63739434

The new Deep Research feature from Google feels like one of the most appropriately "Google-y" uses of AI to date, and is quite impressive. I've had access for a bit and it does very good initial reports on almost any topic. The paywalls around academic sources puts some limits. https://t.co/dwSqr6aKGZ

Media 1Media 2
+2 more
❀️1,166
likes
πŸ”153
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Dec 11, 2024
508d ago
πŸ†”50521183

Extract and interpret SVG charts from PDFs and other complex document formats with LlamaParse! When building a RAG application the most interesting data is often locked away in charts and diagrams. LlamaParse has the ability to extract and interpret these charts, converting them into Markdown and Mermaid representations. Check out the demo video from @ravithejads here: https://t.co/9FkjnKxCt1 Learn more about LlamaParse: https://t.co/p5fnaPn8EE

Media 1
❀️48
likes
πŸ”20
retweets
πŸ–ΌοΈ Media
S
Alex Strick van Linschoten
@strickvl
πŸ“…
Dec 11, 2024
508d ago
πŸ†”76086833

πŸ“Š After analyzing real production LLMOps data, here's what actually works for prompt engineering: structured prompts for reliability, systematic versioning for scale, and retrieval-augmented generation for efficiency. No theoryβ€”just battle-tested approaches. https://t.co/T6BsNoVL8R

Media 1
❀️11
likes
πŸ”2
retweets
πŸ–ΌοΈ Media