Your curated collection of saved posts and media
New controlled study of AI voice mode on mental health finds complex results GPT-4 with an empathetic voice reduced loneliness, but heavier users, and especially when use was combined with a more neutral voice or those with particular personalities, had multiple negative impacts https://t.co/DZVCW43bb8
reliable keyword search is a basic expectation https://t.co/LvUDvWZQsH
Had sora generate "Gradient Descent" https://t.co/k9vefnqczC
Gemini can now execute code in a Canvas, a feature only Claude and ChatGPT has so far. It only does Gemini 2.0 Flash, which means it is fast (this is real-time response for: "create a reverse moonlanding game") but also limited compared to big models like Sonnet 3.7 or GPT 4.5 https://t.co/2hjBGHUz2h
cursor rules to contribute to instructor, @cursor_ai will build the PR, just ask it to https://t.co/t08k4gvguY
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning "We introduce CURIE, a scientific long-Context Understanding, Reasoning and Information Extraction benchmark to measure the potential of Large Language Models (LLMs) in scientific⦠https://t.co/ErIGSJNpEv
@MistralAI Including a base model is so huge about Mistral Small 3.1. Can't wait to try what @NousResearch, @cognitivecompai and others do with it. Brings back the good ol Mistral 7B memories. https://t.co/SayuqY53SX
Deep Learning is Not So Mysterious or Different What do you think? https://t.co/UIXeEr9Jd4
I find playing with Gemini multimodal image generation to be really fun. Took a pic: βturn the bottles into a Saturn V complete with tiny ground crew. Add a neon sign to the cups saying βmoonβ with an up arrowβ βMake the rocket out of legos. Make the crew ducklings on stiltsβ https://t.co/oY6BM49hAF
Introducing Stable Virtual Camera: This multi-view diffusion model transforms 2D images into immersive 3D videos with realistic depth and perspectiveβwithout complex reconstruction or scene-specific optimization. https://t.co/pHPkYhaKH3
Introducing YouTube video π₯ link support in Google AI Studio and the Gemini API. You can now directly pass in a YouTube video and the model can usage its native video understanding capabilities to use that, with just a link! π’ https://t.co/4jeNVmWtgx
https://t.co/FRSMOEa0NZ
I regret to announce that the meme Turing Test has been passed. LLMs produce funnier memes than the average human, as judged by humans. Humans working with AI get no boost (a finding that is coming up often in AI-creativity work) The best human memers still beat AI, however. https://t.co/O0sl5GRQNd
. @sh_reya 's paper confirms what I see in practice 1) Automated evals don't work (without semi-manual human alignment) 2) Most tools don't provide this alignment 3) Automated evals add mostly noise 4) You can only write good evals by looking at data and reacting to failures https://t.co/BxhHEDVKuV
AI agent is taking over advertising.. NEX's new AI model, Marko, can now train your product and generate unlimited commercial quality images and videos for you and.. edit by chatting with it. this bring AI product ads to a new level. 10 examples: https://t.co/qVJDAOKF8j
A new YouTube video by Welch Labs that gives an awesome walk-through of how multi-latent attention works (one of the innovations of DeepSeek)! Check it out! https://t.co/GDAtwWgxmF
We present Theraπ₯: The new SOTA arbitrary-scale super-resolution method with built-in anti-aliasing. Our approach introduces Neural Heat Fields, which guarantee exact Gaussian filtering at any scale, enabling continuous image reconstruction without extra computational cost. https://t.co/Z9luxvjcSQ
Figured out a way to generate entire walkthrough videos from arbitrary text. shit goes very very wild. Now I just need to learn enough css and slidev markup to make this pretty. https://t.co/WE5y7ZKYuC
Nvidia presents: FFN Fusion: Rethinking Sequential Computation in Large Language Models 1.71x speedup in inference latency and 35x lower per-token cost while maintaining strong performance across benchmarks https://t.co/ineyMFUTCV
// Chain-of-Tools // This new paper presents Chain-of-Tools (CoTools), a new method to enable LLMs to incorporate expansive external toolsetsβincluding tools never seen during trainingβwhile preserving CoT (chain-of-thought) reasoning. Highlights: β’ Frozen LLM withβ¦ https://t.co/rygfJXLOjl
it's so over DeepSeek V3-0324 just dropped and it created this website in one shot, it wrote 800+ lines of code without breaking even once, this is free, open-source, super fast. it's great to see how these open-source models are creating pressure on the big techs to build⦠https://t.co/YMGdMEhaol
βΌοΈSentence Transformers v4.0 is out! You can now train and finetune reranker (aka cross-encoder) models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think. Details in π§΅ https://t.co/JECcuKhYFa
My "otter on a plane using wifi" benchmark has now been saturated by ChatGPT 4's new image generator: "an otter on an airplane using wifi, on their laptop screen is image generation software creating an image of an otter on a plane using wifi," first try https://t.co/DyDV4pOequ
New short course: Vibe Coding 101 with Replit! Learn to build and host applications with an AI agent in this course, built in partnership with @Replit and taught by its President @pirroh and Head of Developer Relations @mattppal. Coding agents are changing how we write code.β¦ https://t.co/jMQHnZjAre
OpenAI Agents SDK + MCP works so well! I like where OpenAI is taking this framework. Easy to use but extremely flexible and powerful to build complex agentic systems. Watch my quick demo here: https://t.co/RFhdtt2mGq
State-of-the-art code retrieval is open source again! @nomic_ai has released an Apache 2.0 7B truly open (weights, data, code) embedding model. Details in π§΅ https://t.co/lOhx1TJkoj
β«UNCANNY VALLEY: PERPLEXITY VS GOOGLE β THE WAR FOR THE FUTURE OF SEARCH BEGINS Special Guest: @AravSrinivas Host: @OperationDanish Search is dead. Agents are rising. Phones are getting smarterβwithout Apple or Google. Aravind Srinivas, CEO of @perplexity_ai, lays out theβ¦ https://t.co/04gzH2LQwf
Once again, the vision gang (Kaiming and friends, not us) was way ahead of the LLM gang. Protip for LLM paper ideas: go through 2017-2022 vision papers and translate ideas to LLMs. And this is not a bad thing :) https://t.co/MNnCgEW4iz
Sure, you could use an annotation tool to create bounding boxes of objects in images for you... or you can ask a multimodal AI to do it freehand. https://t.co/5ZkOxIcGga
Our extraction agent in LlamaExtract has gotten a huge upgrade in capabilities π₯- it can read a complex, multimodal document and extract into a Pydantic schema with 3+ layers of nesting. A sneak peek on whatβs possible: I fed in an equity research report which is completelyβ¦ https://t.co/QSs1YOq2EU
Wow. https://t.co/d9ES7mCUol
Build your own AI agent frontend that can connect to the thousands of MCP servers popping up! With @llama_index you can easily build your own multi-agent workflow and connect to any MCP server as a tool. Use our higher-level agent workflow abstractions, or build your own⦠https://t.co/CH6Tna5U01