Your curated collection of saved posts and media

Showing 32 posts Β· last 14 days Β· by score
E
Ethan Mollick
@emollick
πŸ“…
Oct 18, 2024
565d ago
πŸ†”19279982

O1 is a clear preview of the problem of advanced AI. It does stuff that looks really good (this is from a new OpenAI demo) & often is really good, but it is hard to evaluate it unless your are already an expert, as any errors are so subtle or complex that only experts see them.

Media 1Media 2
❀️1,584
likes
πŸ”136
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Oct 18, 2024
565d ago
πŸ†”25094081
⭐1.00

LLMs Can Learn About Themselves by Introspection This paper reports that LLMs can acquire knowledge through introspection that cannot be inferred from their training data. "Our findings challenge the view that LLMs merely imitate their training data and suggest they have privileged access to information about themselves." They also report that this introspection ability is limited and models struggle to predict their behavior on tasks requiring reasoning over long outputs. This is exciting and interesting because these introspection capabilities can lead to more interpretable and controllable LLMs.

Media 1
❀️664
likes
πŸ”133
retweets
πŸ–ΌοΈ Media
N
Naomi Saphra 🧈πŸͺ°
@nsaphra
πŸ“…
Oct 18, 2024
565d ago
πŸ†”89953503
⭐0.86

What makes some LM interpretability research β€œmechanistic”? In our new position paper in @BlackboxNLP, @sarahwiegreffe and I argue that the practical distinction was never technical, but a historical artifact that we should beβ€”and areβ€”moving past to bridge communities. https://t.co/7N1ETIG3Bp

Media 1
❀️336
likes
πŸ”57
retweets
πŸ–ΌοΈ Media
J
John Burn-Murdoch
@jburnmurdoch
πŸ“…
Oct 18, 2024
565d ago
πŸ†”98522321
⭐0.81

The recent political polarisation of Silicon Valley is really striking. 25 years ago most big tech and VC execs were moderates. Then the whole sector shifted gradually leftwards up until 2020, and now suddenly we have a sharp divide into Democrat-backers and Trump backers. https://t.co/WdG3VJYaGt

Media 1
❀️4,752
likes
πŸ”1,170
retweets
πŸ–ΌοΈ Media
J
James O'Leary
@jpohhhh
πŸ“…
May 29, 2024
708d ago
πŸ†”41757975
⭐0.61

new trick: temperature 2.0 / top p 90% left is boring temp 1 / top p 90% prompt: write a whacky, dark, hunter thompson esque star wars story, 3 paras https://t.co/5pS0an2dMT

Media 1Media 2
❀️72
likes
πŸ”5
retweets
πŸ–ΌοΈ Media
E
Steven Brunton
@eigensteve
πŸ“…
Oct 16, 2024
567d ago
πŸ†”65799017

New 20hr bootcamp on Probability & Statistics!!! Videos released weekly but full playlist already posted: https://t.co/3LnVCaeYGv Probability & Statistics are cornerstones of data science and machine learning. This course rapidly covers the basics and gets into advanced topics. https://t.co/92bBdrRhWs

❀️9,803
likes
πŸ”1,634
retweets
πŸ–ΌοΈ Media
A
Aravind Srinivas
@AravSrinivas
πŸ“…
Oct 17, 2024
566d ago
πŸ†”56583224
⭐0.71

Today, we're launching Perplexity for Internal Search: one tool to search over both the web and your team's files with multi-step reasoning and code execution. https://t.co/ftZGNgziBW

❀️3,401
likes
πŸ”240
retweets
πŸ–ΌοΈ Media
A
Alex Covo
@alexcovo_eth
πŸ“…
Oct 17, 2024
566d ago
πŸ†”31542462

Mind Blown! So easy to create your own podcast, uncensored, on your local computer, for free! Just generate a podcast script with your favorite LLM and generate a podcast in minutes. 1. Use LLM to generate your script - Free Open Source models and apps @LMStudioAI @jandotai @AnythingLLM using @ollama a/o @huggingface models - or paid LLMs from @AnthropicAI @OpenAI 2. Make sure to install https://t.co/rb6L8kDbUq from @cocktailpeanut and install F5-tts 3. Record a couple voice samples and upload to app. Make sure to emote if you want more of a dynamic voice. (I used a couple voice samples from @elevenlabsio) for this test 4. Paste the podcast script and you're good to go πŸ‘ Will share full podcast later. Just excited to share! πŸ˜ŽπŸ‘Œ

❀️581
likes
πŸ”69
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Oct 17, 2024
566d ago
πŸ†”73930631
⭐1.00

Build multi-tenant RAG applications easily with LlamaIndex and Nile! πŸš€ Multi-tenancy -- the ability to index data from hundreds or thousands of users without leaking it between them -- is a very common question we get from users. Nile have built a full-stack demo application called TaskGenius that uses AI to estimate the complexity of your to-do list items, and shows off how you can handle multiple users with totally separate document databases and embeddings. Learn how to: ➑️ Isolate documents and embeddings for each tenant ➑️ Scale efficiently with virtual tenant databases ➑️ Implement multi-tenant RAG with just a few lines of code Check out the blog post: https://t.co/Hl7ESkfXvb And the full-stack TaskGenius demo here: https://t.co/S4r0DKUeZZ

Media 1
❀️133
likes
πŸ”36
retweets
πŸ–ΌοΈ Media
C
Charles πŸŽ‰ Frye
@charles_irl
πŸ“…
Oct 17, 2024
566d ago
πŸ†”31890808
⭐0.71

naturally my first instinct when picking up FastHTML, @jeremyphoward and @johnowhitaker 's new full-stack web framework in Python, was to make a webpage that displays GPU stats from nvidia-smi https://t.co/Qap9Z0ls9R

❀️72
likes
πŸ”12
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Aug 21, 2022
1354d ago
πŸ†”06568704
⭐1.00

Good middle managers matter a lot! Who the plant manager is in car factory explains more of the productivity difference between plants than who the CEO of the company is. Replacing a bottom quartile manager with a top quartile one decreases hours needed to build a car by 30%. https://t.co/ecoD6jEqcv

❀️2,060
likes
πŸ”386
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Oct 18, 2024
566d ago
πŸ†”77000668

Two potential explanations. https://t.co/OYYNkAF7lf

@7ohntitor β€’

Now that the dust has settled... What happened here https://t.co/PvGgbK77tg

Media 1Media 2
❀️157
likes
πŸ”10
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Oct 18, 2024
566d ago
πŸ†”59335662
⭐0.95

what did dis mean https://t.co/Fv8sWknVks

@willdepue β€’

if you're from an unconventional background and want to work on ai, consider applying to the OpenAI residency. you should be: - pumped about building true ai - not afraid of large complex codebases or hard infra problems - excited to learn fast, dive deep https://t.co/jqtscGnG6h

❀️94
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Oct 17, 2024
566d ago
πŸ†”87646377

instructor with multimodal audio, extraction just works! no upgrade needed https://t.co/qtLaIP3Z3N

Media 1
❀️101
likes
πŸ”7
retweets
πŸ–ΌοΈ Media
_
AK
@_akhaliq
πŸ“…
Oct 18, 2024
566d ago
πŸ†”54487635

Google presents VidPanos Generative Panoramic Videos from Casual Panning Videos https://t.co/3zLUKNCc1l

❀️460
likes
πŸ”73
retweets
πŸ–ΌοΈ Media
C
Christopher Manning
@chrmanning
πŸ“…
Oct 17, 2024
566d ago
πŸ†”58838432

6 years into the LLM revolution, it’s still Day One in developing the many ways they can help the world. Here: cheap, accurate, automated but human-approved mapping and removal of racially restrictive covenants from all Santa Clara County property deeds. https://t.co/PGGKa0AePF https://t.co/4rYoQDk4lX

Media 1Media 2
❀️375
likes
πŸ”60
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Oct 18, 2024
565d ago
πŸ†”00501514

Isn't this a pretty big bearish signal for standard RLHF for capabilities https://t.co/dnQv3BtsYi

Media 1
❀️298
likes
πŸ”12
retweets
πŸ–ΌοΈ Media
A
Andrew Ng
@AndrewYNg
πŸ“…
Oct 16, 2024
567d ago
πŸ†”59833674
⭐1.00

New short course: Serverless Agentic Workflows with Amazon Bedrock. Learn to build and deploy serverless agents in this course created with @awscloud and taught by @mikegchambers, a Senior Developer Advocate at AWS specializing in GenAI. (Disclosure: I serve on Amazon's board.) Generative AI applications are becoming more complex, sophisticated, and agentic. Agentic applications have workloads that can be hard to predict in advance -- for example, what tools will it decide to call? -- and a serverless architecture helps you efficiently providing on-demand resources. This course teaches you to build and deploy a serverless agentic application. You’ll learn to create agents with tools, code execution, and guardrails, and build responsible agents for business use cases: - Build a customer service bot for a fictional tea mug business that can answering questions, retrieve information, and process orders. - Connect your customer service agent to a CRM to get customer info and log support tickets in real-time. - Explore how you invoke the agent, and see the trace to review the agent’s thought process and observation loop until it reaches its final output. - Attach a code interpreter to your agent, giving it the ability to perform accurate calculations by writing and running its own Python code. - Implement guardrails to prevent your agent from revealing sensitive information or using inappropriate language. By the end, you will have built a sophisticated AI agent capable of handling real-world customer support scenarios. Please sign up here! https://t.co/FQKGJNBPwp

❀️882
likes
πŸ”170
retweets
πŸ–ΌοΈ Media
I
JB
@IAMJBDEL
πŸ“…
Wed
πŸ†”82590711

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts Datasets: - https://t.co/xNpS3o14hR (Likes: 0) - https://t.co/7z4YBb2oG2 (Likes: 0) Models: - https://t.co/S7CibghuiT (Likes: 2, Downloads: 33) - https://t.co/oLFsj40b0g (Likes: 2,… https://t.co/6jtfvoEUTZ

❀️13
likes
πŸ”6
retweets
πŸ–ΌοΈ Media
F
Farouq Aldori
@FarouqAldori
πŸ“…
Oct 16, 2024
567d ago
πŸ†”04782087
⭐0.61

When are we going to admit that Chatbot Arena is bullshit and maybe just shut it down for good? https://t.co/O4HQeSff10

Media 1
❀️25
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Oct 17, 2024
567d ago
πŸ†”17800897
⭐1.00

Thus always in Silicon Valley. https://t.co/C5Zh40NjCN

@jam3scampbell β€’

Mira, Ilya, Elon, Sam, and Dario are now all competing with each other for AGI despite all having worked together at OpenAI just a few years ago

Media 1
❀️3,095
likes
πŸ”237
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Oct 16, 2024
567d ago
πŸ†”65327561
⭐1.00

Mistral AI is doubling down on small language models. Their latest Ministral models (both the 3B and 8B) are pretty impressive and will be incredibly useful for a lot of LLM workflows. Some observations: I enjoy seeing how committed Mistral AI is to developing smaller and more capable models. They seem to understand what developers want and need today. There is huge competition for the finest, smallest, and cheapest models. This is good for the AI developer community. This sets up the community really well in terms of the wave of innovation that’s coming around on-device AI and agentic workflows. 2025 is going to be a wild year. They don’t mention the secret sauce behind these capable smaller models (probably some distillation happening), the Ministral 3B model already performs competitively with Mistral 7B. I think this is a great focus of Mistral as they seek to differentiate from other LLM providers. Given this announcement, I am now super curious about what the next Gemma and Llama small models are going to bring. Mini models are taking over! I use small models for processing data, structuring information, function calling, routing, evaluation pipelines, prompt chaining, agentic workflows, and a whole lot more.

Media 1
❀️229
likes
πŸ”50
retweets
πŸ–ΌοΈ Media
A
Aran Komatsuzaki
@arankomatsuzaki
πŸ“…
Oct 17, 2024
567d ago
πŸ†”70309089

Google presents Inference Scaling for Long-Context Retrieval Augmented Generation - Finds that increasing inference computation leads to nearly linear gains in RAG perf when optimally allocated -Scaling inference compute on long-context LLMs achieves up to 58.9% gains on benchmark https://t.co/dpulK3a20k

Media 1
❀️366
likes
πŸ”84
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Oct 16, 2024
567d ago
πŸ†”21849029

Model Swarms Researchers from Google and UoW propose a new collaborative search algorithm to adapt LLM via swarm intelligence. A pool of LLM experts collaboratively move in the weight space and optimize a utility function representing various adaptation objectives. Top quote: "Extensive experiments demonstrate that MODEL SWARMS could flexibly adapt LLM experts to a single task, multi-task domains, reward models, as well as diverse human interests, improving over 12 model composition baselines by up to 21.0% across tasks and contexts." One interesting observation in the paper is that the collaborative search process helps to discover new skills and enables the weak-to-strong transition of experts.

Media 1
❀️423
likes
πŸ”119
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Oct 14, 2024
569d ago
πŸ†”13984413

sonnet wrote me a script that adds SEO frontmatter to every post in my instructor blog https://t.co/oMxj9wTqLD

Media 1
❀️24
likes
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Oct 15, 2024
568d ago
πŸ†”16430132

You can learn all about benchmarking AI from 570 BCE Croesus, the last king of Lydia, sent messagers to the major oracles of the ancient world to collect prophecies about a subject he knew (hold out test data). Only Delphi passed But why was Delphi so great? Data contamination! https://t.co/cFS6PkfczG

Media 1Media 2
❀️211
likes
πŸ”26
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Oct 15, 2024
568d ago
πŸ†”44074072

The trick is to completely divorce yourself from the price of your efforts. The solution to a problem is worth a percentage of the solution, a percentage of the outputs, not a percentage of your inputs. https://t.co/C4tPInLTBT

Media 1
❀️10
likes
πŸ–ΌοΈ Media
A
Aravind Srinivas
@AravSrinivas
πŸ“…
Oct 15, 2024
568d ago
πŸ†”22677441

Perplexity Finance: real time stock prices, deep dives into a company’s financials, comparing multiple companies, studying 13f’s of hedge funds, etc. The UI is just delightful! https://t.co/VeRiddi7hr

❀️6,344
likes
πŸ”435
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Oct 15, 2024
568d ago
πŸ†”40540389
⭐1.00

Like MySQL/MariaDB and want to build gen AI apps? @skysql can get you up and running! πŸš€ Check out this detailed how-to post from SkySQL: πŸ” Set up MariaDB Vector in SkySQL 🧠 Integrate OpenAI's LLM with LlamaIndex πŸ’» Implement a smart product review analysis system https://t.co/jOZlgU6hfg

Media 1
❀️28
likes
πŸ”10
retweets
πŸ–ΌοΈ Media
B
Bindu Reddy
@bindureddy
πŸ“…
Oct 15, 2024
568d ago
πŸ†”27745340

Can Humans Reason? Not really - most humans simply indulge in groupthink. They found no evidence of independent reasoning in humans. Smarter humans become more sophisticated at self-deception and hallucinating arguments. So, in some sense, AI can already do better than humans πŸ˜‚

Media 1
❀️500
likes
πŸ”68
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Oct 15, 2024
568d ago
πŸ†”72603047
⭐1.00

Thinking LLMs How difficult is it to train LLMs to do explicit "thinking" before responding to questions or tasks? This work proposes a training method to equip LLMs with thinking abilities for general instruction-following without human-annotated data. It uses an iterative search and optimization procedure to explore thought generation which enables the model to learn without direct supervision. Thought candidates for each user instruction are scored with a judge model. Note that only the responses are evaluated by the Judge which determines the best and worst ones. Then the corresponding full outputs are used as chosen and rejected pairs for DPO (referred to as Thought Preference Optimization in this paper). This entails the full training process that involves multiple iterations. Overall, this is a simple yet very effective approach to incentivizing the model to generate its own thoughts without explicitly teaching it how to think. The authors also find that these Thinking LLMs are effective even in problems that often don't rely on reasoning or CoT methods.

Media 1
❀️512
likes
πŸ”126
retweets
πŸ–ΌοΈ Media
A
Aran Komatsuzaki
@arankomatsuzaki
πŸ“…
Oct 15, 2024
569d ago
πŸ†”85127264

Meta presents Thinking LLMs: General Instruction Following with Thought Generation - Superior performance on AlpacaEval and Arena-Hard, - Gains from thinking on even non-reasoning categories such as marketing, health and general knowledge https://t.co/jc4NYKPDUc https://t.co/LAOnSSMD9C

Media 1
❀️470
likes
πŸ”90
retweets
πŸ–ΌοΈ Media