Your curated collection of saved posts and media

Showing 10 posts ยท last 14 days ยท by score
โž• Add New Post
G
gerardsans
@gerardsans
๐Ÿ“…
Apr 10, 2026
14d ago
๐Ÿ†”82203756
โญ0.40

@plainionist The industry has misled the public with terms like โ€œIn-Context Learningโ€, โ€œSkillsโ€ or โ€œReasoningโ€. Itโ€™s trivial to prove no real learning is happening: the modelโ€™s weights are never changed. Remove the context, and the alleged โ€œlearningโ€ disappears instantly. Puff.

P
PrismML
@PrismML
๐Ÿ“…
Apr 16, 2026
8d ago
๐Ÿ†”82896134

Today weโ€™re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. Weโ€™re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

Media 1
๐Ÿ–ผ๏ธ Media
R
RedHat_AI
@RedHat_AI
๐Ÿ“…
Apr 17, 2026
7d ago
๐Ÿ†”02520952

Qwen3.6-35B-A3B just dropped. Red Hat AI has an NVFP4 quantized checkpoint ready. 35B params, 3B active, quantized with LLM Compressor. Preliminary GSM8K Platinum: 100.69% recovery (slightly above baseline). Early release. Let us know what you think! https://t.co/i5Fc4P7NVN

Media 1
๐Ÿ–ผ๏ธ Media
V
victormustar
@victormustar
๐Ÿ“…
Apr 17, 2026
7d ago
๐Ÿ†”46958899

Sharing my current setup to run Qwen3.6 locally in a good agentic setup (Pi + llama.cpp). Should give you a good overview of how good local agents are today: # Start llama.cpp server: llama-server \ -hf unsloth/Qwen3.6-35B-A3B-GGUF:Q4_K_XL \ --jinja \ --chat-template-kwargs '{"preserve_thinking":true}' \ --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0 # Configure Pi: { "providers": { "llama-cpp": { "baseUrl": "http://127.0.0.1:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "unsloth/Qwen3.6-35B-A3B-GGUF:Q4_K_XL" } ] } } }

@Alibaba_Qwen โ€ข Thu Apr 16 13:23

โšก Meet Qwen3.6-35B-A3B๏ผšNow Open-Source๏ผ๐Ÿš€๐Ÿš€ A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. ๐Ÿ”ฅ Agentic coding on par with models 10x its active size ๐Ÿ“ท Strong multimodal perception and reasoning ability ๐Ÿง  Multimodal thinking + non-thinking modes Efficient. Pow

Media 1
๐Ÿ–ผ๏ธ Media
R
rasbt
@rasbt
๐Ÿ“…
Apr 11, 2026
13d ago
๐Ÿ†”58256302

@DanielWulikk Have to think a bit about how to best visualize it, but if you are interested, I have a working from-scratch code implementation of Gemma 4 E2B in the meantime to see how per-layer embeddings are implemented: https://t.co/jyiq1vyJnH https://t.co/fVrSBWHNHl

Media 1
๐Ÿ–ผ๏ธ Media
E
e_volkmann
@e_volkmann
๐Ÿ“…
Apr 08, 2026
16d ago
๐Ÿ†”30881771

Introducing gyaradax ๐Ÿ‰: A JAX solver for local flux-tube gyrokinetics with custom CUDA kernels for acceleration. This entire code was vibecoded by @ggalletti_ and me in a month. Validated against GKW (CPU-only Fortran code) with 10x speedups. Details and code in the replies. https://t.co/22PrHjItR5

๐Ÿ–ผ๏ธ Media
I
iScienceLuvr
@iScienceLuvr
๐Ÿ“…
Apr 10, 2026
15d ago
๐Ÿ†”90441301

abs: https://t.co/tSLRNXAgY3 code: https://t.co/ldWVGmuC4O blog post: https://t.co/yI0NmdaKHm

Media 1Media 2
๐Ÿ–ผ๏ธ Media
J
johnowhitaker
@johnowhitaker
๐Ÿ“…
Apr 11, 2026
14d ago
๐Ÿ†”09752285
โญ0.38

Also, how cool that this is so easy now? This was a few careful asks to Codex, which worked for ~128k tokens/1h to do everything - sourcing the data, embedding with clip (via @replicate), making an exploratory search tool for refining + filtering, and whipping up the final app.

G
gerardsans
@gerardsans
๐Ÿ“…
Apr 10, 2026
14d ago
๐Ÿ†”14916027
โญ0.42

@trq212 This framing degrades technical literacy. Your prompt isn't "communication", it's tokenized, embedded as vectors, processed through frozen weights. No "agent" receives it. No bandwidth grows. Anthropic's own leaked system prompt: 16,739 words of context steering. That's engineering, not dialogue. Research shows anthropomorphic AI discourse creates self-fulfilling alignment degradation. You're not "talking to agents." You're structuring conditional probability queries. High-leverage? Yes. Interpersonal? Never. If the goal is public understanding, not marketing, stop treating inference as a โ€œteam memberโ€. That's technically inaccurate. Which makes it dishonest the moment someone with enough expertise verifies it.

A
aryagm01
@aryagm01
๐Ÿ“…
Apr 12, 2026
12d ago
๐Ÿ†”21521117

dflash-mlx: DFlash speculative decoding, ported to Apple Silicon. Qwen3-4B at 186 tok/s on a MacBook. 4.6ร— faster than plain MLX-LM. Exact greedy decoding: output matches plain target decoding. https://t.co/VxfyworgAe

๐Ÿ–ผ๏ธ Media
Page 1 of 106Next โ†’