🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

NEW Research from Apple. When you think about it, RAG systems are fundamentally broken. Retrieval and generation are optimized separately, retrieval selects documents based on surface-level similarity while generators produce answers without feedback about what information is actually needed. There is an architectural mismatch. Dense retrievers rank documents in embedding space while generators consume raw text. This creates inconsistent representation spaces that prevent end-to-end optimization, redundant text processing that causes context overflow, and duplicated encoding for both retrieval and generation. This new research introduces CLaRa, a unified framework that performs retrieval and generation over shared continuous document representations. They encode documents once into compact memory-token representations that serve both purposes. Instead of maintaining separate embeddings and raw text, documents are compressed into dense vectors that both the retriever and generator operate on directly. This enables something previously impossible: gradients flowing from the generator back to the retriever through a differentiable top-k selector using Straight-Through estimation. The retriever learns which documents truly enhance answer generation rather than relying on surface similarity. To make compression work, they introduce SCP, a pretraining framework that synthesizes QA pairs and paraphrases to teach the compressor which information is essential. Simple QA captures atomic facts, complex QA promotes relational reasoning, and paraphrases preserve semantics while altering surface form. Results: At 16x compression, CLaRa-Mistral-7B surpasses the text-based DRO-Mistral-7B on NQ (51.41 vs 51.01 F1) and 2Wiki (47.18 vs 43.65 F1) while processing far less context. At 4x compression, it exceeds uncompressed text baselines by 2.36% average on Mistral-7B. Most notably, CLaRa trained with only weak supervision from next-token prediction outperforms fully supervised retrievers with ground-truth relevance labels. On HotpotQA, it achieves 96.21% Recall@5, exceeding BGE-Reranker (85.93%) by over 10 points despite using no annotated relevance data. Well-trained soft compression can retain essential reasoning information while substantially reducing input length. The compressed representations filter out irrelevant content and focus the generator on reasoning-relevant context, leading to better generalization than raw text inputs. Great read for AI devs. (bookmark it) Paper: https://t.co/JtMukGVNwV Learn to build with RAG and AI Agents in my academy: https://t.co/JBU5beIoD0

Media 1

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2000570838920434037/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2000570838920434037/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-12-15T14:23:14.552885",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2000570838920434037",
  "url": "https://x.com/omarsar0/status/2000570838920434037",
  "twitterUrl": "https://twitter.com/omarsar0/status/2000570838920434037",
  "text": "NEW Research from Apple.\n\nWhen you think about it, RAG systems are fundamentally broken. Retrieval and generation are optimized separately, retrieval selects documents based on surface-level similarity while generators produce answers without feedback about what information is actually needed.\n\nThere is an architectural mismatch.\n\nDense retrievers rank documents in embedding space while generators consume raw text. This creates inconsistent representation spaces that prevent end-to-end optimization, redundant text processing that causes context overflow, and duplicated encoding for both retrieval and generation.\n\nThis new research introduces CLaRa, a unified framework that performs retrieval and generation over shared continuous document representations.\n\nThey encode documents once into compact memory-token representations that serve both purposes. Instead of maintaining separate embeddings and raw text, documents are compressed into dense vectors that both the retriever and generator operate on directly.\n\nThis enables something previously impossible: gradients flowing from the generator back to the retriever through a differentiable top-k selector using Straight-Through estimation. The retriever learns which documents truly enhance answer generation rather than relying on surface similarity.\n\nTo make compression work, they introduce SCP, a pretraining framework that synthesizes QA pairs and paraphrases to teach the compressor which information is essential. Simple QA captures atomic facts, complex QA promotes relational reasoning, and paraphrases preserve semantics while altering surface form.\n\nResults:\n\nAt 16x compression, CLaRa-Mistral-7B surpasses the text-based DRO-Mistral-7B on NQ (51.41 vs 51.01 F1) and 2Wiki (47.18 vs 43.65 F1) while processing far less context. At 4x compression, it exceeds uncompressed text baselines by 2.36% average on Mistral-7B.\n\nMost notably, CLaRa trained with only weak supervision from next-token prediction outperforms fully supervised retrievers with ground-truth relevance labels. On HotpotQA, it achieves 96.21% Recall@5, exceeding BGE-Reranker (85.93%) by over 10 points despite using no annotated relevance data.\n\nWell-trained soft compression can retain essential reasoning information while substantially reducing input length. The compressed representations filter out irrelevant content and focus the generator on reasoning-relevant context, leading to better generalization than raw text inputs.\n\nGreat read for AI devs. (bookmark it)\n\nPaper: https://t.co/JtMukGVNwV\n\nLearn to build with RAG and AI Agents in my academy: https://t.co/JBU5beIoD0",
  "source": "Twitter for iPhone",
  "retweetCount": 4,
  "replyCount": 1,
  "likeCount": 10,
  "quoteCount": 0,
  "viewCount": 385,
  "createdAt": "Mon Dec 15 14:17:11 +0000 2025",
  "lang": "en",
  "bookmarkCount": 4,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2000570838920434037",
  "displayTextRange": [
    0,
    278
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "",
    "location": "DAIR.AI Academy",
    "followers": 279290,
    "following": 735,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 33936,
    "hasCustomTimelines": true,
    "isTranslator": true,
    "mediaCount": 4380,
    "statusesCount": 16742,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "2000385348413850055"
    ],
    "profile_bio": {
      "description": "Building @dair_ai • Prev: Meta AI, Elastic, PhD • New cohort: https://t.co/GZMhf39NRs",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "dair-ai.thinkific.com/courses/claude…",
              "expanded_url": "https://dair-ai.thinkific.com/courses/claude-code-for-everyone-2",
              "indices": [
                62,
                85
              ],
              "url": "https://t.co/GZMhf39NRs"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                9,
                17
              ],
              "name": "",
              "screen_name": "dair_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "dair.ai",
              "expanded_url": "https://www.dair.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/XQto5ypkSM"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/nrSZZv2wKz",
        "expanded_url": "https://twitter.com/omarsar0/status/2000570838920434037/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 92,
                "w": 92,
                "x": 1347,
                "y": 1391
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 92,
                "w": 92,
                "x": 1347,
                "y": 1391
              }
            ]
          }
        },
        "id_str": "2000570835204255744",
        "indices": [
          279,
          302
        ],
        "media_key": "3_2000570835204255744",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARvDdJM8WzAACgACG8N0lBnbkXUAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABG8N0kzxbMAAKAAIbw3SUGduRdQAA",
            "media_key": "3_2000570835204255744"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/G8N0kzxbMAA85dh.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 898,
              "w": 1604,
              "x": 0,
              "y": 0
            },
            {
              "h": 1604,
              "w": 1604,
              "x": 0,
              "y": 0
            },
            {
              "h": 1794,
              "w": 1574,
              "x": 0,
              "y": 0
            },
            {
              "h": 1794,
              "w": 897,
              "x": 0,
              "y": 0
            },
            {
              "h": 1794,
              "w": 1604,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1794,
          "width": 1604
        },
        "sizes": {
          "large": {
            "h": 1794,
            "w": 1604
          }
        },
        "type": "photo",
        "url": "https://t.co/nrSZZv2wKz"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "arxiv.org/abs/2511.18659",
        "expanded_url": "https://arxiv.org/abs/2511.18659",
        "indices": [
          2520,
          2543
        ],
        "url": "https://t.co/JtMukGVNwV"
      },
      {
        "display_url": "dair-ai.thinkific.com",
        "expanded_url": "https://dair-ai.thinkific.com/",
        "indices": [
          2598,
          2621
        ],
        "url": "https://t.co/JBU5beIoD0"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}