🐦 Twitter Post Details

Viewing enriched Twitter post

@vtabbott_

A thread🧵previewing my paper with @GioeleZardini, covering how to use diagrams to represent algorithms, generate performance models, and derive execution strategies like FlashAttention ~ We use wires to represent axes, dashed lines to separate tuple segments / parallel functions, weaving to map functions, and horizontal placement for composition. This lets us represent FlashAttention with the diagram below. But how do we go from a representation of a mathematical function to an algorithm executed on GPU cores?

🔧 Raw API Response

{
  "user": {
    "created_at": "2022-07-20T05:53:00.000Z",
    "default_profile_image": false,
    "description": "Maker of *those* diagrams for deep learning algorithms | 🇦🇺 in 🇬🇧",
    "fast_followers_count": 0,
    "favourites_count": 1228,
    "followers_count": 4566,
    "friends_count": 242,
    "has_custom_timelines": false,
    "is_translator": false,
    "listed_count": 70,
    "location": "London, England",
    "media_count": 123,
    "name": "Vincent Abbott | Deep Learning",
    "normal_followers_count": 4566,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1549633222689992706/1704858541",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1744929309431848960/wgN2KssH_normal.jpg",
    "screen_name": "vtabbott_",
    "statuses_count": 393,
    "translator_type": "none",
    "url": "https://t.co/de0doR3t1W",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1549633222689992706"
  },
  "id": "1860268276569506250",
  "conversation_id": "1860268276569506250",
  "full_text": "A thread🧵previewing my paper with @GioeleZardini, covering how to use diagrams to represent algorithms, generate performance models, and derive execution strategies like FlashAttention  ~\n\nWe use wires to represent axes, dashed lines to separate tuple segments / parallel functions, weaving to map functions, and horizontal placement for composition. This lets us represent FlashAttention with the diagram below. But how do we go from a representation of a mathematical function to an algorithm executed on GPU cores?",
  "reply_count": 7,
  "retweet_count": 87,
  "favorite_count": 648,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [
    {
      "id_str": "1550406065191280640",
      "name": "Gioele Zardini",
      "screen_name": "GioeleZardini",
      "profile": "https://twitter.com/GioeleZardini"
    }
  ],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GdD7Z_9WMAAbl39.png",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/vtabbott_/status/1860268276569506250",
  "created_at": "2024-11-23T10:24:53.000Z",
  "#sort_index": "1860268276569506250",
  "view_count": 48136,
  "quote_count": 4,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/vtabbott_/status/1860268276569506250"
}