🐦 Twitter Post Details

Viewing enriched Twitter post

@Mosescreates

I'm going all in on Hermes (@NousResearch, @Teknium1) as my entire agent and coding stack. Six profiles. One shared self-hosted memory store. Zero hosted-coder dependencies. The fleet: - pmax-mousa — my own WhatsApp + Email + Google Workspace agent - pmax-tarek — my co-founder's Telegram + Email agent - pmax-dareen — our content creator's WhatsApp assistant (LIVE on real client chats) - pmax-content — background content ops - pmax-ops-observer — daily health reports - pmax-coder — my primary coding CLI, no hosted coder, no gateway The model dial — this is the part I'm most excited about: pmax-coder runs on GLM-5.1 native via the https://t.co/s6oYqmfv05 Coding Plan (@Zai_org, quarterly $45). Direct to https://t.co/HUrIPiINWn, no middleman, no OpenRouter tax. GLM-5.1 published the exact thing I needed — a frontier coder at a flat price I can plan around. I've spent the last three days heads-down just getting the system running. Not tweaking it. Not optimizing it. Getting it to stand up end-to-end without a single load-bearing piece silently falling over. Six profiles, one memory store, two hosts, a dozen services, launchd, Tailscale, native provider pinning, patch re-application, ghost-process recovery, bridge port collisions, FTPS quirks, CI cycles, Qdrant lock contention, Happy Eyeballs hangs — every one of them a real bug I hit and fixed before I could move on. The three days are the story. The five gateway profiles (pmax-mousa, pmax-tarek, pmax-dareen, pmax-content, pmax-ops-observer) all run on qwen/qwen3.6-plus via OpenRouter native Alibaba routing (@Alibaba_Qwen, @OpenRouterAI). I pinned native-only with a strict provider.only patch so nothing silently falls through to a more expensive lane. Offline fallback everywhere is gemma-4-31b-it-4bit served by oMLX on the Mac Studio. If OpenRouter or https://t.co/s6oYqmfv05 goes sideways mid-conversation, every profile transparently fails over to local MLX inference and the user never notices. Swapping models is one YAML line. The real unlock: unified self-hosted memory. Every Hermes profile reads and writes one mem0 store on my MacBook (Qdrant + Ollama nomic-embed-text embeddings, zero cloud). Claude Code (@claudeai, @AnthropicAI) is wired to auto-broadcast every session turn into the same store via a Stop hook. The direction of flow is Claude writes, Hermes listens. Anything I decide in a Claude Code session is visible to the WhatsApp agents on my very next message. Nothing gets re-explained. Ever. Two-host architecture over Tailscale: MacBook (100.x.x.x) is the service layer. It runs mem0-server on 7437, task_server v1.1.3 on 7439, the guru-code router cache on 7450, the content-review webhook on 7438, all the Claude Code hooks, daily backup cron, and the mem CLI. Mac Studio M4 Max (100.x.x.x) is the agent layer. It runs Hermes v2026.4.13-118, all six profile gateways under launchd, the Hermes dashboard on 9119, the WhatsApp and Telegram bridges, and the Google Workspace OAuth session. Both hosts are pinned to IPv4 over the tailnet because macOS Happy Eyeballs was randomly hanging on IPv6 tailnet paths — one flag on every curl and ssh killed a whole class of flakiness. Huge credit to @brian_cheong — his push on idempotency-on-retries directly shaped task_server v1.1.3 (Idempotency-Key header on every write path), https://t.co/mrslfLodFL deterministic run_id dedup, and the guru-code router's response cache. Without that, retried agent actions would silently double-fire — a tool call would hit twice, a message would get sent twice, a file would get written twice. Whole classes of bugs I'll never write now. (btw I knew nothing abt idempotency bar — thanks dude) What else ships with the stack: Daily Qdrant and task_server backups with a 14-day rotation, plus a weekly full Hermes zip. Ghost-process immunity on launchd restarts (a startup_guard script kills any zombie https://t.co/eDFb9bsTOf holding the Qdrant lock before mem0-server boots). A native-provider pinning patch that wires provider_routing.allow_fallbacks straight through to OpenRouter. A secret redactor that runs on every Claude Code turn end so OpenRouter keys, Anthropic keys, GitHub PATs, and Bearer tokens can never leak into transcripts. A mem audit command that scans the memory store itself for leaked patterns. And a `fleet` one-shot status command I can run from any terminal to get a color-coded snapshot of every service on both hosts plus GitHub Actions status plus the Hermes patch inventory. Over these three days I also pulled 98 commits of upstream Hermes in two passes (70 + 28) without losing a single custom patch. An update-check cron inventories every local patch weekly so nothing regresses silently. Upgrades are safe. That's the invariant I wanted and I finally have it. None of this is a custom AI platform. It's Hermes doing what Hermes does, plus a few surgical patches I kept small enough to re-apply on every upstream pull. The whole philosophy is minimal lock-in: use the upstream as much as possible, patch only the load-bearing seams, never fork. The point isn't that Hermes beats every other coder tool today. The point is it's mine. I own the model dial, the memory store, the tools, the hooks, the backup policy, the security posture, the failover behavior. When something breaks I fix it. When I want to upgrade I upgrade. When I want to swap models I swap models. No middleman. No platform. No rug pull risk. Reports from the field to follow.

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044224095546495192/media_0.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044224095546495192/media_0.jpg",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2026-04-15T04:36:59.660997",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2044224095546495192",
  "url": "https://x.com/Mosescreates/status/2044224095546495192",
  "twitterUrl": "https://twitter.com/Mosescreates/status/2044224095546495192",
  "text": "I'm going all in on Hermes (@NousResearch, @Teknium1) as my entire agent and coding stack. Six profiles. One shared self-hosted memory store. Zero hosted-coder dependencies.\n\nThe fleet:\n\n- pmax-mousa — my own WhatsApp + Email + Google Workspace agent\n- pmax-tarek — my co-founder's Telegram + Email agent\n- pmax-dareen — our content creator's WhatsApp assistant (LIVE on real client chats)\n- pmax-content — background content ops\n- pmax-ops-observer — daily health reports\n- pmax-coder — my primary coding CLI, no hosted coder, no gateway\n\nThe model dial — this is the part I'm most excited about:\n\npmax-coder runs on GLM-5.1 native via the https://t.co/s6oYqmfv05 Coding Plan (@Zai_org, quarterly $45). Direct to https://t.co/HUrIPiINWn, no middleman, no OpenRouter tax. GLM-5.1 published the exact thing I needed — a frontier coder at a flat price I can plan around. I've spent the last three days heads-down just getting the system running. Not tweaking it. Not optimizing it. Getting it to stand up end-to-end without a single load-bearing piece silently falling over. Six profiles, one memory store, two hosts, a dozen services, launchd, Tailscale, native provider pinning, patch re-application, ghost-process recovery, bridge port collisions, FTPS quirks, CI cycles, Qdrant lock contention, Happy Eyeballs hangs — every one of them a real bug I hit and fixed before I could move on. The three days are the story.\n\nThe five gateway profiles (pmax-mousa, pmax-tarek, pmax-dareen, pmax-content, pmax-ops-observer) all run on qwen/qwen3.6-plus via OpenRouter native Alibaba routing (@Alibaba_Qwen, @OpenRouterAI). I pinned native-only with a strict provider.only patch so nothing silently falls through to a more expensive lane.\n\nOffline fallback everywhere is gemma-4-31b-it-4bit served by oMLX on the Mac Studio. If OpenRouter or https://t.co/s6oYqmfv05 goes sideways mid-conversation, every profile transparently fails over to local MLX inference and the user never notices. Swapping models is one YAML line.\n\nThe real unlock: unified self-hosted memory.\n\nEvery Hermes profile reads and writes one mem0 store on my MacBook (Qdrant + Ollama nomic-embed-text embeddings, zero cloud). Claude Code (@claudeai, @AnthropicAI) is wired to auto-broadcast every session turn into the same store via a Stop hook. The direction of flow is Claude writes, Hermes listens. Anything I decide in a Claude Code session is visible to the WhatsApp agents on my very next message. Nothing gets re-explained. Ever.\n\nTwo-host architecture over Tailscale:\n\nMacBook (100.x.x.x) is the service layer. It runs mem0-server on 7437, task_server v1.1.3 on 7439, the guru-code router cache on 7450, the content-review webhook on 7438, all the Claude Code hooks, daily backup cron, and the mem CLI.\n\nMac Studio M4 Max (100.x.x.x) is the agent layer. It runs Hermes v2026.4.13-118, all six profile gateways under launchd, the Hermes dashboard on 9119, the WhatsApp and Telegram bridges, and the Google Workspace OAuth session. Both hosts are pinned to IPv4 over the tailnet because macOS Happy Eyeballs was randomly hanging on IPv6 tailnet paths — one flag on every curl and ssh killed a whole class of flakiness.\n\nHuge credit to @brian_cheong — his push on idempotency-on-retries directly shaped task_server v1.1.3 (Idempotency-Key header on every write path), https://t.co/mrslfLodFL deterministic run_id dedup, and the guru-code router's response cache. Without that, retried agent actions would silently double-fire — a tool call would hit twice, a message would get sent twice, a file would get written twice. Whole classes of bugs I'll never write now. (btw I knew nothing abt idempotency bar — thanks dude)\n\nWhat else ships with the stack:\n\nDaily Qdrant and task_server backups with a 14-day rotation, plus a weekly full Hermes zip. Ghost-process immunity on launchd restarts (a startup_guard script kills any zombie https://t.co/eDFb9bsTOf holding the Qdrant lock before mem0-server boots). A native-provider pinning patch that wires provider_routing.allow_fallbacks straight through to OpenRouter. A secret redactor that runs on every Claude Code turn end so OpenRouter keys, Anthropic keys, GitHub PATs, and Bearer tokens can never leak into transcripts. A mem audit command that scans the memory store itself for leaked patterns. And a `fleet` one-shot status command I can run from any terminal to get a color-coded snapshot of every service on both hosts plus GitHub Actions status plus the Hermes patch inventory.\n\nOver these three days I also pulled 98 commits of upstream Hermes in two passes (70 + 28) without losing a single custom patch. An update-check cron inventories every local patch weekly so nothing regresses silently. Upgrades are safe. That's the invariant I wanted and I finally have it.\n\nNone of this is a custom AI platform. It's Hermes doing what Hermes does, plus a few surgical patches I kept small enough to re-apply on every upstream pull. The whole philosophy is minimal lock-in: use the upstream as much as possible, patch only the load-bearing seams, never fork.\n\nThe point isn't that Hermes beats every other coder tool today. The point is it's mine. I own the model dial, the memory store, the tools, the hooks, the backup policy, the security posture, the failover behavior. When something breaks I fix it. When I want to upgrade I upgrade. When I want to swap models I swap models. No middleman. No platform. No rug pull risk.\n\nReports from the field to follow.",
  "source": "Twitter for iPhone",
  "retweetCount": 2,
  "replyCount": 0,
  "likeCount": 8,
  "quoteCount": 1,
  "viewCount": 877,
  "createdAt": "Wed Apr 15 01:19:38 +0000 2026",
  "lang": "en",
  "bookmarkCount": 6,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2044224095546495192",
  "displayTextRange": [
    0,
    267
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "Mosescreates",
    "url": "https://x.com/Mosescreates",
    "twitterUrl": "https://twitter.com/Mosescreates",
    "id": "726886676862263297",
    "name": "Moshe",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/2001885837001166848/rH5xfvqt_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/726886676862263297/1764266158",
    "description": "",
    "location": "U+2641 ♁ EARTH 🌍",
    "followers": 4789,
    "following": 168,
    "status": "",
    "canDm": true,
    "canMediaTag": false,
    "createdAt": "Sun May 01 21:31:03 +0000 2016",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 11439,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 4609,
    "statusesCount": 39715,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1990053918861414649"
    ],
    "profile_bio": {
      "description": "جاينجو",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [],
          "user_mentions": []
        },
        "url": {
          "urls": [
            {
              "display_url": "freeathan.app",
              "expanded_url": "https://freeathan.app",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/3Kdzilf5XQ"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/1N4tJw2HY7",
        "expanded_url": "https://twitter.com/Mosescreates/status/2044224095546495192/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 282,
                "w": 282,
                "x": 803,
                "y": 104
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 282,
                "w": 282,
                "x": 803,
                "y": 104
              }
            ]
          }
        },
        "id_str": "2044224075963478019",
        "indices": [
          268,
          291
        ],
        "media_key": "3_2044224075963478019",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARxeivdhGrADCgACHF6K+/BX0NgAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHF6K92EasAMKAAIcXor78FfQ2AAA",
            "media_key": "3_2044224075963478019"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HF6K92EasAMJ-ym.png",
        "original_info": {
          "focus_rects": [
            {
              "h": 896,
              "w": 1600,
              "x": 0,
              "y": 0
            },
            {
              "h": 1000,
              "w": 1000,
              "x": 0,
              "y": 0
            },
            {
              "h": 1000,
              "w": 877,
              "x": 0,
              "y": 0
            },
            {
              "h": 1000,
              "w": 500,
              "x": 30,
              "y": 0
            },
            {
              "h": 1000,
              "w": 1600,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1000,
          "width": 1600
        },
        "sizes": {
          "large": {
            "h": 1000,
            "w": 1600
          }
        },
        "type": "photo",
        "url": "https://t.co/1N4tJw2HY7"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "Z.AI",
        "expanded_url": "http://Z.AI",
        "indices": [
          641,
          664
        ],
        "url": "https://t.co/s6oYqmfv05"
      },
      {
        "display_url": "api.z.ai",
        "expanded_url": "http://api.z.ai",
        "indices": [
          714,
          737
        ],
        "url": "https://t.co/HUrIPiINWn"
      },
      {
        "display_url": "Z.AI",
        "expanded_url": "http://Z.AI",
        "indices": [
          1834,
          1857
        ],
        "url": "https://t.co/s6oYqmfv05"
      },
      {
        "display_url": "pipeline.py",
        "expanded_url": "http://pipeline.py",
        "indices": [
          3335,
          3358
        ],
        "url": "https://t.co/mrslfLodFL"
      },
      {
        "display_url": "api.py",
        "expanded_url": "http://api.py",
        "indices": [
          3897,
          3920
        ],
        "url": "https://t.co/eDFb9bsTOf"
      }
    ],
    "user_mentions": [
      {
        "id_str": "1318419526132862976",
        "indices": [
          28,
          41
        ],
        "name": "Nous Research",
        "screen_name": "NousResearch"
      },
      {
        "id_str": "1726486879456096256",
        "indices": [
          678,
          686
        ],
        "name": "Z.ai",
        "screen_name": "Zai_org"
      },
      {
        "id_str": "1753339277386342400",
        "indices": [
          1585,
          1598
        ],
        "name": "Qwen",
        "screen_name": "Alibaba_Qwen"
      },
      {
        "id_str": "1943306828697550848",
        "indices": [
          2200,
          2209
        ],
        "name": "Claude",
        "screen_name": "claudeai"
      },
      {
        "id_str": "1353836358901501952",
        "indices": [
          2211,
          2223
        ],
        "name": "Anthropic",
        "screen_name": "AnthropicAI"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "communityInfo": null,
  "article": null
}