Firehose go brrr

Kuba Suder 🇵🇱🇺🇦 June 24, 2025
Source

A couple of months ago, I wrote about the tests I've done on how I can speed up my code for processing the Bluesky firehose, using my Skyfall Ruby gem:

https://mackuba.leaflet.pub/3lko3iqvg5c5t External Link • mackuba.leaflet.pub

The limits I've reached then were around:

Since then, I've done two more things. The first one was that I ran Jetstream locally for testing, and configured it to have a much higher rate limit (the --max-sub-rate option). I've confirmed that indeed, with the rate limit not getting in the way, Skyfall using Jetstream can go much faster on the same server, up to about 10-12k doing full processing.

The second thing is that I started doing some profiling to find out where else I could save some processing time, and in the process, I managed to massively speed up the underlying faye-websocket library 🙃

The Faye speedup fix

So, I was playing with ruby-prof to find where else I can shave off a few microseconds. I ran the scan on the version with my processing turned off, expecting the remaining work to be mostly in some boring internals of Faye or Ruby core libs, reading and writing bytes from the socket, adding them together and waiting.

And I found something… quite interesting: a majority of time was spend in two places:

           4.519    130695/130695   WebSocket::Driver::Hybi#emit_message
13.83%     4.519           130695   String#bytes

           0.000         1/130696   WebSocket::HTTP::Response#body
          12.549    130695/130696   Skyfall::Firehose#handle_message
38.42%    12.549           130696   Array#pack

… wait a minute… 🤔🤔💡

Yes, for a binary websocket (which is used here), Faye prepares the received data in a binary String, but then sends it out as an Array of bytes:

def emit_message
  message  = @extensions.process_incoming_message(@message)
  @message = nil

  payload = message.data

  case message.opcode
    when OPCODES[:text] then
      payload = Driver.encode(payload, Encoding::UTF_8)
      payload = nil unless payload.valid_encoding?
    when OPCODES[:binary]
      payload = payload.bytes.to_a    # <===
  end

And since I want a binary string at the end, to pass it to the CBOR library for decoding, I need to take that byte array and convert it back into a string just like the one we had before:

def handle_message(msg)
  data = msg.data.pack('C*')    # <===
  @handlers[:raw_message]&.call(data)

So could we just… not do that? 🫠

Turns out, yes, although not without some hacking, since the library didn't have an option to emit a string instead for binary websocket messages.

I made some monkey-patches to Faye & websocket-driver first, and eventually a pull request which I submitted to the author – adding a :binary_data_format option to Faye::Websocket::Client initializer, where you can ask to have the data returned as a string instead, defaulting to the original method of returning a byte array. The author actually said that he thinks it makes sense to change the default to a binary string (while adding an option to revert it), and version 0.12.0 was released last month with this changed behavior. (To use this in Skyfall, you need the latest update, version 0.6.0.)

New benchmarks

I've run the benchmarks again, and the results look very encouraging. This is for the CBOR firehose, with and without the fix, and I also rechecked the async-websocket library for comparison:

And this is for Jetstream without a rate limit (the Faye fix does not affect/help Jetstream, because here the stream is text-based, not binary, so that problematic code path wasn't used here):

As you can see:

For unknown reason, I wasn't able to make the Async version go faster than 6k evt/s on the CBOR (binary) stream (while it was using much less than 100% CPU), and it worked fine with Jetstream; not sure why – maybe it was some issue on my side, but I don't really want to spend time digging into this. EM/Faye works fine (especially now), is very battle-tested (even if not being updated much anymore), works with older Rubies, and it would be a big API change for apparently not that much gain. So I think I'm going to keep it as is and maybe reconsider for Skyfall 2.0 one day…

Overall, I think all of this gives me more than enough space to not worry about this again until Bluesky becomes much bigger :)
And it looks like I'm not even going to need any parallel workers + Redis queue setup anytime soon.

Discussion in the ATmosphere

Loading comments...