Let's start with the following:
- ATProto has positioned itself as "no compromises on centralized use cases". Well, in that case, let's say it can't do *worse* than eg ActivityPub. This includes with replies. You can't do *worse* than ActivityPub on replies and mentioning someone, etc.
- We will interpret the most centralized system as one where there's only one provider for storage and distribution of all messages: the least amount of user participation
- The flip side of the spectrum of maximum decentralization is the *most* amount of participation: every user self-hosts.
- Just as blogging is decentralized but Google (and Google Reader) are not, it is not enough to have just PDS'es in Bluesky be self-hosted. When we say self-hosted, we really mean self-hosted: users are participating in the distribution of their content.
- We will consider this a gradient. We can analyze the system from the greatest extreme of centralization which can "scale towards" the greatest degree of decentralization.
- Finally, we will analyze both in terms of the load of a single participant on the network but also in terms of the amount of network traffic as a whole.
Okay. That is the structure we will use for our analysis. Let's compare "message passing" vs ATProto-style "global public shared heap".
So okay. Let's get the CS notation out of the way:
"Message passing" at full decentralization:
- O(1) from a single node's perspective
- O(n) from a whole-network zoom-out perspective (inherent: add a user, it's one more user)
Okay, that's reasonable and what you'd expect
"Public global no-missed-messages (or not worse than AP) shared-heap" ATProto style at full decentralization:
- O(n) from a single user's perspective (!)
- O(n^2) from a whole-network perspective (!!!!!!)
Oof I'd better back this up because that ain't good!
In other words, as our systems get more decentralized, message passing handles things fine. Individual nodes can participate in the network no matter how big it gets. The zoom-out for the network as a whole doesn't get more complicated as we add more users OR move more users towards self hosting.
Things are NOT good, if I'm correct above, as we make things more decentralized in the atproto-public-shared-heap model. The more self-hosting and indeed the more "full nodes" join, the more it gets expensive for each of the nodes and the network EXPLODES!
Truly self-hosted atproto is NOT POSSIBLE!
And there is no solution to this without adding directed message passing. Another way to say this is: to fix a system like ATProto to allow for self-hosting, you have to ultimately fundamentally change it to be a lot more like a system like ActivityPub!
Now I left more of the precise analytical explanation in my blogpost. But social media isn't great for that, so go check out my blogpost if you want to go through all that (eg if you're more like @dthompson and less like me, I'm a narrative person) https://dustycloud.org/blog/re-re-bluesky-decentralization/
Here's our story:
- We have 26 users: [Alice, Bob, Carol, ... Zack].
- Each user sends one message per day, which is intended to have one recipient. (This may sound unrealistic, but it's fine for modeling.)
- Each user sends a message in a ring: Alice => Bob, Bob => Carol, ... Zack => Alice
Now just before you say "wait but ATProto isn't for DMs", yes, but one way this could happen is that eg Bob follows Alice, Carol follows Bob, etc.
What I'm saying is, messages can have an "intended audience". That's what we're using here.
Before we get into this, remember, the main difference between "message passing" and the "shared heap" is the former has directed and delivered messages, the latter does not. See prev blogpost for explainer.
So, what happens in a day for both systems? Because that's what we really want to find out.
Under message passing, Alice sends her message to Bob. Only Bob need *receive* the message. So on and so forth.
- For an individual self-hosted node, messages passed per day: 1.
- Per the decentralized network, total messages passed zooming out: 26.
That's about what we'd expect.
Under the public-gods-eye-view-shared-heap model, each user must know of all messages to know what may be relevant. Each user must *receive* all messages.
- Individual self-hosted server, 26 messages must be received per day.
- Zoom out on whole decentralized network: 26*26: 676!
Sounds survivable with 26 users though, right?
Let's try just adding 5 more users.
Message passing:
- Per node per day: no change.
- Per the network: 5 more messages.
Public gods-eye-view-shared-heap-model:
- Per node per day: 5 more per day
- Per network: ((31 * 31) - (26 * 26)): 285!
Now, could we handle a million self hosted users? Is it possible? No problem in message passing. EXPLOSIVE with atproto.
What if we had a million users and added just 5 more? How many more messages must the network bear?
5 new messages in message passing.
*10,000,025* new messages sent in atproto!
"Christine that's ridiculous, we're not expecting a million self-hosted users"
Well I think it would be nice!
But regardless, ActivityPub has 27,000 servers on it, all meaningfully participating in the network.
ATProto, in its current design, would be crushed to DEATH
"But Christine", you may say, "I heard gossip might fix this!"
No. It cannot.
In fact, I was being more generous than a gossip network, and assumed you only *received* a message once.
With gossip you might *receive* more than once.
But you need to receive a message to know it.
ATProto was designed for a "big world" view. That's fine! But I'm trying to show seriously what happens if it was actually, really decentralized.
*Every* fully participating node added to the network makes the network explosively more expensive.
ATProto doesn't scale towards decentralization.
In other words, the public god's-eye-view allows for a pantheon, but not a civilization. You can only have so many gods who see all.
An important characteristic of a decentralized system is scoping what you *don't* need to know.
This wasn't in the design goals of ATProto, and it has effects.
I may be coming across as some academic computer science nerd. It's actually the opposite. I'm a humanities nerd who cares about the agency of users so much I've twisted myself into a shape where I can do a computer science thing.
But architecture matters. It affects the worlds we can have.
This is what I say when I say that Bluesky's goals of "credible exit" may be reasonable, but it's not decentralized. There is no getting around the fact that the system, as designed, is designed for a few large players. Small players can play on the *periphery*, but they can't play the big game.
Now, you might think, maybe ATProto could fix this!
And it can.
And the solution, ultimately, will end up looking... a lot like ActivityPub.
The point is that nearly everyone knows at this point that "sure, Bluesky is centralized today, in practice!" But a lot of the responses I see are "but decentralization is just around the corner thanks to ATProto!"
So that's why I'm writing this out.
Well, that's it. We've reached as far as we're going tonight.
There's still a bit left, a bit of reframing about what I am and am not concerned about with decentralized identity, and then a bigger topic about Bluesky's design goals vs community expectations. Then we'll talk talk about values.
Those last two, expectations and values, are really important to me. And I think they'll maybe be the most thoughtful part of all of this.
Of course, they're probably not what most people care about from me, about this. Probably what I've said is all many care to hear from me and that's fine.
For those who care about such things, tune in tomorrow, where hopefully we'll wrap this up. For those who were just hoping to hear the decentralization analysis, hope you found it useful.
Regardless, I wish you a very happy
=== REST OF TODAY BREAK ===
Well hello.
So yesterday I stepped onto a crumbled piece of sidewalk, twisted and sprained my ankle, and fucked up my wrist. That, and I think I've said the most important things and this is day *three* of summarizing things from my blogpost, so I will be brief.
It was nice to be prompted about @spritely's values and it lead to a good conversation internally, and we did capture those in my blogpost, but I think that should be covered again from a more official organizational side, separate from this.
I also clarified a bit: the parts I'm concerned about with the did:plc stuff aren't as much the governance, and I think Bluesky is taking some good steps there by planning a certificate transparency log. That's good. Glad to see it.
I do think Bluesky is heading in a tough direction though in terms of community expectations vs the ATProto philosophy that replication and indexing of a firehose are the primary way things work.
It's a tough situation but Bluesky is speedrunning Twitter so fast it practically is Twitter.
People want Bluesky's devs to prevent their content from being replicated and indexed by people they don't like, well, I think it really is that: a *conflict*.
People were encouraged to join a Twitter replacement, they are expecting Twitter-like solutions. Can't blame 'em.
Given that "anyone can replicate and index!" is literally the *entire* design philosophy of ATProto, it's not going to be something easy to solve. I don't have an answer, but hey, I'm working on fairly fundamentally different designs, so it's not my problem to solve.
That said...
Like the present-day fediverse, Bluesky was majorly popularized by a bunch of queer people early on. As a trans person I watched a bunch of my friend join and felt so safe they posted things they never would have in today's environment when the community was small.
The decision about whether or not to boot horrible, well known transphobic people (protip: answer is yes) from the platform seems clear enough to me. I'm not sure the "speech vs reach" approach is working.
And it seems to me people are finding they don't have tools in their hands to do anything.
For all its faults, and there are *many* and I have *railed* against the instance-oriented approach to moderation on the fediverse and have been writing about and working towards alternatives for a while, instance moderation empowers better here.
I think this will be a real test for Bluesky.
But more broadly I think *neither* the present-day fediverse nor Bluesky meet the needs of the future.
The "global town square" is a social media concept invented by centralized social media in the early web 2.0 era.
Social media by millenials, for millenials. What's the future?
@cwebber In my mind this question translates to "how do we allow people to use the Internet to connect with one another in a similar way to how they do in real life?"
I'm just me. Why do I need to worry about an "instance" or a "domain" to be able to connect with other people? Or an "account" for that matter. And what is this "platform" nonsense?