The Future Is Federating Forges

Just because Git is federated, doesn’t mean code forges are. ActivityPub can be extended by ForgeFed to solve this problem.

It’s come to my attention that a lot of early E-mail debates around the ForgeFed vocabulary are not publicly available in a mailing list. It has either been closed, marked private, or shut down. I wanted to take a moment and share my copies of the debate points. I do not intend to call specific individuals out. I will try my best to characterize the differing viewpoints’ strengths and weaknesses honestly.

I am not a contributor to ForgeFed. However, as the author of go-fed, I am biased toward seeing it succeed.

Background

Git is a distributed version control system that has built-in tooling and support for sharing patches using E-mail. E-mail is a federated protocol that allows sharing information about each locally distributed repository’s state. Code can be passed from one person to another as long as the other person is also using Git. That other person can accept or reject the bit of code shared with them. This lets E-mail just shuffle the bytes – the information – from point A to point B. Git is hugely popular. Its mechanisms and concepts power the current state-of-the-art open source development processes, and probably many proprietary processes as well.

Unfortunately, right now the E-mail usage of Git is a minority. For starters, users want to use their browser to view and manipulate their repositories. That’s a lot of HTTP requests and responses displaying and modifying repository data. Also, there are other lifecycle items to be managed besides code, such as release announcements, many different kinds of documentation, issue management, and many more. This is the value that services like GitLab, Bitbucket, and GitHub offer. Git itself doesn’t even attempt to solve these problems in its tool, so everyone steps up to offer their own services to make this part – the managing of software – easier.

ActivityPub & ForgeFed

Last year in early 2018 there were lots of discussions around extending the ActivityStreams vocabulary for code forges. The definition of code forge may mean different things to different people, but often includes:

I want to highlight that the actual version control system is only one subset of a code forge.

ForgeFed initially mentioned only Git, but the vision later expanded to be version-control-system-agnostic. This in hindsight led to a heated debate.

The “Git Is Already Federated” Argument

This argument persists today: because Git is already Federated, ActivityPub for code forges (ForgeFed) is not worth pursuing.

Firstly, the Git tool already has support for E-mail, and E-mail supports extensions. E-mail can be extended with custom extensions to provide the capabilities that code forges provide (as listed above). E-mail is mature with many open source libraries, in contrast to ActivityPub which doesn’t have many libraries. Furthermore, many of these E-mail libraries are actually supported as standard libraries, in contrast to ActivityPub which has zero standard libraries.

Furthermore, using E-mail is easier than ActivityPub. For one, the developer will need to implement ActivityPub which requires a lot of boilerplate, a public domain to do integration testing, and understanding of new or still-evolving specifications. On the other hand, the developer will just need to call into an existing E-mail library. No need to re-implement E-mail.

Also, code forges can still be a web-based UI, but under the hood everything is sent and received via E-mail. E-mail is a better choice than ActivityPub because ActivityPub instances, when they go down, result in permanent data loss. E-mail lives in a user’s inbox, has robust retransmit options, and has tooling that allows re-creation of archives. This is not an option for ActivityPub implementations.

Another argument is that a code forge shouldn’t look like GitHub in the first place. Because GitHub has commercial motivations, their software design of a “repository” as viewed through a browser is inferior to a federated approach. And it should be an E-mail only approach, because if there is both an E-mail and ActivityPub based federation, they will be incompatible. Also, any development man-hours put into ActivityPub could instead be put into E-mail.

The drawback with E-mail is that the Git tool is intimidating and not easy to use. However, with proper education, people will pick up on it and begin to use it more.

Instead, “The Future Is Federating Forges”

I respect many of the points laid out by the argument. My viewpoint is not an opposing one. Instead I think it makes more room for “other federating ideas” and makes the “Git is already federated” argument seem like a more fundamentalist, narrow view. I welcome Git-over-E-mail, but I think to exclude ActivityPub is short-sighted.

ActivityPub shares data between peers. E-mail does this too. Where it differs is that ActivityPub has set forth extensible rules for the construction of a linked-graph of data. Furthermore, that graph is not limited to one kind of data. It can be microblogs, pictures, this blog post. It could be a record of economic activity such as exchanges, loaning, bartering, processing. Or code forge related data. More importantly, it is a graph that does not have to start from scratch each time new extensions are brought on board. On the one hand, by joining the graph early, microblogging software developers have positioned themselves to get all future innovations without putting in the innovative-legwork. They just have to accept the new extensions and display the incoming data! The late-joiners, such as code forges, can bootstrap their potential user base by re-using the existing graph which is a major incentive to interoperate with the existing ecosystem.

E-mail, on the other hand, lacks the ecosystem of self-describing RDF, linked data, and graph-building. That is one of two crucial differences I see that heavily influences my opinion. The other difference will be discussed later.

This benefits the users of the Fediverse as a whole. The graph evolves to contain all sorts of interesting linked data, shown to users in their native software without requiring to go create new accounts in silos. Today, when I see a toot on Mastodon about a controversial pull request, I have to look up the GitHub repository and search it for pull requests. If people are kind, they’ll provide a link instead. But I have to leave Mastodon to view it. Instead, people could have just directly boosted in Mastodon a controversial comment in that code forge, it would have just appeared in my Mastodon home timeline, and then I could have just read it. That’s the power of the Fediverse. Early software developer adopters get new kinds of data in the gigantic growing graph while latecomers can bring their new innovations into the graph and bootstrap their userbase. This is only possible today if you are one very large silo, large enough to contain all of these kinds of graphs.

This is why I firmly believe the future is federating forges. I like Git over E-mail and do not want to diminish it in any way. Both it and ActivityPub are great. But since it is missing this linked-graph property that ActivityPub has, I am more excited about it.

Now that my position is clear, let me examine the stronger points made by the “Git Is Already Federated” argument.

There Are Fewer ActivityPub Libraries Than E-mail Libraries

ActivityPub doesn’t have many libraries. That is accurate. There are very few people who have implemented ActivityPub, even fewer who have done so in a library – and I am one of those! Therefore, people wanting to write federating code forges today will be faced with the daunting task of implementing ActivityPub. They wouldn’t have to implement E-mail if they went that route instead.

However, I have somewhat mitigated this problem by building the go-fed library, which with ForgeFed could allow projects like Gitea or Gogs to use the library to speedily start federating over ActivityPub.

Also, while ActivityPub has no standard libraries, I don’t take that to mean that it is inferior, just newer. So I don’t put much weight on this distinction.

ActivityPub Instance Perma-Gone Results In Data Loss

Note that we’re not talking about a crash that loses data which requires a restore-from-backup. This is where a node in the Fediverse goes offline permanently.

I agree on this point. If an instance goes down permanently, the data it contained is gone. This can be both a good and bad thing (but mostly bad). For privacy reasons, it is sometimes nice to have the internet forget. For the common case (administrator just takes it down), it is not nice. There are some innovations going on in this area. Chris Webber is examining content-addressable IRIs, for example. The content would then be locatable independently of the instances alive on the network.

However, I think this comparison to E-mail can be made fairer. Much like a user has a local copy of an E-mail sent from their peer (even if that peer is gone), ActivityPub implementations can cache a peer’s message locally for viewing by its user. So the ActivityPub implementation can still show a peer’s message locally even if that peer is down permanently. So I think this could be mitigated by implementations in the short term, and fixed in the long term.

Now, let’s address some weaker points.

ActivityPub And E-mail Are Incompatible

Yep. ActivityPub is incompatible with a lot of things. But bridges exist already between ActivityPub and other protocols that it doesn’t interoperate with, so I suspect an E-mail to ActivityPub bridge could be built to unify concepts. To be fair, bridges do require some work, but that doesn’t diminish the fact it is still possible to build them.

ActivityPub Devs Could Instead Help E-mail Dev

Yes, they could. I have no statistics, so I honestly don’t know how strong or weak this “could” is. I know for me, “hacking on E-mail” is down on my list just under “polish my gravestone.”

So I think this a really weak point for an argument. Especially considering it is purely a measure of belief.

Building a Library vs Using a Library

It is easy to get caught up in “E-mail is just a bunch of library calls” versus “ActivityPub is implementing all these things.” This is an inherently unfair comparison, as using a library for a protocol is not the same as building one out. This line of reasoning is found everywhere despite its deliberate unfairness.

Instead, a stronger criticism is that implementing ActivityPub requires tying it too much to an implementation. I would disagree, with the go-fed library as the proof by contradiction.

An even stronger criticism than that one is that an ActivityPub library is way more complex to use than an E-mail library. I’m not sure whether I agree or disagree. Getting E-mail servers configured properly can be non-trivial. ActivityPub has some non-trivial considerations as well. I think considering the large amount of time E-mail has had to be alive, when compared to ActivityPub, makes this hard for me to be conclusive one way or the other.

Boost an Issue?

I omitted a weak line of argument that goes “Could you imagine tooting a patch?” because it comes from a world view that does not share my enthusiasm for or the vision I see in ActivityPub. That’s fine, but it means I view it as a weak and unimaginative line of reasoning. I’ve seen plenty of toots alone on Mastodon where, if some proprietary third party service were using the linked-data graph instead, the content could have been much more readily accessible to the users of the Fediverse.

ForgeFed Isn’t About Git

This whole time, the argument has focused on Git. As I mentioned ages ago, ForgeFed has its sights set on supporting any kind of version control system. This is a huge boon for ForgeFed that the “Git Is Already Federated” argument cannot even begin to address because its premise requires it to be talking about Git. I am not a fortune teller so I don’t know if Git is sticking around as being popular for another month, year, decade, or heck the next century. But ForgeFed would be able to support the next big thing, whereas the Git-over-E-mail would find itself having to work out how to interoperate with the next kind of version control system. It could be zero work, it could be a lot. I don’t know. But this is the second of two crucial differences I see.

Conclusion

I support Git over E-mail efforts. I support ForgeFed efforts. I don’t contribute to either effort. I don’t want to diminish either effort. I have an interest in seeing the ForgeFed stuff succeed, so I’ve put forth how the Fediverse works and how it is fundamentally more than just federation. E-mail and ActivityPub federate, but only ActivityPub has the Fediverse. I don’t view the two efforts as competing, and just want to help get both camps moving forward, side-by-side.


Created: May 19, 2019 17:20:59 EDT
Last Updated: Jun 19, 2019 17:13:28 EDT
By: Cory Slep

Fediverse Comments

@cjslep
« Git-over-E-mail would find itself having to work out how to interoperate with the next kind of version control system. »

Sort of, ForgeFed would have to decide what other features (issues, wiki, etc) it would support over Email (along with version control).

But yes, and thanks for the overview!


Commented on: May 22, 2019 21:41:46 UTC
By: https://social.coop/users/django

@cjslep I think someone should just go forward and implement an AP prototype. This kind of discussion is getting nowhere. If someone considers it federated already, they have nothing to do but make it not painful to use. They had like 20 years to do it but they didn't. If git over email was so wonderful, we wouldn't have forges I guess.


Commented on: May 23, 2019 20:33:04 UTC
By: https://birb.site/users/charlag

@cjslep Imagine posting an issue (and discuss it) directly from Mastodon or Pleroma. We need to get Gitea and GitLab on board.


Commented on: May 27, 2019 15:44:37 UTC
By: https://mastodonsocial.ru/users/drequivalent