(TL;DR: My musings on what “LibreSaaS” could or should be. Initially posted here; moved to my main blog and updated a bit all along the recent disputes and discourses I used to have on that issue…)
With Microsoft acquiring github a few days ago, distrust in large platforms and platform operators in parts of the community once again gets even worse than it has been before, following events such as Facebooks Cambridge Analytica affair. Looking around in my tech filter bubble which is pretty much dominated by people from a FLOSS, security, privacy background, there seems only one valid way of dealing with such issues once and for all: Decentralize, at all costs. Run your own software, host your own data, keep everything you need under your own control. Plain and simple.
Where we are?
And, same as plain and simple: I don’t think that’s going to work. And maybe, after all, it shouldn’t work like this. There are a bunch of more or less simple reasons for that, and I just want to come up with one labeled “smarter use of resources”.
I’ve been professionally into IT ever since the late 1990s, coming from a corporate background now, being technically in charge of offering a more specialized online service basically for handling larger loads of documents. Running servers, storage, infrastructure, we repeatedly ran into the same issue: If you set up your infrastructure to be able to even handle critical load peaks easily, it gets pretty expensive pretty fast, in any way (hardware costs, even assuming we run mostly FLOSS so at least licensing and software support costs aren’t an issue; costs for keeping operations staff around, costs for power supply, internet connectivity, housing, all that).
If you keep things smaller, costs will decrease but you will way easier and faster hit a wall. So there’s a strong desire to scale in a “smart” way – and you end up with right what Amazon does: Use resources such as RAM, disk storage, CPU cycles based on a fine-grained billing per actual use. No housing, no hardware ownership, no need to replace servers every few years, less operations staff, unlimited scalability… We don’t use Amazon actually, but we went down a similar route with a local infrastructure provider offering a similar model in its local data center. One of the best choices we ever made, both in terms of stability and costs. So much for the corporate point of view.
It’s in no way different for smaller entities, just like FLOSS projects: Consider that small Android app on F-Droid, maintained by one or two devs who try building a valuable app in their spare time. Knowing they have a long backlog of issues, knowing they could be working on implementing that feature you kindly requested two years ago or fixing that bug you stumble across every other day, would you really want them to spend time and effort on hosting their own Git or collaboration server? No? Good. Hoped so.
“The cloud” and hosted services help, in those situations. They keep you (no matter whether as a company or an individual) from wasting time, effort, … on things someone else could do way better than you, just like other people possibly are better at fixing your car or preparing that food you buy and eat every day. One really stark example is Google. Using their services for calendar, mail, contact management and documents, you get easy-to-use services way better integrated than in any other platform I’ve seen before, and I’ve tried hosting such services for our company several times in the last decades. It never worked. Integration of mail, contacts and calender never really was a straightforward and enjoyable experience, and no matter how bad the setup was, it always got worse the very moment I, say, had to try and do a major update of an older ownCloud installation. Even if I massively wanted to roll this on my own with any of the FLOSS applications out there, I’d never be able to get even remotely near the quality, usability, stability, availability and feature set offered by Google. For most developers in my environment, it’s just the same about Github. Yet all these benefits come at the price of essentially handing control over your data, your workflow, your communication and productivity over to some commercial, for-profit entity, which, at virtually any time, might change its direction in a negative way. At this point, in worst case, you’re so closely tied to this particular provider that moving elsewhere becomes practically impossible. This, by the way, includes self-hosting, even if you’re able to find a viable piece of software for each of your requirements and manage to get all your data out of your then-former platform.
So what to do about that? Looking at the drawbacks of self-hosting and the drawback of cloud, SaaS or more generic *aaS solutions, it looks all like there are only bad decisions: Either you end up locked in some vendors walled garden, or you end up tied to hosting increasingly complex hardware and software yourself forever. Worse: Neither “Open Source” nor “Software Libre” provide a real solution to this, because both approaches try to solve different problems. That’s where I start to believe we need something like “Libre SaaS” to solve these problems that arise from more specialization, from more complex infrastructure, from less technical and more end-users working with technology. We don’t need just freedoms like the the ones defined in the GPL, even though they will be extremely helpful here, too. But we might need another approach to explicitely addressing the idea of having users who want to use services (maintained by someone else) rather than software (which they run themselves to provide that service) and provide freedoms to those. We want an approach to keep complex infrastructure running in a sustainable way (in terms of operation staff, in terms of energy and other resources used for maintaining it, …). We want an approach where administration of such systems, just like everything else, is done by experts who know what they’re doing, not by “part-time administrators” who are operating server infrastructure of their own just because they need it for a certain purpose. We want an approach to provide as many users as possible with all the benefits of current cloud and server-sided solutions, but without most of the drawbacks of the current providers.
Right now I’d like to see “Libre SaaS” as either a code of conduct or a contract between operators and users of a certain service adhering to the following rules and requirements:
(0) User first: Before even considering any technical or semi-technical “non-functional” requirements, end users and their use cases should be top priority. Why? Well, very few people consciously make stupid decisions. Very few people would consciously choose to use services or providers that keep on violating their privacy in more or less obvious ways. But apart from different users having different ideas of what “privacy” actually means, quite a bunch of users these days are totally non-technical and choose tools that satisfy some of their needs. There’s the whole Google (including mail, calendar, documents) stack they can use to really get work done without second thought. There’s something such as WhatsApp that keeps them in touch with the majority of the users and contacts in their address book. There’s something such as Twitter they can use in order to replace a plethora of independent RSS feeds with one app where all the “news” outlets they used to consume are aggregated – and where they can even communicate. We need to accept that a load of the services currently in “mainstream use” have seen vast adoptions because they did what, apparently, the whole Open Internet community before didn’t manage to do: Build tools that are easy to use to totally untrained users. And, honestly, I don’t like how they do it, but I actually like that they do that – because it might force a load of people to leave their comfort zones and ivory towers and get closer to understand that end-users do have totally different expectations and requirements. Right now, we spend a lot of time arguing why certain services are “bad” or “dangerous” from our technical understanding – which doesn’t change much. We shouldn’t try to scare people away from tools they use and that satisfy their needs, we rather should focus on building tools that satisfy these users needs and are better when it comes to our (non-functional) requirements.
(1) Full data control: User is completely able to export all of her/his data from a hosted platform in a standardized way so that it can be re-used either in another providers environment or in a self-hosted system. Implicitely, so, seamlessly importing data from certain standards would need to be possible. Open standards, at this point, seem most relevant, and I’d even like to push this a bit further towards lightweight open standards – standards that can easily be implemented on top of custom infrastructure. This export might become pointless if by then you do have your data but it takes you several years of development just to build some application that can actually process these exported data.
(2) Interoperability: Services should, as far as possible, adapt open standards and protocols also in communication. No walled garden. It should be easy to wire in any other external application, service or component one sees suitable for a particular purpose. This might require development of some new standards for certain purposes, but it seems needed to remove issues such as being tied to in example Github because both code and all users and all issues and communication are there. Maybe you don’t want that “all-or-nothing” approach, because in example you already run an issue tracker somewhere which should be seamlessly integrated with the source code hosting service.
(3) ‘Ethical’ hosting: There should be agreements, code of conducts, … outlining how hosting of the service looks like. Generally, all user data should be treated as strictly private and no operations staff should be accessing, reading, copying, selling or in any other way messing with it. No automated “processing” of user content to generate user-targeted ads. No selling of data or metadata to external companies for whichever analytics they might want to do on top of it. Transparency in all data processing that happens or needs to happen in order to keep the system running (though in many aspects way too heavy, the idea of documenting processes as found in GDPR seems a good starting point for that). Maybe best would be to have things such as on-disk or in-database encryption so that users actually have options to actively prevent certain kinds of data “abuse” possible; still I’d love to see a modus operandi based upon trust, as a lot about using technology in an age of highly specialized experts is about trusting others to do the right thing.
(4) Transparent calculations: Let’s face it. A load of issues arising from current cloud providers such as Google, DropBox and the like are caused by the single fact that most of these services essentially are “free” (as in “no payments involved”) for the vast majority of users. This totally contradicts the fact that, too, a vast majority of people has bills to pay and spends time with paid work in order to do so. Providing services at Google quality involves quite a bunch of people being paid each month, as it does include a bunch of highly complex global infrastructure to run the software, as it does include, too, vast amounts of electric energy to keep the infrastructure running. These costs won’t disappear if you offer a service “for free”, so about all of these free services you can be sure the companies offering these do have ways of still earning money with it, at the very least enough to keep the service permanently running and improving. I do not consider this a bad thing, and I’m perfectly okay with companies operating that way. The only thing I see desirable about that would be to open these calculations to the users to make people see how much money is required to keep the system running – and where it does come from. Next step obviously would be paid plans and service level agreements between users and providers. Yes it takes money. No, “Libre SaaS” shouldn’t be about free-of-charge hosting. It should be about valuable, sustainable, fair hosting, also from a financial point of view.
(5) Built on Software Libre: I am unsure whether this really is of much importance, but I definitely think it could help: If you already do have a service running a certain Software Libre component, it’s easier to move either between different providers or between a provider and a self-hosting environment if required. That’s similar to the gitlab approach and seems a good thing at least to me.
So much for the basic idea. Right now this is little more than just a collection of thoughts. I have no real idea how something like that could be brought to life. Obviously it would, first and foremost, require providers willing to work under such conditions, same as users willing to accept these (including, most likely, a price tag rather than an “everthing-is-free” view of the world). Maybe it would require or foster different approaches of providing such services – commercial entities as well as cooperative hosting, provider communities and the like. Maybe it also would require a structure similar to the Free Software Foundation to work out and promote this idea any further. Maybe, then again, it’s totally off and something that will never work for reasons that are obvious to anyone but me. Not sure. Feedback welcome. 🙂