Go Beyond

Only read if you don't mind being offended.
/ About / Business /

Data Hoarding

Amid privacy concerns, the average netizen neglects a particularly obvious answer. Many of us go to lengthy measures to have a semblance of privacy. Ranging from blocking trackers, using Tor, pseudonyms, "private browsing" windows, and disabling Javascript entirely, these can be effective to different degrees. Some are quite inconvenient, others are best practices and not that cumbersome.

But what was wrong with working offline? No means of tracking, no notion of which page you browse, which movie you watch, how long you focus on a particular segment, etc. I believe netizens have been led away from the concept of working offline by the "cloud" era.

In some ways, the "cloud" era of Google Apps, public VPS hosting, Salesforce, Datadog, Loggly, and many others, has been immense progress.

The "cloud's" mantra is:

We can do it cheaper, faster, and easier than you.

Instead of everyone reinventing the wheel hosting their own email, organizational applications, metrics, security, etc, the "cloud" takes care of it for you. In many economic senses, it is indeed quite logical. Specialization is rampantly proven to work quite well. Not perfectly, but it tends to be that you get better results from having five different specialists than five generalists working equally between five needs.

The "cloud" also speaks in other ways:

We're so good that we can have metrics and tracking on all user data to make your experience even better.

This manifests as many possible things:

  • A/B testing.

  • "Free" services by turning your data into exportable goods.

  • Imposing varying ethics on users.

Now, many metrics and tracking are used for good things. Modern UI is so vastly superior to what it used to be. Many phone applications, web applications, etc, are intuitive from the first use. They have no "learning curve". They get out of the way and for the most part let you have similar expectations as a user across much of computized existence. You don't have to context-switch between each application as they can flow so well together. I am not going to argue against the benefits of this.

The eerie bits come into play perhaps almost as often as the benefits. Youtube knows what videos you like, how long you watch them, even how engaged you are by how quickly you skip ads. Youtube has a good idea of which ads you're most likely to be engaged in, and thus, earn them (and the video creator) more revenue. Youtube can also apply any degree of ethics-washing it wants. The "suggested" videos can be fine tuned to push any number of ideologies. A subtle tweak to search results, video suggestions, and generated playlists can influence anything from elections to the clothes people buy on a grand scale.

This notion of election flipping is not even something I am reaching far to state. You can search for Xfinity (Comcast) TV advertizements which openly claim to use watching habits to be able to identify voters most likely to be influenced. This alone to me is stunning but not shocking, I've seen such an ad first hand on Xfinity.

The convenience, cheapness, and utter ease of the "cloud" have won many of us over. I would say in many ways, the "cloud" era has been a good one. And in many cases, "cloud" services are the more ideal of options. Do you really want to setup your own server in an office to handle your email than just give a handful of people GMail accounts? The email generally ends up cleartext anyways and can be read simply on the receiving end to profile you anyways.

The "cloud" is right that many processes were much too difficult, error-prone, and hard to reproduce. However, I must call for a new era, perhaps the standalone era.

There is a newthink wave of anti-cloud behaviors. Anything from golang to youtube-dl, to decentralized social media applications. It should be very possible to make tooling which is as good as the "cloud" offerings, while untracked, uninfluenced, and hosted locally.

And aside from these locally hosted applications which are much more censorship resistant, I have to jump back to my original title. Largely enabling the "cloud" era is cheap and reliable storage. The same cheap and reliable storage can do wonders for any power user.

While much internet use is read/write (commenting, communications), I would say a great deal is read/only. Looking through online manuals could easily be replaced by local manuals. Every library in the world can fit into many mechanical hard drives or even solid state drives. It should be fairly easy to have Wikipedia, every software manual/resource you regularly use, and any other reference downloaded locally. With storage capacities and fast internet speeds, downloading what you use the most should not take long. To the privacy minded, now they only know that you downloaded all of Wikipedia, not which pages you browsed and for how long. To the reliability obsessed, local storage can easily exceed internet and remote service reliability.

With any move to a data hoarding mentality, intelligent mechanisms to keep these archives up to date can be important. Tooling can be improved to the point where the local data is as simple to use as it is online.

Another benefit of this is having offline copies which quickly show upstream changes and likely censorship. All of the Youtube videos taken down, Tweets removed, and Reddit users shadowbanned can be easily tracked. Perhaps you like having your mind fine-tuned by the powers that be. Or perhaps you'd like to see exactly what they remove and question who they want you to be.

Thanks for reading.


Share on Voat.