Reclaiming our Web

This blog suggests an alternative way of sharing and retrieving content on the web today. It is a response to questions posed in the first blog.

Posted: Sun 03 Apr, 2016, 11:07
As I wrote in my last blog , there are certain fundamental issues with how we use the web today. We lose our content, we lose control of when it is shared and for how long. We are beholden to the (well known) content sharing platforms because they are so good, so effective and there appears to be no viable alternative.

My view is that there is an alternative now. In truth, we have always had the ability to host our own website, but only in a very crude and time consuming way. There are innumerable web hosting providers where you can build a site and then use this to share all your content. But this approach is so limited, for reasons covered below, and we need to look beyond this.

There are many pieces of the puzzle that we need to solve. In recent years certain key technologies have been come available to us. My final blog in this series will cover these technologies, but for now lets understand what we need.

Self hosted

As with this site, we need a location on the internet that is our own. This can be done in many ways but ultimately we need a web address. This is our web presense. Not Facebook. Not Flikr. Not Instagram. This is where we share our content from and everyone comes to view it. Crucially, this is also where we come to see leads to content published by other people. More on this later.

Self hosting does not mean getting webspace with a provider. This is not enough. There are at least two serious, affordable and practical options available to us that will allow us to be self-hosting we require. Running a Raspberry Pi plugged into your home router or hiring a droplet from Digitial Ocean (or similar). Both these offer a chance to both serve out your pages and also run software to pull in other content.

Content Metadata

Content is data. Metadata is data about data. So content metadata is data about content. We need a way to define and provide both our content, and critically, our content metadata. An example of content metadata is with a photo. The photo is the content. The metadata is who took it, when it was taken, a title maybe, a link to the photo and who is allowed to view it.

This is really an extension of the RSS solution available to use today, but with much more. RSS is metadata about news stories. It allows us to pull in all the metadata and then choose to visit the site that has the story if it is of interest. We can do all this with just the metadata.

Our metadata for our sites needs to wrap up other kinds of content. For example to serve out, say, photos, blogs, review, albums and videos. The metadata for all of these will be similar in some regards and different in other. If someone wants is interested in just my most recent photos but not my blogs (!), they can pick out just this metadata.

This is another key similarity to RSS. We will publish our metadata to allow others to pull our content. This is much preferable to a push based system like emailing photos or having "interesting" content selected for us on a social media platform.

Quick

Adding new content must be quick. Like less than a minute to add a blog that took you 2 hours to write. Or a second to add a new photo. These are reasonable parameters to work within. Anything more than that and any potential solution fails.

When we think about generating web content that is quick to add and maintain then the obvious solution is a Wiki. This is exactly what a Wiki does, but with one major drawback. The content, structure and metadata are wound up together. Yes there will be RSS publishing plugins to allow us to generate a feed off the back of a blog post, but this is not sophisticated enough. We need various other features which are covered next.

Queryable

This is supposed to be a non-technical blog (not doing very well am I!), but what do we mean by queryable. It means we can pull only very specific kinds of content from a source. For example, say we want to pull photos that were taken with the last week by our friend. This is a query. It allows us to retrieve links to these photos, and all the other metadata, without having to pull in all the metadata about all photos.

This is where we are evolving and extending on RSS. I cannot pick up out stories published in the last hour with RSS. I need to pull all the stories in the feed and then throw away any that I am not interested in. If we could query, then we could have much more control and only retrieve the content that interests me.
Queryable sharing is how we will provide the nuanced, personalised and efficient way of retrieving the the content we want, when we want it.

Agents

Hopefully by now a picture is emerging of the landscape of our reclaimed web. We have many sources of (meta)data to chose from and we also publish our own stream about our own content. The hard work of visiting a site, pulling the metadata we want, sorting, aggregating it and making it available is not a task for a human. We need agents to sift the web and pull back the good stuff.

An agent (or call it a robot or your avatar) is directed by you and is also running on your self-hosted platform. It knows where we get your news from. It knows where our friends publish from. It knows which stocks we are tracking. And it knows how frequently we want these updates. It can then take care of gathering all this for us as metadata and presenting it us on our website.

This gives us a common location that contains our own content and the aggregation of all live and interesting news from all our other sources. Agents are key to this.

Authentication

We need a way to authenticate people and their agents. Before I decide whether or not to let an agent/person see some metadata about a photo, I need to know who they are. With this information I decide whether not to give them access. This should leverage some internet-wide authentication framework that is not beholden to a third-party.

A better Web

Hopefully this articulates a web where we control our content, publish it from our own site, decide who sees it and allows us to efficiently pull in other content. It allows us to control the context in which our content is viewed. We do not need to send all our content to some third party for the priviledge of sharing it. And the web may evolve into an infinitely more interesting, diverse, personalised and fulfilling place.

All the pieces of the puzzle exist - today - to allow us to reclaim our web. My final blog in this series will be present one solution. And it will be technical, it needs to be.