content:searchanddiscoveryservicedefinition

Identifing Potential Services and Architectural Characteristics to Create Safe & Welcoming Community Spaces

So what do we think are the kinds of search and discovery services that we can implement over the Serval Mesh’s distributed networking primitives? Serval’s Rhizome distributed data system effectively works like a magic file share: Various users put files (or file-like objects we call bundles) into it, and those then get distributed throughout the network in an opportunistic manner. The result is an approach that provides eventual consistency, provided that there is sufficient storage on each node, and provided that there is sufficient opportunities for network communications.

The storage limitation is addressed by having a prioritisation system, so that the highest priority material is retained, if there is not space for everything. To date, the priority system has been relatively simple, with smaller files/bundles taking priority over larger ones. This could be easily extended to prioritize local content over remote content in a variety ways. This could include either geographic bounding boxes for content, that indicate that it is only of relevance in a given geographical area, a hop counter, or alternatively that content is tagged as belonging to a given community. Community tagging has the advantage that material could follow a community who were on the move. Neither is perfect, nor are they bullet-proof against various forms of denial of service measures, as is common for most mesh networking systems. Essentially any mesh network is very difficult to defend against a well-resourced aggressor. We will thus just focus on how we can add the ability to search and discover content on a mesh network.

The Serval Rhizome system supports both encrypted and non-encrypted content. Each bundle also has meta-data. At the most basic level, we can simply allow users to search this content, without describing any additional services or semantics. We should do this, but also consider services that can be created that intentionally support discovery and search. Also, from another perspective we should think about the types of discovery and search services that are desirable for infrastructure-deprived contexts. Ideally, by considering what is possible, and what is desirable find those capabilities that are going to be successful in practice.

From the possibility perspective, we need to formulate search and discovery systems that can work using only cached data. This leads to a positive characteristic, in that the time to perform a search can be very rapid in almost all cases, because no network latency will be involved. There is of course a negative aspect, in that if the data has not reached the device performing the search, than it cannot be searched. This is the fundamental limitation of the the Serval Rhizome data model, which is the trade-off of the ability to operate completely independent of supporting infrastructure. As with every other network model, the nature of the network influences the nature and architecture of the network services that can be delivered – especially if they are to perform well.

To help us identify services that are of interest to consumers, it’s probably a good idea for us to look at statistics relating to search engine use, such as from somewhere like these data from 2020:

This lists the top 20 reasons for search globally, which has some limitations for our purposes, which we will discuss shortly, but let’s look at the top 10:

  • 13% - Translate
  • 11% - Social Network
  • 10% - Entertainment
  • 10% - Shopping
  • 10% - Weather
  • 9% - News
  • 8% - Web Services
  • 5% - Gambling
  • 4% - Mail
  • 4% - Travelling

Let’s start by saying that I don’t really have a strong opinion as to how reliable I believe these percentages to be, but the overall categories look reasonably sensible, although I am dubious when pornography doesn’t make it into the top-20 list at all. Perhaps that comes under “entertainment”, “shopping” and/or “Web Services” to some degree. But that’s rather academic for us, because we are not interested in creating an infrastructure-free pornography distribution service. In fact, history has shown that you can tell when a communications medium begins to be useful, precisely because it starts being used to carry pornography.

Rather, we are actually actively wanting to create a platform that doesn’t get overwhelmed by porn. This can be difficult, precisely because of the propensity for communications media to get used for this purpose. The services we are creating need to be crafted as intrinsically safe places, to the maximum extent possible. This is not always easy, and can never be done perfectly, but there are sensible measures that can be employed to help.

The corollary to this is that if you don’t take steps to make communications spaces safe, then they rapidly end up as very unsafe places. This has shown itself repeatedly, especially in recent history: CB radio with its near total lack of accountability consists in most places in large part of a stream of profanity and abuse, making it unsafe for families and young peoples – in spite of often quite strong laws that ban even the use of swear words on the medium. Then we have more similar examples to our area of interest, where online chat systems such as FireChat. I recall around the time it was released trying it out in the mundane environment of an Australian city, and even here, the lists of nearby users were eye-scorchingly offensive to read. This isn’t a complaint about FireChat per se (and they might well have great tools for dealing with this now), but really just an example of what I consider to be an axiom of human communications: If people are unaccountable for what they say, they will say just about anything, and a lot of it will be designed to shock, and a lot of it will be sexually explicit and abusive. If we want to create services that are going to be wide spread by civil society, rather than just “uncivil” users, then we have to take active measures to help nudge the use of the system in the right directions, and to reduce the potential for abusive use to be easy or attractive.

For example, the text messaging and neighbour discovery mechanisms already in the Serval Mesh are quite conservative about showing you random participants who are a long distance away on the network, to reduce the potential for unaccountable abusive activity, such as using offensive account names, and allowing random people to send you messages that appear together in your personally curated inbox. Instead, your inbox only shows content from people you have added as contacts, or in the case of the Twitter-like micro-blogging facility, only of people you have chosen to follow their feed. This creates a safe working place, where you can carry on your day to day activity, without fear of random anonymous abuse.

The trade-off is that you have to actively choose to go through your contact requests, when you want to add new people to your network. This makes it harder for social groups to self-organise, and necessarily means that you may have to make excursions from the safe space into the wild exterior. One way to improve on this, is to have a filter for truly “nearby” users, so that you can find people on the network when you meet in person. This could require direct network connection, i.e., that user’s devices are physically directly peered with one another, or use some other mechanism, such as a more relaxed hop-count limit, or perhaps better, to sort the list of potential contacts by hop-count and time since last reception, so that fresh local requests appear at the top of the list. Of course, this current work on supporting search and discovery mechanisms helps to directly address this issue as well, and we will keep solving this problem in mind as we design them.

Another approach that could help would be to have a mechanism to create closed communities, where a common passphrase or similar is used to tag network content as belonging to a particular community. There are also clever cryptographic mechanism to achieve this effect that don’t allow for a random unaccountable user to obtain and use the passphrase. Such approaches help not only to filter results to a manageable level in a large network, but also directly addresses some of the unaccountability that leads to what we might generally call anti-social behaviour on these platforms.

Hybrid approaches are also possible, where for example, only users in your verified social group are able to have image content displayed, so that the benefits of images can be used in a service or facility, but with reduced probability of this filling up with pornographic or other anti-social content. There is considerable potential solution space for such measures, well beyond the scope of this current project. But what is relevant to this project and should be included, is at least some exemplar mechanism for discouraging anti-social behaviour, and where necessary, excluding anti-social users and their content. That is, it must like the existing Serval Chat (MeshMS) and Serval Micro-Blogging (MeshMB) services follow a “safe place by default” approach.

These insights will be applied in the design of the system, which will also be guided by positive use-cases, through an extended scenario-driven set of user-stories.

content/searchanddiscoveryservicedefinition.txt · Last modified: 30/09/2022 03:47 by Paul Gardner-Stephen