In my previous post on this topic, I discussed some of the ways we rate the truth of information. One of the key points of that post was that much of our information rating is based on the opinions of other people that we, for one reason or another, think are good judges of the information’s reliability. Due to the sheer amount of information we deal with, none of us has the time to personally vet the truth of all the information we rely on, even in the cases where we have the skills and resources to do so.
In this post, I’ll start to introduce some possible ways that computers can help improve the ability to rate information. In particular, I want to look at ways to improve our ability to rate information based on other people’s opinions. Many of the ways we’ll look at to do this are just extensions of methods that we already employ today in our normal lives.
I want to emphasize that this post is just meant to begin an early exploration of a very broad topic, not a prescription for immediate action in the form of explicit algorithms, and not a very comprehensive exploration either (there will be more posts later). Also, I’ll only be discussing the associated mathematics in a very general way, although I’ll add some links to sources with more details on potential approaches.
The need for mathematical models: Computers “think” in math
As any programmer can tell you, one of the most challenging tasks in getting a computer to solve a problem, is to figure out how to model the problem in a mathematically formal way. Below I’ll examine some of the ways we can mathematically model a web of trust system for rating information (for brevity, I’ll refer to this system as a trust network).
Trust networks naturally generate individualized ratings
Before we jump into the modeling problem, it’s worth pointing out one important aspect of the trust network approach to rating information that I only addressed in the comments of my last post: these networks don’t need to give universal answers when rating information. For one thing, each person can get individualized answers based on the people they trust and the ways in which they trust them.
Also, each user is free to select their own personalized algorithms for analyzing the information from the people in their web of trust. While I’m visualizing a network of trust that is at least partially publicly accessible, there’s no requirement that all users run the same software. The only requirement for interoperability is that each version of the trust network software is able to operate over a commonly understood data format that is exchangeable by nodes in the trust network.
Modeling units of “information to be rated” using predicates
The first problem we need to address is how we can represent the information we want to rate. This general area of computer research is known as knowledge representation. As you can probably imagine, it is a huge topic, and we’ll try to make as many simplifications for now to avoid the many complexities that arise.
To assist in the mathematical modeling necessary to map the information rating domain to a computer, I’ll briefly introduce some terms from symbolic logic. Symbolic logic theory is a branch of mathematics that was created to formalize the rules of logic we use when we reason about the truth of statements, so it’s worth a brief look at some of the things it has to offer.
Borrowing from the language of symbolic logic theory, I will refer to the units of information to be rated in a web of trust network as “predicates”. In symbolic logic theory, a predicate is something that is affirmed or denied of the subject in a proposition in logic. For example, one example of a predicate is “Hive has a blockchain”. This predicate makes a statement about the subject, Hive, that can be either true or false, but obviously not both simultaneously.
Assigning probabilities to predicates
Traditional symbolic logic requires predicates to be either true or false, but we need a model that represents probabilities of truth rather than absolutes. Let’s assign a decimal value of from 0 to 1 to indicate a predicate’s truth probability. For example, 0 means it’s false, 1 means it’s true, and 0.7 means there’s a 70% chance that it is true.
In an information rating system, it’s likely that very few predicates should ever get rated as absolutely true (1) or absolutely false(0). This is because for most of the things we believe to be true, we need to be prepared to receive new information that contradicts our current beliefs. In fact, it might at first seem like we could never rate anything as absolutely true or false, but there are internally consistent information domains such as mathematics, where it is quite possible to rate things as absolutely true (e.g. 1+1 = 2).
Modeling the inputs from a web of trust network using probabilities to rate them
As a quick refresher from my previous post, a web of trust is one or more information sources that you trust to some extent to rate the truth of information (e.g. your friends are one example of a web of trust you have).
Let’s look at one possible way we can mathematically model the opinions about a predicate’s truth coming from a web of trust. Let’s start with a simple example, where your web of trust consists of 2 people, Alice and Bob, who both solve the same math problem on a test, with Alice reporting the answer to the problem is 5 and Bob reporting the answer is 10.
Before we can optimally cheat on our own test, we need to decide which of the two answers is more probable. Typically we’ll do this based on an internal rule like “Alice is good at math” and “Bob is no math genius”. In this simple case, we would likely just decide that Alice is right and go with her answer. But next we look around the room a little more and spot that Carol, who is good at math, has Bob’s answer. This new evidence may lead us to believe that Bob is actually right and Alice is wrong.
In order for a computer to help us solve this problem, we need to assign probabilities to our ratings of the information sources. So let’s say we believe Alice is right 80% of the time, Bob is right 60% of the time, and Carol is right 70% of the time. At this point, we have enough information to compute which is the more probable answer using probability theory (although, if you’re like me, you’re still going to want a computer to do it). One promising avenue for doing computations of this sort to model the probability of truth is using Bayesian probability theory.
Confidence factor: how sure are we of a probability?
In real life, we’re rarely able to rate information so precisely that we’re able to say things like “Alice is right 80% of the time”. So just deciding our own internal ratings of information sources isn’t a trivial problem.
For example, we may believe Alice is right most of the time, but we might not have enough experience with Alice’s math skills to know if that means Alice is right “7 times out of 10” or “9 times out of 10”. Let’s call this a “confidence interval” of .7 to .9, as it’s a measure of how confident we are in rating Alice’s skill at math. Another more powerful way to model this uncertainty mathematically is to rate an information source with a probability distribution rather than a simple confidence interval. In either of these two models, we can tighten the range of our confidence interval or probability distribution as we get new information that allows us to increase our precision in rating Alice’s math ability.
For programmers who are interested in experimenting with using computers to perform math with probabilistic distributions, here’s a python-based library that I came across in my research: PyMC3.
Problems with predicate meaning: meaning requires context
One problem that immediately arises with the use of predicates is how to decide when two information sources are reporting on the same subject. As a simple example, let’s say you have two information sources, and one says “Coke is good for you” and another that says “Coke is bad for you”. Serious problems emerge immediately in determining if these two sources of information are reporting conflicting information, or if they are talking about entirely different things. One may be talking about Coke, the product from the Coca-cola company, whereas the other source may be talking about cocaine.
A related problem is we have to decided if the predicates are making the same assertion about the subject of the predicate. For example, in this case, we have to decide what “good for you” means, does it mean “good for your state of mind” or “good for your health”?
There’s many possible ways to get around these problems, of course. Normally we use contextual clues to determine the meaning of such statements. If we read the statement during an article about soft drinks, we can be pretty sure they’re talking about the coca-cola product.
But while this is easy enough for us to do normally, a few decades ago this would have been a very difficult problem for a computer to solve. Nowadays, however, computers have gotten much better at making distinctions of this sort, so one potential solution is to provide a trust network with enough contextually relevant information that it can use to distinguish seemingly similar predicates (or even identify the same predicate expressed slightly differently).
Another way we can solve this problem is to get the information sources themselves to agree about a unique set of predicates. In other words, we can use human brains to make the distinctions instead of relying on the computers to interpret the words associated with the predicate.
There’s many ways that the trust network could go about achieving such consensus on the meaning of predicates, and it’s a research topic in and of itself. One interesting method might be to combine the use of computers to identify probable matches in predicates, then have humans in the trust network rate the probability of those predicates being identical.
Unique subjects allow us to create a set of predicates about that subject
In the previous section, we talked about a mechanism for uniquely identifying subjects in predicates. Once we have such a mechanism, we can begin to group predicates associated with that subject. This can be very useful in many ways.
One obvious way this kind of predicate grouping can be useful is simply as a new way to get information about a subject: when we know one thing about a subject, it’s likely that we will want to know other things about it.
Predicate grouping also allows us to reason about the subject and make conclusions about it. For example, the more information/predicates we know about a subject, the easier it is to identify if a subject in another predicate is the same subject, because the associated predicates serve as contextual information. It also to allow us deduce new information (new predicates) about the subject from existing predicates, which I’ll discuss in more depth in a later post.
Sentient subjects
One particularly important group of subjects for an information rating system are subjects that represent a person or an organization, because these are the information sources in our trust network.
Predicates about people can allow us to rate the probability of several important attributes that can help us decide whether to trust them as information sources. For example, one of the most fundamental such attributes is identifying whether an information source is a real person, and not just one of many sock puppets used to magnify the opinions of one person. Predicates can also be used to indicate a person’s specialized areas of knowledge and we can use this information to help make a judgment on how much we should weight their opinions in a given information domain.
We can also associate a set of cryptographic keys with subjects that represent people or organizations. In this way, the owner of the keys can “act” for the subject in the trust network. This has many potential uses, but one of the most obvious is it enables an information source to cryptographically sign information they disseminate, in order to authenticate that the information is originating from the signing source. This is basically how Hive works today with Hive accounts and posts. In a trust network, we could even assign probabilities to the likelihood that a subject still retains control of their keys.
Grouping “related predicates” into domains
A “related” problem to the problem of identifying when information is being reported on the same subject, is how to group predicates that are related to each other.
It’s desirable to group predicates into domains, because we would like to be able to rate an information source based on their domain knowledge. In other words, it’s not very useful for our trust network to rate Alice as being good at math, unless we can also classify the predicates we are trying to rate as being in the domain of math or not.
Classification of predicates into domains is also a very general problem in computer science with many possible solutions. One method that many Hive users are very familiar with is the process of adding tags as metadata to a post they create. In this case, we’re classifying the entire post as belonging to one or more domains. In a trust network, user-specified tags could serve as meta-predicates that assert domains to which a predicate belongs, along with associated probabilities that the predicate actually belongs to each of the specified domains.
Another issue that arises with domain representation is how to perform computations on predicates belonging to overlapping domains. If we assign separate judgment rating probabilities to Alice for math and biology, how does the trust network choose which judgment rating to apply when rating a predicate that has been grouped into both domains? Or does it use both in some way, since her ability to rate the predicate is arguably higher when she’s an expert in two domains that the predicate belongs to?
Reducing the scope of the problem: rating useful predefined predicates such as identity
I suppose that just the simple examples above will have given you some insight into the difficulties involved in knowledge representation, but the “good news” is we can often make simplifying assumptions, as long as we’re willing to limit the capabilities of our trust network.
As a starting point, we could simply predefine a set of “interesting” predicates for our trust network and group them into distinct non-overlapping domains. Later work could then focus on how to dynamically introduce new classes of predicates to be rated and how to handle overlapping domains.
For example, applying a network like this to Hive, we could use it to rate Hive accounts as subjects, and rate predicates such “account X is a real person”, “account X is a government-registered entity”, “account X is a plagiarist”, and “account X is an expert in STEM”. The latter predicate could be generalized to “account X is an expert in Y community”. Again, to avoid controversy, I want to point out that the ratings provided by such a trust network are individualized (and thus very decentralized), so it would serve as an aid to each user, but it would not act as a universal social scoring system.
Upcoming topics: creating and optimizing a web of trust
In this post, I’ve focused on some potential ways to create a probabilistic model of information to be rated by a web of trust. In future posts, we’ll continue to look into modeling issues, ways that computer can collect information to create a web of trust, ways to detect problems with a web of trust, using inference engines to infer new information about predicate subjects and detect contradictions, and methods of optimizing a trust network over time.