Social Choice Theory is a pretty mature field that deals with the question of how to combine the preferences of different individuals into a single preference or a single choice. This field may serve as a conceptual foundation in many areas: political science (how to organize elections), law (how to set commercial laws), economics (how to allocate goods), and computer science (networking protocols, interaction between software agents). Unsurprisingly, there are interesting computational aspects to this field, and indeed a workshop series on computational social choice already exists. The starting point of this field is Arrow‘s theorem that shows the there are unexpected inherent difficulties in performing this preference aggregation. There have been many different proofs of Arrow’s impossibility theorem, all of them combinatorial. In this post I’ll explain a basic observation of Gil Kalai that allows quantifying the level of impossibility using analytical tools (Fourier transform) on Boolean functions commonly used in theoretical computer science. At first Gil’s introduction of these tools in this context seemed artificial to me, but in this post I hope to show you that it is the natural thing to do.
Arrow’s Theorem
Here is our setting and notation:
- There is a finite set of participants numbered
.
- There is a set of three alternatives over which the participants have preferences:
. (Arrow’s theorem, as well as everything here actually applies also to any set
.)
- We denote by
the set of preferences over
, i.e.
is the set of full orders on the elements of
.
The point of view here is that each participant has his own preference
, and we are concerned with functions that reach a common conclusion as a function of the
‘s. In principle, the common conclusion may be a joint preference, or a single alternative; Arrow’s theorem concerns the former, i.e. it deals with functions
called social welfare functions. Arrow’s theorem points out in a precise and very general way that there are no “natural non-trivial” social welfare functions when
. (This is in contrast to the case
, where taking the majority of all preferences is natural and non-trivial.) Formal definitions will follow.
A preference really specifies for each two alternatives
which of them is preferred over the other, and we denote by
the bit specifying whether
prefers
to
. We view each
as composed of three bits
(where it is implied that
,
, and
. Note that under this representation only 6 of the possible 8 three-bit sequences correspond to elements of
, where the bad combinations are 000 and 111.
The formal meaning of natural is the following:
Definition: A social welfare function satisfies the IIA (independence of irrelevant alternatives) property if for any two alternatives
the aggregate preference between
and
depends only on the preferences of the participants between
and
(and not on any preferences with another alternative
). In the notation introduced above it means that
is in fact just a function of the
bits
(rather than of all the bits in the
‘s).
One may have varying opinions of whether this is indeed a natural requirement, but the lack of it does turn out to imply what may be viewed as inconsistencies. For example, when we use the aggregate preference to choose a single alternative (hence obtaining a social choice function), then lack of IIA is directly tied to the the possibility of strategic manipulation of the derived social choice function.
We can take the following definition of non-trivial:
Definition: A social welfare function is a dictatorship if for some
,
is the identity function on
or the exact opposite order (in our coding, the bitwise negation of
).
It is easy to see that any dictatorship satisfies IIA. Arrow’s theorem states that, with another minor assumption, these are the only functions that do so. Kalai’s quantitative proof requires an assumption that is more restrictive than that required by Arrow’s original statement. Recently Elchanan Mossel published a paper without this additional assumption, but we’ll continue with Kalai’s elementary variant.
Definition: A social welfare function is neutral if it is invariant under changing the names of alternatives.
This basically means that as a voting method it does not discriminate between candidates.
Arrow’s Theorem (for neutral functions): Every neutral social welfare function that satisfies IIA is a dictatorship.
Quantification for Arrow’s Theorem
In CS, we are often quite happy with approximation: we know that sometimes things can’t be perfect, but can still be pretty good. Thus if IIA is “perfect”, then the following quantitative version comes up naturally: can there be a function that is “almost” an IIA social welfare function and yet not even close to a dictatorship? Let us define closeness first:
Definition: Two social welfare functions are
-close if
. The probability is over independent uniformly random choices of
.
The choice of the uniform probability distribution over (strangely termed the impartial culture hypothesis) can not really be justified as a model of the empirical distribution in reasonable settings, but is certainly natural for proving impossibility results which then extend to other reasonably flat distributions. Once we have a notion of closeness of functions, being
-almost a dictatorship is well defined by being
-close to some dictatorship function. But what is being “almost an IIA social choice function”? The problem is that the IIA condition involves a relation between different values of
. This is similar to situation in the field of property testing, and one natural approach is to follow the approach of property testing and quantify how often the desired property is violated:
Definition A: A function is an
-almost IIA social welfare function if for all
we have that
. I.e. for a random input, a random change of the non
-bits is unlikely to change the value of
.
A second approach is simpler although surprising at first, and instead of relaxing the IIA requirement, relaxes the social welfare one. I.e. we can allow to have in its range all possible 8 values in
rather than just the 6 in
. We call the bad values, 000 and 111, irrational outcomes, as they do not correspond to a consistent preference on
.
Definition B: A function is an
-almost IIA social welfare function if it is IIA and there exists a social choice function
such that
and
are
-close.
Note that we implicitly extended the definitions of closeness and of being IIA from functions having a range of to those also with a range of
.
We can now state Kalai’s quantitative version of Arrow’s theorem:
Theorem (Kalai): For every there exists a
such that every neutral function that is
-almost an IIA social choice function is
-almost a dictatorship.
Following Kalai, we will prove the theorem for definition B of “almost”, but the same theorem for definition A directly follows: take a function that is an
-almost IIA social welfare according to definition A, and define a new function
by letting
to be the majority value of
where the
‘s agree with the
‘s on their
-bit and range over all possibilities on the other bits. By definition
is IIA, and it is not difficult to see that since
satisfied definition A, then the majority vote in the definition is almost always overwhelming and thus
is
-close to
(where
). Since we defined each bit of
separately, its range may contain irrational outcomes, but since it is close to
, at most
of these, and thus it satisfies definition B.
The main ingredient of the proof will be an analysis of the correlation between two bits from the output of .
Main Lemma (social choice version): For every there exists a
such that if a neutral
is not
-almost a dictatorship then
.
Before we proceed, let us look at the significance of the : If the values of
and
were completely independent, then the joint probability would be exactly
(since each of the two bits is unbiased due to the neutrality of
). For a “random”
the value obtained would be almost
. In contrast, if
is a dictatorship then the joint probability would be exactly
since
as
is uniform in
. The point here is that if
has non-negligible difference from being a dictatorship then the probability is non-negligibly smaller than
.
The theorem is directly implied from this lemma as the event is the disjoint union of the three events
,
, and
, whose total probability, if
is not
-almost a dictatorship, is bounded by the lemma by
and thus the probability of an irrational outcome is at least
.
So let us inspect this lemma. We have two Boolean functions and
each operating on
-bit strings. Since
is neutral, these are actually the same Boolean function, so
. What we are asking is for the probability that
where
and
are each an
-bit strings:
and
. The main issue how
and
are distributed. Well,
is uniformly distributed over
and so is
. They are not independent though:
. We say that the distributions on
and
are (anti-1/3-)correlated.
So our main lemma is equivalent to the following version of the lemma which now talks solely of Boolean functions:
Main Lemma (Boolean version): For every there exists a
such that if an odd Boolean function
is not
-almost a dictatorship then
, where
is chosen uniformly at random in
and
chosen so that
and
.
An odd Boolean function just means that which is follows from the neutrality of
by switching the roles of
and
. (We really only need that
is “balanced”, i.e. takes value 1 on exactly 1/2 of the inputs, but lets keep the more specific requirement for compatibility with the previous version of this lemma.)
Fourier Transform on the Boolean Cube
At this point using the Fourier transform is quite natural. While I do not wish to give a full introduction to the Fourier transform on the Boolean cube here, I do want to give the basic flavor in one paragraph, convincing those who haven’t looked at it yet to do so. 1st year algebra and an afternoon’s work should suffice to fill in all the holes.
The basic idea is to view Boolean functions as special cases of the real-valued functions on the Boolean cube:
. This is a real vector space of dimension
, and has a natural inner product
The
factor is just for convenience and allows viewing the inner product as the expectation over a random choice of
:
. As usual the choice of a “nice” basis is helpful and our choice is the “characters”: functions of the form
, where
is some subset of the
bits.
takes values -1 and 1 according to the parity of the bits of
in
. There are
such characters and they turn out to be an orthonormal basis. The Fourier coefficients of
, denoted
, are simply the coefficients under this basis
, where we have that
.
One reason why this vector space is appropriate here is that the correlation operation we needed between and
in the lemma above, is easily expressible as a linear transformation
on this space defined by
where
is chosen at random with
and
. Using this it is possible to elementarily evaluate the probability in the lemma as
, where the sum ranges over all
subsets
of the
bits. To get a feel of what this means we need to compare it with the following property of the Fourier transform of a balanced Boolean function:
. The difference between this sum and our sum is the factor of
for each term. The “first” element is this sum is easily evaluated for a balanced function:
, so in order to prove the main lemma it suffices to show that
. This is so as long as a non-negligible part of
is multiplied by a factor strictly smaller than
, i.e. is on sets
. Thus the main lemma boils down to showing that if
then
is
-almost a dictatorship. This is not elementary but is proved by E. Friedgut, G. Kalai, and A. Naor and completes our journey from Arrow to Fourier.
More results
There seems much promise in using these analytical tools for addressing quantitative questions in social choice theory. I’d like to mention two results of this form. The first is the celebrated MOO paper (by E. Mossel, R. O’Donnell and K. Oleszkiewicz) that proves, among other things, that among functions that do not give any player unbounded influence, the closest to being an IIA social welfare function is the majority function. The second, is a paper of mine with E. Friedgut and G. Kalai that obtains a quantitative version of the Gibbard–Satterthwaite theorem showing that every voting method that is far from being a dictatorship can be strategically manipulated. Our version shows that this can be done on a non-negligible fraction of preferences, but is limited to neutral voting methods between 3 candidates.