Posts Tagged ‘Social Choice’

COMSOC 2010 has published the list of accepted papers.  True to the conference’s non-archival relaxed stance the acceptance rate was very high: 40/57.

Read Full Post »

The submission deadlines of COMSOC and of SAGT are near: COMSOC on May 15th and SAGT on May 10th (after extension).  The interest areas of these two conferences have a significant intersection: (algorithmic mechanism design) \subset (algorithmic game theory) \cap (computational social choice).  Interestingly, one may submit the same paper to both conferences as COMSOC is defined to be non-archival: “we stress that authors will retain the copyright of their papers and that submitting to COMSOC-2010 does not preclude publication of the same material in a journal or in archival conference proceedings.”  Does this make sense? (I think: yes.)

Read Full Post »

Terry Tao’s latest buzz shows how Arrow’s theorem can be viewed (and proved) as a corollary of the fact that the only ultrafilters over finite sets are principal.

Read Full Post »

The classic Gibbard-Satterthwaite theorem states that every nontrivial voting method can be manipulated.   In the late 1980’s an approach to circumvent this impossibility was suggested  by Bartholdi, Tovey, and Trick: maybe a voting method can be found where the manipulation is computationally hard — and thus effective manipulation would be practically impossible?  Indeed voting methods whose manipulation problem was NP-complete were found by them as well as later works.  However, this result is not really satisfying: the NP-completeness only ensures that the manipulation cannot be solved in the worst case; it is quite possible that for most instances manipulation is easy, and thus the voting method can still be effectively manipulated.   What would be desired is a voting method where manipulation would be computationally hard everywhere, except for a negligible fraction of inputs.  In the last few years several researchers have indeed attempted “amplifying” the NP-hardness of manipulation in a way that may get closer to this goal.

The new paper of Marcus Isaksson, Guy Kindler, and Elchanan Mossel titled The Geometry of Manipulation – a Quantitative Proof of the Gibbard-Satterthwaite Theorem shatters these hopes.  Solving the main open problem in a paper by Ehud Freidgut, Gil Kalai, and myself, they show that every nontrivial neutral voting method can be manipulated on a non-negligible fraction of inputs.  Moreover, a random flip of the order of 4 (four) alternatives will be such a manipulation, again with non-negligible probability.  This provides a quantitative version of the Gibbard-Satterthwaite theorem,  just like Gil Kalai previously obtained a quantitative version of Arrow’s theorem.

Read Full Post »

The paper Strategyproof Approximation Mechanisms for Location on Networks by Noga Alon, Michal Feldman, Ariel D. Procaccia, and Moshe Tennenholtz was recently uploaded to the arxiv (this is probably Noga Alon‘s first work in AGT.)   This paper continues previous work on Approximate Mechanism Design Without Money by a subset of these authors.  The (amazingly animated) slides from Ariel’s distinguished dissertation talk in AAMAS give an overview of these results.

The Gibbard-Satherswaite impossibility result rules out, in most settings, the possibility of designing incentive compatible mechanisms without money.   Chapter 10 of the Algorithmic Game Theory book is devoted to settings where this is possible, and these papers suggest a new approach for escaping the impossibility, an approach which should not surprise computer scientists: approximation.  While the first paper gives motivation and promise and is technically simple, the new paper is more sophisticated.   Both papers focus on simple variants of facility location but one can certainly think of much future work in this direction in other settings, like the recent one on Sum of us: truthful self-selection by Noga Alon, Felix Fischer, Ariel D. Procaccia, and Moshe Tennenholtz.

Read Full Post »

A “textbook system” based on social choice theory would have a centralized mechanism interacting with multiple software agents, each of them representing  a user.  The centralized mechanism would be designed to optimize some global goal (such as revenue or social welfare) and each software agent would elicit the preferences of its user and then optimize according to user preferences.

Among other irritating findings, behavioral economics also casts doubts on this pretty picture, questioning the very notion that users have preferences; that is preferences that are independent of the elicitation method.  In the world of computation, we have a common example of this “framing” difficulty: the default.  Users rarely change it, but we can’t say that they actually prefer the default to the other alternative since if we change the default then they stick with the new one.  Judicious choice of defaults can obviously be used for the purposes of the centralized mechanism (default browser = Internet explorer); but what should we do if we really just want to make the user happy?  What does this even mean?

The following gripping talk by Dan Ariely demonstrates such issues.

Read Full Post »

Social Choice Theory is a pretty mature field that deals with the question of how to combine the preferences of different individuals into a single preference or a single choice.  This field may serve as a conceptual foundation in many areas: political science (how to organize elections), law (how to set commercial laws), economics (how to allocate goods), and computer science (networking protocols, interaction between software agents).  Unsurprisingly, there are interesting computational aspects to this field, and indeed a workshop series on computational social choice already exists.  The starting point of this field is Arrow‘s theorem that shows the there are unexpected inherent difficulties in performing this preference aggregation.  There have been many different proofs of Arrow’s impossibility theorem, all of them combinatorial.  In this post I’ll explain a basic observation of Gil Kalai that allows quantifying the level of impossibility using analytical tools (Fourier transform) on Boolean functions commonly used in theoretical computer science.  At first Gil’s introduction of these tools in this context seemed artificial to me, but in this post I hope to show you that  it is the natural thing to do.

Arrow’s Theorem

Here is our setting and notation:

  1. There is a finite set of participants numbered 1...n.
  2. There is a set of three alternatives over which the participants have preferences: A=\{a,b,c\}.  (Arrow’s theorem, as well as everything here actually applies also to any set |A|\ge 3.)
  3. We denote by L the set of preferences over A, i.e. L is the set of full orders on the elements of A.

The point of view here is that each participant 1 \le i \le n has his own preference x_i \in L, and we are concerned with functions that reach a common conclusion as a function of the x_i‘s.  In principle, the common conclusion may be a joint preference, or a single alternative; Arrow’s theorem concerns the former, i.e. it deals with functions F:L^n \rightarrow L called social welfare functions.  Arrow’s theorem points out in a precise and very general way that there are no “natural non-trivial” social welfare functions when |A|\ge 3.  (This is in contrast to the case |A|=2, where taking the majority of all preferences is natural and non-trivial.)  Formal definitions will follow.

A preference x \in L really specifies for each two alternatives a,b \in A which of them is preferred over the other, and we denote by x^{a,b} the bit specifying whether x prefers a to b. We view each x as composed of three bits x=(x^{a,b},x^{b,c},x^{c,a}) (where it is implied that x^{b,a}=-x^{a,b}, x^{c,b}=-x^{b,c}, and x^{a,c}=-x^{c,a}.  Note that under this representation only 6 of the possible 8 three-bit sequences correspond to elements of L, where the bad combinations are 000 and 111.

The formal meaning of natural is  the following:

Definition: A social welfare function F satisfies the IIA (independence of irrelevant alternatives) property if for any two alternatives a,b \in A the aggregate preference between a and b depends only on the preferences of the participants between a and b (and not on any preferences with another alternative c).  In the notation introduced above it means that F^{a,b} is in fact just a function of the n bits x_1^{a,b}, ..., x_n^{a,b} (rather than of all the bits in the x_i‘s).

One may have varying opinions of whether this is indeed a natural requirement, but the lack of it does turn out to imply what may be viewed as inconsistencies.  For example, when we  use the aggregate preference to choose a single alternative (hence obtaining a social choice function), then lack of IIA is directly tied to the the possibility of strategic manipulation of the derived social choice function.

We can take the following definition of non-trivial:

Definition: A social welfare function F is a dictatorship if for some i, F is the identity function on x_i or the exact opposite order (in our coding, the bitwise negation of x_i).

It is easy to see that any dictatorship satisfies IIA.  Arrow’s theorem states that, with another minor assumption, these are the only functions that do so.  Kalai’s quantitative proof requires an assumption that is more restrictive than that required by Arrow’s original statement.  Recently Elchanan Mossel published a paper without this additional assumption, but we’ll continue with Kalai’s elementary variant.

Definition: A social welfare function F is neutral if it is invariant under changing the names of alternatives.

This basically means that as a voting method it does not discriminate between candidates.

Arrow’s Theorem (for neutral functions): Every neutral social welfare function that satisfies IIA is a dictatorship.

Quantification for Arrow’s Theorem

In CS, we are often quite happy with approximation: we know that sometimes things can’t be perfect, but can still be pretty good.  Thus if IIA is “perfect”, then the following quantitative version comes up naturally: can there be a function that is “almost” an IIA social welfare function and yet not even close to a dictatorship? Let us define closeness first:

Definition: Two social welfare functions F, G are \epsilon-close if Pr[F(x_1...x_n) \ne G(x_1 ... x_n)] \le \epsilon.  The probability is over independent uniformly random choices of x_i \in L.

The choice of the uniform probability distribution over L^n (strangely termed the impartial culture hypothesis) can not really be justified as a model of the empirical distribution in reasonable settings, but is certainly natural for proving impossibility results which then extend to other reasonably flat distributions. Once we have a notion of closeness of functions, being \epsilon-almost a dictatorship is well defined by being \epsilon-close to some dictatorship function.  But what is being “almost an IIA social choice function”?  The problem is that the IIA condition involves a relation between different values of F.  This is similar to situation in the field of property testing, and one natural approach is to follow the approach of property testing and quantify how often the desired property is violated:

Definition A: A function F : L^n \rightarrow L is an \epsilon-almost IIA social welfare function if for all a,b \in A we have that Pr[F^{a,b}(x_1...x_n) \ne F^{a,b}(y_1...y_n) | \forall i:x^{a,b}_i=y^{a,b}_i] \le \epsilon.  I.e. for a random input, a random change of the non (a,b)-bits is unlikely to change the value of F^{a,b}.

A second approach is simpler although surprising at first, and instead of relaxing the IIA requirement, relaxes the social welfare one.  I.e. we can allow F to have in its range all possible 8 values in {0,1}^n rather than just the 6 in L.  We call the bad values, 000 and 111, irrational outcomes, as they do not correspond to a consistent preference on A.

Definition B: A function F : L^n \rightarrow \{0,1\}^n  is an \epsilon-almost IIA social welfare function if it is IIA and there exists a social choice function G : L^n \rightarrow L^n such that F and G are \epsilon-close.

Note that we implicitly extended the definitions of closeness and of being IIA from functions having a range of L to those also with a range of {0,1}^3.

We can now state Kalai’s quantitative version of Arrow’s theorem:

Theorem (Kalai): For every \epsilon>0 there exists a \delta>0 such that every neutral function that is \delta-almost an IIA social choice function is \epsilon-almost a dictatorship.

Following Kalai, we will prove the theorem for definition B of “almost”, but the same theorem for definition A directly follows: take a function F that is an \delta-almost IIA social welfare according to definition A, and define a new function F' by letting F'^{a,b}(x_1...x_n) to be the majority value of F^{a,b}(y_1...y_n) where the y_i‘s agree with the x_i‘s on their (a,b)-bit and range over all possibilities on the other bits.  By definition F' is IIA, and it is not difficult to see that since F' satisfied definition A, then the majority vote in the definition is almost always overwhelming and thus F' is \delta'-close to F (where \delta' = O(\sqrt{\delta})).  Since we defined each bit of F' separately, its range may contain irrational outcomes, but since it is close to F', at most \delta' of these, and thus it satisfies definition B.

The main ingredient of the proof will be an analysis of the correlation between two bits from the output of F.

Main Lemma (social choice version): For every \epsilon>0 there exists a \delta>0 such that if a neutral F is not \epsilon-almost a dictatorship then Pr[F^{a,b}(x_1...x_n)=1\:and\:F^{b,c}(x_1...x_n)=0] \le 1/3-\delta.

Before we proceed, let us look at the significance of the 1/3-\delta: If the values of F^{a,b} and F^{b,c} were completely independent, then the joint probability would be exactly 1/4 (since each of the two bits is unbiased due to the neutrality of F).  For a “random” F the value obtained would be almost 1/4.  In contrast, if F is a dictatorship then the joint probability would be exactly 1/3 since Pr[x_i^{a,b}=1\:and\:x_i^{b,c}=0]=1/3 as x_i is uniform in L.  The point here is that if F has non-negligible difference from being a dictatorship then the probability is non-negligibly smaller than 1/3.

The theorem is directly implied from this lemma as the event F(x_1 ... x_n) \in L is the disjoint union of the three events F^{a,b}(x_1...x_n)=1\:and\:F^{b,c}(x_1...x_n)=0F^{b,c}(x_1...x_n)=1\:and\:F^{c,a}(x_1...x_n)=0, and F^{c,a}(x_1...x_n)=1\:and\:F^{a,b}(x_1...x_n)=0, whose total probability, if F is not \epsilon-almost a dictatorship,  is bounded by the lemma by 1-3\delta and thus the probability of an irrational outcome is at least 3\delta.

So let us inspect this lemma.  We have two Boolean functions F^{a,b} and F^{b,c} each operating on n-bit strings.  Since F is neutral, these are actually the same Boolean function, so F^{a,b}=F^{b,c}=f.  What we are asking is for the probability that f(z)=1\:and\:f(w)=0 where w and z are each an n-bit strings: w=(x^{a,b}_1....x^{a,b}_n) and z=(x^{b,c}_1...x^{b,c}_n).  The main issue how w and z are distributed.   Well, z is uniformly distributed over \{0,1\}^n and so is w. They are not independent though: Pr[w_i=z_i]=1/3. We say that the distributions on w and z are (anti-1/3-)correlated.

So our main lemma is equivalent to the following version of the lemma which now talks solely of Boolean functions:

Main Lemma (Boolean version): For every \epsilon>0 there exists a \delta>0 such that if an odd Boolean function f:\{ 0,1 \}^n \rightarrow \{ 0,1 \} is not \epsilon-almost a dictatorship then Pr[f(z)=1\:and\:f(w)=0] \le 1/3-\delta, where z is chosen uniformly at random in \{0,1\}^n and w chosen so that Pr[w_i = z_i]=1/3 and Pr[w_i=-z_i]=2/3.

An odd Boolean function just means that f(-z_1...-z_n)=-f(z_1....z_n) which is follows from the neutrality of F by switching the roles of a and b.  (We really only need that f is “balanced”, i.e. takes value 1 on exactly 1/2 of the inputs, but lets keep the more specific requirement for compatibility with the previous version of this lemma.)

Fourier Transform on the Boolean Cube

At this point using the Fourier transform is quite natural.  While I do not wish to give a full introduction to the Fourier transform on the Boolean cube here, I do want to give the basic flavor in one paragraph, convincing those who haven’t looked at it yet to do so.  1st year algebra and an afternoon’s work should suffice to fill in all the holes.

The basic idea is to view Boolean functions f:{0,1}^n \rightarrow {0,1} as special cases of the real-valued functions on the Boolean cube: f:{0,1}^n \rightarrow \Re.  This is a real vector space of dimension 2^n, and has a natural inner product <f,g> = 2^{-n} \sum_x f(x) g(x) The 2^{-n} factor is just for convenience and allows viewing the inner product as the expectation over a random choice of x: <f,g> = E[f(x)g(x)].  As usual the choice of a “nice” basis is helpful and our choice is the “characters”: functions of the form \chi_S(x) = (-1)^{\sum_{i \in S} x_i}, where S is some subset of  the n bits.  \chi_S  takes values -1 and 1 according to the parity of the bits of x in S.  There are 2^n such characters and they turn out to be an orthonormal basis.  The Fourier coefficients of f, denoted \hat{f} (S), are simply the coefficients under this basis f=\sum \hat{f} (S)\chi_S, where we have that \hat{f} (S) = <f,\chi_S>.

One reason why this vector space is appropriate here is that the correlation operation we needed between w and z in the lemma above, is easily expressible as a linear transformation T on this space defined by (Tf)(x)=E[f(y)] where y is chosen at random with Pr[y_i = x_i]=1/3 and Pr[y_i=-x_i]=2/3.  Using this it is possible to elementarily evaluate the probability in the lemma as \sum_S 3^{-|S|} \hat{f} (S)^2, where the sum ranges over all 2^n subsets S of the n bits.  To get a feel of what this means we need to compare it with the following property of the Fourier transform of a balanced Boolean function: \sum_S \hat{f}(S)^2 = ||f||_2^2 = 1/2.  The difference between this sum and our sum is the factor of 3^{-|S|} for each term.   The “first” element is this sum is easily evaluated for a balanced function: \hat{f}(\emptyset)^2 =(E[f])^2 = 1/4, so in order to prove the main lemma it suffices to show that \sum_{S \ge 1} 3^{-|S|} \hat{f} (S)^2 < (1/3 - \delta) - 1/4 = 1/12 - \delta. This is so as long as a non-negligible part of \sum_{|S| \ge 1} \hat{f}(S)^2 = 1/4 is multiplied by a factor strictly smaller than 3^{-1}, i.e. is on sets |S|>1. Thus the main lemma boils down to showing that if \sum_{|S| \ge 2} \hat{f} (S)^2 \le \delta'  then f is \epsilon-almost a dictatorship.  This is not elementary but is proved by E. Friedgut, G. Kalai, and A. Naor and completes our journey from Arrow to Fourier.

More results

There seems much promise in using these analytical tools for addressing quantitative questions in social choice theory.  I’d like to mention two results of this form.  The first is the celebrated MOO paper (by E. Mossel, R. O’Donnell and K. Oleszkiewicz) that proves, among other things, that among functions that do not give any player unbounded influence, the closest to being an IIA social welfare function is the majority function.  The second, is a paper of mine with E. Friedgut and G. Kalai that obtains a quantitative version of the Gibbard–Satterthwaite theorem showing that every voting method that is far from being a dictatorship can be strategically manipulated.  Our version shows that this can be done on a non-negligible fraction of preferences, but is limited to neutral voting methods between 3 candidates.

Read Full Post »