Feeds:
Posts

## Prisoner’s Dillemma Reality show

(Thanks to Eyal Winter for the link.)

Google just unveiled its new (open source) attempt to change email by combining it with essentially all other electronic collaboration modes (IM, blogs, collaborative editing…) .  Google Wave may completely change how we collaborate electronically.  Or, it may not.

Wired magazine published a (rather enthusiastic) popular article on Hal Varian’s role as chief Google economist.  It shortly mentions that Microsoft now has Susan Athey in a similar role.

## Papers from Market Design Meeting

The National Bureau of economic research recently held a “working group meeting” on market design, as mentioned in Al Roth’s blog as well as David Pennock’s blog.  The meeting web page contains links to the presented papers, many of which seem quite interesting.  Dave Pennock’s post also offers reflections on the culture difference between economics and CS: “In some ways it felt like visiting a foreign country”.

## Postdoc positions in Algorithmic Game Theory

Paul Goldberg’s blog maintains a list of postdoc positions in Algorithmic Game Theory.  Here are the ones listed there now:

A few interesting pieces that I recently came by:

1. A new economic paper on open source byMichael Schwarz and Yuri Takhteyev (via Al Roth’s blog).
2. A compendium of PPAD-complete problems from Shiva Kintali.
3. Some interesting thoughts on graduate school from Noah Snyder.
4. The WAA09 workshop on ad auctions (via Muthu’s blog) — deadline is May 22nd.
5. A new paper on “feedback loops of attention in peer production” by Fang Wu, Dennis M. Wilkinson, and Bernardo Huberman added to the many social network papers by Huberman on the arXiv.
6. Electronic voting workshop in Israel

After having looked at some of the debate raised by Moshe Vardi’s questioning of CS conferences (on Lance’s blog post as well as on my own post), I was struck by the fact that everyone was concerned with publishing papers; hardly anyone worried about reading these papers.  This seems to be the general attitude in the theoretical CS community and there is a risk of it becoming even more so in the AGT community.  People write papers just so that they get published, not really making much effort to have them be read; conferences and journals are being created to cater to the publishing needs of authors rather than to satisfy any desire of having more information to read.

I have two general recommendations for catering to possible readers of our scientific results:

### Publish Less Papers

We publish too much (as recently noted by Mark).  I am not against writing and sharing small steps towards solving a problem, but the pre-mature packaging as a “””publication””” is usually bad.  Algorithms and theorems are often not really fully understood by the authors, and are thus not properly crystallized, simplified, or generalized.  Models are not fully polished, justified, discussed or compared to other related models.  Proofs are rarely made crisp.  We also feel compelled to “market” our results often sacrificing scientific honesty, not to mention the amount of time wasted on the whole packaging effort.  No wonder few people bother reading such papers.  There are other ways to get preliminary results in the open (while maintaining credit) such as technical reports, the arXiv, or giving workshop talks.  I really like the tradition in the field of economic theory of circulating a working paper that keeps being improved until it is deemed ready for real publication in a journal (where the top ones are actually widely read, I understand).

The funny thing is that the pressure for publishing more papers rather than better papers is not really external — it is a psychological trick we play on oursleves.  In most places it is easier to get a job or tenure with ten first rate papers than with twenty second rate ones.  It does not make any sense for hiring, tenure, grant, or promotion committees to count papers — if you insist on “counting”, then at least make sure you count impact: citations, the h-factor, or just publication in the absolute top venues.  Conferences and journals should insist on evaluating papers from the point of view of the reader: simplicity, full context, clear writing, crisp proofs, polished models, all these are critcal to a paper being useful to its readers.  They can help reducing the sheer number of papers by simply accepting less of them: top ones should be more selective, and less-than-top conferences should consider becoming publication-less workshops.

### Help Identify the Important Ones

All these papers are “out there”: in conference proceedings, journals, technical reports, on authors’ websites, the arXiv, and other places.  No one can read even a fraction of the papers in his field.  Wouldn’t it be nice to have someone point to those that we should be advised to read?  Someone to summarize a topic?  This “meta-research” layer would be the critical element in helping the readers allocate their attention.  There are various formats to draw attention to the key results: awards, invited talks, or tutorials in conferences.  In the long term, a book that summarizes an area, in the middle term, a “hand-book” written by many authors, or in the shorter term, lecture notes.  Wouldn’t it be nice if more people wrote surveys or expositions?  I particularly liked the new physics review site (as reported by Suresh).  Blogs can point papers out too (Oded Goldreich does and I try to do so too).

## Correlated equilibria — some CS perspectives

What will happen in a given strategic situation?  The dogma of game theory — which models strategic situations — is that some kind of equilibrium will be reached.  Taking a wide enough definition of equilibrium, the dogma seems fine.  Taking a narrower look and just considering the Nash equilibrium — either pure (if it exists) or mixed — may be more questionable.  In 1974 Robert Aumann suggested a generalization of Nash equilibrium termed correlated equilibrium.  This this type of equilibrium is very appealing due to several reasons, and this post presents these from a CS-ey perspective.

### What is Correlated equilibrium?

For ease of notation let us stick with two players, Alice and Bob (everything does extend to an arbitrary number of players).  A game between Alice and Bob is given by a pair of $n \times m$ matrices $A=(a_{ij})$ and $B=(b_{ij})$,  where $1 \le i \le n$ and $1 \le j \le m$, i.e. Alice has $n$ strategies and Bob has $m$ strategies.  $a_{ij}$ denotes Alice’s utility when she plays $i$ and Bob plays $j$, and $b_{ij}$ denotes Bob’s utility in this case.  We will start with the simplest notions of equilibria and proceed to correlated equilibria.  In all cases equilibrium means that each player is best-replying to the other, so it suffices to define what does best-replying mean in each notion.

Let us start by defining the pure Nash equilibrium: a pure strategy of  Alice $i$ is a best response to a pure strategy $j$ of Bob if for all $i'$ we have that $a_{ij} \ge a_{i'j}$.

The next step allows players to play mixed strategies rather than pure ones, where a mixed strategy is simply a probability distribution over pure strategies. Denote by $\Delta_n$ the set of mixed strategies ($\sum_i x_i = 1$ and for all $i = 1 ... n$, $x_i \ge 0$), then a mixed strategy $x \in \Delta_n$ is a best reply to a mixed strategy $y \in \Delta_m$ if for every mixed strategy $x'$ we have that $\sum_{ij} x_i a_{ij} y_j \ge \sum_{ij} x'_i a_{ij} y_j$.  (This turns out to be equivalent to every pure strategy in the support of $x$ being at least as good against $y$ as every other pure strategy.)

The mixed-Nash equilibrium allows each player to randomize his own actions.  The basic idea of correlated equilibrium is to allow joint randomization between Alice and Bob.  We imagine a third trusted party choosing a pair $(i,j)$ according to some distribution $p$ on pairs, and telling $i$ to Alice (only) and telling $j$ to Bob.    When can we say that $i$ is indeed a best reply to the information that Alice has about Bob’s action?   Well, when Alice gets $i$, she may assume the conditional distribution over Bob’s strategies: $Pr[j|i] = p_{ij} / (\sum_j p_{ij})$.  Alice is best responding if $i$ is indeed the best response to this distribution, i.e. if for all $i'$ we have that $\sum_j p_{ij} a_{ij} \ge \sum_j p_{ij} a_{i'j}$.  (Dividing both sides by $\sum_j p_{ij}$ gives exactly the conditional probabilities).  The distribution $p$ is called a correlated equilibrium if every $i$ is a best response for Alice, and every $j$ a best response for Bob.

Notice that a Nash equilibrium is a special case: when $(p_{ij})$ is a product distribution $p_{ij}=x_i y_j$.   In this case the previous best-reply constraints become $x_i \sum_j y_j a_{ij} \ge x_i \sum_j y_j a_{i'j}$, which means that each pure strategy $i$ in the support ($x_i>0$) it is a best reply to $y$$\sum_j y_j a_{ij} \ge \sum_j y_j a_{i'j}$.  In particular this implies existence of a correlated equilibrium in every finite game.

1. It can be more efficient than Nash
2. It can be implemented like Nash
3. It can be efficiently computed unlike Nash
4. It can be efficiently learned unlike Nash

### Efficiency

It is well known that the Nash equilibrium need not give an efficient outcome. The following game of chicken serves as the standard example that the correlated equilibrium may be more efficient:

.                                  Bob

.                        Dare          Chicken

.      Dare          0, 0             4, 1

Alice

.      Chicken    1, 4              3, 3

This game has two pure equilibria (C,D) and (D,C) as well as a single mixed Nash equilibrium where both Alice and Bob choose to Dare with probability of exactly 1/2.   If we are interested in the obtained sum of utilities (social welfare) as our goal, then each pure equilibria gets social welfare of 5, while the mixed one gets an expected social welfare of 4. The optimal social welfare, 6, is obtained at a non-equilibrium point. Putting it into a price of anarchy/stability point of view, the price of stability (ratio between welfares obtained in best Nash equilibrium and the optimimum) is 6/5=1.2, while the price of anarchy (ratio between welfares obtained in worst Nash equilibrium and the optimum) is 6/4=1.5 (for the mixed case).

Let us look at the following correlated distribution:

.                            Bob

.                   Dare          Chicken

.     Dare          0                  1/3

Alice

.     Chicken     1/3               1/3

Let us see why this is a correlated equilibrium.  Assume that Alice “gets” dare. This happens with probability 1/3, and in this case she knows for sure that Bob got “chicken” — so clearly daring is a best reply.  On the other hand, if Alice gets Chicken — which happens with probability 2/3 — then her conditional distribution on Bob is 1/2-1/2.  Under this distribution she is indifferent between daring and chicken, so chicken is still a best reply.  The nice thing is that the expected sum of utilities in this correlated equilibrium is $5 \frac{1}{3}$ — better than that of any Nash equilibrium.  This turns out to be the best correlated equilibrium in terms of social welfare, and the resulting price of correlated stability is thus $6/(5 \frac{1}{3}) = 1.125$.  The set of correlated equilibria does not only includes ones with higher welfare than any Nash equilibrium, but also ones with lower welfare.  The worst correlated equilibrium turns out to give 1/3 weight to each of (dare,dare), (dare,chicken), and (chicken,dare), for a total expected social welfare of $3 \frac{1}{3}$, giving a correlated price of anarchy of  $6/(3\frac{1}{3}) = 1.8$.  The efficiency gap may be arbitrarily large as the following example shows [corrected on 21/10 thanks to Jochen Konemann who found a bug in the original version] (all missing entries are 0,0):

.          C1       C2        C3       D0      D1      D2     D3
C1                  1,1       1,1       0,1      0,2
C2      1,1                   1,1       0,1                0,2
C3      1,1       1,1                   0,1                        0,2
D0      1,0       1,0       1,0       $\epsilon,\epsilon$     $\epsilon,\epsilon$     $\epsilon,\epsilon$    $\epsilon,\epsilon$
D1      2,0                               $\epsilon,\epsilon$
D2                  2,0                   $\epsilon,\epsilon$
D3                              2,0       $\epsilon,\epsilon$

Inspection will reveal that any Nash equilibrium cannot have any C* strategies in its support and thus gives always utility $\le \epsilon$ to both players.   In contrast, a uniform distribution on the 1,1 entries is a correlated equilibrium that gives utility 1 to each player.  Thus while the regular price of stability is infinite, the correlated price of stability is 1.  So, the optimists (like me) who prefer looking at the price of stability may view the notion of  a correlated equilibrium as an opportunity to get better results.  On the other hand, the pessimists who look at the price of anarchy may be concerned that it may bring worse results.  To ease these fears, a very recent paper, “Intrinsic robustness of the price of anarchy” by Tim Roughgarden  shows that for a wide class of games, including the congestion games usually studied in the context of the price of anarchy, this does not happen — the correlated price of anarchy turns out to be no worse than the regular price of anarchy.

### Implementation

The most serious concern with the notion of correlated equilibrium is its implementation — how can the players get a joint correlated probability distribution.  The definition of the notion assumes a fictional trusted third party termed a mediator.  But how can the correlated equilibrium be reached in normal circumstances, without a mediator?  One may devise various “correlation devices” to replace the fictional mediator.  For example, in the chicken example above a hat and three balls will do the trick: two balls labeled “chicken” and one ball labeled “dare” are thrown into a hat (under the joint supervision of Alice and Bob) and Alice and Bob each takes one ball from the hat (without looking).  Notice that each of them draws “dare” with probability 1/3 — and they never do so simultaneously — exactly our correlated equilibrium!

Can such a correlation device be implemented in general?  Can an equivalent device that works over computer communication networks be implemented?  Both the game-theory community and the cryptography community worked on these questions using somewhat different notions and terminology.  Game theorists considered adding a pre-round of cheap talk to the game where the parties may communicate with each other.  From the computer science perspective this is just a special case of secure function evaluation: there are general cryptographic techniques that enable any number of parties to emulate any trusted mediator without really having such a mediator and without trusting each other.  There are a variety of models and results here, but the basic ones in our setting allow (1) making standard cryptographic assumptions, any correlated equilibrium in any game with any number of players can be implemented securely in a computational sense (2) any correlated equilibrium in any game with at least 3 (or 4, depending on the exact model) players can be implemented securely in an information-theoretic sense (but not in 2-party games).  For more information see chapter 8 of the algorithmic game theory book.

### Computation

The first nice thing about correlated equilibria is that they can be efficiently found.  Just look at the inequalities that define them above: they are all linear inequalities in the $p_{ij}$‘s: for every $i,i'$$\sum_j p_{ij} a_{ij} \ge \sum_j p_{ij} a_{i'j}$,  similar inequalities for Bob, and further linear inequalities specifying that $p$ is a probability distribution.  Thus we can find a correlated equilibrium using linear programming and even optimize an arbitrary linear function over correlated equilibria, e.g. find the one with highest social welfare.  This is in stark contrast to Nash equilibria where  finding an arbitrary one is PPAD-complete and optimizing social welfare over them is NP-hard.

The LP algorithm for correlated equilibria directly applies to games with any number of players and its running time is polynomial in the size of the game (the descriptions size of the utility functions of the players, which is the product of the players’ strategy-space sizes).  The situation is actually even better and we can often find correlated equilibria of exponential-size games in polynomial time.  Specifically, using the Ellipsoid algorithm, there are cases where it is possible to find a correlated equilibrium in time that is polynomial in the length of the representation of the strategies of the game.  For this to make sense there needs to be a succinct representation of the game, where given a representation of players’ strategies, their utilities can be easily determined.

### Learning

How can an equilibrium of a game be practically reached?  Is there a natural process that will lead to a state of equilibrium?  A process where the players will “learn” what they should play?  There have been many different attempts and models to define processes that converge to equilibria.  A simple setting considers players repeatedly playing a game, where at each stage they act according to some given “learning” rule (which is intuitively “somewhat rational” but not really strategic) that depends on the history of the game as well as on their own utility function (but usually not on the other players’ utility functions — this is called uncoupled dynamics).

A simple example is repeated-best-reply: in every round you play your best reply to the strategies played by the other players in the previous round.   If this process converges, then it will be to a pure Nash equilibrium.  However, it may in general fail to converge even if a pure Nash equilibrium exists and certainly will not converge if the game lacks pure Nash equilibria.  There are however interesting special cases where this process does converge, specifically potential games and games solvable by iterated elimination of dominated strategies (in the latter case even in a strong asynchronous sense).  A natural attempt for converging to a mixed Nash equilibria is called fictitious play: in every round you play your best response to the other players’ mixed strategies defined according to the empirical historical distribution of their strategies. This process again is not guaranteed to converge and may loop indefinitely.  In fact, there is no natural process that is known to converge to a Nash equilibrium that is not essentially equivalent to exhaustive search; this shouldn’t come as a surprise since such a process would likely imply an efficient algorithm whose existence we doubt.  There are however natural processes that do converge to correlated equilibria.  These are based on regret minimization procedures. Below is a minimal description of these types of learning algorithms; for a comprehensive exposition as well as historical context see chapter 4 in the algorithmic game theory book, or this talk by Avrim Blum.

Here is the basic framework: we need to design a “learning algorithm” $A$ that chooses one out of $N$ possible actions, and does this repeatedly for $T$ steps.  For each possible action $i \in \{1...N\}$ and time $0 < t \le T$ there is some utility $0 \le u_t(i) \le 1$.  At time $t$ the algorithm must choose an action $i_t$ before seeing the current utilities $u_t(i)$ but rather only based on previous utilities.  The goal is to maximize the total utility obtained: $U_A = \sum_t u_t(i_t)$, where $i_t$ is the action $A$ took at time $t$.  The main challenge here is the total lack of information regarding the utilities $u_t(i)$ which need not have any relation to the utilities at previous times.  I.e. the model allows $u_t(i)$ to be chosen by an adversary, and furthermore to depend arbitrarily on the actions chosen previously by our learning algorithm, i.e. on $i_1 ... i_{t-1}$.   Since such a setting is hopeless we will not attempt comparing the performance of our algorithm to the posterior optimum, but just to some family of alternative algorithms (a family that may be defined as possible variants of the original $A$).  What we need in this context is called low swap (or internal) regret.   Let $F : \{1...N\} \rightarrow \{1...N\}$, we say that $B$ is obtained from $A$ by swap $F$ if whenever $A$ takes action $i$, then $B$ takes action $F(i)$.

Definition: $A$ that runs for $T$ steps has swap regret $R$ if for for every algorithm $B$ that is obtained from $A$ by some swap, we have that $U_A \ge U_B - R$.

It is not difficult to see that deterministic algorithms cannot have “low” swap regret, but it turns out that randomized algorithms may. A randomized algorithm may choose its action $i_t$ according to a probability distribution $p^t$, (where each $i$ is chosen with probability $p^t_i$).   There has been much work on designing learning algorithms with low regret with the end result being efficient randomized algorithms that almost surely achieve swap regret $R=O(\sqrt{NT\log N})$, the important point being that $R << T$.  (For a weaker notion, external regret, the dependence on $N$ can be made logarithmic, but this is not known for internal regret).   This post will not go into how these algorithms are constructed, but just why they are what we need.

So far all this regret-minimization stuff was in a decision-theoretic framework — just a single decision maker.  Now comes the key switch to games: what if each player in a game plays such a regret minimization algorithm?   Let us be more specific, and as above, consider two players, even though everything holds in general.  Suppose that Alice and Bob play the game for $T$ rounds, each time choosing their strategy (action) in the game according to a regret minimization algorithm.  At each time $t$ Alice plays $i_t$ and Bob plays $j_t$ and thus Alice sees utility vector $u_t(i)=a_{i,j_t}$ and Bob sees utility vector $u^{Bob}_t(j)=a_{i_t,j}$.

Lemma: Assume that Alice and Bob both play an algorithm for $T$ steps with swap regret $R$ then the uniform distribution on $(i_t,j_t)$ over all $1 \le t \le T$ is $\epsilon$-close to a correlated equilibrium (in $L_1$ norm), where $\epsilon=R/T$.   I.e., let $p$ denote this distribution then for every $i$ there exists $\epsilon_i$ such that for all $i'$, we have that $\sum_j p_{ij} a_{ij} \ge \sum_j p_{ij} a_{i'j} - \epsilon_i$, with $\sum_i \epsilon_i = \epsilon$ (and similarly for Bob).

To see why this is true, let us look at Alice’s optimal strategy when the trusted mediator tells her that $i$ was drawn according to this distribution. Certainly, Alice’s best response is to play the strategy $i'$ that maximizes $\sum_j p_{ij} a_{i'j}$ — we can denote the mapping from each $i$ to its optimal $i'$ by the function $F(i)$, and we define $\epsilon_i$ to be the gap needed for $i'=F(i)$ , $\epsilon_i = \sum_j p_{ij} a_{F(i)j} - \sum_j p_{ij} a_{ij}$.  Now let us look at the expression for the regret of $A$ relative to the swap $F$:   $U_B - U_A = \sum_t u_t(F(i_t)) - \sum_t u_t(i_t)$ which we know is bounded by $R$.   Given that by definition $u_t(i) = a_{i,j_t}$, and that $p_{ij}$ is simply $1/T$ times the number of times $t$ for which $i_t=i$ and $j_t=j$, we see that this expression is exactly equal to the the sum of the $\epsilon_i$‘s which completes the proof of the lemma.

## Auction Prescriptions

I recently looked again at Ausubel’s multi-unit auction paper, and really liked the crystallization of the first paragraph in it:

The auctions literature has provided us with two fundamental prescriptions guiding effective auction design. First, an auction should be structured so that the price paid by a player—conditional on winning—is as independent as possible of her own bids (William Vickrey, 1961). Ideally, the winner’s price should depend solely on opposing participants’ bids—as in the sealed-bid, second-price auction—so that each participant has full incentive to reveal truthfully her value for the good. Second, an auction should be structured in an open fashion that maximizes the information made available to each participant at the time she places her bids (Paul R. Milgrom and Robert J. Weber, 1982a).

I would say that this is useful practical insight gained from studying theory.

## New paper: randomized auctions with budgets

The paper Incentive Compatible Budget Elicitation in Multi-unit Auctions by Sayan Bhattacharya, Vincent Conitzer, Kamesh Munagala, and Lirong Xia has recently been uploaded to the arXiv.

The basic model that they study is the auction of a single infinitely divisible good among bidders with budget limits.  In the model each bidder $i$ has a private value $v_i$ and private budget $b_i$.  The utility of $i$ when allocated $x_i$ units of the infinitely divisible good and charged $p_i$ is assumed to be $x_i v_i - p_i$ but only if the budget is not exceeded, $p_i \le b_i$. If $p_i > b_i$ then the utility is assumed to be negative infinity.

The twist in this model is the budget limit which takes us out of the quasi-linear model usually assumed in auction theory.  In particular VCG mechanisms are no longer incentive compatible.  A previous paper by Shahar Dobziniski, Ron Lavi, and myself shows that indeed there is no (anonymous) incentive compatible mechanism that produces Pareto-optimal allocations.  This paper shows that there is such a randomized mechanism.

The paper starts with the  mechanism that was previously shown to be incentive compatible in the special case that the budgets are public knowledge (an adaptive version of Ausubel’s auction).  The heart of the proof lies in an analysis showing that the only way a bidder can gain by mis-reporting his budget is by over-reporting; under-reporting is never profitable. This requires quite a delicate analysis (which I didn’t fully read yet) but once this characterization is known, an incenive compatible randomized mechanism follows immediately: run the original mechanism, and instead of asking a bidder to pay $p_i$, ask him to pay $b_i$ with probability $p_i/b_i$. The expected payments have not changed so under-reporting is still not profitable, and this randomized payment makes over-reporting result in a postive probability of getting negative infinite utility and thus is certainly not profitable.