Feeds:
Posts

## Regret in Markets

How do markets and other games reach equilibrium?  Or at least, how can they do so? A nice answer will have each participant in the market or game repeatedly performing some simple update step that only depends on information available to him and that makes sense from his local point of view, with the whole system then magically converging to equilibrium.  The meanings of “simple”, “update”, “information available”,  “makes sense”, and “converging” may vary, but the idea is to capture the basic phenomena of a global equilibrium arising from simple local optimizations.

In this post I will describe some recent work that shows this kind of phenomena where local “regret-minimization” optimizations lead to global convergence in a natural market model.

## Regret and Online Convex Optimization

We start by looking at an online optimization strategy for a single decision maker that interacts with an adversarial environment.  It will be nicest to use the general and clean model of  online convex optimization due to Zinkevich.  This model considers an sequence of functions $u^t : S \rightarrow \Re$ for some set $S$. The rules of the game require the algorithm at every time $t$, for $t =1,2,...T$, to pick a point $x^t \in S$ with the aim of  maximizing $u^t(x^t)$, and do so before seeing the function $u^t$, but only based on the history, i.e. on the functions $u^1 ... u^{t-1}$.  Given that we are not assuming any connection between the different $u^t$‘s, even not a stochastic one, this may seem hopeless, but our yardstick for success will be a bit modest as well: we only want to minimize our regret relative to the best constant point.  I.e.

Definition: A sequence of points $x^1 ... x^T$ has regret $R$ on the sequence of functions $u^1 ... u^T$ if for every $x^* \in S$ we have that $\sum_{t=1}^T u^t(x^t) \ge \sum_{t=1}^T u^t(x^*) - R$.

The main point is that there exist online algorithms that always find such low-regret sequences when $S$ is any compact convex subspace of a Euclidean space and each $u^t$ is concave (hill shaped).  Zinkevich presents a simple online algorithm that ensure $R = O(\sqrt{T})$ for any sequence of such functions, and other ones can do even better.  Zinkevich’s learning algorithm is simple gradient ascent: You start with an arbitrary $x^1$ and then each $x^t$ is obtained by taking a small step of size $\delta_t$ from $x^{t-1}$ in the direction of steepest ascent of the function $u^{t-1}$ (and , if needed, projecting back to $S$).  Intuitively you prepare for the future by only slightly modifying what you did in the past, in the direction that would have been best then.  Amazingly, by choosing $\delta_t = 1/\sqrt{t}$, one gets $R = O(\sqrt{t})$ for any sequence of concave functions (where the hidden constant in the big-O notation is quadratic in the diameter of $S$ and in the maximum steepness of the $u^t$‘s).  In particular, the average loss, $R/T$ approaches 0.  This model and these results generalize the well-studied case where the $u^t$‘s are linear functions. In particular, a leading application to have in mind is where the convex set $S$ is the set of mixed strategies, i.e. distributions, over a finite set of $m$ pure strategies, and the utility is the linear function giving the expectation of the utility of the chosen pure strategy.

## Coarse Equilibria in Games

So far we were in a decision theoretic model of a single decision maker.  Now, what happens when several different decision makers are using such a regret-minimization procedure together?  Formally, we have an $n$-player  game where players have convex strategy sets $S_i$ and utility functions  $u_i : S_1 \times ... \times S_n \rightarrow \Re$.  The players play this game repeatedly, at each time $t$, each player $i$ chooses a strategy $x_i^t$.  Looking at this from player $i$‘s point of view, at each time $t$ he is presented with function of his $x_i$, where the function is determined by the values of the other players’ choices at time $t$, i.e. for $z \in S_i$ we see the function $u_i^t(z) =u_i(z, x_{-i}^t)$.  If each of the players uses a regret minimization algorithm then they will generate together some sequence of strategy profiles $x^1 ... x^T$, where each $x^t=(x_1^t ... x_n^t)$.  Will these approach some kind of equilibrium from a game theoretic perspective?  I heard from Eva Tardos the following term for this:

Definition: Let $D$ be a distribution on the strategy profiles $S_1 \times ... \times S_n$.  We say that $D$ is a coarse equilibrium (or Hannan consistent) if for every player $i$ and every $x^*_i \in S_i$ we have that $E_{x \sim D} [u_i(x)] \ge E_{x_{-i} \sim D_{-i}} [u_i(x^*_i,x_{-i})]$, where $E$ denotes expectation.  We say that it is an $\epsilon$-coarse-equilibrium if $E_{x \sim D} [u_i(x)] \ge E_{x_{-i} \sim D_{-i}} [u_i(x^*_i,x_{-i})] - \epsilon$.

Intuitively, we can think about a trusted coordinator who picks a profile of strategies $x_1 .. x_n$ according to the distribution $D$ and “suggests” it to the players.  $D$ is a coarse equilibrium if all players prefer to follow these suggestions, rather than play any fixed strategy that may only depend on the distribution $D$ and not on the chosen recommendation.  This is strictly weaker than correlated equilibrium in which players should prefer following suggestions to choosing strategies whose identity may depend on  their suggested strategy $x_i$.  This type of equilibrium is what the definition of low regret implies directly:

Proposition: If all players have regret $R$, (i.e. for each $i$ and each $x^*_i$, we have that $\sum_{t=1}^T u_i(x_i^t,x_{-i}^t) \ge \sum_{t=1}^T u_i(x^*_i,x_{-i}^t) - R$,) then the uniform distribution on $\{x^1 ... x^T\}$ is an $\epsilon$-coarse equilibrium for $\epsilon = R/T$.

Let us just look what this implies for finite games in which players have a finite set of pure strategies.  To apply the results form online concave optimization, the regret algorithms will be run over the set of mixed strategies, which is a compact convex set.  In this case the utility for a profile of mixed strategies is the expected utility of the chosen pure strategy profile. Thus once $x^t_{-i}$ is fixed, the utility function on $x_i$ is linear and thus concave so the regret minimization algorithms apply here and yield a $\epsilon = O(1/\sqrt{T})$-coarse equilibrium in time $T$.  By direct definition this is a $\epsilon$-coarse equilibrium of the mixed game, i.e a distribution over (the $T$) profiles of mixed strategies.  However, this distribution may also be viewed as a distribution on the underlying pure strategies so we have also achieved an $\epsilon$-coarse equilibrium of the pure game.   It follows that for every finite game, if each player runs a regret minimization algorithm on the set of mixed strategies then we get a coarse equilibrium.

## Socially Concave Games

Just the fact that we have called what is reached (in the sense of average over time) an equilibrium does not mean that we have to like it from a game theoretic sense.  Could we get a “normal” equilibrium from such corse equilibrium?  Even-Dar, Mansour, and Nadav point out a family of games, termed socially concave games, where this is the case.

Definition: An $n$ player game is called socially concave if it satisfies the following two properties:

• For each $i$ and each fixed value of $x_i$, the function $u_i(x_i, x_{-i})$ is convex in $x_{-i}$.
• The sum $\sum_{i=1}^n u_i(x)$ is a concave function of $x$.

Theorem: Let $D$ be a coarse equilibrium of a socially concave game with convex strategy sets, and denote the average value of of elements drawn according to this distribution by $\hat{x} = E_{x \sim D} [x]$ (so by convexity of the $S_i$‘s also $\hat{x}_1 \in S_1 ... \hat{x}_n \in S_n$) then $(\hat{x}_1 ... \hat{x}_n)$ is a (pure) Nash equilibrium.  If $D$ is $\epsilon$-coarse equilibrium then $(\hat{x}_1 ... \hat{x}_n)$ is an $n\epsilon$-Nash equilibrium.

This theorem ensures a pure equilibrium in the game with convex strategy sets.  Looking at the case that the game is the mixed extension of a finite game, this is translated to a mixed equilibrium of the finite game.  Before we prove the theorem let’s look at a simple application: two-player zero-sum finite games.  Looking at the mixed extension we have $u_1(y,z) = -u_2(y,z) = y^TAz$, where, $y$ and $z$ are the mixed strategies of the two players and $A$ is the $m_1 \times m_2$ matrix of utilities of player 1.  This game satsifies the two conditions since for any fixed $y$ we have that $u_1$ is a linear function (hence convex) of $z$, dualy for player 2, and the sum $u_1(y,z)+u_2(y_z)=0$ is constant and thus concave.  This implies that if the two players in a zero-sum game run a regret minimization algorithm (on the mixed strategies) then taking the average of the distribution over time is a ($\epsilon=O(1\sqrt{T})$-)mixed Nash equilibrium.

Proof: We will prove the exact version, the quantative is identical, only taking care of the $\epsilon$‘s everywhere.  What we need to show is that for every $x^*_i$, it holds that $u_i(\hat{x}_i,\hat{x}_{-i}) \ge u_i(x^*_i,\hat{x}_{-i})$.  We will show this by showing the same inequality for the sum: $\sum_i u_i(\hat{x}_i,\hat{x}_{-i}) \ge \sum_i u_i(x^*_i,\hat{x}_{-i})$.  Notice that for the optimal $x^*_i$ (the one maximizing $u_i(x^*_i,\hat{x}_{-i}$) we trivially have $u_i(\hat{x}_i,\hat{x}_{-i}) \le u_i(x^*_i,\hat{x}_{-i})$, and thus the inequality of the sum at the optimal $x^*_i$‘s implies the inequality (in fact equality) for each term seperately.  So we start with $\sum_i u_i(\hat{x}_i,\hat{x}_{-i})$ and make use of the concavity of the sum and the definition of $\hat{x} = E[ x]$  to conclude $\sum_i u_i(\hat{x}_i,\hat{x}_{-i}) \ge E[\sum_i u_i(x_i, x_{-i})]=\sum_i E[u_i(x_i, x_{-i})]$ (using Jensen’s inequality.)  At this point we invoke the definition of coarse equilibrium for each $i$ getting $E[u_i(x_i, x_{-i})] \ge E[u_i(x^*_i,x_{-i})]$, and now using the convexity of $u_i$ in $x_{-i}$ (once $x^*_i$ is fixed) we get $E[u_i(x^*_i,x_{-i})] \ge u_i(x^*_i,\hat{x}_{-i})$, chaining these inequalities we get the required inequality on the sums.

## A Market Equilibrium

We can now take this general result and apply it to a rather general market model.  This  is due to Sergiu Hart and Andreu Mas-Colell.  (When I asked Sergiu how should I refer to this result, he said something like “had we been computer scientists this would have already been published in a conference; since we are economists these are preliminary thoughts that maybe will find their way into a paper within a few years”.)

Our market has $m$ divisible goods and $n$ players.  Each player $i$ has  a concave valuation function $v_i(x_1 ... x_m)$ that specifies his value for owning the bundle of goods with $x_j$ fraction of each good $j$.  Each player $i$ has an initial endowment of $w_{ij}$ of each good $j$, and they may trade between themselves, reaching some final allocation, where each player $i$ receives a fraction $x_{ij}$ of each good $j$, where for each good $j$ we have that $\sum_{i} w_{ij} = \sum_{i} x_{ij}$.  Significantly, we assume the existence of money that can be transferred between the players, formally “transferable utilities” or “quasi-linearity”.   In this model the notion of a market equilibrium is:

Definition: The allocation $(x_{ij})$  is at (transferable utility) equilibrium with prices $(p_1 ... p_m)$ if for all goods $j$ we have that $\sum_{i} w_{ij} = \sum_{i} x_{ij}$ and for every player $i$, his net utility, $v_i(x_{i1}...x_{im})-\sum_{j} (x_{ij}-w_{ij})p_j$ is maximized (over all possible bundles $(x_{i1}...x_{im})$).

We now turn this market into a game.  For this, we will require the minor technical assumption that the partial derivatives of the $u_i$‘s are bounded  from above by $M$ and from below by $1/M$ for some constant $M$.  The game will have $n+m$ players: $n$ players that correspond to the original players in the market, with the strategy set of player $i$  being the set of  vectors $(x_{i1}...x_{im})$ with $0 \le x_{ij} \le \sum_i w_{ij}$, and $m$ players corresponding to the goods, where the strategies of “good-player” $j$ are the possible prices $0 \le p_j \le M$.  The utility of an “original-player” $i$ is, as expected,  $u_i(x,p)=v_i(x_{i1}...x_{im})- \sum_{j} p_j(x_{ij} - w_{ij})$, while those of  “good-player” $j$ is: $u_j(x,p) = p_j \cdot (\sum_i x_{ij} - \sum_i w_{ij})$.  The point is that this new game captures, as its Nash equilibrium, the equilibrium of the previous market.

Proposition: The allocation $(x_{ij})$ with prices $(p_1 ... p_m)$ is an equilibrium of the market if and only if  it is a pure Nash equilibrium of the game defined above.

Proof: First notice that being a market equilibrium directly implies that all “original” players best-reply in the associated game.  It also implies that  $\sum_{i} w_{ij} = \sum_{i} x_{ij}$ and thus any $p_j$ is a best-reply for “good-player” $j$ in the game.  The opposite direction is identical once we show that any pure Nash equilibrium of the game must have $\sum_{i} w_{ij} = \sum_{i} x_{ij}$.  Suppose that this was not the case, say $\sum_{i} x_{ij} > \sum_{i} w_{ij}$ for some good $j$, then the only best response for “good-player” $j$ would be to declare $p_j=M$, to which all other players could only best-respond with $x_{ij}=0$ due to the bound from above on the derivative of the $u_i$‘s, contradicting  $\sum_{i} x_{ij} > \sum_{i} w_{ij}$.  Similarly if $\sum_{i} x_{ij} < \sum_{i} w_{ij}$ then the only best response for the “good-player” $j$ would be $p_j=0$, to which all other players could only best-respond with $x_{ij}=1$ due to the bound from below on the derivative the the $u_i$‘s, contradicting  $\sum_{i} x_{ij} < \sum_{i} w_{ij}$.

At this point it is easy to see that the game we just defined is socially concave: for an “original” player $i$, and every fixed $(x_{i1}...x_{im})$, his utility is just a linear function of the $p_j$‘s and thus convex in the strategies of the others; for a “good-player” $j$ and every fixed value of $p_j$, his utility is just a linear function in the $x_{ij}$‘s so again is convex in the strategies of the others.  Finally, summing up the utilities of all players, the $p_j \cdot (\sum_i x_{ij} - \sum_i w_{ij})$ terms cancel out and we are left with $\sum_i v_i(x_{i1}...x_{im})$ which is a concave function being the sum of concave $v_i$‘s.

So we now close the loop: to effectively find an $\epsilon$-equilibrium of a given market, we let each player in the market as well as artificially created  “good-players” run a regret-minimization algorithm for a while.  This is a very simple decentralized process: the original players attempt optimizing their demands to the sequence of ever-changing price vectors that they see, while the “good-players” optimize their prices to the changing aggregated demands that they see.  In both cases, not by a simple best-response but rather by the regret minimization response.  We then have that the long-term averages of demands and prices will be the desired market equilibrium point.

## Vazirani: Seeking Combinatorial Algorithms for Convex Programs

Vijay Vazirani has spent a considerable amount of time in the last few years on developing combinatorial algorithms for problems (mostly involving market equilibria) that can be solved by general convex programming.  In this guest post Vijay talks about the motivation for this:

Since convex programs can be solved efficiently using “continuous” methods, such as ellipsoid or interior point methods, why bother designing extremely ellaborate and difficult combinatorial algorithms for them? Let me propose the following thought experiment (it is not a difficult one!) to bring home the point. Think of a world in which these continuous methods were developed in the 1920’s and ever since, problems such as network flow and matching, which can be cast as linear programs, were routinely solved using such methods. Then in 1956, Ford and Fulkerson propose their beautiful combinatorial algorithm for max-flow and it is immediately trashed, since it is “needlessly complicated and difficult”. Edmonds’ matching algorithm, which is proposed in 1965, meets a similar fate. As a consequence, in this world, combinatorial optimization is a stillborn field. How much of a tragedy would that be? After all, matchings and max-flows could still be computed efficiently …

Observe that the continuous methods solve the instance at hand, without giving any insight into the entire problem from which the instance arose. In fact, such methods do not even need to be told the problem from which the instance arose — all that is needed is the specific linear or convex program for the given instance. So, in this world, theoreticians (if they can be called that) would not have the deep insights we have into the structure of numerous fundamental combinatorial problems. And you can gauge what kind of a “theory” of algorithms this world would have!

I believe that if our research community does not pursue the design of combinatorial algorithms for convex programs, our world would also lose out on a beautiful, deep theory. My lone work on this topic, some joint with my students, is too insignificant an effort in comparison with the ideas, structures and methods that still need to be unearthed. My inability to convince people so far, on what had seemed to me an obvious direction worth pursuing, is the reason for this post.

The convex programs I have studied over the last 8 years arose in AGT, in the context of market equilibria and Nash bargaining games. The theory so far has started forming around a remarkable convex program given by Eisenberg and Gale in 1959. In order to solve these nonlinear programs combinatorially, the classical primal-dual algorithm design paradigm had to be extended from its previous setting of LP-duality theory to the more general setting of convex programs and KKT conditions. The algorithms are non-trivial and require substantial structural insights. In turn, these structural insights provide a starting point for tackling more general problems. To get an idea of how rich the theory is already, consider the following episode. A few months ago, Gagan Goel and I started designing a combinatorial algorithm for a certain Nash bargaining game under peicewise-linear, concave utility functions; the linear case had already been solved. Out of the structural insights gained, we managed to define a new, natural market that models perfect price discrimination and in which buyers have peicewise-linear, concave utility functions (the usual Fisher and Arrow-Debreu markets were recently shown to be PPAD-complete under these utility functions). The equilibrium of our market is captured by a generalization of the Eisenberg-Gale program and is polynomial time computable — even combinatorially. In addition, the convex program yields very simple proofs of both welfare theorems for this market!

It is important to point out immediately that my work is limited to a very thin sliver of convex programs — depite their nonlinearity, these programs admit rational solutions. How about attacking more general classes of convex programs combinatorially, perhaps via approximation algorithms, since they won’t admit rational solutions? Considering how extensive the theory is already, I believe it cannot exist in islolation and is indicative of the tip of an iceberg. We need a lot more smart people to ponder on these issues to find the rest of the iceberg!

## Economists and Complexity

One of main culture clashes between computer scientists and economists on the CS-econ frontier is whether “complexity of equilibria” matters.  The  CS-y view of the matter is captured in Kamal Jain’s quote: “If your laptop can’t find it then neither can the market“.  Economists mostly don’t care since they see equilibrium reached everyday, contrary to what CS says.  As Jeff Ely quips:  “Solving the n-body problem is beyond the capabilities of the world’s smartest mathematicians.  How do those rocks-for-brains planets manage to do pull it off?”  TCS folks who see complexity as the map of the world can’t really understand this indifference, as Lance Fortnow tweets: “I’m an economist so I can ignore computational constraints / I’m a computer scientist, so I can ignore gravity.

Computational Complexity map of the world

The beautiful thing about studying systems at equilibrium is precisely the fact that this abstracts away the gory details of the process (aka dynamics aka algorithm) of reaching this equilibrium.  In principle, there are many different difficult-to-understand dynamics that all reach the same easy-to-understand equilibrium point.   This is all very nice as long as the equilibrium is indeed magically reached somehow by the “real” dynamics.  The obvious crucial question is whether this is the case in practice. It seems that the general tendency among economists is to claim “yes”, to which the natural next question in line is “why”?  As computer scientists show, this is not a general characteristic of large games or markets.  Understanding the properties of interesting games and markets that make them actually reach equilibrium should be enlightening.  Maybe it is because economists choose to consider only those that do turn out to converge quickly, ignoring another large and interesting class of strategic scenarios? Maybe it is because economists are thinking about “smallish” games and so their insights will not carry over to more complex realistic scenarios?  Maybe there is some deeper interesting structure that guarantees fast convergence?  Distinguishing between these possibilities is especially crucial as we aim to analyze the new artificial games and markets that are to be found — and designed — on the Internet as well as elsewhere.  Which economic and game-theoretic sensibilities will still hold in these complex unnatural  circumstances?

Complexity is all about understanding such processes.  While the foremost question dealt by computational complexity is that of “time” — how long does a computational process need in order to find the solution — in our case to reach (close-to) equilibrium — this is not the only type of questions and insights provided by complexity theory.  As one can see in the “map above”, there are a stunning variety of complexity classes, each attempting to capture a different facet of the challenge of finding the solution: how much space (memory) is needed? Can we even tell when we reach a solution?  Does randomization help?  Is it helped parallelism?  Are approximations easier?  Does the solution have this or that particular structure? In the case of finding equilibria, the classes PPAD and PLS give a very nuanced explanation of what is involved.  There are also “concrete” models that study explicitly specific parameters such as communication or queries.  One may dislike the fact that this complexity analysis does not restrict attention to natural dynamics but allows arbitrary and unnatural algorithms.  The kind of natural dynamics that economists usually have in mind are some kind of best-replying in case of games and some kind of a Walrasian auctioneer in markets.  The problem is that there are many variants of these that make sense: fictitious play, various forms of population dynamics, more sophisticated types of learning such as regret-minimization, and all these can be enhanced with various orderings, smoothing attempts, tie-breaking rules, strategic look-ahead, re-starts, actions of the central planner, not to mention other more or less complex optimization and learning  attempts.  The strength of complexity analysis is that it applies to all of these.  Any “lower bounds” are definitive: any practical system can be simulated by a computer, and thus no dynamics can succeed in general. (Emphasis on “in general” — as mention above, the problems that you may be interested in may have special structure — so what is it?)   A statement of  an “upper bound” may be less interesting as stated, but immediately raises the challenge of either finding a natural algorithm=process=dynamic or pinpointing the complexity reason explaining why this is impossible.

This is a good point to refute several irrelevant objections to the applicability of computational complexity for analyzing dynamics of games and markets.  The first is the notion that humans or markets can undergo processes that cannot be computed.  Frankly, there is no evidence of this; there is certainly much about the world that we do not understand well enough to simulate, but there is no indication of any natural process that is inherently more powerful than computers.  This is the modern version of the Church-Turing thesis.  (It seems that some quantum phenomena cannot be simulated by usual computers, but that doesn’t change the principle — it just would replace classical computers with quantum ones in the few places that it matters.)  Even if you do subscribe to metaphysical dualism, do you want to base economics on it?  The second types of objections concern the standard technical notions used in complexity which obviously leave much wiggle-room: “polynomial time”, with its hidden constants, is not synonymous with efficient; worst-case analysis is not always the right notion, etc.  The point is that these are simply concise notions that usually seem to capture the issues well.  There always are cases where more refined notions are needed, and in such cases complexity theory can provide more precise answers: for example in analysis of basic problems like sorting or matrix multiplication, very exact results are obtained (with no hidden constants) similarly, cryptography is not based on worst-case analysis, etc.  There is no indication so far that the usual somewhat coarse notions of computational complexity miss something significant when applied to games or markets — quite the contrary in fact.  If such evidence emerges then complexity will not become irrelevant; simply more refined analysis will be used.

Take for example the stunning difference between reaching a Nash equilibrium and reaching a Correlated equilibrium.  While the latter is reached by various natural regret-minimization dynamics, there are no “non-trivial” dynamics that, in general, reach the former.  Let me say a word about this “non-triviality” by giving an example of what I would consider a trivial process: suppose that each player chooses a strategy at random every “turn”, unless his last strategy was already a best-reply to those of the others (up-to some $\epsilon$ indifference).  At some time, the players will happen to “hit” an ($\epsilon$-) equilibrium.  This type of dynamics that simply search over the whole space of strategy profiles provides no insight and is not useful in most practical situations.  The point of the triviality is not that of the random choices but rather that of essentially trying every possibility.  Many other proposed “dynamics” for Nash equilibrium or for market equilibrium are similarly trivial — in some cases they resort to simply trying all possibilities (in some approximate sense).  The dynamics for correlated-Nash are not like this at all — they only look at a tiny “small-dimensional” fraction of the space of possibilities.  Why is that? Complexity theory explains this phenomena clearly: correlated equilibria can be found in polynomial time, but finding a Nash equilibrium (or many types of market-equilibrium) is PPAD-complete.

## The Computational Complexity of Pure Nash

Perhaps the first complexity question regarding equilibria that a Computer Scientist can ask is how difficult — computationally — is it to find a pure Nash equilibrium of a given game, or report that none exists.

In the simple setting, the $n$-player game is given by it’s $n$ utility functions, each of them given as a table of size $m^n$ of numbers (where we assume for simplicity that each player has a strategy space of size $m$.)  By going over all possible $m^n$ possible strategy profiles and checking for each of them whether some player can gain by deviating we get an efficient algorithm.  (Easy exercise: The running time of this algorithm is $O(n^2 m^n)$ — improve it to linear in the input size, i.e to $O(nm^n)$.)

This, however, is not the end of the story.  In many cases, we really have a “very large” game, and we would like to find an equilibrium in time which is significantly sub-linear in the size of the game’s utility tables.   How can we even talk about getting sub-linear time algorithms?  We need to read the input — don’t we?  There are two standard ways in Computer Science to talk about this question.  The concrete complexity approach assumes that the utility tables are given by some of “black box” which can answer queries of a certain type (e.g. given a strategy profile $s$, can return the value of $u_i(s)$ as a single atomic operation).  The set of queries that the black box supports will of course determine what can be done efficiently in this model.  The Turing-Machine complexity approach will consider concise representations of the utility functions, whose length is significantly less than the full size of the tables, and expect the algorithm to find an answer in time that is polynomial in the size of the concise input.  This of course gives different algorithmic problems for different concise representations.

In the rest of this post I will present the basic study of the complexity of finding a pure Nash equilibrium in these models (for general games as well as some special cases.)   These basics are often overlooked as they are shadowed by the meaty PLS-completeness results and PPAD-completeness results, but I prefer  talking about the basics even though these contains no “significant” proofs.

### Black Box Complexity

In the variant here we assume that we get $n$ “black boxes”, one for each player, that can answer utility queries: given a strategy profile $s$ return $u_i(s)$.  Another variant we can consider would also allow asking best-reply queries: given a strategy profile of the others $s_{-i}$, return the best-reply $s_i$ of player $i$.  (Where we would need to specify how ties are handled.)  We will concentrate on cases where $m$ is not too large and we allow the running time to be polynomial in $m$, even though in some cases we could think about running in time which is logarithmic in $m$ — the length needed to represent a strategy of a player. In these situations, as a best-reply query can easily be simulated by $m$ utility queries to all possible values of $s_i$, adding a best-reply query doesn’t significantly change the power of the model.  (One can imagine even more queries to ask from the black-box, and the ultimate in this direction would be t allow asking any question about $u_i$, but only about a single $u_i$ for each query.  This model turns out to be equivalent to the communication complexity model.)

We will consider not only general games, but also two sub-classes of games that are known to always have a pure-Nash equilibrium: Dominance-solvable games (those where iterated elimination of strictly dominated strategies leaves a single strategy profile), and Potential Games.  For motivation it is best to start with a “good” algorithm, that finds a pure Nash equilibrium in all Dominance-solvable games:

1. Let $s$ be an arbitrary initial strategy profile
2. Repeat $mn$ times
1. For all players $i = 1 ... n$ do
1. $s_{i} =$ best reply of $i$ to $s_{-i}$.

The total running time of this algorithm is $O(m^2 n^2)$ operations (which include utility queries to the black box, where each best-reply calculation is implemented via $m$ utility queries.)  The interesting fact is that this algorithm always ends with a pure Nash equilibrium — as long as the game is dominance-solvable.  (Idea of easy proof: a strategy that was eliminated in the $t$‘th step of the strategy elimination process can never be used after $t$ iterations of the outer loop of this algorithm.)  What is remarkable about this running time is that it is exponentially smaller than the “input size”, as each black box contains $m^n$ numbers in it.

Can such an efficient algorithm — whose running time is polynomial in $n$ and $m$ — be found for other games?  For potential games?  Maybe even this algorithm — just repeated best-replying — will do the trick?  It is not difficult to see for general games repeated-best replying may loop infinitely even if some pure Nash equilibrium exists.  For potential games, almost by definition, every sequence of best-replies will indeed terminate with a pure Nash equilibrium.  (In fact, ordinal potential games are exactly those games where every sequence of better-replies terminates.)  But how long can this take?  Unfortunately, exponential time.  (Exercise: find an exact potential game with $n$ players each having two strategies, where starting from some strategy profile $s$, every sequence of best-replies requires exponential time to reach a Nash equilibrium.)

For the case of general games it is easy to provide such an adversary.  Let the adversary fix in his mind some game without any pure Nash equilibrium. (For concreteness, say, 2 players playing matching pennies, and another $n-2$ players not caring at all, but still having two strategies each.  This gives a game with $2^n$ strategy profiles, whose structure is just many multiple copies of the 2×2 game of matching pennies.)  Whenever the algorithm makes a query, the adversary will just answer with the value of the fixed game.  Clearly the algorithm can never find a pure Nash equilibrium since none exists.  The point is that the algorithm cannot decide that no Nash equilibrium exists before making at least $2^n$ queries.  If even a single strategy profile was not queried, then it could be that at that strategy profile all players get high utility (where “high” is anything higher than any value in the original game) and thus it is a Nash equilibrium.

For the case of potential games the adversary must be somewhat more clever.  We will consider an $n$ player game where each player has two strategies and thus the space of strategy profiles is the $n$-dimensional Boolean hypercube. Our adversary will produce an exact potential game, as will be evident by the fact that all players will have exactly the same utility at each strategy profile.  In such cases a pure Nash equilibrium is simply a local maximum of this utility function.  Let us start with the basic idea, that does not quite do the job: At the $t$‘th query of the algorithm, $s^t$, the adversary will answer $u_1(s^t) = ... = u_n(s^t) = u(s^t)=t$.  The idea is that the algorithm cannot find any Nash equilibrium this way since a strategy profile can be declared to be such only if all of its $n$ neighbors (in the Boolean hypercube — corresponding to the possible $n$ deviations by a single player) were already queried to have a lower utility than the vertex itself.  Since the adversary keeps answering with increasing values, the only way that this can happen is if the algorithm first “closes up” a strategy profile (vertex in the hypercube) by first asking all of its neighbors and only then the enclosed vertex.  So to fix it, we’ll improve the adversary: when asked a query $s^t$, the adversary will look at the set of un-accessed vertices of the hypercube and see whether any vertices are left disconnected from the “rest of the hypercube”.  Specifically, let us call a vertex “compromised” if its connected component (after removing all queries $s^1 ... s^t$ from the hypercube) has size strictly less than $2^{n-1}$ (half the hypercube).  What our adversary will do is go over all newly compromised vertices, fixing their values in an order that increases from the farthest away from $s^t$ to $s^t$ itself.  This way a vertex is never assigned a value after all of its neighbors have, and thus no vertex can be guaranteed to be a Nash equilibrium until all vertices of the hypercube are either queried or compromised.  The only question is how many queries does this take?  Let $Q$ be the set of queries and $C$ be the set of all vertices compromised by $Q$, how large does $Q$ have to be for the union to be at least half of the whole hypercube?  Combinatorially, $Q$ must contain all of $C$‘s neighbors in the hypercube.  This leads us to ask for smallest possible size of the set of neighbors of a set of vertices in the hypercube — known as the isoperimetric problem for the Boolean hypercube.  This question is completely solved (see chapter 16 of Bollobas), and in particular it is known that for any subset of at most half of the hypercube, the size of the set of its neighbors is at least $\Omega(1/\sqrt{n})$ fraction of original set’s  size.  This means that $|Q| \ge \Omega(|C|/\sqrt{n})$, which implies that $|Q| \ge \Omega(2^n/\sqrt{n})$ — an exponential lower bound for the worst case running time of any algorithm that finds a pure Nash equilibrium for all potential games.  An immediate corollary is that there may be best-reply paths of this exponential length.

### Turing-Machine complexity

We now move to the alternative way of accessing a “large” game, that of representing it concisely.  Clearly not every game has such a concise representation, but can we at least find and equilibrium efficiently for those that do?  Also, different representations may require different complexities — which concise representation should we consider?  The first one we will look at is the general representation of a black box by the code of a procedure, equivalently by a description of Turing machine. However,  we only want to consider TMs that run efficiently, e.g. in polynomial time, or alternatively count the running time of the TM as part of the input size.  The usual formalism for giving such a TM as input, is the equivalent requirement of a Boolean circuit that computes the black-box queries, and this will be our first concise representation:

Input: $n$ boolean circuits, each computing a function $u_i : (\{0,1\}^{\log m})^n \rightarrow \{0,1\}^t$, where the input of each circuit is the index of each player’s strategy and  the output of each circuit is interpreted as a $t$-bit long integer specifying the utility.

Output (Decision version): does this game have a pure Nash equilibrium?

We would expect the running time to be polynomial in the input size: $n,t$, and the total size of the circuits.  As before let us concentrate on cases where $m$ is not too large (i.e. also polynomial in these parameters).

So what is the complexity of this problem?  Generally speaking, the state of complexity theory is such that we get essentially no algorithmically-useful information from the circuit representation.  Thus, we would expect a polynomial time algorithm to exist for this generic representation only if such existed in the black-box model — which is doesn’t.  In this world we can easily prove an NP-completeness result.

Theorem: Existence of a pure Nash equilibrium in a game given by boolean circuits (as above) is NP-complete.

Proof: Containment in NP is by guessing the equilibrium point, and then verifying it by checking all $m$ possible deviations for every player.  (This is the only place where we used $m$ being small, otherwise the problem seems to be $\Sigma_2$-complete).  To prove NP-hardness we will provide a reduction from 3-SAT (I lost the reference for this reduction).  All players will have two strategies: 0 and 1.  For each variable we will have a player whose utility function is a constant 0.  For each clause we will have an additional player whose utility function gives him 1 if his strategy is the OR of its literals.  We will then have two additional “top” players whose utilities are as follows: if all clause players play 1, then both of these players get utility 1 whatever they play; otherwise, these players play “matching pennies” between themselves.  Note that in a pure Nash equilibrium, the variable players can play anything, the clause players must faithfully compute the clause values, and then the two “top” players can be in equilibrium if and only if all clause values are 1.

The next succinct representation we consider is one that aims to take advantage of the fact that in many games while there are many players, each player’s utility only depends on the strategies of a few others and not on everyone’s strategies.  In such cases, we can represent  the utility of each player as a full table of only the relevant players’ strategies.  In such games an important structure is obtained by looking at the graph  (or digraph) specifying for each two players whether the utility of one depends on  the strategy played by the other.  So here is the equilibrium problem using this “graphical games” succinct representation:

Input: A graph $G$ and for each vertex $i$ in it, a table $u_i : \{1...m\}^{d_i+1} \rightarrow\mathbb{Q}$, where $d_i$ is the degree of vertex $i$, and the table specifies the utility of player $i$ as a function of his strategy as well as those of his neighbors.

Output (Decision version): does this game have a pure Nash equilibrium?

We would expect the running time to be polynomial in the the total size of the tables, i.e polynomial in $n$ and in $m^d$, where $d = \max_i d_i$.

However just a tiny variation of the NP-completeness proof above shows that this problem in NP-complete too.  The only players in the reduction above who had degree more than 3 are the two top players who were “computing” the “AND” of all the clauses.  To get a reduction where all players have degree at most 3, we will add an additional player for each clause who will compute the “and” of this clause with all preceding ones.  This will be done by giving him utility 1 if his strategy is indeed the “and” of the original clause-player of his clause and the new-clause-player of the preceding clause (and 0 otherwise).  Now the two “top” players will just need to depend on the new-clause-player of the last clause as they did before (either always wining or playing matching pennies).  Thus we get:

Theorem: Existence of a pure Nash equilibrium in graphical games (as above) is NP-complete.

A nice exercise is to exhibit a polynomial time algorithm for the special case that when the graph is in fact a tree.

Now let us move to the special case of exact potential games.  The problem specification (input and output) is exactly as defined above where the utility functions are given as Boolean circuits, except that we are guaranteed that the utility functions are those of an exact potential game.  (This condition may not be easy to verify in general, but our algorithm may take it as an assumption.)   We know that a pure equilibrium is exactly a local maximum of the potential function.  This potential function is uniquely defined, up to an additive constant, by the given utility functions: once the value of the potential function is known for some profile, $P(s^0)$, we can change, in turn each $s^0_i$ to an arbitrary $s_i$ and the difference is completely specified by $u_i$.  Thus the potential function is defined by $P(s)-P(s^0) =$ $\sum_{i=1}^n [ P(s^0_1 ... s^0_i, s_{i+1} ... s_n) - P(s^0_1 ... s^0_{i-1}, s_i, ..., s_n)] =$  $\sum_{i=1}^n [u_i(s^0_1 ... s^0_i, s_{i+1} ... s_n) - u_i(s^0_1 ... s^0_{i-1}, s_i, ..., s_n)]$.  We see that the problem of finding a pure Nash equilibrium for potential games is equivalent to that of finding a local maximum of a given Boolean circuit, where “locality” is given by switching the values of one of the $n$ sets of $\log m$ bits in our problem definition. (Proof: the expression above is efficiently computable and thus our problem is no harder, and on the other hand a special case of our problem is where all $u_i$‘s are identical in which case our problem is equivalent to finding a local maximum.)

Now let us say a few words about the complexity of this local-maximum problem.  First, syntactically speaking we have an “NP search problem”: on input $x$ (the utility functions) the task is to find some solution $y$ (strategy profile) that satisfies $R(x, y)$ ($y$ is a local maximum of $x$), where $R$ is computable in polynomial time.  The general class of such search problems is termed FNP,  and in our case we are talking about problems with a guarantee that such a $y$ always exists, a subclass termed TFNP.  Now, semantically, there are various other problems that have a similar flavor of finding a local maximum (or minimum), and this class of problems, a subset of TFNP, was termed PLS (polynomial local search).  Some examples of such problems come from attempting to formalize what various local-improvement search heuristics obtain: suppose that you attempt finding a short traveling salesman route by gradually trying to switch the locations of pairs of cities along the route.  What do you get?  How long will this take?  The general class of problems in PLS is defined as those having polynomial-time computable procedures specifying a neighborhood relation and a quality score over solutions where the problem is to define a solution which is a local minimum (or maximum).  By these definitions, our problem is in this class PLS, and in fact the generic nature of our problem makes it PLS-complete.  So what does this PLS-completeness mean?  It means that the complexity of our problem is similar to the hardest problems in this class which seem to have intermediate difficulty between P and NP.  We do not know a polynomial time algorithm for them (and such an algorithm will require significantly new techniques since it doesn’t “relativize”), nor or they likely to be NP-complete (since that would also imply NP=coNP).

A special case of potential games is that of network congestion games.  In these we are given a (directed) graph, with $n$ pairs of vertices $(a_i,b_i)$ in it and, for each edge $e$, a sequence of $n+1$ costs: $c_e(0) .... c_e(n)$.  This is a succinct representation of a congestion game: player $i$‘s set of strategies is the set of paths in the graph from $a_i$ to $b_i$, and the cost (negative utility) of a player is the sum of $c_e(k_e)$ over all edges $e$ in his path, where $k_e$ is the number of players whose strategy (path) contains the edge $e$.  Network congestion games are exact potential games, and have a pure Nash equilibrium.  Finding such an equilibrium turns out to be PLS-complete too, although this requires a complicated reduction.

## Communication Complexity of Mixed-Nash Equilibria

In a previous post I discussed the communication complexity of reaching a pure Nash equilibrium.  The communication complexity model aims to capture the basic information transfer bottleneck between the different players in a games, abstracting away the question of incentives, and focusing on the need of communicating the different preferences (utilities) of the players that are assumed to initially be privately known.  The previous post introduced the  model of communication complexity and applied it to the question of finding a pure Nash equilibrium (if it exists).  The bottom line was that, in the general case, essentially all information about the utilities must be transferred and no “shortcuts” are possible.  In a multi-player game this amount of information is exponential in the number of players, implying that convergence to equilibrium, in general, is impractical, and special properties of the game must be used in order to reach equilibrium in reasonable time (one such property was demonstrated: dominance-solvable games).

This post discusses the issue of convergence to a mixed Nash equilibrium, as studied by Sergiu Hart and Yishay Mansour in How Long to Equilibrium? The Communication Complexity of Uncoupled Equilibrium Procedures.   The setting, as in my previous post, has each one of $n$ players holding his utility function $u_i : S_1 \times ... \times S_n \rightarrow \Re$, where each $S_j$ is the set of strategies of player $j$, and for ease of notation lets have $|S_j|=m$ for all $j$.  We will assume that all utilities are finitely represented, i.e. are rational numbers, say with $k$-bit numerator and denominator.  Can a mixed Nash equilibrium be found by communicating significantly less than $m^n \cdot k$ bits — the size of each utility function?  Maybe it can be done by communicating only a polynomial (in $n$, $m$, and $k$) bits?  This would be a necessary condition for efficient convergence of any  (uncoupled) dynamics between the players.

Before we look deeper into at this question, let us look at the similar problem of finding a correlated  equilibrium.  In this case it turns out that a polynomial amount of communication suffices and thus only a tiny fraction of the private information needs to be transferred. You can see a previous post of mine for background on correlated equilibrium.

### The Communication Complexity of Correlated Equilibrium

Let us recall that a correlated equilibrium is a probability distribution on the strategy profiles $p : S_1 \times ... \times S_n \rightarrow \Re$ that satisfies linear inequalities of the following form:  for every player $i$, and every two strategies of $i$, $s_i, s'_i$ : $\sum_{s_{-i}} p(s_i,s_{-i}) u_i(s_i,s_{-i}) \ge \sum_{s_{-i}} p(s_i,s_{-i}) u_i(s'_i,s_{-i})$, where the sum ranges over all strategy profiles $s_{-i}$ of the other players.  The interpretation is that $s_i$ is indeed a best-reply to the conditional distribution $p(s_i,\cdot)$ on $s_{-i}$.

[Edited on Oct 2nd: Albert Xin Jiang was kind enough to point out that the next paragraph is, well, wrong, so I’m striking it out.  In order to show that finding a correlated equilibrium can be done with low communication complexity, one must run the Ellipsoid algorithm on the dual LP (which has polynomial many variables) rather than on the primal (which has exponentially many variables).  How to do so effectively is shown in a paper by Papadimitriou and Roughgarden, and Hart and Mansour show that the algorithm can be implemented with low communication.]

As being a correlated equilibrium is defined by a set of linear inequalities, we can find a correlated equilibrium using linear programming.  Let us see how the players can jointly run the Ellipsoid LP algorithm while keeping the amount of communication in check.  The Ellipsoid algorithm runs in time polynomial in $m$, $n$, and $k$ as long as it has access to a separation oracle.  Such an oracle must be able to answer queries of the following form: given an unfeasible point, in our case a candidate distribution $p$, it must find a constraint, in our case a player $i$ and strategies $s_i, s'_i$, where the corresponding inequality is violated by $p$.  The main point is that each of the inequalities can be checked by a single player since it depends only on a single utility function.  Thus to implement the separation oracle, each player must either report a violated constraint or a single bit specifying that all his constraints are satisfied — all together taking $O(n + \log m)$ bits of communication.  When the algorithm terminates — after a polynomial number of steps (polynomial in $n$, $m$, and $k$) — all players know the answer, which is a correlated equilibrium.   (Note that even though a correlated equilibrium is an exponential-sized object, it will turn out to have a polynomial-sized support.)

While this low-communication algorithm can not be viewed as natural dynamics that efficiently converge to a correlated equilibrium, it does point out that some dynamics that converge efficiently do exist.   The question of finding natural dynamics then gets more pressing and indeed it turns out that natural dynamics do exist as well: dynamics based on regret minimization (see my previous post.)

### Mixed-Nash Equilibria

Now, once we have what to aim for — a similarly low-communication way of reaching a mixed-Nash equilibrium — let us look at the problem carefully again.  First, we must recall that Nash equilibria may be irrational even if all utilities are rational, so a Nash equilibrium can not be “printed” as binary numbers in any finite time.  Still, in our communication complexity formulation this shouldn’t be a problem since any representation of the equilibrium is in principle acceptable as long as it is uniquely determined by the communication.  Closely related to this is the fact that the representation of a Nash equilibrium in an $n$-player game may require exponentially many bits of precision, even if the utilities themselves have only short descriptions.  As before, while the required precision itself is not a lower bound on the communication complexity, Hart and Mansour do show how to use this precision to obtain a lower bound.  Below I give a different, somewhat stronger proof.

Let us start with the following 2-player bi-strategy game, where $0 is an arbitrary parameter.

r, 0                0, r

0, 1-r            1-r, 0

It is easy to see that the only Nash equilibrium of this game has each player choosing his second strategy with probability exactly $r$ and his first with probability $1-r$.  This family of games suffices for giving a tight lower bound for the special case of two-player ($n=2$) bi-strategy ($m=2$) games.

Lemma: The communication complexity of finding a Nash equilibrium in two-player bi-strategy games, where all utilities are given by $k$-bit integers, is $\Theta(k)$.

The upper bound is trivial and the lower bound is implied by games of the previous form where $r$ ranges over all fractions of the form $r = x/2^k$, for integer $0 since each of these games has a different (unique) answer giving $2^k$ different such answers, which thus requires at least $k$ bits of communication just to get all possibilities.

The dependence of the communication complexity on $m$, the number of strategies of each player, is still unclear.  However, Hart and Mansour show that the dependence on the number of players, $n$, is exponential.  The basic idea is that with $n$ players we can get doubly-exponential  many different answers.  A clean way to show this is to “simulate” utilities of representation length $k=2^{\Theta(n)}$, and then invoke the previous bound.

Win-Lose Simulation Lemma: For every $n$-player game with utilities that are $k$-bit integers, there exists an $(n + 3\log k)$-player game, with the following properties:

• The new game is a win-lose game, i.e. all utilities are 0 or 1.
• Each of the first $n$ players has the same set of strategies as in the original game.  The utility of each of these players is fully determined (in an easy manner) by his utility in the original game.
• Each of the final $3 \log k$ players has 2 strategies.  The utilities of each of these are constants not depending on the original game at all.
• The Nash equilibria of the new game are in 1-1 correspondence with those of the original game: the mixed strategies of each of the first $n$ players are identical, while the strategies of the final $3\log k$ players are fixed constants independent of the game.

From this we immediately get our theorem.

Theorem: The communication complexity of finding a Nash equilibrium in $n$-player bi-strategy win-lose games is $2^{\Theta(n)}$.

The upper bound is trivial and the lower bound is implied by the lower bound on 2-player games with $k$-bit utilities via this reduction using $n=2+3\log k$ players.

### Proof of Win-Lose Simulation lemma

It remains to prove the simulation lemma.  We will do so in three steps: first construct a fixed game whose unique equilibrium has exponentially low probabilities, then use this game to simulate games with utilities that are all powers of two, and finally use these to simulate general games with integer utilities.

The first step of the reduction is the following construction:

Construction: for each $t$ there exist $(2t)$-player win-lose bi-strategy games such that the unique Nash equilibrium has, for each $1 \le i \le t$, players $2i-1$ and $2i$ choosing their first strategy with probability exactly $2^{-2^{i-1}}$.

Proof: for $t=1$, this is exactly “matching pennies”, i.e. the $2 \times 2$ game described above for $r=1/2$ (after scaling by a factor of 2).  Now for the induction step we start with $t-1$ pairs of players as promised by the induction hypothesis, and we keep these players’ utilities completely independent of the soon to be introduced 2 new players, so we know that in every Nash equilibrium of the currently constructed $2t$-player game these will still mix exactly as stated by the fact above for $t-1$.  In particular the last pair of players in the $(2t-2)$-player game choose their first strategy with probability $2^{-2^{t-2}}$.  The point is that with these players in place, we can easily simulate the $2 \times 2$ game above for $r=2^{-2^{t-1}}$.  To simulate a “$r$ entry”, we define the utility as 1 when the last pair of players in the $(2t-2)$-player game play their first strategy (which happens with probability $2^{-2^{t-2}} \times 2^{-2^{t-2}} = 2^{-2^{t-1}} = r$) and 0 otherwise, while to define a $1-r$ entry we define the opposite.

Our next step is to simulate (in the sense of the lemma) games where all utilities are powers of two in the range $1, 2, 4 , ..., 2^k$, equivalently, after scaling, in the range $1, 1/2, 1/4, ..., 2^{-k}$.  This is done by adding to the original players $t=2\log k$ players as defined in the construction above.  We can now replace each utility of the original players of the form $2^{-x}$ where $x$‘s binary representation is $x = \sum_{i=1}^{t} x_i 2^{i-1}$ with a utility of 1 whenever, for every $i$ with $x_i=1$, new player $2i$ plays his first strategy,(and 0 otherwise).  This happens with probability $\Pi_{i|x_i=1} 2^{-2^{i-1}} = 2^{-\sum_{i |x_i=1} 2^{i-1}} = 2^{-x}$, as needed.

Our final step is simulating games with utilities that are general $k$-bit integers by games that have all their utilities powers of two, as in the previous step.  This is done by adding another $\log k$ bi-strategy players, paired into $\log k/2$ independent “matching pennies” games, and thus each of them always evenly mixes between his two strategies.  To simulate a utility of $y=\sum_{i=1}^k y_i 2^{i-1}$ in the original game, we give, in the new game, a utility of $y_i 2^{i-1}$ whenever the $\log k$ new players played a sequence of strategies which is the binary representation of $i$ (and 0 otherwise). The expected value is exactly $y/k$, completing the simulation (as the scaling by a factor of $k$ does not matter).

### What remains?

The previous lower bound leaves much to be desired: while it does give an exponential lower bound in the input parameters, this is done at the cost of having an exponentially long output.  If we also take the output length into account (in any representation) then the bound is no more than linear — trivial.  Indeed, in the formal description of the computational Nash equilibrium problem, one formally has an input parameter $\epsilon$ that specifies how close to an equilibrium do we demand the output to be, and the algorithm is allowed to run in time that is polynomial in $\log \epsilon$.  (See a previous post on the subtleties of this approximation parameter $\epsilon$.)  Taking this approach, the lower bound is trivial too as it is bounded from above by $\log\epsilon$.

Thus, the main open problem that remains is that of determining the communication complexity of finding a mixed-Nash equilibrium when the precision required at the output is only polynomial in the other parameters.

## New Paper: A computational view of Market Efficiency

An intriguing new paper, A computational view of Market Efficiency by Jasmina Hasanhodzic, Andrew W. Lo, and Emanuele Viola has just been uploaded to the arXiv.

Abstract

We propose to study market efficiency from a computational viewpoint. Borrowing from theoretical computer science, we define a market to be efficient with respect to resources $S$ (e.g., time, memory) if no strategy using resources $S$ can make a profit. As a first step, we consider memory-$m$ strategies whose action at time $t$ depends only on the $m$ previous observations at times $t-m,...,t-1$. We introduce and study a simple model of market evolution, where strategies impact the market by their decision to buy or sell. We show that the effect of optimal strategies using memory $m$ can lead to “market conditions” that were not present initially, such as (1) market bubbles and (2) the possibility for a strategy using memory $m' > m$ to make a bigger profit than was initially possible. We suggest ours as a framework to rationalize the technological arms race of quantitative trading firms.

## Motivation

The dogma of game theory (and economics) is that strategic situations will reach equilibrium — one way or another. How this will happen is often left un-answered, but from a computational point of view this is quite a central question. Indeed, a great amount of work has been done regarding the computational complexity of reaching an equilibrium of a game, reaching a conclusion that an infeasible amount of computation might be needed. In this post I want to discuss another aspect of the complexity of reaching equilibrium: the amount of information that needs to be transferred between the different players. The point of view here is that initially each player “knows” only his own utility function $u_i$ and then the players somehow interact until they find an equilibrium of the game $(u_1 .... u_n)$ defined by their joint utilities. The analysis here abstracts away the different incentives of the players (the usual focus of attention in game theory) and is only concerned with the amount of information transfer required between the players. I.e., we treat this as a purely distributed computation question where the input $(u_1 ... u_n)$ is distributed between the players who all just want to jointly compute an equilibrium point. We assume perfect cooperation and prior coordination between the players, i.e. that all follow a predetermined protocol that was devised to compute the required output with minimum amount of communication. The advantage of this model is that it provides lower bounds for all sorts of processes and dynamics and that eventually should reach equilibria: if even under perfect cooperation much communication is required for reaching equilibrium, then certainly whatever individual strategy each player follows, convergence to equilibrium cannot happen quickly.

The main result that we shall give in this post is that finding a pure Nash equilibrium (or determining that none exists) requires essentially communicating everything about the utility functions — no shortcuts are possible. This is the basic result that appears in two papers that studied the issue: Communication complexity as a lower bound for learning in games by Vincent Conitzer and Tuomas Sandholm that considered two-player games and How Long to Equilibrium? The Communication Complexity of Uncoupled Equilibrium Procedures by Sergiu Hart and Yishay Mansour whose focus is multi-player games. See also the presentation by Sergiu Hart. A future post will discuss the much more complicated case of finding a mixed Nash equilibrium, a question which is still mostly open, despite results given in the H&M paper.

## The Communication Complexity model

The basic communication complexity model considers $n$ players, where each player $i$ holds part of the input which we assume is an $N$-bit string, $x^i \in \{0,1\}^N$. The usual focus is on the two-player case, $n=2$, in which case we simplify the notation to use $x$ and $y$ rather than $x^1$ and $x^2$. The joint goal of these players is to compute the value of $f(x^1 ... x^n)$ where $f : (\{0,1\}^N)^n \rightarrow Z$ is some predetermined function. As $f(x^1 ... x^n)$ depends on all values of $x^i$, the players will need to communicate with each other in order to do so (we will use “broadcasting” rather than player-to-player channels), and they will do so according to a predetermined protocol. Such a protocol specifies when each of them speaks and what he says, where the main point is that each player’s behavior must be a function of his own input as well as what was broadcast so far. We will assume that all communication is in bits (any finite alphabet can be represented in bits). The communication complexity of $f$ is the minimum amount of bits of communication that a protocol for $f$ uses in the worst case. This model has been introduced in a seminal paper by Andy Yao, the standard reference is still the (just slightly dated) book Communication Complexity I wrote with Eyal Kushilevtiz, and online references include the wikipedia article, and the chapter from the Barak-Arora computational complexity book.

.

## The Disjointness Function

Our interest will be in lower bounds on the communication complexity. There are several techniques known for proving such lower bounds, the easiest of which is the “fooling set method” which we will use on the “disjointness function” for two players. The disjointness function, introduced in Yao’s original paper, plays a key role in communication complexity, and indeed it will turn out to be useful to us too: once we have a lower bound for it, it will be easy to easily deduce the lower bound for the function of our interest, finding a pure Nash equilibrium, by simple reduction.

The disjointness function: Alice holds $x \in \{0,1\}^N$ and Bob holds $y \in \{0,1\}^N$. They need to determine whether there exists an index $i$ where they both have 1, $x_i=1=y_i$. If we view $x$ as specifying a subset $S \subseteq \{1...N\}$ and $y$ as specifying $T \subseteq \{1 ... N\}$, then we are asking whether $S \cap T = \emptyset$.

Lemma: Any deterministic protocol for solving the disjointness problem requires at least $N$ bits of communication in the worst case.

Before proving this lemma, notice that it is almost tight, as Alice can always send all her input to Bob ($N$ bits) who then has all the information for determining the answer, who then sends it back to Alice (1 more bit). It is known that a $\Omega(N)$ lower bound also applies to randomized protocols, but the proof for the randomized case is harder, while the proof for the deterministic case is easy.

Proof: Let us look at the $2^N$ different input pairs of the form $\{ | \forall i,\: x_i+y_i=1\}$ (these are the “maximal disjoint” inputs). We will show that for no two of these input pairs can the protocol produce the same communication transcript (i.e. the exact same sequence of bits is sent throughout the protocol). It follows that the protocol must have at least $2^N$ different possible transcripts, and since these are all binary sequences, at least one of them must be of length at least $N$ bits.

Now comes the main point: why can’t two maximally disjoint pairs $$ and $$ have the same transcript? Had we been in a situation that $f(x,y) \ne f(x',y')$ then we would know that the transcripts must be different since the protocol must output a different answer in the two cases; however in our case all “maximal disjoint” input pairs give the same answer: “disjoint”. We use the structure of communication protocols — that each player’s actions can only depend on his own input (and on the communication so far) — to show that if $$ and $$ have the same transcript then $$ has the same transcript too (and so does $$). Consider the players communicating on inputs $$: Alice, that only sees $x$, cannot distinguish this from the case of $$ and Bob, that only sees $y'$ cannot distinguish it from the case $$. Thus when Alice communicates she will not deviate from what she does on of $$ and Bob will not deviate from what he does on $$. Thus, none of them can be the first to deviate from the joint transcript, thus none of them ever does deviate and the joint transcript is also followed on $$. We now get our contradiction since either $x$ and $y'$ are not disjoint or $x'$ and $y$ are not disjoint, and thus at least one of these pairs cannot not share the same transcript, a contradiction.   (The reason for non disjointness is that since $x \ne x'$ there must be an index $i$ with $x_i \ne x'_i$. If $x_i = 1$ and $x'_i=0$ then $y_i = 0$ and $y'_i=1$ in which case $x$ and $y'$ are not disjoint. Similarly, if $x'_i = 1$ and $x_i=0$ then $y'_i = 0$ and $y_i=1$ in which case $x'$ and $y$ are not disjoint.)

## Finding a Pure Nash Equilibrium

Back to our setting where each player $i$ holds his own utility function $u_i : S_1 \times ... S_n \rightarrow \Re$, where the $S_i$‘s are the commonly known strategy sets, which we will assume have size $m$ each. Thus each player’s input consists of $m^n$ real numbers, which we will always assume are in some small integer range. Perhaps it is best to start with a non-trivial (just) upper bound.  Let us consider the class of games that are (strictly) dominance solvable, i.e. those that iterated elimination of strictly dominated strategies leaves a single strategy profile, which is obviously a pure Nash equilibrium.  A simple protocol to find this equilibrium is the following:

• Repeat $mn$ times:
• For $i = 1 ...n$ do
• If player $i$ has a strategy that is dominated after the elimination of all previously removed strategies, then announce it, else say “pass”.
• Output the (single) profile of non-eliminated strategies.

What is remarkable about this protocol is that number of bits communicated is polynomial in $n$ and $m$, which may be exponentially smaller than  the total size of the input which is $O(m^n)$ numbers for each player.  Can some protocol like this be designed for all games that have a pure Nash equilibrium?  The basic answer in no:

Theorem: Every deterministic (or randomized) protocol that recognizes whether the given game has a pure Nash equilibrium requires $\Omega(m^n)$ bits of communication.

Note that this theorem implies in particular that one cannot design a more efficient protocol that finds an equilibrium just for games that are guaranteed to have one, as adding a verification stage (where each player checks whether the strategy suggested to him is indeed a best reply to those suggested to the others) would convert it to a protocol that recognizes the existence of a pure Nash.

Proof: The proof is by a reduction from the disjointness problem.  We will show that a protocol for determining the existence of a pure Nash equilibrium for games with $n$ players each having $m$ strategies can be used to solve the disjointness problem on strings of length $N = \Omega(m^n)$.

Let us start with the case $n=2$ and solve disjointness for length $N=(m-2)^2$.  The idea is very simple: Alice will build a utility matrix $u_A$ by filling it with the values of its bit string $x$ in some arbitrary but fixed order, and Bob will build a matrix $u_B$ from $y$ using the same order.  Now, any cell of the two matrices where $x_i=y_i=1$ will turn out to be a Nash equilibrium since both players get the highest utility possible in this matrix.  For this idea to formally work, we just need to make sure that no other cell is a Nash equilibrium.  This can be done in one of several ways (and different ones are used in the C&S and H&M papers), the simplest of which is adding two more rows and columns as follows:

With the addition of these rows and columns, each player’s best-reply always obtains a value of 1, and thus only a (1,1) entry may be a Nash equilibrium.

Now for general $n$.  We add to Alice and Bob $n-2$ players that have utilities that are identically 0.  Thus any strategy will be a best reply for these new players, and a pure Nash equilibrium is determined solely by Alice and Bob.  The main change that the new players bring is that the utility functions of Alice and Bob are now $n$-dimensional matrices of total size $m^n$.  Thus Alice and Bob can fold their length $N$ bit string into this larger table (as before keeping two strategies from their own strategy space to ensure no “none-(1,1)” equilibrium points), allowing to answer disjointness on $N=(m-2)^2 \cdot m^{n-2}$-bit long vectors.

## Variants

There are several interesting variants on the basic question:

1. The lower bound uses heavily the fact that the utilities of different strategies may be equal and thus there are many possible best replies.  Would finding a Nash equilibrium be easier for games in “general position” — specifically if there was always a single best reply for each player?  For the case of $n=2$ players, C&S show that the complexity does go down to $\Theta(m \log m)$.  (Exercise: show that the randomized communication complexity in this case is $\Theta(m)$.)  As $n$ grows, the savings are not exponential though and H&M show a $2^{\Omega(n)}$ lower bound for constant $m$.  A small gap remains.
2. We have seen that finding an equilibrium of a dominance solvable game is easy.  Another family that is guaranteed to posses a pure equilibrium is potential games.  However, H&M show that finding a Nash equilibrium in an ordinal potential game may require exponential communication.  I don’t know whether things are better for exact potential games.
3. The efficient protocol for dominance solvable games was certainly artificial.  However, an efficient natural protocol exists as well: if players just best reply in round robin fashion then they will quickly converge to a pure Nash equilibrium.  (This seems to be well known, but the only link that I know of is to one of my papers.)

## Approximate Nash

The basic computational problem in algorithmic game theory is that of computing a (mixed) Nash equilibrium of a given (non-cooperative) game. The computational complexity status of this problem has been recently settled by showing that it is “PPAD-complete” and thus “unlikely” to be efficiently computable (see this link for a nice overview). This may be considered as not very satisfying answer due to our incomplete understanding of the class PPAD, but at least the problem is now reduced to be a purely computational complexity one with no game theory aspects to it and no special role for the Nash problem. The main related problem remaining is whether approximating a Nash equilibrium may be done efficiently. This post will describe the current status of the approximation problem.

## The approximate-Nash problem

Let us first settle on some notation. We are talking about a $n$-player game, where each player $i$ has $m_i$ possible strategies, each player has a utility function $u_i : \{1..m_1\} \times \{1..m_2\} \times ... \times \{1..m_n\} \rightarrow \Re$. We will denote by $\Delta_i$ the set of probability distributions on $\{ 1 ... m_i\}$ and for $x^1 \in \Delta_1 ... x^n \in \Delta_n$ we use $u_i(x^1 ... x^n)$ as a shorthand for $\sum_{j_1, ... , j_n} \Pi_i x^i_{j_i} u_i(j_1 ... j_n)$, the expected value of $u_i(j_1 ... j_n)$ where each $j_i$ is chosen at random according to the probability $x^i$.

We know by Nash’s theorem that a mixed Nash equilibrium of the game exists and the computational problem is to find one. Unfortunately, it is also known that when $n \ge 3$ then the Nash equilibria may all be irrational numbers. We thus cannot just ask our algorithm to output a Nash equilibrium since it may not have a finite representation and so we need to settle for approximation in order to even define the “exact” problem. So here is the carefully stated common definition of the Nash equilibrium computation problem:

Input: The input is given by $n$ tables $u_1 ... u_n$ each of dimensions $m_1 \times m_2 \times ... \times m_n$. For finiteness of representation, each of the $n \Pi_i m_i$ numbers in the input is a rational number given by a $d$-digit numerator and denominator. We are also give a rational number $\epsilon>0$ (with $d$-digit numerator and denominator).

Output: $n$ rational vectors $x^1 \in \Delta_1, ..., x^n \in \Delta_n$  that are an $\epsilon$-Nash equilibrium.

Definition: $x^1, ..., x^n$ are $\epsilon$-Nash of the game defined by $u_1 ... u_n$ if for every $i$ and every $x^{i*} \in \Delta_i$ we have that $u_i(x^i, x^{-i}) \ge u_i(x^{i*},x^{-i}) - \epsilon$.

Running time: we want the running time to be polynomial in the input size, i.e. polynomial in $n$, $d$, and $\Pi_i m_i$.

It is quite useful to look carefully at various choices made here, and what we would get with slightly different choices:

1. Allowing any Nash equilibrium: our definition allowed any ($\epsilon$)-Nash rather than specifying which one we desire in case multiple ones exist. It is known that most ways of attempting to require a specific Nash point make the problem harder, NP-complete.
2. Approximation: as mentioned above when $n \ge 3$ we need to settle for a $\epsilon$-approximation just to have a finite output. For $n=2$ there always exists an equilibrium which has rational numbers with $poly(d, n, \Pi_i m_i)$-digit long numerator and denominator, and thus the exact version is equivalent to the problem as stated.
3. Approximation of best-response condition: our notion for approximation relaxed the best-response condition by $\epsilon$ rather than asking for some rational point that is $\epsilon$-close to an exact Nash equilibrium point. The latter condition seems to be much harder and in not even known to be in NP (or in fact even in the polynomial time hierarchy) and is treated at length by this 57-page paper by Mihalis Yannakakis and Kousha Etessami. The crux of the difficulty is that getting some grip on an exact Nash point seems to require “infinite precision” arithmetic — the same type of problem encountered in trying to determine whether $\sum_i \sqrt{x_i} > T$ (discussed by Richard Lipton and see also this related post of his). (Thanks to Kousha Etessami for explaining these delicate issues to me.)
4. Having $\epsilon$ be given in binary: in our definition $\epsilon$ was given by $d$-digit numerator and denominator and the running time was required to be polynomial in $d$, i.e. in $\log \epsilon^{-1}$. An alternative version would be to require $\epsilon$ to be “given in unary” i.e. allow the algorithm to run in time polynomial in $\epsilon^{-1}$. This version is asking for a fully polynomial approximation scheme (FPTAS) and could be easier. However, it turns out that getting a FPTAS is also PPAD-hard and thus this is actually not the case.
5. Additive approximation: Our condition demands $u_i(x^i, x^{-i}) \ge u_i(x^{i*},x^{-i}) - \epsilon$. We can not use the more natural relative error $u_i(x^i, x^{-i}) \ge u_i(x^{i*},x^{-i}) (1 - \epsilon)$ as the Nash equilibrium point $u_i$ may be arbitrarily close to zero which may again require an exponential-length answer.

At this point we can start talking about the approximate version of this problem. For this we will allow an arbitrary dependence of the running time on $\epsilon$ and even allow $\epsilon$ be fixed (e.g. $\epsilon=0.1$) . As the original problem scales, for the approximation to make sense, we need to normalize $\epsilon$ by the sizes of the numbers in the probem.

$\epsilon$-Approximation Problem: This is the same problem as above but scaled so that the utilities all satisfy $0 \le u_i(j_1 ... j_n) \le 1$.

## A Quasi-polynomial-time Algorithm

The key result by Richrad Lipton, Evangelos Marakakis, and Aranyak Metha is that the approximation problem can be solved in “quasi-polynomial” time — in this case time $N^{\log N}$ where $N$ is the input size. The algorithm is very simple and is based on the existence of approximate equilibria with strategies that have only small support.  In many “non-uniform” models, randomization over a large domain can be replaced by randomization over a much smaller domain.  (One nice example is the simulation of public coins by private ones in communication complexity.) This is the situation here and some background is given in a recent blog post by Richard Lipton.

Definition: Let $k$ be an integer, then a distribution $\tilde{x}^i \in \Delta_i$ is $k$-simple if all its coordinates are integer multiples of $1/k$. I.e. it is a uniform distribution on a multi-set of size $k$ of pure strategies. A random $k$-simplification of $x^i$ is obtained by choosing at random $k$ elements (with repetitions) according to the distribution specified by $x^i$ and then taking the uniform distribution on this multiset.

Proposition: For every $x^1 \in \Delta_1$ and fixed pure strategies $j_2 ... j_n$, if we chose a random $k$-simplification $\tilde{x}^1$ of $x^1$ then $Pr[|u_i(\tilde{x}^1,j_2,...,j_n) - u_i(x^1,j_2,...,j_n)| > \epsilon] \le e^{-\Omega(k\epsilon^2)}$.

The proof is by observing that $u_i(x^1,j_2,...,j_n)$ is the expected value of  $u_i(j_1,j_2,...,j_n)$  when $j_1$ is chosen according to $x_1$, while $u_i(\tilde{x}^1,j_2,...,j_n)$ is the average value of  $u_i(j_1,j_2,...,j_n)$ over $k$ random choices of $j_1$ according to $x_1$, and thus the proposition is just a direct use of the Hoeffding tail inequalities.

Now, if we choose $k = O(\log N / \epsilon^2)$, where $N=\Pi_i m_i$, then, with high probability (due to the union bound), the previous proposition holds for all pure strategies $j_2 ... j_n$ and all $u_i$ for $i = 1 ...n$ .  In this case, by averaging, it also holds for all mixed strategies $x^2 ... x^n$:

Definition: $\tilde{x^1}$ is an $\epsilon$-approximation of $x^1$ (relative to $u_1 ... u_n$) if for all $x^{2} \in \Delta_2 .... x^{n} \in \Delta_n$ and all $i$ we have $|u_i(\tilde{x}^1,x^{2},...,x^{n}) - u_i(x^1,x^{2},...,x^{n})| \le \epsilon$.

Corollary: For every $x^1 \in \Delta_1$ there exists an $\epsilon$-approximation $\tilde{x}^1$ which is $k$-simple, for $k = O(\log N / \epsilon^2)$.

The main lemma now follows:

Lemma: Every game has an $\epsilon$-Nash equilibrium where all player strategies are $k$-simple, with $k = O(\log N / \epsilon^2)$.

The proof starts with any Nash equilibrium $x^1 ... x^n$ and chooses a $k$-simple $\epsilon'$-approximation $\tilde{x}^i$ for each $x^i$ (where $\epsilon'=\epsilon/(2n)$).  Since $x^1 ... x^n$ are an exact Nash equilibrium, in order to show that $\tilde{x^1} ... \tilde{x^n}$ are an approximate equilibrium, it suffices to show that changing the $x^i$‘s to $\tilde{x}^i$‘s changes things by at most $\epsilon$ — but this is exactly what being an $\epsilon'$-approximation means for replacing a single $x^i$ by $\tilde{x}^i$, and the approximation errors $\epsilon'$‘s simply add up.

Once the existence of a $k$-simple $\epsilon$-equilibrium is known, an $O(n^k)$ algorithm follows simply by exhaustive search, and thus we get our algorithm.

Notice that we have actually shown a stronger theorem: not only can we find some $\epsilon$-approximation but in fact we can find an approximation to any equilibrium, and our approximated equilibrium maintains also player’s utilities.  This for example also provides an approximate equilibrium with approximately optimal social welfare.

## A polynomial time algorithm?

The existence of a quasi-polynomial time algorithm may often be viewed as a hint that a polynomial-time algorithm exists as well.  In our case the main open problem is whether such a polynomial time algorithm exists for every fixed value of $\epsilon>0$.  The upper bounds for the approximation problem are rater weak and the best that is known is a polynomial time approximation for $\epsilon=0.3393...$.

One possible approach is to see whether the parameter $k$ above may be reduced to being constant when $\epsilon$ is constant.  This however is not the case.  Consider a zero-sum two-player game where each utility is chosen at random to be 0 or 1.  The following statements can be easily shown to hold with high probability over the random choice of the game: (a) for each row or column, the fraction of 1-entries is very close to 1/2 and thus a value of close to 1/2 can be obtained by any player by using a uniform distribution on his rows or columns (b) for any $k=o(\log n)$ rows there exists a single column which has all 0 entries in these rows.  Thus any mixed strategy that is $o(\log n)$-simple cannot ensure positive value.  It follows that there does not exists a $\epsilon$-approximate Nash which is $k$-simple for any $\epsilon<1/2$ and $k = o(\log n)$.  A recent paper of Constantinos Daskalakis and Christos Papadimitriou generalizes this impossibility not only to simples strategies but to more general “oblivious” algorithms.

Another difficulty is shown by a recent paper of Elad Hazan and Robert Krauthgamer.  They consider the harder problem of approximating a specific Nash equilibrium — specifically the one with highest social welfare — for which the technique above also provides a quasi-polynomial-time algorithm.  They show that this problem is at least as hard as the known problem of finding a small randomly planted clique in a graph, providing some evidence for its difficulty.

A new paper, Reducibility Among Fractional Stability Problems, by Shiva Kintali, Laura J. Poplawski, Rajmohan Rajaraman, Ravi Sundaram, and Shang-Hua Teng was just uploaded to the arXiv.  The paper proves PPAD-completeness for half a dozen problems, from various domains of application, all related to stability.  The simple interpretation of PPAD-completeness is that these problems are shown to be computationally equivalent to computational variants of Brower’s fixed point theorem or to finding a Nash-equilibrium.

While I can not say that I am specifically interested in any of these problems, I do think that this type of methodological characterization of the complexity of problems related to stability and equilibrium is important.  It maps the space of such problems, separating variants that have some special structure from those whose are general enough to hide within them the general problem.  Maybe it is time for someone to maintain a compendium of PPAD-complete problems?  (In the spirit of the compendium of NP-optimization problems.)  My favorite entry would be Nash equilibrium for 2-player win-lose games.