Another round in the everlasting soul-searching of the TCS community about its connection with the real world has been lately going on in the “usual” theory blogs (by Mihai, Bill, and Michael, spurred by a workshop on theory and multi-cores) and also by Mark, a self-professed outsider. Among other issues that he points out, Mark complains about “people who devise complicated (never implemented) algorithms, with basic big-O analysis only, and obsession over the worst case” and asks: “how many researchers who devise new algorithms actually have them implemented?” Mihai, basically argues (in the multi-core case) against being too practical: “there is really no point wasting our time on current technology which is clearly bound to be replaced by some other technology in the next 60-1000 years”, but cannot refrain from mocking “too theoretical” theory (with a stab at algorithmic game theory too): “Which theory community? … The one that thinks Amazon should be happy with an O(lg3n) approximation for its profit?”
I tend to think that much of these complaints comes from taking the models and results of theoretical computer science too literally. The contribution that theory provides is usually not that of someone directly implementing a new algorithm. Usually it is the insights behind the algorithm (or the lower bound or the characterization) that tend to be useful. Such insights may well come from “an O(lg3n) approximation for profit” (which may be near optimal on real distributions, or may suggest a good heuristic or may combine well with other algorithms and optimization goals — all of these best dealt with practically rather than analyzed theoretically). On the other hand even, say, a linear time algorithm for the exact optimal profit may not be directly applicable since it solves the wrong problem or due to additional new constraints or due to the input not being available as assumed or a variety of other reasons. Thus the “technology transfer” from theory to practice is not really composed of “algorithms” and “results” but rather of “insights”, “techniques”, and “ways of thinking”. Indeed, when I look at how theoreticians in Google contribute to the practice of the company, I can hardly think of a situation where the contribution was simply a great new algorithm that was simply implemented. Usually, what a theoretical researcher contributes is a point of view that leads to a host of suggestions about algorithmic techniques, heuristics, metrics, notions, interfaces, trade-offs and more — all influenced by theoretical understanding — that together yield improved performance.
Thus the contribution of an “algorithm for X with performance Y in model Z” does not require that model Z is “realistic” or that performance Y is “good” — the real contribution is that obtaining performance Y in model Z provides new insights — insights that can then be used in various possible realistic applications for achieving performance improvements in various cases. It seems that many in our field (including program committees) sometimes forget this and simply view improving Y as a goal by itself (this is related to the old TCS conceptual debate). A well chosen model Z will indeed tend to have the property that improving Y in it indeed correlates well with new insights, but is never quite equivalent. A major difference between theoretical CS (as well as theoretical economics) and pure math is exactly in the central place of the art of choosing the right model. This is especially true for Algorithmic game-theory that sits between theoretical CS and theoretical economics that have different sensibilities about good models — but this requires another post.
While judging a single contribution for “insight” is certainly difficult and subjective, in the long run we can tell. The real test of a community, in the long run, is whether the insights that it has developed over a significant amount of time contribute to other fields. I would claim that theoretical computer science passes this test with flying colors. Looking a generation back TCS has pretty completely re-shaped computer science and optimization. The more recent products of theoretical CS like approximation algorithms, online algorithms or streaming/sketching algorithms have already had a major influence on how the fields of OS, DB, or AI treat their own problems as well as how the industry looks at many of its problems. I would even claim that the very young field of algorithmic game theory can already boast some of its insights already affecting other fields, and certainly much real Internet activity (yes, yes, ad-auctions and such). Of course it is way too early to judge the contribution of algorithmic game theory or other last-decade fruits of theory.
Agree with many of the points. Anyone who has worked in the industry and interacted with real world problems realizes that theoretical insights are very useful (and applied) in shaping how to model a problem and what to focus on.
Hi Noam!
there is really no point wasting our time on current technology which is clearly bound to be replaced by some other technology in the next 60-1000 years
Actually, I didn’t write that. It was an anonymous comment on my blog (that later got dissected in follow-up comments).
I’m glad you feel good about theory 🙂 I am also not keen on this TCS soul searching, not because I am ok with the current state of affairs, but because I find it rather useless (nobody ever convinced anybody of anything, as far as I can tell).
Actually I don’t feel completely good about the state of TCS — I do see various problems (maybe I’ll complain about some of them in a future post). However, I generally disagree with criticisms about the non-practicality of our models or results taken literally.
Sorry for having mis-quoted you.
Noam, thanks for an excellent post. Blogs are dangerous, because just now I spent half an hour reading the old TCS conceptual debate and only then remembered that I had already read it last year! It seems that the reward system in TCS and the way conferences dominate it, needs some adjustment as the field matures.
I agree with everything you say above. The contributions of TCS are mainly conceptual, a way of thinking rather than results. We don’t need less “pure” theory, just more “applied” theory and more communication between groups and recognition that everyone should work together using their own strengths for the advancement of science.
Two comments:
1. I would also be very interested in what you have to say about algorithmic game theory, and choosing models in general. What sort of model validation should be done as far as TCS goes (it is different from theoretical physics, since to some extent we can design the universe, not just interpret it)?
2. “improving Y” (performance) is not the same as “asymptotically improving an upper bound when no lower bound is known” – this is one of the things I was complaining about in my post. I see a big gulf between the followers of P-vs-NP and the followers of Knuth here, to say nothing of the experimental algorithm analysts. Bob Sedgewick once told me about some upper bounds he ran into when writing his Algorithms books, and they were ludicrously non-tight when he implemented the algorithms. An algorithm with a O(n^2) bound is not necessarily an improvement on another algorithm with a O(n^3) bound. Of course, good papers improve the upper bound for the SAME algorithm, or give lower bounds, or give some empirical evidence for why the second algorithm is better.
Mark Wilson says “An algorithm with an O(n^2) bound
is not necessarily an improvement on another algorithm
with an O(n^3) bound.”. I think this type of statement
is a very common criticism of theory but it is not a good
argument. It may very difficult to prove a lower bound
on the O(n^3) time algorithm to really show that
it performs as poorly as the stated bound – or it may
be in fact much better but that is also difficult to
prove. However, should one not prove that
some other algorithm gives a provably better asymptotic
bound? Why not? We have to prove what we can as long
as there is a decent enough metric. Interpreting the
significance of the asymptotic improvement in practice is a different issue and need not always burden theoretical discoveries.
Hi x (whoever you are)
I agree that changing an upper bound from O(n^3) to O(n^2) is a worthy thing to do, and that lower bounds are hard. I am not suggesting that such upper bound refinements not be published.
I am suggesting that their importance is not as great as some people seem to suppose. I am also suggesting that they should be complemented by some other evidence (possibly in the same paper, by the same author, and possibly by others).
My main concern is not really about individual researchers but how the whole system fits together. A paper that does what is described above immediately leads to interesting questions about actual performance of this algorithm compared with, say, other algorithms with a known Theta(n^2) bound. But publication outlets for results on this question seem to be fewer and less highly regarded by much of the “TCS community” (if such a thing exists), so incentives to do this work are relatively weak. Furthermore the people who might be interested in doing this work are not finding out about such algorithms systematically. I will also add that many published algorithms have substantial errors, but correcting these and trying to publish that is a thankless task.