The following is a special contribution to this blog from Christos Papadimitriou, C. Lester Hogan Professor of Electrical Engineering and Computer Science at the University of California at Berkeley. Christos co-organized the Simons Institute Visions on the Theory of Computing Symposia, which was sponsored by the Computing Community Consortium (CCC) this past May. Christos provides a retrospective look at each presentation from the Symposium below. You can also view all of the presentations here.
“What Should a Computational Theory of Cortex Explain?” Leslie Valiant, Harvard University.
Valiant started by remembering two neuroscience pioneers: Ramon y Cajal, who admired data and derided theory (except in physics); and David Marr who eventually realized that 1970 was too early for theory. In the face of such ample warning, Valiant set out to develop a theory of the brain: There is a hierarchy of brain-related tasks, he said, of increasing sophistication: communicate, compute, learn, evolve. Interestingly, the lowest layer is already hard. He went on to outline a principled research project for developing a theory of the brain: model the brain in a resource-conscious way, and then provide concrete explanations (“vicinal algorithms”) for basic, important and quantitatively challenging tasks. Valiant outlined (i) how to allocate space for a new concept, and (ii) how to link two existing concepts. Random graph theory helps in both.
“Quantum Computing and the Entanglement Frontier” John Preskill, Caltech.
There are three frontiers in Physics, said Preskill: The short-distance frontier, the long-distance frontier, and the complexity frontier. His talk focused on the latter, which he sees as a unifier of the other two, a rebellion against the micro/macro dichotomy, against the wisdom that “large systems want to behave classically”. The complexity frontier is about “putting this quantum weirdness to work,” identifying tasks that are classically hard but quantum feasible (factoring yes, simulating quantum field theory and the short distance frontier yes, but not SAT). “There is no quantum copier,” he pointed out, information is non-local, and this is the essence of entanglement. And entanglement can be useful: its monogamy — a particle cannot be entangled with many — gives rise to cryptographic primitives. There are many possible approaches to quantum computation, all of them challenging, all being pursued. And then there is the black hole paradox.
“Why Biology is Different” by Bernard Chazelle, Princeton University.
Biology is physics plus history, said Chazelle, hence the unreasonable ineffectiveness of mathematics in it (“math is no good at history”). In biological systems scales interact intensely, and causation flows in both directions of scale. Chazelle presented the formalism of diffusive influence systems for modeling biological phenomena, a multi-scale model of physically realistic randomized distributed interaction. Familiar things likeTuring machines and strange attractors are part of this space, he said, but they constitute a set of measure zero. When the underlying law of the system is expressed in first-order logic, the system becomes geometric (through elimination of quantifiers); the mixing of time-scales produces novel phenomena.
“Big Data and New Models Needed to Study DNA Variation in Evolution and Cancer” David Haussler, UC Santa Cruz.
Haussler begun by discussing the G10K project whose purpose is to sequence 10,000 animal genomes, understand the phylogeny of every base, and deduce function from observed selection. As an example of what can be learned, he mentioned that gene HES5 involved in neural development remains active in the human fetus much longer than in monkeys. Sequence graphs is a mathematically sophisticated formalism for tracing many kinds of molecular transformations in comparative genomics. Finally, Hassler outlined the prospects for cancer treatment through comparative genomics: the major obstacles to progress appear to be unwillingness by clinicians and researchers to share data, and time-consuming treatment approval procedures.
“Perfection and Beyond” Maria Chudnovsky, Columbia University.
Chudnovsky outlined the perfect graph conjecture, its proof, and the remaining challenges. A graph is perfect if for all induced subgraphs its clique number coincides with its chromatic number (in all graphs the clique number is at least as large as the chromatic number). Claude Berge defined the concept in 1961, and conjectured (1) that perfect graphs are closed under complement, and (2) that a graph is perfect if an only if neither itself nor its complement contains the simplest non-perfect graph: an odd cycle with five or more nodes. In 1972 Lovász gave a relatively simple proof of (1) based on work by Fulkerson. Chudnovsky gave an outline of her — much more complicated — 2006 proof of (2), with Roberstson, Seymour, and Thomas, as well as several open problems, chief among which is the Erdős-Hajnal conjecture on large independent sets in graphs lacking any particular induced subgraph.
“Bursts, Cascades, and Hot Spots: Algorithmic Models of Social Phenomena” Jon Kleinberg, Cornell University.
Kleinberg thinks of the Internet as a cross between a library and a crowd: unruly and yet full of essential knowledge. By studying it, we discover unexpected aspects of human behavior, but more often (and arguably more interestingly) new ways in which familiar aspects manifest themselves. For example, one can identify popular tourist attractions by analyzing tags on Flickr pictures and then running a greedy heuristic on a world map, while we can learn a lot from Facebook by seeing it not as a single giant graph, but as a large collection of small networks (the neighborhoods of each person).
“The Cryptographic Lens: Visions of our Past and Future,” Shafi Goldwasser, MIT.
Goldwasser gave a tour of the accomplishments of Cryptography over the past four decades, and pointed out that they are all “mathematical hallucinations,” feats which at first sight seem impossible: Public key cryptography, electronic signatures, bit commitment, playing poker over a phone line, zero-knowledge proofs, probabilistic encryption, and so many more. Then she focused on two recent accomplishments which were long considered impossible even by seasoned cryptographers: Homomorphic encryption (encrypt in a way that allows an untrusted party to perform computations on the cypher text) and functional encryption (selectively allow certain functions on your private data). Her conclusion: We should not let common sense and real-world experience restrict our cryptographic ambition.
“More is Different. Also in Computing?” Marc Mézard, ENS Paris.
Mézard’s title is an allusion to P. W. Anderson’s idea that, in nature, the whole can be more than its parts because new properties may emerge across scales. Spin glasses (the collective magnetic state of large lattices of interacting atoms) have been studied for decades (tens of thousands of papers on essentially zero grant support), with an eye towards exploring whether useful computation can be performed by such systems. The resulting theory and computational experiments have yielded interesting insights in connection to error-correcting codes, satisfiability of Boolean formulae, and compressed sensing. For example, there is an intermediate phase in satisfiability between the two extremes of ubiquitous satisfiability and unsatisfiability, in which satisfying assignments are fragmented.
“Theory of Data Streams” S. Muthu Muthukrishnan, Rutgers University and Microsoft Research.
Imagine that someone reads read to you all numbers between 1 and 20 except one, in some order, and then expects you to find the missing number. How can you accomplish this by remembering only one number? (Hint: sum). Muthukrishnan believes that, because the amount of data grows faster than computing resources, computation today is like this puzzle. He outlined two important primitives for streaming computation with limited memory: Count-min sketch (n counters are increased/decreased over and over, and you must remember the value of each), and L0 sampling (now the counters represent frequencies, and you must sample from this distribution): both can be implemented with much less memory than one would imagine possible, and lie at the basis of many applications. He also discussed several other concepts and techniques crucial to the new algorithmic environment, including sparse Fast Fourier Transform (an algorithm that scales not with the dimension but with the number of nonzero elements of the input) and distributed learning (how to learn with minimal communication).
“Intelligence and Machines: Creating Intelligent Machines By Modeling the Brain” Jeff Hawkins, Numenta.
Jeff Hawkins believes that the Turing Test (creating computers that can pass as ordinary humans) is not an ambitious enough research goal: he believes we should aim for true — that is, extraordinary — intelligence. He also believes that this can be achieved only through insights obtained by studying the brain. The technical cornerstone of his approach is the concept of sparse distributed representations (SDR): Representing patterns by long (several thousand long) bit strings that are sparse (say, about 2% of the entries are ones). SDRs must be robust (the pattern can be identified by subsampling), meaningful (shared ones imply pattern similarity), they must enable set membership (is this pattern in this pattern class?), and be hierarchical in nature (reflecting the six-deep hierarchy of cortical neurons). At Numenta, Hawkins has been applying SDRs to solve difficult pattern recognition problems such as demand prediction in the energy industry.
“The Online Revolution: Learning Without Limits,” Daphne Koller, co-founder, Coursera.
Coursera has been offering online courses (MOOCs) for over a year now. Based on this experience, Koller envisions a future in which universities have shifted their focus away from providing academic content, since this content will be widely available online. MOOCs will provide students with basic material mastery, and leave for human instructors tasks such as discussing material, as well as teaching problem solving, inventing proofs and algorithms, debate, and other skills that may not be easily taught online. Koller told stories about the impact Coursera MOOCs have had on people, and on their many unanticipated consequences (such as the rise of online and face-to-face communities), she discussed difficulties and emerging opportunities (such as mutual grading), and defended MOOCs against the “90% failure rate” criticism (of all the scientific papers that you have downloaded, how many have you read thoroughly? and does this mean that online papers are useless?).
“Programming Nanoscale Structure Using DNA-Based Information,” Ned Seeman, New York University.
Seeman builds structures and devices from DNA. The intellectual goal is to understand — at a limit — the connection between microstructure and macroscopic scale (recall Feynman’s “What I cannot create I do not understand”). The initial impetus came from a crystallographer’s impatience with more conventional crystallographic experiments. A methodology has emerged: Reciprocal exchange between DNA strands is a basic primitive. Components can be assembled to form structures through sticky-end cohesion. Scaffolding is essential and possible, for example through the magic of the tensegrity triangle motif. Moving micromolecular devices require excruciating programming. Both computation and reproduction are possible.
“Five Discontinuities that Reshaped My Research (and a lot else),” Prabhakar Raghavan, Google.
The Internet is inhabited and used by consumers, not academics, and there is a huge difference — for example, academics have the illusion that they know what they are searching for. Raghavan painted the picture of a wild wild web that not only automates markets but also creates new ones (witness ad word auctions, and the 25% of Netflix and 30% of Amazon sales which concern products not available in brick-and-mortar stores). Consumer behavior causes power-law distributions and fat tails; and computation driven by making happy huge numbers of consumers happens at very few hubs via massive parallelism (while the rest of us carry around fancy display devices). We must radically change the ways we design and teach algorithms to cope with the realities of the web, Raghavan concluded.
“Market Design and Computer-Assisted Markets: An Economist’s Perspective,” Alvin Roth, Stanford University.
In a market both buyers and sellers search and discover, said Roth. Markets are more general than auctions, they are best thought of as matching games. Computation has transformed economics because computational platforms render markets thick (astronomical crowds of buyers and sellers can congregate in electronic markets), uncongested (“efficient” means something else in economics), and safe (following the rules will not compromise your interests). The Gale-Shapley matching algorithm is paramount, but the devil is in the details, as Roth illustrated through observations from a telephone-based medical labor market. When couples need jobs in the same city, computational complexity kicks in — but there are remedies.
“Evolution and Computation,” Christos Papadimitriou, Simons Institute.
The study of evolution even before Darwin has been surprisingly rife with covertly computational ideas. Papadimitriou discussed the algorithmic insights which led to three recent advances: A new understanding of the role of sex in evolution through the concept of mixability; a theorem stating that a recombining population of truth assignments can evolve to satisfy an arbitrary Boolean function, assuming that satisfaction confers a fitness advantage; and a surprising connection between evolution of a species and the multiplicative weights update learning algorithm, well known in computer science.
“The Mathematics of Causal Inference, with Reflections on Machine Learning and the Logic of Science,” Judea Pearl, UCLA.
Pearl compared Babylon (where scientific progress relied mainly on competent statistical analysis of data) to Athens (where the understanding of the world progressed through rational models of nature) and called for a similar transition in modern science. Counterfactuals are the atoms of scientific thinking (e.g., “what if the force pulling the wire were doubled?”), while models summarizing statistical dependence between random variables enable algorithmic reasoning about transportability (“Can I apply my NY observations to LA?”), recoverability (when is missing data always recoverable?) and ultimately causality, which lies at the basis of both scientific reasoning and human morality.
“The Gospel According to TCS,” Avi Wigderson, Institute for Advanced Study, Princeton.
This symposium proves how effective TCS is in the sciences; Wigderson looked inside the “engine room” of TCS, its core, where the concepts and techniques underlying this effectiveness have been created. Wigderson spoke of four astonishing successes of core TCS and one equally astonishing failure (“four weddings and a funeral,” he called them), and traced all five to the locality of computation as initially conceived by Turing: Optimization problems have been classified with amazing thoroughness through NP-completeness; randomized algorithms have brought depth, elegance, insight, and practical results; cryptography created impossible-looking primitives by wielding TCS’s arsenal of ideas; approximate optimization was classified, through an improbable sequence of unanticipated consequences of seemingly unrelated TCS advances, as tightly as the exact version; and yet, sixty years after Gödel’s letter to von Neumann, we seem further than ever from proving that P is not NP.
“The Modern Astrophysics Stack: Automated Action and Insight,” Josh Bloom, UC Berkeley.
When it becomes operational in 2021, the Large Synoptic Survey Telescope is expected to collect 20TB of data per night. As telescopes collect more and more data, astronomers have an increasingly difficult time processing data. Bloom outfits telescopes with intelligent systems that help decide what is “astronomically promising.” These intelligent systems learn from past astronomers’ telescope usage patterns, and identify important events sooner than astrophysicists can. Such systems have helped astrophysicists rule out many models of white dwarf supernovae. In the future, Bloom hopes to see intelligent systems formulate and answer novel questions.