Posted by: holdenlee | August 23, 2015

## Media consumption, 8/9-22

Posted by: holdenlee | August 8, 2015

## Media consumption, 8/2-8

Posted by: holdenlee | August 5, 2015

## Barron’s theorem: Neural networks evade the curse of dimensionality

I’m presenting Barron’s Theorem in the Machine Learning reading group today.

Abstract: I will state and prove Barron’s Theorem, which shows that 1-layer neural networks can evade the curse of dimensionality. Barron’s Theorem bounds the error of the best neural net approximation to a function, in terms of the number of hidden nodes and the smoothness of the function, independently of the dimension.

Notes are available here (source).
Update: Thanks to Alex Anderson for pointing out limitations of Barron’s Theorem. In his words:
In the context of deep learning, Barron’s Theorem has been a huge red-herring for the neural network community. Even if a single hidden-layer network can do arbitrary function approximation, that doesn’t mean that it does it efficiently (in terms of number of parameters), and these approaches are never used in practice now. There are some things happening that are much more subtle than can be treated in this fashion.
Here are some recent papers he recommends:
Posted by: holdenlee | August 1, 2015

## Media consumption, 7/26-8/1

* = found particularly interesting

•  Podcasts
• 99% Invisible
• Automation paradox*: 2-part series on automation.
• 170 Children of the magenta – How automation has changed airplane safety (no question it increases safety, but automation can lead to incompetence – pilots being too reliant)
• 171 Johnnycab: A world with self-driving cars. How would cities change?
• 172 On location: on the Bradbury building, a film icon in Los Angeles
• 173 Awareness: Ribbons for causes – it started with AIDS. How to do it: get the ribbons in a lot of high-profile places, and then tell people it’s for AIDS awareness
• 174 From the sea, freedom: Micronations
• 169 Freud’s couch
• Freakonomics:
• Finite* (innovation comes out of scarcity; rainforests contain a wealth of medical knowledge that we haven’t valued and have turned away from; antibiotics should be viewed as a scarce resource because they create resistance)
• Transformation
• Following Slate star codex*. http://slatestarcodex.com/2015/07/28/non-dual-awareness/ (Things like minimum wage, tenure, and medical school admission create “duality”: a gap between those who get it vs. those who don’t, with similar consequences), http://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/ (People misuse the word “debunk” to refer to literature that opposes their beliefs, rather than is publicly dismissed; often both sides claim they “debunked” each other). Lots of links. It’s very draining to sift through lots of media to find interesting things, so a curated list of link is gold.
• Glanced at first 2 chapters of Experimental mathematics (website). Seems to be a lot of interesting math problems to do here, though I mostly just read the notes on doing math:
• Hardy and Littlewood’s 4 axioms for collaboration:

The first [axiom] said that when one wrote to the other (they often preferred to exchange thoughts in writing instead of orally), it was completely indifferent whether what they said was right or wrong. As Hardy put it, otherwise they could not write completely as they pleased, but would have to feel a certain responsibility thereby. The second axiom was to the effect that, when one received a letter from the other, he was under no obligation whatsoever to read it, let alone answer it, – because, as they said, it might be that the recipient of the letter would prefer not to work at that particular time, or perhaps that he was just then interested in other problems….The third axiom was to the effect that, although it did not really matter if they both thought about the same detail, still, it was preferable that they should not do so. And, finally, the fourth, and perhaps most important axiom, stated that it was quite indifferent if one of them had not contributed the least bit to the contents of a paper under their common name; otherwise there would constantly arise quarrels and difficulties in that now one, and now the other, would oppose being named co-author.“

• Observation vs. proof in math (p. 11)

I have myself always thought of a mathematician as in the first instance an observer, a man who gazes at a distant range of mountains and notes down his observations… proofs are what Littlewood and I call gas, rhetorical flourishes designed to affect psychology, pitcures on the board in the lecture, devices to stimulate the imagination of pupils. – G H. Hardy

• Site + article on Escher: http://escherdroste.math.leidenuniv.nl/. What an Escher picture has to do with elliptic curves.
• Curiosity – thought about this after listening to From Curiosity to Discovery (TED radio hour) a while back. Curiosity is an important ingredient to a fulfilling life and yet is not talked about/encouraged in society. Picked up Curious, Ian Leslie and A Curious Mind, Brian Grazer.
Snippets from the intro:

No one had thought to let him in on a secret: “I was suddenly seeing that the world is incredibly interesting… The closer you look at anything, the more interesting it gets. But nobody tells you this.”

For most of Western history, it has been regarded as at best a distraction, at worst a poison, corrosive to the soul and to society… curiosity is deviant.
[The Renaissance happened was a shift where society realized it was good to ask questions.]

[the] new challenge is to make people hungry to learn, question, and create…. how to instill a culture of inquiry and critical thinking into their educational systems.

Curiosity is more state than trait: we can arrange our lives to stoke our curiosity or quash it.

• An interesting talk by Nisheeth Vishnoi on thinking about evolution using computer science. Papers quite readable.
• LLL algorithm, notes and slides. (One application is finding algebraic dependenciess, see Experimental mathematics above.
• Started reading Game of Thrones.
• Watched up to episode 11 of My Little Pony
Posted by: holdenlee | July 25, 2015

## Media consumption, 7/19-25

To keep track of some of the stuff I read/watch/listen to/play. Perhaps I’ll post one of these every week or two.

Posted by: holdenlee | July 21, 2015

## Fully Homomorphic Encryption

I’m giving a talk today on fully homomorphic encryption at the student-run “Gems of theoretical computer science” seminar.

Here are the notes (source).

Posted by: holdenlee | July 21, 2015

## TCS talk: High-rate locally-correctable and locally-testable codes with sub-polynomial query complexity

Here are notes (source, original paper) from a talk by Noga Ron-Zewi.

Abstract:

Locally-correctable codes (LCCs) and locally-testable codes (LTCs) are special families of error-correcting codes that admit extremely efficient sub-linear time algorithms for error correction and detection, respectively, while making a few queries to the received word. These codes were originally studied in complexity theory because of their close relationship to program checking and probabilistically-checkable proofs (PCPs), but subsequently they were extensively studied for their own sake.

In this work, we construct the first locally-correctable codes and locally-testable codes with constant rate, constant relative distance, and sub-polynomial query complexity and running time. Specifically, we show that there exist binary LCCs and LTCs with block length n, constant rate (which can even be taken arbitrarily close to 1), constant relative distance, and query complexity and running time $\exp(\sqrt{\log n})$. Previously such codes were known to exist only with query complexity and running time $n^{\beta}$ (for constant $\beta > 0$), and there were several, quite different, constructions known.

Along the way, we also construct LCCs and LTCs over large (but constant) size alphabets, with the same query complexity and running time $\exp(\sqrt{\log n})$, which additionally have the property of approaching the Singleton bound: they have almost the best-possible relationship between their rate and distance. This has the surprising consequence that asking for a large alphabet error-correcting code to further be an LCC or LTC does not require any sacrifice in terms of rate and distance! Such a result was previously not known for any sub-linear query complexity.

If time permits I will also discuss a very recent work that substantially improves the query complexity of LTCs to quasi-polylogarithmic function of the form $(\log n)^{O(\log \log n)}$.

Joint work with Swastik Kopparty, Or Meir and Shubhangi Saraf

Posted by: holdenlee | July 18, 2015

## Inside Out

I saw Inside Out recently and I loved it!

Some ramblings.

I’m fascinated with the whole “society of mind” idea – the idea that what constitutes our consciousness or our intelligence isn’t just one central entity (oneness is an illusion) – but rather a large collection of agents – feelings, methods of reasoning, etc. – cooperating/competing in our minds. To turn this into a story – portray all these agents in a person’s mind as characters – would be amazing but also very difficult – how could one get this across without being either being overly pedantic or abstract, or too simplistic? Inside Out really strikes a sweet spot here.

Posted by: holdenlee | July 12, 2015

## Namba.elm: A fast-paced arithmetic game

Play me at http://bit.ly/playnamba.

Source code is available at http://www.github.com/holdenlee/Namba. This is the minimal playable version (v0.1) – leave me some comments so I can make the next version better.

Detailed instructions (plus thoughts on making the game, and directions to go from here) follow. But see if you can figure it out by yourself first.

If you get a score >10, let me know in the comments:)

Posted by: holdenlee | July 10, 2015

## FCRC talks

I was at FCRC (the CS conference conglomerate that happens once every 4 years), June 13-19. Here are some of the talks I found particularly memorable.

Personal notes on FCRC talks are at https://workflowy.com/s/wkI79JfN0N and on STOC/CCC/EC talks (very rough) are at https://dl.dropboxusercontent.com/u/27883775/wiki/math/pdfs/stoc.pdf. Note that neither has been edited.

FCRC award/plenary talks

• Turing award lecture (The Land sharks are on the squawk box), Michael Stonebraker (https://www.youtube.com/watch?v=BbGeKi6T6QI): Stonebraker draws parallels between his work in building Postgres, a relational database system, and his cross-country bike trip. He described numerous challenges and how they were overcome in both situations, concluding that they were both about “making it happen”.
• Interdisciplinary research from the view of theory, Andrew Yao: Yao comes from the viewpoint of theoretical computer science but has then worked on connections to physics (quantum computation), economics (auction theory), and cryptography (certifiable randomness). Theoretical computer science started with computability and complexity theory, but has since then grown to involve algorithms and data structures, information theory, combinatorics, cryptography, quantum computing, bio-informatics, game theory and mechanism design… Yao argues that interdisciplinary research not only impacts the development of other disciplines but also has a transformational effect on the nature of CS research. He distinguishes between surface-level interdisciplinarity – ex. outsourcing a computational problem to CS researchers – vs. core-to-core interaction – where researchers cooperate not just to solve a problem at the computational level but also the conceptual level. The trend is increasingly that computer scientists are becoming specialists in different fields on a part-time basis; he is optimistic about a return to the “Renaissance man.”
• Hardware neural nets: from inflated expectations to a plateau of productivity, Olivier Temam: Temam refers to the “tuning curve” for reactions to innovation, consisting of
• peak of inflated expectations
• trough of disillusionment
• slope of enlightenment (when people understand what’s useful for)
• plateau of productivity

This happened with neural nets, which were initially hyped as “emulating the brain in hardware”; now the focus is on the modest goals of doing well in concrete tasks like classifying images, which we have made a lot of progress in. (There is a “bimodal” response to the idea of “emulating the brain in hardware”: bullshit vs. awesome. People should be able to engage in more level conversation without fearing for their careers.) Three communities need to work together on neural nets: neuroscience – to understand what the brain is doing, computer architecture – to implement hardware, and machine learning/algorithms – to translate brain functionality into algorithms to pass on to the architects, the “middleman” in this game.

• The F# path to relaxation, Donald Syme: Syme gives his perspective on designing and implementing a programming language (F#). F# is notable in that it is a functional language that incorporates features from OOP. People can often be polarized towards different programming language paradigms (ex. functional vs. object-oriented (OOP)). But rather than create opposing camps, it’s good to look at the advantages of each and see if there’s a way to include all those advantages (which Syme calls “relaxation” or “synthesis from contradiction”) in as simple a way as possible.

STOC (Symposium on the Theory of Computing)

• Forrelation, Scott Aaronson, Andris Ambainis: Prove that a certain problem, forrelation (Fourier correlation), gives (essentially) the largest possible separation between quantum and classical query complexities (how many queries it would take to solve the problem if the queries are classical vs. quantum). http://dl.acm.org/authorize.cfm?key=N98526
• Super-resolution, extremal functions and the condition number of Vandermonde matrices, Ankur Moitra: Investigates the very general question: When we recover fine-grained structure from coarse-grained measurements (called “super-resolution”)? He finds there is a sharp threshold for the cutoff frequency (we’re allowed to make measurements below this frequency) above which measurements can be recovered from $\frac1{\text{poly}}$ noise, and below which measurements cannot be recovered from exponentially small noise. The analysis involves calculating the condition number of a Vandermonde matrix (and surprisingly, involves the Beurling-Selberg majorant, which has traditionally been used in analytic number theory, e.g., to count primes in intervals). More generally, many problems about well-conditioning turn out to be much nicer when studied over the complex numbers. For example, the Discrete Fourier Transform is a particularly well-conditioned Vandermonde matrix. Then tools from complex analysis naturally come into play. There may be more opportunities to apply complex analysis to problems in machine learning/signal processing/statistics. http://dl.acm.org/authorize.cfm?key=N98679
• Knuth Prize lecture, Laszlo Babai: Among other things, Babai talks about the many surprising applications of group theory to theoretical computer science, including the following:
• “Width-5 branching programs can count” because $A_5$ is not solvable (https://people.cs.umass.edu/~barring/publications/bwbp.pdf).
• The idea of a semidirect product inspires the zig-zag product of graphs used to construct explicit expanders for derandomization (SL=L). (http://arxiv.org/pdf/math/0406038.pdf)
• The automorphism group of a graph has consequences for its structure: vertex transitivity implies facts about perfect matchings, cycle length…possible Hamiltonicity.
• A nontrivial monotone boolean function invariant under a transitive permutation group conjecturally has the maximum decision tree complexity (is “evasive”).
• What is the complexity of questions about groups (specific classes of group (non)membership questions, etc.)? They have appeared as candidate separations for complexity classes.

EC (Economics and computation)

I went to just the award lectures for EC.

• Algorithms for strategic agents (Matthew Weinberg): One big difference between computer science and economics is that:
• in CS (algorithm design), we assume that inputs to algorithms are correct, vertex connectivity, expansion… possibly Hamiltonicity.
• in economics (mechanism design), we have to deal with the possibilities that people will lie to take advantage of the system. More precisely, the inputs – ex. people’s bids in an auction, or preferences in a lottery – are given by agents acting in their self-interest, i.e., to maximize their own utility.

Not only do you have to design an algorithm that gives the correct answer, you also have to design it in such a way that people can’t take advantage of it by lying about their values or rankings. A natural question from a TCS perspective, which Weinberg explores, is whether there is a black-box reduction from certain classes of algorithms to corresponding mechanisms.

• Test of Time talk (influential paper from 10-25 years ago): The social cost of cheap pseudonyms, Eric Friedman, Paul Resnick: The Internet allows cheap pseudonymity. At the dawn of the Internet age, Friedman and Resnick establish that there is a definite economic price for pseudonymity. They first model pseudonymity through a repeated prisoner’s dilemma game where players are allowed to leave and come in with new identities. This model, however, gives rise to strange equilibria, and little can be said. The authors then adds the realistic assumption of “trembling,” i.e., accidentally defecting when you attempt to cooperate (e.g. a package you send through eBay gets lost in the mail). One can then prove quite a few things about this simple model. For example, it’s natural for old-timers with good reputation to be occasionally hostile against newcomers (which they calls Pay Your Dues).

Follow