Tuesday, August 31, 2004
Theory in the Coca-Cola Capital of Iowa
Posted by Lance
Today I gave the talk at the Atlantic
Theory Seminar, not Atlantic as in ocean but Atlantic, Iowa (population 7,257).
Located halfway between the theory groups of Iowa State and
Nebraska, the Cass County branch of the Iowa Western Community College
hosts a theory seminar every couple of months.
Pretty impressive that these theory groups, faculty, postdocs and students,
drive two hours each just to join for a seminar and talk theory with
each other. Quite an enjoyable day in rural America.
Sorry about the quality of the picture above. We didn't have a digital
camera so I took the picture with my cell phone.
9:28 PM
#

Sunday, August 29, 2004
Computationally Simple One-Way Functions
Posted by Lance
An upcoming FOCS paper is drawing a lot of interest at TTI and
Chicago,
Cryptography
in NC0 by
Benny Applebaum, Yuval Ishai and Eyal Kushilevitz.
Don't let "NC0" scare you, it simply means that
every output bit depends on a constant number of input bits. In this
paper the authors show that under reasonable assumptions one can have
one-way functions and pseudo-random generators where each output bit
depends on only four input bits. Quite surprising that such simple
functions can be so hard to invert.
Their proof uses a clever but not overly complicated reduction
using randomized polynomials from a larger class (⊕L/poly) to
NC0 and
so if one-way function and PRGs exist in the larger class than slightly
weaker versions exist in NC0 respectively. The larger class
contains many functions conjectured to be one-way or a PRG,
including those based on factoring.
I remember Mike Sipser mentioning in a complexity class I took back in
the mid-80's that
"We don't even
know how to show there are no one-way functions in
NC0." Now we know why.
7:29 AM
#

Friday, August 27, 2004
What happened to the future?
Posted by Lance
Disney World has two areas originally designed to give a glimpse of
"the future", Tommorowland in Magic Kingdom and Future World
at EPCOT, which itself once stood for Experimental Prototype Community
of Tomorrow. These areas give somewhat a nostalgic view of the future from
the time of my childhood but not of the future from today.
Not only has society's view of the future changed but even the view
about the future has changed. We live in an era of massive changes in
technology particularly in access to communication and information. In
the "Spaceship Earth" ride that shows the history and future
of communications, the future shows two kids seeing and playing games
with each other through video monitors, a task easily accomplished
today with webcam-equipped networked PCs. But the ride missed many
aspects of computers and the web. Who would have thought when I was a
kid that I would now be writing this post on a commuter train that
soon will be read around the world.
On the other hand we also dreamed of flying cars and space planes but
the technology of transportation has not significantly changed since I
was born and I don't expect major changes in the next forty years. So
the future has come but not quite as we expected.
Technology has made the world a more homogeneous place. When at the
Norway pavilion at EPCOT, an American visitor asked a Disney worker
from Norway about what the country was like to visit. "The towns
are like mid-size American cities" was the reply. I guess it is a
small world after all.
7:06 AM
#

Wednesday, August 25, 2004
ε and the Olympics
Posted by Lance
Back from Disney World and ready to tackle the new academic year. Thanks to
Adam Klivans for his guest posts. Interesting to see that COLT
glorifies the open problems. In complexity we prefer to see them closed.
Varsha Dani emailed a note about an article
describing an Amazon tribe with no concept of numbers. I guess they
don't send anyone to the olympics where numbers seem king.
Watching gymnastics I am always amazed how winning seems to
depend on sticking the landing. After a rather complicated and lenghty
routine why should whether your feet move at the end be the difference
between Gold and Bronze? The answer lies in the ε; the best
gymnasts can uniformly hit the major elements of their routines but
consistently sticking the landing is much more difficult and that what
makes or breaks the result.
We see the ε issue in many scenarios, most notably the 2000
US presidential election where confusing ballots in one county created
a huge controversy. The budget of the NEC Research Institute is
miniscule compared to the revenue of NEC corporate. A couple of years
ago NEC eeked out a profit about the same size of the cost of the
Institute and all of a sudden the NEC Research Institute looked very
expensive to NEC. In another example, consider the relatively large
power a small party can have in a coalition in a parlimentary system.
The best way to avoid the ε problem is by decisive
victories. Solidly beat your
opponent in an election or have enough parlimentary seats so you don't
need small partners. Make enough of a profit so that a cheap research
lab stays that way. One only gets the ε problem when one has
a statistical tie and then the small things take on great
importance.
In gymnastics one would want a system where a greatly superior gymnast
can score high enough that a small jump on the landing won't make a
difference. But the current system doesn't allow that; with a top
score of 10 the world's best gymnasts all get well over nine, making
it impossible to put enough of a gap between a the best and the rest.
6:44 PM
#

Friday, August 20, 2004
"Conclusions and Open Problems" (by Adam Klivans)
Posted by klivans
Unless Lance has been involved in some unfortunate roller coster accident, this is most likely my last post. It remains to be seen whether I will be allowed to keep my posting privileges here on the board (place your bets on "No"). What I've taken away from this experience, more than anything else, is that posting every day is a tremendous time sink. Lance, kudos to you for the almost daily missive.
At the end of most technical talks is the obligatory "Conclusions and Open Problems" slide, usually the least thought out moment of the presentation, which consists of a brief summary of the talk and a list of the most obvious (difficult) open problems. For the last few years, COLT has glorified the open problems section and allocates about an hour of time for a presentation of open problems. The open problems themselves must be submitted months beforehand and are refereed (how rigorously is anyone's guess); accepted problems appear in the proceedings. A list of this year's open problems can be found on the COLT 2004 program schedule -- the session was held on Friday evening.
Are there any other computer science conferences where open problems are refereed and given their own slot in the program? It always seems to work out well at COLT, and while I am aware of open problems being presented at rump sessions at various conferences, I don't know of other venues which require advance submission of problems.
With that, gentle reader, I return you to Lance's wise embrace. If I ever get to guest blog again, I will reveal what's hidden in Lance's "Private" directory here on fortnow.com. Some "lost" theorems apparently-- here's an aborted post entitled "Soon I will be famous: a simple proof that NP = coNP" -- and here's another unfinished note with some not so nice things to say about Algorithms. And what's this? A love letter to Jessica Simpson? Oh if only we had more time.
2:19 PM
#

Thursday, August 19, 2004
Constant Depth Circuits, Fourier Transform, and Learnability (by Adam Klivans)
Posted by klivans
First, thanks to Jeff Erickson for the lone comment on my previous post. For a second I was worried that I would have to post about Quantum Learning to bait Scott Aaronson into commenting. Fortunately, we're past that. I also realize now that being in Chicago I'm just a two hour drive away from you Jeff at UIUC. Maybe you, Ernie, and I should get some 3-D hotcakes at the local Waffle House-- on me.
Many top notch researchers come to visit the Toyota Technological Insitute here in Chicago. This fall alone Lenore Blum, Bruno Codenotti, Prahladh Harsha, and Jaikumar Radhakrishnan will be in residence along with TTI's regular faculty.
Yishay Mansour visited in August for about two weeks, and I had the chance to ask him about his work along with N. Linial and N. Nisan which pioneered the use of discrete Fourier analysis in computational learning theory. The paper Constant Depth Circuits, Fourier Transform, and Learnability gives a quasi-polynomial time learning algorithm for constant depth circuits with respect to the uniform distribution. More importantly, it pointed out a connection between the Fourier concentration of a Boolean function and its learnability.
If f is a Boolean function, we could imagine writing f as a multilinear polynomial in n variables mapping {-1,1}^n to {-1,1}. Every Boolean function can be written as such a polynomial whose total degree is at most n. Each coefficient of this polynomial measures the correlation of f with that monomial. These are the Fourier coefficients of f. Linial, Mansour, and Nisan showed that if the sum of the squares of the coefficients of monomials of degree d and larger is small, then there is an n^{O(d)} time algorithm for learning f with respect to the uniform distribution-- roughly speaking it is sufficient to estimate the coefficients of only the low degree monomials and output this polynomial.
What does any of this have to do with constant depth circuits? They also proved that for any circuit of depth d, the sum of the squares of the coefficients on the terms of degree (log n)^d and higher decays rapidly. From the above paragraph this gives us a quasi-polynomial time algorithm. For a rough intuition as to why this is true, consider one implication of Hastad's Switching Lemma, namely that parity cannot even be approximated by constant depth circuits. Thus every coefficient of a sufficiently large monomial (each monomial is a parity function) must be small.
As it turns out, according to Yishay, the work stemmed from an effort to prove circuit lower bounds, rather than a plan to develop new learning algorithms. The authors were inspired by Kahn, Kalai, and Linial's work on the influence of variables on Boolean functions and thought that discrete Fourier analysis might be the right tool for studying circuits. They happened to be correct-- in an unexpected way.
3:15 PM
#

Wednesday, August 18, 2004
Banff Revisited (by Adam Klivans)
Posted by klivans
Hey, you're back! And I'm really starting to feel comfortable at this posting gig. Thanks for all of the comments I got about my previous post-- all zero of them.
A few weeks ago Lance reported on the events from a complexity workshop he was attending at the Banff International Research Station, an institute similar to the International Space Station except that it's in Canada rather than outer space.
What Lance didn't mention was that at the exact same time, some important computer science conferences were taking place just down the street at the Banff Park Lodge. I was participating in the 17th annual Conference on Learning Theory (COLT) which was co-located with the International Conference on Machine Learning (ICML) and the Conference on Uncertainty in Artificial Intelligence (UAI).
Back when COLT stood for Computational Learning Theory (circa 1988-2002), the conference was known for focusing on the computational complexity of machine learning. Nowadays the conference still operates from a theoretical perspective, but, as the Conference on Learning Theory, the program covers everything from game theory and economics to kernel methods in addition to traditional PAC style results.
A PAC style result that appeared in COLT 2004 which may be of interest to the readers of this web log is Polynomial-Time Prediction Strategy with Almost Optimal Mistake Probability by Nader Bshouty. His paper solves a problem in the online learning setting. In this setting, an unknown function f is chosen from some fixed concept class and at time t a learner is presented with an input x chosen according to some distribution D.
The goal of the learner is to run in time polynomial in log t (and all of the other relevant parameters) and predict the value of f(x) with mistake probability O(1/t) (Haussler, Littlestone, and Warmuth proved that this mistake probability is optimal). Bshouty shows that if the concept class is PAC learnable, then there exists an online learning algorithm which runs in time polynomial in log t and achieves mistake probability O(log t/ t). His algorithm has an exponential improvement in running time as previous solutions ran in time polynomial in t. The algorithm works by iteratively creating a branching program based on hypotheses considered at previous time steps. A boosting-type procedure dictates which branch to take for any new input.
In the spirit of being controversial (Lance asked me to be controversial), I could discuss the pros and cons of the change from Computational Learning Theory to Conference on Learning Theory (as a concrete example of differences in the community, about half the participants at the COLT business meeting wanted to see COLT co-located with STOC in 2006-- University of Washington folks make your move-- and the others wanted to co-locate with ICML). I'll leave that, however, for my faithful readers to debate. I will point out that it's hard to argue with the increase in attendance at COLT over the last few years.
2:20 PM
#

Tuesday, August 17, 2004
Favorite Theorems: The Harmonic Sieve (by Adam Klivans)
Posted by klivans
What does Lance Fortnow do after getting a paper accepted to FOCS? He goes to Disneyworld for a week, of course. While Lance blasts pasts celestial satellites on Space Mountain, I will humbly try to replace the irreplaceable.
Considering the topic of yesterday's post, if there were a list of favorite theorems in computational learning theory, a field which makes only brief appearances here on the weblog despite its many connections to computational complexity, Jeff Jackson's algorithm for learning DNF formulas would certainly be on it.
A DNF formula is a Boolean formula written as an OR of ANDs (e.g. x_1 and x_2 OR x_3 and x_5). The size of a DNF formula is equal to the number of terms or ANDs. The problem of PAC learning an unknown polynomial-size (in n, the number of variables) DNF formula with respect to an arbitrary distribution remains one of the most notorious open problems in the field (for background on PAC learning see this post or Kearns and Vazirani's excellent book An Introduction to Computational Learning Theory).
In fact, even if we restrict the underlying distribution on examples to be the uniform distribution, the fastest algorithm for learning DNF formulas runs in quasi-polynomial time (a result due to K. Verbeurgt-- the main idea being that only terms of logarithimic length have a chance at being satisfied, so longer terms can be ignored).
If, however, we allow the learner to make queries to the unknown DNF formula, i.e. if the learner can choose any input x and ask for the value of the DNF formula evaluated on x, then the learner can succeed in polynomial-time.
The solution, due to Jeff Jackson in 1994, shows how to learn polynomial-size DNF formulas with respect to the uniform distribution in polynomial-time (again assuming the learner has query access to the unknown DNF). His algorithm, which he has called the Harmonic Sieve due to its use of Fourier analysis, builds on work due to Blum, Furst, Jackson, Kearns, Mansour, and Rudich (``Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis'') which showed that for any DNF formula with s terms, there exists a parity function which agrees with the DNF formula on roughly a 1/2 +1/s fraction of inputs.
The next step of the algorithm involves a novel application of Boosting algorithms (see this post for more on Boosting) for combining these parity functions to obtain an accurate hypothesis. The output of the Harmonic Sieve is not a DNF formula but a threshold of parity functions.
The Harmonic Sieve is one of the rare examples in computational learning theory of a polynomial-time algorithm for an expressive concept class. It is natural to ask whether the queries are essential for the algorithm. Unfortunately it seems like the answer is yes-- we do not know how to learn decision trees or even juntas (both strictly weaker concept classes than DNF formulas) in polynomial-time with respect to the uniform distribution unless the learner has query access to the unknown function. Removing the dependence on queries would be a real breakthrough.
By the way, the interface on blogger.com is worse than I ever could have imagined. Apologies in advance for formatting errors.
11:12 AM
#

Monday, August 16, 2004
Favorite Theorems: Parallel Repetition
Posted by Lance
July Edition Consider a simple Arthur-Merlin
game: Arthur probabilistically chooses a string r sends it to Merlin
who responds with y and then Arthur runs some algorithm A(r,y) to
decide whether to accept. Merlin's goal is to achieve the highest
acceptance probability possible p for Arthur. Suppose we run the game
twice in parallel, Arthur sends r1 and r2 and
Merlin sends y1 and y2 and Arthur accepts if
A(r1,y1) AND A(r2,y2). The
highest possible acceptance probability will be
p2.
Now consider the MIP model with two Merlins M1 and
M2 who cannot communicate with each other. Arthur sends u
and v to M1 and M2 respectively who respond with
y and z. Arthur accepts based on some function A(u,v,y,z). Once again
M1 and M2 try to achieve the highest possible
acceptance probability p. Now we run the game twice in parallel,
Arthur sending u1 and u2 to M1 and
v1 and v2 to M2 receiving
y1 and y2 from M1 and z1
and z2 from M2 and accepting if
A(u1,v1,y1,z1) AND
A(u2,v2,y2,z2).
One might assume that the best the provers can achieve is
p2 (an assumption in fact made in an early paper
co-authored by a certain weblog author) but in some circumstances the
provers can do better. However Ran Raz shows that if p is less than 1, one can
get an exponential decrease in p with a polynomial number of parallel
rounds in
A
Parallel Repetition Theorem by Ran Raz
This paper settles one of the more perplexing aspects of multiple
prover proof systems with a highly complicated proof. The result also plays a
critical role in reducing the number of queries in probabilistically
checkable proof systems which led to some
optimal approximation bounds.
As a side note I am off on vacation tomorrow and Adam Klivans will
guest blog in my absence. Enjoy.
11:14 AM
#

Thursday, August 12, 2004
Wisdom of Crowds
Posted by Lance
Keeping with this week's theme of prediction, I just finished reading
The Wisdom of Crowds
written by New Yorker writer James Surowiecki. The book makes the case
that large groups can make great decisions, often better than any
individual in a group, if three conditions occur:
- diversity of the members of the group,
- independent opinions of the group members, and
- a method for aggregation of the opinions.
Surowiecki's very readable book gives many examples where group
decisions do quite well (sports betting, Google's search techniques
based on other's web pages, Linux) and where group decisions fail
(stock market bubbles, committee meetings, strong CEOs).
Chapter 8 is devoted to science and how many widely spread scientists
developing and criticizing various theories lead to explosive growth in
our understanding. He also notes that this ideal world has its flaws
as unknown researchers have a harder time selling their work than more
established scientists.
I don't agree with all the conclusions drawn by Surowiecki but he does
lay out what we need to do and not do to benefit from the pooled
knowledge of a group. We can also draw lessons in computer science as
computation and information gets more distributed that we need to
integrate to find the best solutions we can.
9:13 PM
#

Tuesday, August 10, 2004
Fun with Information Markets
Posted by Lance
Just over a year ago the Department of Defense cancelled
their program on using markets to predict future world events, an
overreaction that stopped funding a potentially powerful prediction tool. Robin
Hanson has a comprehensive web page
giving a history and plenty of links.
Information markets live in limited academic-based markets like the Iowa Electronic Market and
offshore sites like Tradesports. For example the current
price on Tradesports for Bush winning the election is 51.5 which
translates to a 0.515 probability that Bush will win indicating a very
close contest.
For each state, Tradesports has a security on whether Bush will win
that state. They also have some bundles of states. The price for
Florida is 50.1, Ohio 55.4 and Bush winning both Florida and Ohio is
47.1. This gives a surprising correlation between Florida and Ohio. If
you believe the theory there is a very high 0.94 probability that Bush
wins Ohio given that he wins Florida and with a 0.89 probability these
two very different swing states will go the same way.
Tradesports gives David Vitter a 59 percent chance of becoming a
senator from Louisiana. David Vitter is the brother of CS theorist and
former SIGACT chair Jeff Vitter.
11:47 PM
#

Monday, August 09, 2004
Micromorts
Posted by Lance
Currently on airplanes children under two can ride free by sitting on
a parent's lap. The FAA is considering whether to require such
children to have their own seat in a child seat similar to the ones
most states require for cars. Sounds reasonable? One argument against
goes as follows: If we require parents to pay for a seat for a
children there is a chance they will drive instead greatly increasing
their risk.
How can we evaluate risk? Decision scientists have developed a measure
called micromorts (μmorts). A μmort is a one-millionth chance
of death. Sounds gruesome but by counting micromorts we can analyze
the right choices to keep the most people alive.
All of three lap children have died in airplane crashes where their
parents survived since 1987. The average driver runs the risk of about
.02 μmorts/miles. If the average car trip is say 500 miles that
translates to about 10 μmorts for each child in the car. Three laptop
children have died in airplane crashes where the parent has survived
since 1987. This translates to the equivalent of 300,000 car trips or
about 15,000/year. About 6 million children ride on laps on airplanes
each year, so if more than 0.25% of them were to ride in a car instead
because of the higher prices, we would about cost lives by requiring
safety seats on planes. My numbers, drawn from various internet
sources, don't tell the whole story but nevertheless we
can and should do a full analysis before setting policy.
It would be nice to have a list of various activities and how many
μmorts they use, say you feel like parachuting, you can get an idea of
how dangerous it is compared to say riding a bicycle. But we don't get
such lists and people have to use their own judgments and often make
the wrong decisions. We can also give a cost amount to a μmort; how
much is it worth to save lives?
By finding statistics online you can calculate the risks in your
various activities. You need to use about 3 μmort/day on average to
keep a 10% chance of accidental death in your life. Spend them wisely.
10:51 AM
#

Friday, August 06, 2004
When to Announce?
Posted by Lance
Suppose you have some partial solutions of a popular problem. At what
point do you announce your results? If you announce your partial results
you run the risk of someone else taking your ideas and solving the
full problem and you won't get as much credit as you deserve. If you
wait and try to extend the work yourself someone else might get the
same results you already have and you'll lose or at best have to share
the authorship.
If you are completely altruistic you should announce your progress as
this will best advance science quickly. But as in the end you need to worry
about your own publication record, particularly for a young
researcher, the answer isn't so clear. Of course it depends on many factors including your
belief that you or others could extend the work as well as when the
next conference deadline occurs.
Oddly enough before the internet (in the eighties) such decisions were
easier. You could write up a technical report to establish your result
and you would have months before your work spread throughout the
community. This gives you plenty of time to try and extend the work. The
quick spread of information not only improves collaborative work as it
does, but forces us to make decisions that we could avoid in the past.
11:26 AM
#

Wednesday, August 04, 2004
Small Circuits
Posted by Lance
Fix a constant k. In 1982 Ravi Kannan showed that some
Σ2p∩Π2p
language must not have nk-size (nonuniform) circuits. Here
is a proof sketch: A simple counting argument shows there is a
function that depends only on the first 5k log n inputs that is not
equivalent to a nk-size circuit. Just by writing out the
quantifiers in Σ4p you can compute the
lexicographically first such function. Now we have two cases:
- If SAT does not have polynomial-size circuits then SAT then
Σ2p∩Π2p which
contains SAT does not have nk-size circuits.
- If SAT
has polynomial-size circuits then
Σ4p=Σ2p∩Π2p
(Karp-Lipton) and thus
Σ2p∩Π2p does
not have nk-size circuits.
This is a wonderful example of a non-constructive proof and giving an
explicit
Σ2p∩Π2p
language without quadratic-size circuits is open. With better known
collapses we can improve the result from
Σ2p∩Π2p to
S2p.
Vinod Variyam recently observed that the class PP which is not known to contain
S2p also cannot have nk-size
circuits. Here is his proof: If PP has nk-size circuits
then PP is in P/poly which implies the polynomial-time hierarchy and
in particular Σ2p is in MA which is in PP
which has nk-size circuits contradicting Kannan.
Read Variyam's paper for details and references.
5:45 AM
#

Monday, August 02, 2004
Larry Stockmeyer
Posted by Lance
We lost a great complexity theorist over the weekend. From Phokion Kolaitis:
It is with great sadness that I write to inform you that
Larry
Stockmeyer passed away. He died at his home as he had wished when he
fell terminally ill a few weeks ago.
Larry was one of the pioneers of computational complexity who made
fundamental and lasting contributions to the field.
His death creates a void in our community that cannot be filled.
Indeed Stockmeyer developed many of the important early concepts in
complexity such as alternation and the
polynomial-time hierarchy, concepts that have laid the foundation for
many important works in computational complexity. He had
a number of great
results throughout his career in complexity and nearly all areas
of theoretical computer science. Our community has lost one of its giants.
4:19 PM
#

Strangers in the Same Place
Posted by Lance
Professor X and Professor Y from the same university attend the same
conference. At the end of the conference, Professor X says in a
surprised tone "That's the most time I have talked with Professor Y
all year." He shouldn't be surprised; this is a story I've heard
over and over again (and have even told myself).
A professor's life has many responsibilities. Teaching and research of
course but also paper writing, grant proposals, meeting with students,
and administrative tasks including seemingly endless committee
meetings. When I visit another university I leave most of these
responsibilities behind so I can focus on research. I also expect the
people who invited me to make time in their schedules so we can work
together. That way even a short visit can be quite productive.
As the length of the visit increases it becomes harder to avoid these
other responsibilities and the amount of research time per day
decreases. In the extreme, two people who work at the same university
for years end up spending very little time talking research together.
This explains why teleconferencing will never replace traveling no
matter how technologically advanced. The social requirements of a
short visit require people to spend time together in ways a
teleconference cannot. What teleconferencing will do is
"allow" me to attend those endless committee meetings
wherever I am.
8:53 AM
#
