Wednesday, November 26, 2003
Critical Mass
Posted by Lance
In the left column of this weblog's homepage, I've added a new
"weblogs" section to list the academic weblogs that I have
been following. Not very long right now, so start up a relevant weblog
and I'll add you to the list.
What is Critical Mass? This is a weblog written by U. Penn English
Professor Erin O'Connor. Not much computer science but she covers a
number of issues about academic freedom, an important issue to every academic. Take a look.
Don't forget that the 2004 Complexity Conference
submissions are due today. Have a great
Thanksgiving!
8:51 AM
#

Monday, November 24, 2003
Theory and XML
Posted by Lance
XML (eXtensible Markup Language) has become quite a popular data
format in recent years. XML roughly corresponds to a tree. For
example,
<person><name>Harry</name><age>29</age></person>
<person><name>Jane</name><major>Computer Science</major></person>
represents a tree. The root having two children, each labeled
"person". The first of these children has two children named
"name" and "age". The first of those children has a
leaf node labeled with the phrase "Harry". For a larger
example, see the
RSS feed for my weblog.
XML was designed as a flexible way to present documents for later
displaying. Since the XML format can be easily produced and parsed,
XML also serves as a standardized method for transferring data
between databases, far better than the old CSV (Comma-Separated
Values) format.
Recently there have been some work on directly manipulating and
querying the XML data. As a theorist, this seems like a bad idea,
particularly for larger databases. While XML completely represents the
underlying tree, it is not a good implementation of that tree. Basic tree
operations like parent and sibling are very expensive
just looking at XML. About the only thing one can do quickly with XML
is depth-first search.
Far better to "shred" the data into a better tree
implementation like a DOM (Document Object Model) or a full-fledged
database and do the work there, rewriting a new XML if needed.
One issue though is when the XML file is on the order of 5-10 GB,
a bit larger than what can be stored in the memory of a typical desktop
machine. One can stream through the file rather quickly but cannot
recreate the tree. This opens up some interesting theoretical
questions:
- Given a stream of data in XML format, how can one do efficient
analysis and manipulations of the underlying tree? I suspect one would
want to sometimes shred subtrees, but you cannot determine the size of
the subtree until after its been streamed. Perhaps some randomness or
streaming the file multiple times might be helpful.
- XML might not be the right model of a tree for this purpose. What
is the best way to stream a tree or other data structure to allow an
efficient implementation of the basic operations of the data
structure? Perhaps some redundancy might be useful.
9:37 AM
#

Friday, November 21, 2003
Rational Functions and Decision-Tree Complexity
Posted by Lance
I thought I should mention some of my favorite and most frustrating
open questions over the years. Here's one of them:
Let f:{0,1}n→{0,1}. Let h and g be n-variable degree d
polynomials over the reals. Suppose for all x in {0,1}n,
g(x)≠0 and f(x)=h(x)/g(x). Is there a constant k such that the
decision-tree complexity of f is bounded by dk?
The decision-tree (or query) complexity is the number of bits of x
that need to be viewed to determine f(x). The queries to the bits of x can be
adaptive. I'm particularly interested in the case where d is
poly-logarithmic in n.
Nisan and Szegedy answer the question in the affirmative if g(x)=1. Their
result holds even if f(x) is only approximated by h(x). However if we
allow arbitrary g(x), h(x)/g(x) can closely approximate the OR
function which requires looking at all of the bits. The case where we
require exact equality of f(x) and h(x)/g(x) is the open question at hand.
8:42 AM
#

Wednesday, November 19, 2003
Midwest Theory Day
Posted by Lance
Over the years a number of localized theory workshops have developed
which bring together researchers from around the area to mingle and
learn about some of their research. For example BATS (Bay Area Theory
Seminar-San Francisco area), CATS (Capital Area Theory
Seminar-Washington DC), DIMACS Mixers (New Jersey), Netherlands
Theoriedag and others including many I've never heard of.
In the middle of America we have the Midwest Theory Day. Generally
since 1980 this is a day of talks held in the fall in the Chicago area
and in the spring somewhere else in the Midwest such as the
University of Wisconsin,
Indiana, Illinois, Purdue, Notre Dame, etc. The next Midwest theory
day will be held December 13 at DePaul University and in the spring on April 24 at the
University of Iowa.
So come to Chicago or attend (or organize) you own local area theory
seminar and meet your theory neighbors.
8:47 AM
#

Monday, November 17, 2003
Editors Needed
Posted by Lance
Back in my science-fiction reading days, I particularly remember one
editorial written in one of those anthology magazines about
1980: In the near future, you will be able to access, via your
personal computer, any science fiction story right after it has been
written. If you like a certain author, you can read other stories from
that author, even if we didn't decide to put it in this magazine. In
this future world, will you still need me, the editor? The answer is
yes, for there will be way too much dreck out there for you to find
the good stories within, and you will need people like me to point
them out to you.
The future is now and though I haven't kept up with science fiction,
the same issue applies to academic publications. Recent posts by
Michael Nielsen
and on Slashdot have asked: With nearly all new papers in
physics and computer science easily accessible on the web, how do
you find the ones worth reading?
Conferences have traditionally played this role in computer science. But, by
definition, paper choices are decisions by committee and with the massive
growth in the field, many good papers do not appear in the major
conferences.
What we need are "editors"! You can help. Write a survey
paper, or spend a page in your research papers discussing the
important earlier results in a field. Maintain a web page pointing to
papers you find interesting. Start a weblog saying what you find
interesting--you don't have to post long or often, just to say, hey,
this paper is worth looking at. This way people with similar interests
to you can find out what at least you think is important. Only by
many of us working together can we make the interesting papers stand out.
9:26 AM
#

Friday, November 14, 2003
A Regular Problem
Posted by Lance
Here's a problem from Janos Simon. For a set A, let
L(A)={x | for some m≥0, xm is in A}
Simon asked, as a homework problem, to show that if A is regular then
L(A) is also regular. The standard solutions require an exponential
increase in the number of states. Simon asks whether this is needed in
general. If A is accepted by an n-state DFA, can one find a
poly(n)-state DFA for L(A) or is an exponential state DFA for L(A)
required?
As an aside the submission deadline for the Complexity
Conference is right around the corner. Get your papers ready.
9:17 AM
#

Wednesday, November 12, 2003
Opportunities at Chicago (Blatant Plug)
Posted by Lance
[I will do this just once. Feel free to use the comments link to
mention your own institution.]
We are seeking theory graduate students and faculty (all ranks) in the
Computer Science Department of the University of Chicago. The Toyota
Technological Institute, located on the University of Chicago campus,
is looking for both faculty and postdoctoral candidates.
The Theory Group has expanded over the past few years and will continue to
grow. We have a long tradition of outstanding doctoral students and notable
research accomplishments.
Click here
for further information.
3:56 PM
#

Tuesday, November 11, 2003
25 Years of Science Times
Posted by Lance
It happened right after I started high school in suburban New Jersey, the start of the Science Times section in Tuesday's New York Times. The Science Times not only helped get me excited about science but made me feel others could get excited over science as well. I've kept reading it off and on during these past 25 years. The Science Times has reported on a fair amount of research in complexity and theoretical computer science, for a time some joked that a result was not important until it appeared in the New York Times.
Today the New York Times celebrates the 25th Anniversary Issue of the Science Times. It features 25 questions such as Does Science Matter? and What Is the Most Important Problem in Math Today? (Hint: It's not P versus NP).
I'll end this post with a quote from the essay of Alan Lightman:
All of the scientists I've known have at least one quality in common: they do what they do because they love it, and because they cannot imagine doing anything else. In a sense, this is the real reason a scientist does science. Because the scientist must. Such a compulsion is both blessing and burden. A blessing because the creative life, in any endeavor, is a gift filled with beauty and not given to everyone, a burden because the call is unrelenting and can drown out the rest of life.
6:55 AM
#

Sunday, November 09, 2003
CISE Reorganization
Posted by Lance
The Computer and Information Science and Engineering Directorate of the NSF has completed it reorganization. The CISE web site details the new structure.
CISE now has four divisions. Instead of each division have a large number of specific programs, each division contains a smaller number of clusters covering a broader research area. I'm happy to see "Computational Complexity" specifically mentioned in the Formal and Mathematical Foundations Cluster in the Division of Computing & Communication Foundations. However it shares that cluster with such diverse topics as "computational algorithms for high-end scientific and engineering applications" and "analysis of images, video, and multimedia information." Hopefully funding panels will meet in the more specific areas to avoid trying to compare proposals from vastly different areas of computer science.
Quantum and Biological Computing sit in a different CCF cluster, Emerging Models and Technologies for Computation. This shows NSF's hopes for these new technologies but may also give them a way to phase out these areas if the technologies don't show promise.
Program announcements for the CCF clusters are still under development. The ITR solicitation is still not expected until Thanksgiving. So if you plan a grant proposal this year, you'll still need to wait.
7:52 PM
#

Friday, November 07, 2003
The Web and Research
Posted by Lance
I received the following email from David Molnar, a Berkeley graduate student. He asks some interesting questions about the use of the web and research. Instead of just answering them immediately, I though I would share his email (with his permission) and let you think about it. Feel free to comment. I'll also address some of his points later in future posts.
Molnar's Letter:
I wanted to follow up on your comment about theoretical computer
science and e-mail/IM. Also, use of weblogs.
- There's a piece of folklore about how IP=PSPACE was
proved pursuant to some race between many researchers
carried out over e-mail. I've once or twice found myself in
e-mail races with other people on mailing lists when we're
responding to something technical.
Do you think the day will come when some major theorem is
proved in an IM chat room? :)
- Weblogs as place for ideas. A friend of mine asked me if
I thought researchers in theory would be interested in putting
up a web page (actually, we were discussing a wiki) about
open problems. My response was pretty much like what
you see in Oded Goldreich's FAQ -- "well that's nice, but
good research problems are hard to find, and people seem to
reserve them for themselves or their close colleagues."
The point: I find that I'm starting to use my weblog as a place to
deposit
ideas for problems that I just don't have time to pursue, or
aren't in what I think is my main line of research.
In the old days they would have gone into a notebook somewhere
and no one would have seen them. Now maybe someone will
pick them up -- and I think that's great (especially since
I usually don't know how to progress). At the same time,
I wonder how peoples' concerns about ownership of ideas
will affect the way they use weblogs. What do you think?
10:29 AM
#

Wednesday, November 05, 2003
The PCP Theorem
Posted by Lance
This week I started teaching the PCP theorem: Every language in NP has
a proof that can be probabilistically checked with only O(log n)
random coins and a constant number of queries. One graduate student
asked me which came first, the PCP theorem or the connection to
hardness of approximation. That seemed like a strange question; after
all why would so much effort be put into reducing parameters like
number of random coins or queries unless there was some application
behind it. But then again, so much emphasis is now in the literature
on these parameters that he probably thought these were always the
parameters people cared about the most.
To put the record straight: Feige, Goldwasser, Lovász, Safra and Szegedy made the connection
between the interactive proof literature and approximation. Arora and
Safra showed the importance of reducing random coins and queries and
proved a weaker version of the PCP theorem. The theorem in the above
form was proved by Arora, Lund, Motwani, Sudan and Szegedy.
I have been teaching off the notes of Madhu Sudan from a short course
he taught on the PCP theorem. The notes don't go through every detail,
but they give one of the better expositions of what is still quite a
complex proof.
8:42 AM
#

Monday, November 03, 2003
What happened at NEC?
Posted by Lance
The NEC Research Institute (NECI) died just over a year ago. I didn't
feel comfortable talking about it then so let me say a few words now.
I joined NECI in 1999 just after its tenth anniversary. When I joined
its mission and focus was basic research in computer science and
physics. NECI gave me considerable time and resources to study
fundamental questions in computational complexity. It was an exciting
place to be.
Soon thereafter some changes were occurring. NEC modified the mission
of NECI to focus on producing technologies with basic research
secondary. Some researchers (though not us theorists) were encouraged
to join "technology groups" to find practical applications
of their research. But during this time, the administrators always
supported basic research and I never felt uncomfortable doing
theory.
But then on November 1, 2002, NECI merged with NEC CCR&L, a more
applied research lab in the same building to form NEC Laboratories
America. The new mission makes no mention of basic research. The
scientists in charge were replaced by engineering/management
types. Many of the research scientists, particularly physicists, were
let go.
My job was never in immediate danger but NEC was no longer the place
for me and so I went on the job market; no one was surprised when I
decided to leave.
A corporation like NEC needs to make decisions for the health of the
company. I do not fault NEC for the decisions that it made and they
gave me a few great years. Still I mourn the NEC Research
Institute, quite a special place during its thirteen year run.
9:01 AM
#
