GUEST BLOGGER: Bill Gasarch
(BEFORE I START TODAYS BLOG- A REQUEST. EMAIL ME OTHER
LUDDITE QUESTIONS- I WILL POST THE BEST ONES ON FRIDAY)
If u,v \in \Sigma^* then u is a SUBSEQUENCE OF v if you
can obtain u by taking v and removing any letters you like.
EXAMPLE: if v= 10010 then
e,0,1,00,01,10,11,000,001,110,0010,1000,1001,1010,10010
are all of its subsequences
Let L be any language-- a subset of \Sigma^* SUBSEQ(L)
is the set of subsequences of all of the strings in L.
The following three could be easy problems in a
course in automata theory:
a) Show that if L is regular then SUBSEQ(L) is regular
b) Show that if L is context free then SUBSEQ(L) is context free
c) Show that if L is c.e. then SUBSEQ(L) is c.e.
(NOTE- c.e. is computably enumerable- what used to be called
r.e.- recursively enumerable)
Note that the following is not on the list:
Show that if L is DECIDABLE then SUBSEQ(L) is Decidable.
Is this even true? Its certainly not obvious.
THINK about this for a little bit before going on.
There is a theorem due to Higman (1952), (actually a corollary of
what he did) which we will call SUBSEQ THEOREM:
If L is ANY LANGUAGE WHATSOEVER over ANY FINITE ALPHABET
then SUBSEQ(L) is regular.
This is a wonderful theorem that seems to NOT be that well known.
It's in very few Automata theory texts. It is not heard much.
It falls out of well quasi order theory, but papers in that
area (is that even an area?) don't seem to mention it much.
This SEEMS to be an INTERESTING theorem that should get more
attention, which is why I wrote this blog. Also, I should point
out that I am working on a paper (with Steve Fenner and Brian
Postow) about this theorem. BUT to ask an objective question:
Why do some theorems get attention and some do not?
1) If a theorem lets you really DO something, it gets attention.
There has never been a case of `OH, how do I prove L is regular?
WOW- its the subseq language of L' !!'
By contrast, the Graph Minor Theorem, also part of well quasi
order theory, lets you PROVE things you could not prove before.
2) If a theorem's proof is easy to explain, it gets attention.
The SUBSEQ theorem needs well quasi order theory to explain.
(`needs' is too strong- Steve Fenner has a prove of the |\Sigma|=2
case that does not need wqo theory, but is LOOOOOOOOOOOOOONG.
He things he can do a proof for the |\Sigma|=3 case, but that will be
LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOONG.
Can be explained to an ugrad but you are better off going through
wqo theory.)
3) If a theorem CONNECTS to other concepts, its gets attention.
There are no real consequences of the SUBSEQ theorem.
Nor did it inspire new math to prove it.
4) If a theorem has a CHAMPION it may get attention. For example
the SUBSEQ Theorem is not in Hopcroft-Ullman's book on automata
theory- one of the earliest books (chicken and egg problem- its
not well known because its not in Hopcroft-Ulman, its not in HU
because its not well known). The SUBSEQ theorem had no CHAMPION.
5) Timing. Higman did not state his theorem in terms of regular
languages, so the CS community (such as it was in 1952) could not
really appreciate it anyway.
Yet, it still seems like the statement of it should be in automata
theory texts NOW. And people should just know that it is true.
Are there other theorems that you think are interesting and not
as well known as they should be? If so I INVITE you to post them
as comments. The theorem that gets the most votes as
SHOULD BE BETTER KNOWN will then become better known and hence
NOT be the winner, or the loser, or whatever.
NOTE: The |\Sigma|=1 case of Higman's theorem CAN be asked in
an automata theory course and answered by a good student.
Is there any reason why everyone wants to call recursive anything computable anything beyond the hope that sticking in the word computation will make more people interested?
Is there any reason why everyone wants to call recursive anything computable anything beyond the hope that sticking in the word computation will make more people interested?
Yes indeed. The word "computable" describes much more closely the objects referred to than does "recursive," which historically refers only to a partular model of computation, and nowadays is too often confused by students with a particular strategy for designing algorithms. It's incongruous to talk about Turing machines and call the functions they compute "recursive." In what sense do Turing machines recurse?
``Steve Fenner has a proof of the |\Sigma|=2 case that does not need wqo theory, but is LOOOOOOOOOOOONG. He thinks he can do a proof for the |\Sigma|=3 case, but that will be LOOOOOOOOOOOOOOOOOOOOOOOOONG.''
Actually I have a proof of the general case, not using wqo's, that's 4-5 pages. It is at http://www.cse.sc.edu/~fenner/ papers/higman.pdf. The binary case is easier than the general case, but the ternary case probably isn't.
actually I had written about higman's lemma in this post, but not using the formulation of Higman's lemma that you state here, and which is much more compelling than the variant that I used. We even used it in a paper (referenced in the post).
Given a sequence of real numbers a(1),...,a(n), suppose we want to find the monotonically nondecreasing sequence that best approximates it in the least-squares norm.
It turns out there's a beautiful linear-time algorithm to accomplish this. I was elated to come up with it as a summer student at Bell Labs, until I learned that Kruskal had beat me by ~35 years.
(1) Create a linked list, where initially the ith element has a "value" of a(i) and a "weight" of 1.
(2) Repeatedly look for adjacent elements i and i+1 such that a(i)>a(i+1). Whenever you find such a pair, replace it by a single element of weight w(i)+w(i+1), and value equal to the weighted average [w(i)a(i)+w(i+1)a(i+1)]/[w(i)+w(i+1)]. Continue until a(i)<=a(i+1) for all i.
(3) Output a list of n elements, where a(i) in the final list occurs with multiplicity w(i).
Exercises: Why does this work? Why can it be made to run in linear time?
OK, I've got another result that ought to be better-known in our community (though it is well-known in a different community).
Over a Boolean alphabet, what are the largest sets of gates that are not universal? Assuming the constants 0 and 1 come for free, it's easy to show that there are exactly two such sets:
(1) the monotone gates (AND,OR), and
(2) the linear gates (NOT,XOR).
But what if the alphabet has 3 or more elements? Then the problem is much more complicated, but it was solved by Ivo Rosenberg in the early 70's. In particular, Rosenberg showed that for any finite alphabet size, there are only finitely many "maximal but not universal" gate sets.
Am I missing something? I found a proof of Higman's Lemma pretty quickly. It uses Dickson's Lemma, which now that I've looked up the terms is I guess part of w.q.o. theory, but that result has an easy, self-contained proof by induction (I learned about it in week 2 of an undergrad alg. geometry course) and is beautiful discrete math. So I'm not sure why Higman's result can't be in more texts.
Dickson's Lemma: Let S be a subset of the set of k-tuples of natural numbers Suppose that S is 'upwards closed': if v1 is in S and v2 dominates v1 coordinate-by-coordinate, v2 is in S.
Then there's a finite subset S' of S such that v is in S iff v dominates some element of S'.
Proof is induction on k.
Proof of Higman's lemma:
Let L be a language. If every string is in subseq(L), subseq(L) is decidable; so say x is a forbidden subsequence for L.
Insert 0's and 1's into x so that 0's and 1's alternate; the resulting string x' is also forbidden. Let k be the length of x'; than no string in L can have more than k alternations between 0 and 1.
Slice up the (k+1)-alternation-restricted strings according to how many 0-1 alternations (0 <= j < k+1) a string has and which bit (b) it begins with.
Any of the strings in the (j, b) slice can be naturally encoded as a (j+1)-tuple of natural numbers in a bijective way (for that slice), e.g.
000111011 ---> (3, 3, 1, 2) in the (3, 0) slice; 11011 -----> (2, 1, 2) in the (2, 1) slice; 11------> (2) in the (0, 1) slice.
Then it holds that if any (j+1)-tuple v1 encodes a forbidden (j, b) subsequence of L and v2 dominates v1, v2 also encodes a (j, b) forbidden subsequence. Thus by Dickson's Lemma, the forbidden (j, b) strings are exactly those j-alternation-restricted strings whose encodings dominate the encoding of one of a finite set of (j, b) strings. This is a finite disjunction of properties easy to test by finite automata; using the closure of regular languages under finite union and complement, and applying the easy check for too many 0-1 alternations, we find subseq(L) is regular. QED
This does beg the question of how to provide the finitely many strings we need, given a description of L (Dickson's Lemma is nonconstructive). But of course it's undecidable to do this given just a machine for L, and in any case, as Bill says, who actually cares about subseq(L)?
(I feel odd commenting on my own post.) YES, the proof given above of SUBSEQ Thm. using Dickson's lemma is correct. In fact, the proof of SUBSEQ theorem is NOT hard. I suspect that your proof and the standard one are the same same proof. When I say it needs `wqo theory' that just means that it would take some work to get to in an ugrad automata theory class, but it really could be done. And it could be in the textbooks- would not take that many pages.
Am I missing something? I found a proof of Higman's Lemma pretty quickly. It uses Dickson's Lemma ...
This is a good concise proof, but only of the binary case of Higman's result. It resembles some sort of hybrid between Higman's proof and mine (see the link in my previous comment). Dickson's Lemma is essentially a restatement of the fact that
(N^k, componentwise-domination)
is a wqo, and that part resembles Higman's proof. The question of whether SUBSEQ(L) has an excluded string is equivalent to that of whether strings in L have unbounded 0-1 alternation. I generalize this idea to prove the general case for a k-ary alphabet.
(Higman's full proof uses the fact that (Sigma*, subseq) is a wqo, for any finite alphabet Sigma. Once this is established, the rest of the proof is easy and straightforward.)
By the way, I wasn't deliberately trying to avoid wqo's or Dickson's Lemma in my own proof. I just didn't know about them at the time (although I knew I was reproving a known result).
Finally, I can imagine a scenario where Higman's result is useful: a language L may be obviously closed downward under the subseq relation, but not obviously regular. Higman's result says that L = SUBSEQ(L), so L is regular. For example,
L = {w in {a,b,c}* | w has at most 5 occurrences of a followed by b, and at most 3 of them have c in between}
Of course, what is obvious and not obvious is in the eye of the beholder.
OK, thanks. I actually just didn't notice that Bill had actually stated a k-ary generalization of what I proved (typical CS lacuna--expecting that binary alphabets always capture the essential complexity). I'll think about k > 2.
Let G be a gate-set; let F(G) be the functions computable with G. Form gate-set equivalence classes: [G] = {G': F(G') = F(G)}.
Partial-order these classes: [G1] <= [G2] if F(G2) contains F(G1). (not just quasi- because it's antisym.)
Suppose it turns out to be a well-partial-ordering; then looking at the set of equivalence classes of the maximal non-universal gate sets, they form an antichain. So there must be only finitely many of them.
This falls short of the result you quote, because one of these function classes might have infinitely many maximal basis gate-sets. Still, it's part-way there.
This theorem appear on page 64 of John Conway's book "Regular Algebra and Finite Machines". I am not an expert in this area but it seems to me that this text on finite automata contains lots of material which is fundamental to automata theory but has not been explored since the book was written in in the early 1970s. This is despite the book being cited in many papers in the computer science literature. I guess what I am trying to say is that if you want results that should be better known go and read Conway's little book!
Imagine my surprise when my co-authors (Cortes and Mohri, "Learning Linearly Separable Languages") pointed out that another paper in ALT06 -- Fenner and Gasarch, "The Complexity of Learning SUBSEQ(A)" -- was using Higman's result. Imagine Steve's and my surprise when it turned out that a third paper at that same conference was also using Higman's theorem: de Brecht and Yamamoto, "Mind Change Complexity of Inferring Unbounded Unions of Pattern Languages From Positive Data". Perhaps the time was ripe for this obscure result to come start getting mileage... or perhaps it wasn't that obscure to begin with!
There IS a recent text-book that contains Higman's Lemma together with a short proof of it (one page): the book of Reinhard Diestel, "Graph Theory". It is stated there in the context of the graph minor theorem.