List decoding considers codes where we have too many errors in the
code to uniquely decode a
string. These algorithms
creates a short list of all of the
possible decodings.
Last year I had discussed
list decoding in my Favorite Theorems series and said
Guruswami and Sudan
give a polynomial-time algorithm that handles what is believed to
be the best possible amount of error.
Guruswami and Sudan's algorithm works for the standard Reed-Solomon
codes and is likely the best possible for that code. Parvaresh and
Vardy develop a new but related encoding to get an
improvement on the rate of the code, the ratio of the original message
length to the code length. Guruswami and Sudan show how to list
decode a 1-ε fraction of errors using Reed Solomon codes with a rate of
O(ε2) while Parvaresh and Vardy achieve the same errors with
their code with rate O(ε/log(1/ε)).
Perhaps I just don't understand the terminology - what does 1-e errors mean? What is meant by 'rate'? 1-e obviously doesn't refer to an absolute number, does it meant that an e proportion of all values are altered or that a 1-e proportion of all values are altered? Is 'rate' a time-based measure or an amount of error correcting code measure?
I should have said a "1-ε" fraction of errors, i.e., for a code of length m, all but εm entries can be corrupted, where entries are elements of some finite field. The rate refers to the throughput: If a code of length m encodes an input of length n (with the same alphabet) then the rate is n/m.
This was asked last time, but never answered: what is the evidence that Guruswami and Sudan's algorithm is likely the best possible for the Reed-Solomon code?
Any idea what the patent status of this new algorithm will be?
Is that rate the information-theoretic limit? (I suspect that it is, but don't know the appropriate approximations for factorials offhand.)
Any idea if it will be possible to apply techniques which use the fact that errors tend to be next to each other in the real world to this approach?
Does this technique require a lot of random access, or can it do a decent job of avoiding cache misses and disk seeks as well?
It's good to see some theory making practical contributions, a little surprising that the most exciting stuff is information theoretic right now though.
This was asked last time, but never answered: what is the evidence that Guruswami and Sudan's algorithm is likely the best possible for the Reed-Solomon code?
It is not clear if there is any "compelling" evidence. Beyond the radius decoded by their algorithm, we do not know if there will always only be a polynomial number of codewords to output (which is a guarantee we need to be able to decode in polynomial time). It is possible this number shoots to super-polynomial just beyond this radius --- some works have explored this possibility, though the question remains wide open. It could also be that the choice of evaluation points used for the Reed-Solomon encoding matters, and that for a clever choice one could decode from many more errors.