A list that has a special marker at its end is a general concept. The-empty-list...

mercurial · on March 13, 2013

Either today's tea was not strong enough, or you're answering the wrong person.

dschiptsov · on March 13, 2013

Yeah, I should change a tea-shop.)

In my opinion, the single-linked-list data structure as a foundation of Lisp was selected not by accident. It is also most basic and natural representation of many natural concepts - a chain, a list. You could ask for the next element, and find out that there is no more. Simple and natural. Because of its simplicity the code is also simple.

Such kind of lists should be heterogeneous, because when all the elements are of the same type, it is more natural to represent it as an array - an ordered sequence. As far as I know, Python's lists actually are dynamic arrays.

The sequences of the elements of the same type (same storage size and encoding) with a marker at the end could be also viewed as a homogeneous lists. C strings processed as a list of characters is canonical example.

Now consider a UTF-8 encoding. It is a variable-length encoding. UTF-8 string is not an array, and because you cannot tell the boundaries between runes while reading, is not a list. But, nevertheless it could be considered and processed as a stream, until EOL marker is reached. This is why it was invented in Bell Labs to keep things as simple as possible.

Now, you see, the concept of a homogeneous list from math is not enough for CS, and sometimes it is much better to have it fuzzy. What is a list is a matter of a point of view.

I think I will keep my dealer.)

mercurial · on March 13, 2013

> In my opinion, the single-linked-list data structure as a foundation of Lisp was selected not by accident. It is also most basic and natural representation of many natural concepts - a chain, a list. You could ask for the next element, and find out that there is no more. Simple and natural. Because of its simplicity the code is also simple.

Yes, that's a recursive data structure well-suited to functional languages, same as Haskell lists.

> Such kind of lists should be heterogeneous, because when all the elements are of the same type, it is more natural to represent it as an array - an ordered sequence.

From a memory point of view, maybe, but that's really an implementation detail. If you take, eg, Perl arrays, which you can access by index, push, shift, unshift, and pop, the actual implementation is invisible from the programmer, just as it should be in this sort of language. Java will happily store cats and dogs in an array of Object, for instance.

> Now consider a UTF-8 encoding. It is a variable-length encoding. UTF-8 string is not an array, and because you cannot tell the boundaries between runes while reading, is not a list. But, nevertheless it could be considered and processed as a stream, until EOL marker is reached. This is why it was invented in Bell Labs to keep things as simple as possible.

You could very well store it as a list of bytes. This wouldn't be terribly efficient, but it's perfectly doable. Whether you process it as a stream or you have it stored in whatever array/collection is orthogonal to the fact that your language of choice supports heterogeneous lists. You can do stream processing in Haskell too (with, eg, pipes). You have also access to other data structures which are not lists, but which do expect to be homogenous.

dschiptsov · on March 13, 2013

Yes, it is indeed a stream of bytes, with a zero-byte as an EOF marker. That's why UTF-8 is good-enough.

As for lists, as long as your next element is not always in the next chunk of memory, you need a pointer. A chain of pointers is a single-linked list. This is the core of a Lisp and it is not an accident. Together with type-tagging, you could have your lists heterogeneous, as simple as that.)

This is a part of the beauty and elegance of a Lisp, in my opinion - few selected ideas put together.