The S System

S is a language and system for organizing, visualizing, and analyzing data. It has been a project of statistics research at Bell Labs since 1976, evolving continually through that time. In 1998, S became the first statistical system to receive the Software System Award, the top software award from the ACM.

This page is a brief author's-eye view of the system, with pointers to other sources of information.

S has from the start been aimed at programming with data; that is, at describing to the computer some graphical view, numerical summary, statistical model, or other information you want to produce. It occupies a middle ground between packages that emphasize standard operations and research projects in language design that start from a more abstract goal. S has always been designed to be used in practice, but with an emphasis on users who wanted to turn new ideas into software.

Although S was invented at Bell Labs, and we continue to be involved very much in its evolution, the implementations actually available, S-Plus and R, are distinct from the S language itself.

S-Plus products are distributed by the MathSoft Corporation. In particular, the S-Plus language is based on the S software from Bell Labs; MathSoft has an exclusive license with Lucent Technologies to distribute software based on S from Bell Labs. For more background on S and S-Plus, click here.

The R language, is an open-source system distributed under the GPL license, which is sometimes described as a ``free clone'' of S. More accurately, it is a separate project, based on the S language, but with a number of additional software directions.

Finally, we should mention the Omegahat software. Like R, it is a joint, open-source project for statistical computing. In part, it is concerned with next-generation software. However, getting there from the current generation software is also part of Omegahat. In particular, there are a number of inter-system interfaces from the S language (S-Plus or R), and some tools for programming in the S language.


More Information

There exist many books and articles on S (plus countless articles that include S-based graphics), plus information from the web or e-mail. Here are a few pointers that may be helpful.
 o Programming with Data (John M. Chambers, Springer, 1998 (3rd printing, revised, 2000))
A book describing the current version of S, the basis for S-Plus Version 5.
 o The S-News Mailing List
An e-mail list for discussions about S and S-Plus. This can be very valuable, especially for middle-level users. Fortunately, it is strictly a users' forum, not run by Bell Labs or MathSoft. Anything lost in having to read occasional flames is more than repaid by the practical information shared.
 o The R-help Mailing List
This is a list for R, similar to S-News. Actually, there are several mailing lists described at this web site, with r-help being the user-level list, suitable for questions about using R, and various related (and occasionally not-so-related) topics. A very helpful list.

There are a number of books about the S language and its implementations. In particular, for detailed information I would recommend:

 o Modern Applied Statistics with S-Plus (3rd ed., Springer, 1999) and S Programming (Springer, 2000), both by W. N. Venables and B. D. Ripley,
The first covers statistical methods and the use of S/S-Plus to apply them. For readers who have at least a little introduction to both the system and to statistics, and who want to apply statistical techniques to their data, this is probably the best single source, with valuable pointers to related libraries and other software. The second covers programming in S or R, largely the earlier versions of S rather than that in Programming with Data.


John Chambers
Last modified: Mon Jan 1 18:30:10 EST 2001