S and Statistics Research at Bell Labs

The characteristics, and much of the success, of the S system reflect the environment in which it grew: the research area of Bell Labs. The combination of direct involvement with challenging practical problems with the freedom to try fundamental new ideas has influenced S in many ways.

The key is to be aware of what is important for users, but at the same time to have the chance to go beyond a simple response to immediate user demands, looking for an approach that will satisfy both immediate and longer-term needs. This largely describes the attitude to research in statistics at Bell Labs, over the whole period of S's evolution.

Examples abound, from the beginning up to the present day. At the very first discussions, we knew we wanted an interactive, high-level system that would be easy to use. At the same time, we realized that the system had to make available the work-horses of statistical data analysis at the time; in particular, the Fortran software for graphical displays and basic numerical computations that we and our clients relied on. This led immediately to the concept of a system interface from S to, initially, Fortran and later to other languages and systems.

Building interfaces into the system was a novel approach for its time, and essential to flexibility. We are still benefitting from the concept ­ recent joint research on Java-based sofware is being made available via an interface between Java and S.

The intimate feedback loop between research and challenging applications has also worked from S, by stimulating the design of specialized systems for many kinds of application. Because S intends users to develop substantial software projects, it encourages important applied packages to be developed, often by gradual refinement of an initially modest piece of software.

A notable example is the S-Wafers system developed by Mark Hansen and David James. This software provides visualization and analysis for integrated circuit manufacturing applications. The contributions have been cited by those involved as key to integrating the global efforts to improved manufacturing quality at Lucent in this field.

S contributes to software such as this by providing an environment in which the design needed for the particular application can be set out with flexibility and clarity, and connected to the essential data and computations.

It has been commented on several occasions (by statisticians at major university departments) that S might not have been possible in a standard university environment. I think this is at least plausible, partly because of the role of statistical computing in the academic statistics world, and partly because a special kind of environment is essential to the success of projects of this sort.

Management attitude in the early stages of work on S was encouraging but ambivalent, for quite understandable reasons. The notion that serious data analysis could be done from an interactive, high-level language took quite some time to be accepted. A feeling prevailed that real statistical computing was done in Fortran. At the same time, Bell Labs researchers were well-known for pushing hard on the ideas they believed in; stubbornness was expected. The result was encouragement to push ahead, with some helpful pressure to make the results useful, but with a minimum of pre-judgement.

John Chambers<jmc@research.bell-labs.com>
Last modified: Tue Mar 30 10:15:08 EST 1999