Inconsistent Versions of Classes and Generic Functions

Potential Problems

Languages such as S in which functions and other parts of the language are separate, first-class objects have many nice features in terms of ease of programming and use, compared to conventional compiled languages. There are some disadvantages that sometimes bite programmers, however. Because function objects are separate and interpreted, not compiled all together, if function f calls function g, and the definition of g changes, we may be in trouble. (Parenthetically, similar problems start to arrive even with compiled languages as soon as techniques such as dynamic, incremental loading are used.)

These problems can be particularly subtle when we move on to the definition of classes and of generic functions. If a class extends another class, then it is dependent in some subtle ways on the definition of that second class not changing. And, if methods are defined for a generic function on more than one library, bad things can happen if, for example, the argument list of the function changes on one of those libraries.

Some new checks have been added to the S evaluator to detect situations that may cause problems and issue warnings to users. In addition there is a new function that you can use to make changes to generic functions more safely. These are changes made to S some time after the book was written, so you need to check whether they exist in the version of S/S-Plus you are using (e.g., exists("redefineGeneric")).

Here are the main problems that will be detected. First, for classes. When a class definition is first read from a library, the evaluator looks to see what classes that class extends. If the extension is defined the usual way, by including the representation of the second class, then some slots in the first class correspond to slots in the second class. The evaluator verifies that the two classes agree on the number and the class of these slots. So if a slot in the second class has changed definition, you will see a warning message of the form:

> x = new("bar")
Warning messages:
  Class "bar" extends class "foo"; slot "b" has class "character" in "bar",
	but class "numeric" in "foo" in: new("bar")
A similar message warns if the number of slots has changed.

What to do when an included class has changed depends on circumstances. The most common solution would be to define a new version of class "bar", to conform to the change in "foo". Alternately, if the old definition was really needed, "foo" could be replaced as a contained class by another class, using the old definition of "foo". Either way, it would be good to use class version control (Programming with Data, section 7.4) on "bar", in case existing objects need fixing.

Another warning is issued if there seems to be no definition at all for an included class, since in this case we can't verify consistency and, more seriously perhaps, other software may fail that depends on the included class.

Warning messages:
  Class "bar" extends a class "foo" that is not defined (use unsetClass("bar")
	when it is)
As the message suggests, if the class is defined on another library, you should attach that library, and then call unsetClass to make sure the definition of class "bar" is read again, and verified.

Turning to generic functions, the problems now arise when a different generic function corresponds to the same function name on different libraries. This can happen just because two people use the same name for entirely different purposes. Somewhat more likely, it arises because I have defined some new methods for a function and the owner of the function has then changed the function definition. Notice that it is the function, not its methods, we're talking about. The two likely changes are: changing the argument list to add, rename, or reorder arguments; and changing from a standard generic to one that does some special computations. The first case is more likely and unfortunately harder to deal with.

In any case, the S evaluator verifies heuristically that the generic function definition for a particular name is identical on all libraries that contribute methods for this function. If not you will see a warning message that indicates the nature of the problem. For example:

Warning messages:
  inconsistent function definitions for  "f" in libraries "." and "other":
	different argument names
Similarly, if the argument list has a different length, you will be warned. If the arguments stay the same, but one version is not a standard generic or if there are other differences in the body of the function, you may see the message:
Warning messages:
  inconsistent function definitions for  "f" in libraries "." and "other":
	different function body
For efficiency reasons, the check is only heuristic and not of complete object identity for the functions, but most changes are likely to show up.

Again, there is no fully general solution. You may just choose to redefine the generic consistently on your own library. If there is a real conflict, you may need to rename one of the functions.

Changing the Generic Function's Definition

When you specify a method for the first time for a particular function, that function is turned into a generic function. If the function existed before as an ordinary function (on the same database, preferably), that previous definition becomes the default method. The function itself from then on does nothing but call standardGeneric to tell the S evaluator to select and evaluate a method corresponding to the actual arguments in the call (Programming with Data, section 8.3).

How that works is usually of no importance so far as defining methods goes. There are two cases, though, when you may want to change the generic function itself, not just some of its methods. One case occurs when you decide that the generic should do some other computation besides dispatching methods. The other case occurs when you want to change something about the list of formal arguments in the generic. Changing the generic function is a little tricky, especially in the second case, if methods already exist.

This note describes a utility function to redefine a generic definition. The function was written after the book was completed; if your version of S or S-Plus doesn't have it, some source will be provided below. However, you can avoid using the function, and probably should, if you can collect all the source for the function and all existing methods into one source file. If you can, just call removeGeneric to remove the existing function and all its methods, and then source in the new definitions. Remember, though, that you need to redefine the generic on all libraries that contain methods for this function.

If you want to redefine the function but leave existing methods alone, call the new function, redefineGeneric. Again, this must be applied to all the libraries that have methods for the function. By default, redefineGeneric will find all such libraries on the current search list and apply the changes to each of them. You give redefineGeneric the name of the generic and the new definition. For an example, suppose we have a function, appendFrom that takes an object and a connection, and appends the data on the connection to the object. It would be natural for this function to evolve into a generic, with methods suitable for appending to some important classes of objects in our applications. After some methods have been defined, the actual function definition for appendFrom would be:

  function(object, connection)
If things get complicated, this is the sort of function that needs to be careful about opening the connection according to the rules (compare readWithFormula on page 386 of the book). But in this case, each method may need to be careful, and we could end up with lots of extra code, not to mention occasionally forgetting. Here's an example where we may choose to extend the definition of the generic itself: The issue of opening and closing the connection applies to any method (assuming we're not proposing different methods for different connection classes!). Therefore, it could reasonably come in the generic itself, relieving the individual methods from responsibility.

At any rate, let's assume we decide to make this change. It's done by the expression

    function(object, con) {
      if(!isOpen(con)) {
        con = open(con)
The situation when we want to change the argument list is touchier. Since all methods must agree with the generic in their argument list, the methods must change too. Indeed, it's unlikely that you could change the argument list in general without making some substantive changes in the existing methods. For that reason, redefineGeneric will ask you whether to proceed with the redefinition on any library with different arguments. You probably should be dumping and redefining all the methods.

But sometimes you may reasonably believe that the body of all the existing functions still makes sense; for example, if the change is to add an optional argument ignored by the existing methods, or to change the order of the arguments. If you do go ahead in redefineGeneric, all the methods will then be redefined as well. This is done conservatively by turning the existing method definition into an internal function with the old argument list. The new method then calls that function. In most cases, you still probably need to examine the new methods to see that they do make sense.

Why is this special care needed, as opposed to just reassigning the relevant function? Aside from the case that arguments, and therefore methods, have to change, the main reason is that the generic definition needs to be kept consistent between the ordinary function object and (all) the metadata objects containing methods for this function. The generic function should exist as an ordinary function object with the same name, so that it can be passed along as an argument to other functions. In addition, the evaluator keeps the generic function definition in the metadata, so that in principle any library with methods for the function has a fully usable definition of the function. The redefineGeneric utility exists mainly to maintain this consistency when the function definition changes. If you don't have this function because your version of S or S-Plus was made before the function was written, you can pick up source for it here.

John Chambers<>
Last modified: Thu Oct 8 17:02:38 EDT 1998