Saturday, October 29, 2011

What is Covariance and Contravariance Anyways?

If you have been following the talk around Google's latest language, Dart, then you might have heard things like "covariant arrays are widely regarded as a mistake" (src). But what is covariance? Why are covariant arrays considered a mistake? Who has covariant arrays? Why?

What is Covariance and Contravariance All About?

class X:
    pass

class Y(X):
    def zoom(self):
        print 'Y.zoom'

class A:
    # Returns a value of type Y
    def foo(self):
        return Y()

class B(A):
    # Returns a value of type X
    def foo(self):
        return X()


# Takes a value of type A
def bar(a):
    y = a.foo()
    y.zoom()


# Call bar with a B
bar(B())
Is this code safe? Often, in an OO language, inheritance is used to create a type that looks just like another type, and pass values of this new type to functions that expect the other type and without breaking anything. So one might expect, looking at this code, that passing a B to bar would be safe. But it isn't. The problem is B.foo returns an X where A.foo returns a Y. bar then goes on to try to call a member function that only exists in Y, so the code will fail at runtime. B has violated the interface that A established. Python doesn't give us any tools to discover this error without executing the code, but statically typed languages like C++ and Java do. These languages can define the types that member functions of a subclass can take, and return, in order to be valid through covariant and contravariant properties. The concepts are larger than member functions but it's a good place to start.

Definitions

Before we can talk about covariance and contravariance we need to know what subtyping is. I'll restrict this to classes in an object-orientated language because that's what my knowledge is limited to, but these concepts go beyond that. An example of a subtype in an OO language would be a class that extends or inherits from another class, for example in C++ you would do class S : public T for S to extend T. The common way type theorists write this is S <: T. This can be read in a few ways: "any term of type S can safely be used in a context where a term of type T is expected" or "every value described by S is also described by T" (Pierce, 182). Thus, S is a more specific type than T, and conversely T is a more generic type than S. Given this, covariance and contravariance define when a more specific or more generic type is acceptable in a particular context.

The definition from the wiki article:

covariance and contravariance refers to the ordering of types from narrower to wider and their interchangeability or equivalence in certain situations
And their definitions:
  • Covariant - Given the types S and T specified above, covariance is when a more specific type, S, can be used when a more generic type, T, is specified. This applies to functions, a function that returns S can be used in the same context as a function that returns T.
  • Contravariant - When the more generic type, T, can be used where the more specific type, S, is specified. A function that takes a T can be used in the same context as a function that takes a S.
  • Invariant - The type specified is the only type that can be used.
My method for remembering the distinction is contravariant is larger than covariant, so it means narrower to wider.

Covariant Example

Consider this example from C++:

class X {};
class Y : public X {};
class Z : public Y {};

class A {
public:
  virtual Y *foo() { return new Y(); }
};

class B : public A {
public:
  virtual Z *foo() { return new Z(); }
};
Here we have three classes X, Y, and Z which we will return from a virtual function in classes A and B. This code is valid because B::foo is returning a narrower type than A::foo because Z is a subtype of Y. But what happens if we make B::foo return a wider type?
class X {};
class Y : public X {};
class Z : public Y {};

class A {
public:
  virtual Y *foo() { return new Y(); }
};

class B : public A {
public:
  virtual X *foo() { return new X(); }
};
$ g++ -W -Wall -ansi -pedantic -c foo.cc foo.cc:12: error: invalid covariant return type for ‘virtual X* B::foo()' foo.cc:7: error: overriding ‘virtual Y* A::foo()’

Contravariant Example

As we saw above, one can be covariant on return types, but what about on parameters to a member function? One has to be the opposite actually: contravariant. Unfortunately, gcc (or the version I'm using at least) doesn't give us an error about contravariance, instead we have to add a little bit of code using B::foo to see the problem:

class X {};
class Y : public X {};
class Z : public Y {};

class A {
public:
  virtual void foo(Y &y) { }
};

class B : public A {
public:
  virtual void foo(X &x) { }
};


int main() {
  B b;
  Y y;
  Z z;
  
  b.foo(y);
  b.foo(z);
  return 0;
}
This behaves as we would expect. We can call B::foo with an X, Y, or Z, since B::foo takes the superclass to all of them (X). But what if we modify it to take Z:
class X {};
class Y : public X {};
class Z : public Y {};

class A {
public:
  virtual void foo(Y &y) { }
};

class B : public A {
public:
  virtual void foo(Z &z) { }
};


int main() {
  B b;
  Y y;
  Z z;
  
  b.foo(y);
  b.foo(z);
  return 0;
}
$ g++ -W -Wall -ansi -pedantic foo.cc foo.cc: In function ‘int main()’: foo.cc:24: error: no matching function for call to ‘B::foo(Y&)’ foo.cc:14: note: candidates are: virtual void B::foo(Z&)
The error we get is about B::foo(Y&) not existing. Because Z is a narrower type than Y, we have broken the interface of A, so B is no longer a valid substitute for A.

But You Already Knew This

Intuitively, this is the way that makes the most sense. The use case for inheritance is generally when a function takes a certain type as input but you want to pass your own type. In order to ensure that the function, bar, can successfully use the type you pass it, all of the methods that bar calls on your type need to take types that bar can pass to it and return types that bar knows how to use. Take this code:

class X {};
class Y : public X { /* ... */ };
class Z : public Y { /* ... */ };

class A {
public:
  virtual Y *foo(Y &y) { /* ... */ }
};

void bar(A &a) {
  Y y;
  Y *y_ptr = a.foo(y);
}
If you were to pass your own type, B, which is a subtype of A (that is, B <: A) the only way we can implement B::foo such that our function bar still works is if B::foo takes an object as input that is a supertype to Y and returns a type that is a subtype of Y. Consider if it were valid for B::foo to want a Z. B::foo could try to call a member function that exists only in Z. But since bar is passing an object of type X, which we can't guarantee has member function that exists only in Z, our code would be unsafe. A similar argument can be made with the return type of Y and X.

What About Arrays?

Now that we get what these terms mean, how do they apply when it comes to arrays of objects? In C++, arrays are not covariant, however in Java they are:

public class Test {
    public static void foo() {
        Test[] tests = new Test[10];
        Object[] objects = tests;
    }
}
This code looks innocent enough, but there is a problem: what if I modify objects? For example:
public class Test {
    public static void foo() {
        Test[] tests = new Test[10];
        Object[] objects = tests;
        objects[0] = new Integer(1);
    }
}
This code compiles fine, but executing it gives the following error (changed foo to main in order to run it):
Exception in thread "main" java.lang.ArrayStoreException: java.lang.Integer at Test.main(Test.java:5)
This is what people mean when they refer to covariant arrays being a mistake. A compile-time check has been moved to runtime, which means you don't know if your code is safe without running it. Note, though, that what makes covariant arrays a problem in Java is modifying the array, not reading it. If we could somehow ensure that the objects array was immutable, covariant arrays in Java would be safe. Why does Java have covariant arrays then? Originally Java did not have parametric polymorphism (generics), without covariant arrays it would be impossible to write a generic function that copies one array to another. Instead one would have to write a copy function for every type of array they wish to copy (Pierce, 188). While the behavior is not ideal, at the least the JVM can provide some level of safety by performing type checks at runtime.

Unfortunately, I cannot shed any light on why Dart has covariant arrays. It has done everything from befuddle to enrage type theorists (null pointers, in 2011, why?!). Dart appears to support parametric polymorphism, the lack of which lead to Java having covariant arrays. It is a shame that Dart seems to be repeating, what some, would classify as mistakes for no obvious benefit. Dart is not the only one sinning though, C# also has covariant arrays and seems to have simply copied them from Java.

Conclusion

I'm no type theorist, this post went through several iterations with comments and corrections from friends smarter than I, so this post only scratches the surface of covariance and contravariance. Hopefully it is enough to understand what people are referring to when the terms come up in casual Reddit conversation. If you're interested in more, Benjamin Pierce's book, Types and Programming Languages, has an entire chapter dedicated to subtyping. While I haven't finished the book yet, what I have read is excellent. Even if you skip the math that looks complicated you will come out better.

Thanks

Special thanks to @dklee. Much of this post is based off of emails we have exchanged. Thanks to @j2labs, and @apgwoz, as well, for help in making this post.

3 comments:

  1. C# has done covariance/contravariance in a saner way for generic type parameters in its latest release (we're still stuck with unsafe covariant arrays though from 1.0), so that all of the framework covariant types are read-only, so you can't get into the situations you describe, yet can still do the types of casts that "should just work".

    http://blogs.msdn.com/b/csharpfaq/archive/2010/02/16/covariance-and-contravariance-faq.aspx is a good article for describing how this works in C#

    ReplyDelete
  2. @Paul - Thank you for the link!

    ReplyDelete
  3. @Matteo - My wording was poor, what I was trying to say was that gcc doesn't give us a nice error saying our code violates contravariance, like it does for the covariant example. The example C++ code I give shows that it is overriding not overloading, otherwise b.foo(y) would work (since it would just be a matter of finding the correct overloaded function). But it doesn't. You can still access the correct foo statically, but then we are going around polymorphism, which defeats the purpose of being co/contravariant.

    ReplyDelete