C++ versus Java

I've worked in C++ for 20 years and in Java for nearly 10. Both are proven languages that I and many other engineers have used to produce large, maintainanable, successful products. But each has its unique warts. I won't review them all here, but will comment on where I think standard comparisons go astray.

Memory Management

Some make the analogy to cars with manual versus automatic transmission. But this is misleading because the underlying assumption is that the manual transmission is inherently faster or more efficient. This is not necessarily the case with C++. C++ forces me as a developer to spend relatively more time thinking about memory management: for each object, who allocates it, who uses it, who destroys it. Usually there are multiple different ways to arrange this, all of which which could work, each of which has different tradeoffs in performance and complexity, so it is a matter of choosing the "best" way. To complicate matters, sometimes various other warts of C++ make some of the approaches impossible. For just one example, you can't have an STL container of abstract classes (if it contains abstract classes it must have pointers to them). Sure, I understand why, but even so this is purely a technical limitation or design deficiency of the C++ language. There's no reason why it shouldn't be allowed, and it would prove useful in some designs. Occasionally one of these warts pops up 75% of the way through implementing an approach, forcing one back to backtrack and redesign. Ultimately, in C++ I end up choosing some approach that balances performance and complexity. Yet even when performance and complexity are both high, it is not always true that this performs faster than Java.

Java's memory management is not only simpler and more consistent, less prone to leaks, dangling pointers and the like, it also has tools (such as weak references) that can be used to build sophisticated optimizations when necessary. Going back to the car analogy, it's more a case in which the automatic transmission shifts just as fast as the manual and has nearly the same power transfer efficiency, while providing a wider range of gear ratios.

In short, "rolling your own" in terms of the C++ approach of dealing with pointers and your own memory allocation, is not always faster or more efficient, and even when it is, the difference isn't large or significant, but it is always more complex, error prone and time consuming.

Multiple Inheritance

When it comes to multiple inheritance, Java allowed the perfect to be the enemy of the good. In order to eliminate the possibility of the diamond problem, they needlessly complicated some common design patterns. For example, consider the Visitor pattern. The classes being visited need to implement a "visit" method that the visitor calls. Normally this is done by making them subclass an "element" or "subject" class. But this requires multiple inheritance (the class is both itself, and an "Element"), which Java doesn't support. In Java, the workaround is to make the "element" or "subject" an interface instead of a class. But this forces the element or subject to declare that it implements the interface and to implement (or delegate) the calls. Not only is this clumsy, it also makes the code fragile. Any time the element or subject interface changes, every class that uses it is affected. This is the kind of dependency inversion that O-O design is intended to eliminate.
NOTE: one way around this would be if Java implemented automatic/opaque delegation, much like what was called "aggregation" in Microsoft COM. Let the class declare something like "implements Element via ElementImp", which tells the compiler to instantiate an ElementImp object owned by this object and delegate all Element method calls to it. This would still be clumsier than multiple inheritance, but it would at least break the dependency inversion, which would be huge improvement.

The Copy Constructor

If I made a "top 10" list of the most common problems that lay at the root of all the bugs I have worked on in my career, the C++ copy constructor would probably be in the top 5. In C++, the copy constructor exists by default for every class, and is automatically called in a variety of circumstances that are often overlooked by programmers. Consider an object X. Every time you pass it, return it, or throw it, the copy constructor may be called. Also, different objects may define different semantics for their copy constructors. These semantics usually make sense in simple isolated cases, but when combined with other objects the resulting behavior is often counterintuitive or even counterproductive. Now consider how many functions are called implicitly through overloaded operators. One single piece of one line of code can cascade into deeply nested layers of calls. The copy constructor itself may not be the root of all evil, but the implicit rules governing how it is automatically called make an accident waiting to happen.

Java doesn't have a copy constructor; due to its memory management and consistent use of pass-by-reference, it rarely needs one. But on those occasions when you do actually need one, Java provides a clone mechanism that enables you to do the job.

Polymorphism

Here, C++ and Java are closer than they appear to be. But on the surface they appear to be quite different. In C++, polymorphism (virtual functions, dynamic binding) only works through pointers. But in Java, everything is a pointer (or reference), so there is no distinction.

For example, in C++ suppose you have an STL list< Shape>, and Circle is derived from Shape. Every shape defines a virtual method area() which returns its area. If you fill up a list with shapes and compute the area of each, it will NOT work. The list contains Shape objects, not pointers to Shape objects. Virtual functions work only when dereferenced through a pointer. Thus, the area() method that will be invoked, will always be Shape::area() regardless of the actual object in the list.

NOTE 1: also, even worse, when the list removes objects you'll get leaks. That's because right destructors won't be called. Even if they are declared virtual - as they should always be - only Shape::~Shape() will be called regardless of the type of object.
NOTE 2: when you add objects, they may be invalid. That's because they'll fire off the wrong copy constructur - Shape::Shape(Shape), regardless of the particular subclass copy constructor.

In Java, everything is a reference/pointer. When you make a List< Shape>, it is always a list of pointers/references to Shape objects. And when you call area(), it dynamically binds to the particular subclass implementation. You simply can't make a list of actual Objects; you can only use references/pointers to them.

This is likely A GOOD THING, as a container of actual objects is usually A BAD THING.

Templates and Generics

Here C++ has the better approach. Java's generics are clumsier because the "T" in the template is just a syntax placeholder, a compiler checked erasure - it has no real meaning. For example, suppose you have:
public < T > void func(Collection< T > data)
This function cannot get the actual class of T. That is, it can't say "new T()", "T.class", "T.getClass()" or anything like this (it can call data.getClass(), but not if data is null). So if this function needs to create a T, or if it needs the class of T, then T's class must be passed to it:
public < T > void func(Collection< T > data, Class< T > tClass)
Of course if you want to be more specific, you can say:
< ? extends Foo > instead of < T >
or variations thereof. In short, Java's generics are really just a way to:

  • eliminate type casting to and from containers
  • document in the source code what containers are intended to hold
  • Java generics are much more limited than C++ templates.

    Mutual Class Dependency

    In C++, each class is declared in a header file and cycles are not allowed. If A uses B, then A.H normally must include B.H. But this means B.H cannot include A.H, else a cycle would exist. In header files C++ programmers usually say:

    #ifndef THISFILENAME
    #define THISFILENAME
    ... code for this file
    #endif

    Which eliminates all kinds of errors that happen when a header file is unavoidably #included multiple times due to being part of a nested set of source files. But this doesn't solve the mutual dependency problem. It is just a workaround for a C++ language limitation.

    In C++, instead of B.H including A.H, B.H can use a forward declaration for A. But a forward declaration is very limited - it means B only sees A as an opaque pointer.

    In Java, two classes can fully use each other so long as they are in the same compiled module. They don't necessarily have to be in the same package. Of course the basic logical limitation remains. That is, if the compiler builds the module containing A separately from (either before or after) the module containing B, then they cannot use each other. But this would be a mutual dependency at the logical build level, which is an entirely different thing that nobody should ever have.

    Exceptions

    Here Java has the edge. Its exception handling is modeled after C++ but has two nice improvements.

  • Java provides a convenient set of built-in exception classes that provide messages and stack traces. This provides consistency and reduces the amount of code you have to write yourself.
  • Java's finally clause, which is guaranteed to run regardless of whether exceptions were caught, uncaught or thrown. A try...catch...finally block makes exception handling easier to read, understand and maintain.
  • Java always has access to the exception object. In C++, if you use "catch(...)" you can't access the exception object. You can rethrow it, but you're doing it blind because you don't know what it is. This is dangerous as throwing anything by value will fire its copy constructor which could do anything.
  • On top of this, C++ has limitations about throwing exceptions in destructors. More on that below.

    Threads

    Java's thread handling is far superior to C++. That's wasn't hard to accomplish, since threading is not part of the C++ language. If you need to use threads in C++ you are forced to use whatever libraries are provided by the platform you are working on, which are often proprietary, non-standard, and vary widely from one platform to the next. Writing clean multithreaded code is hard enough as it is. Doing it in C++ only adds to the frustration. Java's built-in support for threads is simple, well designed, fully featured, performant, and works the same way on all platforms.

    Technical Details

    Both C++ and Java have their small low level technical warts, but C++ has far more. C++ is more technically cumbersome than Java. Some examples:

  • static class members: in C++, the class definition (.H file) for static members is only a declaration, not a definition. Static members must be defined in the class implementation (.CPP file). This is inconsistent with non-static members and cumbersome. In Java, static members are treated just as any other and static initializations are done with "static" blocks, which easy to read and maintain.
  • arrays: C++ arrays are just a different syntax for pointers, and an array doesn't know how big it is. This latter fact is a real PITA whenever you pass around the array, as you have to pass around its size, or keep a special element at the end, both of which are fragile and kludgy. In Java, arrays know their size and type, which makes them self-contained and easy to pass around.
  • destructors: C++ destructors are fragile. First you had better make all your destructors virtual, else the wrong destructor may fire when objects are destroyed. Also, any exceptions that pop a C++ destructor will nuke your entire process (if it is within the context of another exception being thrown, but one must assume that is the case since determining whether it is, is generally impossible or too complex).
  • I could go on, but I think the point is clear.
    Now to list some of my pet peeves with Java:

  • Primitive types (int, float, etc.) can be clumsy to work with. Their relationship to classes has plenty of rough spots. Also, the lack of unsigned char/int/long can be a PITA.
  • Java generics are too limited (implemented as compiler erasures) and should be expanded into a true type aware template system like C++
  • Java's String class feels weird and the rest of the class library makes kludgy assumptions about it. It should be refactored for a more seamless integration with the rest of the system.
  • No operator overloading. For example it would be great to overload [] with the container library.
  • STL has the concept of iterators orthogonal to containers orthogonal to algorithms. This is elegant and well designed. In comparison, Java's container library feels clumsy.
  • CONCLUSION

    While I maintain strong proficiency in both languages, using them daily, I prefer to write in Java.
    Why? Let me list the reasons:

  • I get more done in less time, since Java requires less focus on technical details that are irrelevant to the problem domain.
  • I find that programs written in Java require less debugging (related to #1).
  • Once written, a Java program runs on a wider variety of platforms.
  • Java has a standard library that covers a wider set of features (threads, sockets, GUI, etc.).
  • Not only is the Java language itself more consistent across platforms, but it also has a wider variety of 3rd party enhancements (many open source) which run on all platforms: ant, JUnit, etc.
  • For all practical purposes, Java has no sacrifice in performance.