Performance Evaluation: Java vs C++
Sean Wentworth (spw98@yahoo.com)
University of South Alabama
Computer Science Department
Mobile, Alabama 36688
Java is becoming much more widely used in academic settings because it offers a nice language for both introductory courses as well as more advanced courses. There is, however, always a kind of “apology” made for Java due to its slower execution. Java implementations are steadily improving with the use of JIT (Just In Time) compilation as well as strides being made in other areas affecting runtime such as garbage collection routines. This paper compares Java to C++ in terms of execution times. This comparison includes a summary of results reported in the literature and presents new testing results based on sorting routines. These results show that with the support of JIT technology the Java performance penalty is becoming much more “acceptable” thus allowing educators to present Java without apology.
Introduction
The early Java virtual machines (JVM) executed Java bytecode similar to the way that purely interpreted languages execute instructions. Interpreted Java programs execute much slower than equivalent compiled programs [6]. For example, in executing a common loop, an early JVM would interpret each bytecode instruction into binary and then execute the instruction for each iteration of the loop! A compiled program, on the other hand, need only execute the body of the loop the appropriate number of times. A native compiler can also perform more time consuming optimizations during the compilation process, thus further enhancing performance. The poor performance of early JVMs was overlooked because Java designers were more concerned with Java’s portability and with the ease of debugging and code maintenance.
There is also evidence that programming in Java, using its rich API, yields code that has fewer bugs per line of code than using C++[7]. As recently as 1998, however, Java’s lagging performance was still a roadblock to its widespread acceptance as a general purpose computing language [3, 11]. As a result, improving Java’s performance relative to natively compiled languages, such as C++, has become the focus of both software and hardware designers [10].
Such interest in improving Java’s performance lead to the development of “just-in-time” (JIT) compiler technology. JIT compilers are integrated into the JVM and compile Java bytecode into native binary code when a method is first called. Along with other technological improvements to modern JVMs, JIT has helped to significantly minimize the performance gap. As a result Java is being used for more applications that were previously written in languages such as C++ [5]. The use of Java to implement large-scale applications with sizeable memory requirements and many networking functions is an example of this migration to Java [2]. Many companies are promoting Java as a general purpose high level computing language [1], not just for internet programming.
Java is most often compared to C++ [6], probably because of its widespread use along with its syntactical and semantic similarities to Java. Also, C++ is noted for its efficiency among object oriented languages. The aim of this paper is to quantify, based upon current JVM technology, the relative performance of Java to C++. Specifically, the performance differences using common sorting algorithms are examined. Sorting was chosen for several reasons. First, none of the research found reported timings on sorting algorithms. Second, sorting is a fundamental computing procedure that appears in numerous applications. Finally, sorting code involves numerous array references, thus it may tend to show Java execution in its “worst light”.
Related Work
Implementing a large-scale application in both Java and C++ for comparative analysis would be a major undertaking. To date, no one appears to have attempted such a study, probably because the results would bear little more than anecdotal significance. On the other hand, there have been several published studies comparing the performance of Java versus C++ at executing common, smaller pieces of code. Interestingly, the majority of these studies show similar results, specifically that JIT compiled Java performs about 2.5 times slower than C and/or C++.
In their evaluation, Bernardin, et. al. compare Java to both C and C++ [1]. Java was compared to C at multiplying univariate polynomials (represented as arrays of 32-bit integers) of varying degrees. In their tests, the C version was compiled two ways for added comparison. One was compiled using the standard compilation options while the other was compiled by enabling powerful, machine-specific optimizations. The results show the Java versions executing an average of 2.26 times slower than the heavily optimized C. Against the standard C compilation, the Java versions performed better in each of the five trials (averaging 21% faster)! To compare Java to C++, polynomial multiplication was again used, but the polynomial representation was changed to an array of pointers to generic coefficient objects. Using this design, the Java versions executed an average of 2.61 times slower than the C++ versions.
In designing benchmarks for large-scale Java applications, Bull, et. al. performed a single comparison of Java to C [2]. They ran the Linpack Benchmark with a problem size of 1000 x 1000 in both Java and C. The Linpack Benchmark is a matrix multiplication test of linear equations used to measure floating-point numeric performance. The results, measured in Mflop/s, show Java performing 44.44% of the Mflop performed by the C version. This translates into an execution time for Java that is 2.25 times slower than C for the Linpack Benchmark. This is almost identical to Bernardin's results when comparing Java to heavily optimized C at polynomial multiplication.
Prechelt's comparison of Java to C/C++ also yields similar results to what has been seen so far [8]. He presents a comparison of 40 different implementations (24 written in Java, 11 in C++, and 5 in C) of the same program written by 38 different programmers. The median Java execution time is approximately 2 times slower than the median C++ execution time and just over 3 times slower than the median of the C and C++ groups put together.
Object oriented programming involves a large number of method calls due to data encapsulation. For this reason Roulo analyzed Java method call performance compared to C++ [9]. He showed that across methods accepting from 1 to 11 parameters, Java’s method calls are approximately one clock cycle slower than the same C++ method calls. Looking at a method call with the median number of parameters (6), his results show that the Java method executed in 13 clock cycles while the C++ method executed in 12 clock cycles. This translates into a slowdown of Java compared to C++ of only 8.3%. He also showed that explicit object creation in Java is roughly equivalent to using the malloc operation in C or creating objects with the new operator in C++. This is due to the fact that in all 3 of these instances, the memory allocation occurs on the heap. So Java performs equally with C++ at creating user-defined objects, which is another fundamental feature of object oriented programming. However, user-defined objects aren’t the only objects that Java and C++ must create and manage. Many common operations require programs to create temporary objects. Java uses the heap for all object creation, whereas C++ creates temporary objects on the system stack, which “is almost always in the on-board Level 1 cache” [9]. Therefore, temporary object creation in Java is many times slower than in C++. Roulo’s results show a performance difference of about 10 to 12 times. What’s more, the use of a JIT seems to have little impact on the Java performance penalty for object creation [9].
In addition to the direct cost of object allocation on the heap, Java also incurs the cost of performing garbage collection [4]. Inaccessible, heap-allocated memory is automatically collected in Java, resulting in that memory being freed for reuse. The cost of garbage collection can be substantial, especially for applications with a high rate of object creation [4]. Because of both the direct cost of heap allocation in Java and the cost of garbage collection, Klemm says that “much of the speed gap between C++ and Java programs can … be attributed to the memory management in Java virtual machines” [5].
The most impressive Java performance results come from Mangione’s study [6]. His experiment timed simple computational steps such as integer and float divisions and method calls in both Java and C++. In each of his tests, the functions were performed in a loop that executed 10 million times. In six of the tests the Java program executed precisely as fast as C++! At first glance, Mangione’s results seem hard to believe, given the application timings presented earlier that show a performance difference of about 2 to 3 between Java and C++. However, that evidence indicates that Java’s poor performance with respect to C++ is due, in large part, to Java’s memory management model. Mangione’s tests did not involve any memory management tasks, avoiding one of Java's main performance bottlenecks.
Additional Testing
Although the previous research doesn’t totally agree, the majority of the empirical results tend to point in the same general direction. That is, that Java programs using JIT technology execute approximately 2 to 3 times slower than corresponding C++ programs. The question addressed here is how closely the timings of sorting algorithms compare to the prior research results. Four sorting algorithms were selected for comparison, two of O(n2) complexity (Bubble Sort and Insertion Sort) and two of O(n log n) complexity (Recursive Quicksort and Heap Sort). The algorithms sorted four data sets of increasing size. Each of the data sets consists of randomly generated integers between 0 and the size of the data set. For both the Java and C++ versions of the algorithms, the data was stored in a dynamically allocated array of 4-byte integers. The two O(n2) algorithms were timed sorting array sizes of 50,000 and 200,000. The two O(n log n) algorithms were timed sorting array sizes of 200,000, 400,000 and 800,000. The algorithms were paired with array sizes in this manner because the O(n2) algorithms executed too slowly on the large arrays and the O(n log n) algorithms executed too quickly on the small arrays.
The timing tests were executed on a 200 MHz Pentium processor under the Windows 95 operating system. The Java and C++ versions of each algorithm were virtually identical. The timings included only the sorting portion of the algorithm, not the setup portions. The C++ sort program was compiled using Borland's C++ compiler version 5.5. The "-O2" optimization switch was used, which optimizes the code for speed. The Java sort program was compiled and run under Sun's Java 2 SDK version 1.3.0. Sun uses their own version of the JIT technology, which they call HotSpot. Whereas traditional JIT preemptively optimizes each class or, in some cases, each method as it is loaded, HotSpot performs a number of on-the-fly optimizations during program execution [12]. HotSpot represents the state of the art in JIT technology. For comparison purposes, the Java sort trials were tested both with and without HotSpot enabled. Without HotSpot, the results reflect only the performance of the Java interpreter.
In the four tables below, "I-Java" represents the interpreted Java trials and "H-Java" represents the HotSpot-enabled Java trials. Some of the table entries are marked as "n/a," representing timings that were omitted either for being too fast or too slow.
The I-Java vs C++ Comparison Ratio columns, with performance degradation ratios as high as 20.5, clearly depict the unacceptable performance of interpreted Java versus C++. The I-Java vs H-Java columns show a seven to eight fold improvement (average of 7.6) in performance as a result of the evolution of JIT technology. This dramatic improvement helps to put Java in a much more competitive position against natively compiled languages.
The H-Java vs C++ Comparison Ratio columns reveal an interesting difference between the slower O(n2) algorithms and the faster O(n log n) algorithms. In all tests, the O(n2) algorithms showed performance ratios in the range of 2.5 to 3, which is consistent with the results cited in the literature. The O(n log n) algorithms, on the other hand, consistently performed in the 1.4 to 1.6 ratio range, which is even more encouraging.
Sorting 50,000 4-byte Integers |
||||||
|
|
Times in Seconds |
Comparison Ratios |
||||
|
Algorithm |
H-Java |
I-Java |
C++ |
I-Java vs H-Java |
I-Java vs C++ |
H-Java vs C++ |
|
Insertion Sort |
65.42 |
421.82 |
22.45 |
6.45 |
18.79 |
2.91 |
|
Bubble Sort |
173.29 |
1311.07 |
63.76 |
7.57 |
20.56 |
2.72 |
|
Recursive Quicksort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
|
Heap Sort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
Sorting 200,000 4-byte Integers |
||||||
|
|
Times in Seconds |
Comparison Ratios |
||||
|
Algorithm |
H-Java |
I-Java |
C++ |
I-Java vs H-Java |
I-Java vs C++ |
H-Java vs C++ |
|
Insertion Sort |
1112.74 |
6835.64 |
437.33 |
6.14 |
15.63 |
2.54 |
|
Bubble Sort |
2926.92 |
21108.63 |
1162.99 |
7.21 |
18.15 |
2.52 |
|
Recursive Quicksort |
0.39 |
3.08 |
0.25 |
7.90 |
12.32 |
1.56 |
|
Heap Sort |
1.10 |
10.16 |
0.76 |
9.24 |
13.37 |
1.45 |
Sorting 400,000 4-byte Integers |
||||||
|
|
Times in Seconds |
Comparison Ratios |
||||
|
Algorithm |
H-Java |
I-Java |
C++ |
I-Java vs H-Java |
I-Java vs C++ |
H-Java vs C++ |
|
Insertion Sort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
|
Bubble Sort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
|
Recursive Quicksort |
0.83 |
6.21 |
0.53 |
7.48 |
11.72 |
1.57 |
|
Heap Sort |
2.47 |
21.59 |
1.63 |
8.74 |
13.25 |
1.52 |
Sorting 800,000 4-byte Integers |
||||||
|
|
Times in Seconds |
Comparison Ratios |
||||
|
Algorithm |
H-Java |
I-Java |
C++ |
I-Java vs H-Java |
I-Java vs C++ |
H-Java vs C++ |
|
Insertion Sort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
|
Bubble Sort |
n/a |
n/a |
n/a |
n/a |
n/a |
n/a |
|
Recursive Quicksort |
1.81 |
13.41 |
1.17 |
7.41 |
11.46 |
1.55 |
|
Heap Sort |
5.49 |
45.87 |
3.70 |
8.36 |
12.40 |
1.48 |
The encouraging news from these tests is that the Java “performance penalty” is being further reduced as better compilation and optimization technologies are applied to Java. Java's performance relative to that of C++ at executing O(n2) sorting algorithms is comparable to the ratios found in the current literature. Java's relative performance at executing O(n log n) sorting algorithms, which are most common in practical applications, is even better. Additional testing remains in order to evaluate the Java performance penalty in larger applications that are more representative, yet with performance ratios as low as 1.5 and not much more than 3 there should be little hesitancy to use Java in a wider variety of applications.
Bernardin, L., B. Char, E. Kaltofen (1999) Symbolic Computation in Java: An Appraisement, Proceedings of the 1999 international symposium on Symbolic and algebraic computation, 1999, pp237-244.
Bull, J. M., L. A. Smith, M. D. Westhead, D. S. Henty and R. A. Davey (1999) A Methodology for Benchmarking Java Grande Applications, Proceedings of the ACM, 1999 Conference on Java Grande, pp81-88.
Gonsalves, A., (1998) Java Performance Still Disappointing, PC Week, May 25, 1998, vol. 15, no. 21, p10.
Heydon, A., M. Najork, (1999) Performance Limitations of the Java Core Libraries, Proceedings of the ACM, 1999 Conference on Java Grande, pp35-41.
Klemm,R. (1999) Practical Guidelines for Boosting Java Server Performance, Proceedings of the ACM, 1999 Conference on Java Grande, pp25-34.
Mangione, C. (1998) Performance Tests Show Java as Fast as C++, JavaWorld, Feb. 1998, vol. 3, no. 2, available from http://www.javaworld.com/javaworld/jw-02-1998/jw-02-jperf_p.html.
Phipps, G., (1999) Comparing Observed Bug and Productivity Rates for Java and C++ (abstract only), Software – Practice and Experience, vol. 29, no. 4, pp345-348.
Prechelt, L. (1999) Comparing Java vs. C/C++ Efficiency Differences to Interpersonal Differences, CACM, Vol. 42, No. 10, Oct. 1999.
Roulo, M. (1998) Accelerate your Java apps!, JavaWorld, Sept. 1998, vol. 3, no. 9, available from http://www.javaworld.com/javaworld/jw-09-1998/jw-09-speed.html.
Santoni, A., J. Walsh, (1997) Java Speed Tops List of Priorities, InfoWorld, June 9, 1997, vol. 19, no. 23, p29.
(1998) No More Mr. Slow for Java, InfoWorld, Sept. 14, 1998, vol. 20, no. 37, p68.
(1999) The Java HotSpot Performance Engine Architecture, Whitepaper available at http://java.sun.com/hotspot/whitepaper.html.
-