Rectangle 27 0

C++ performance vs. JavaC ?

@Bill I may be mixing two things... but isn't branch prediction done at run time in the instruction pipeline achieve similar goals independent of the language?

@Hardy yes, the CPU can do branch prediction regardless of language, but it cannot factor out an entire loop by observing that the loop has no effect on anything. It will also not observe that mult(0) is hard-wired to return 0 and just replace the entire method call with if(param == 0) result=0; and avoid the entire function/method call. C could do these things if the compiler had a comprehensive overview of what was happening, but generally it doesn't have enough info at compile time.

But to use all of C++ capability you, the developer must work hard. You can achieve superior results, but you must use your brain for that. C++ is a language that decided to present you with more tools, charging the price that you must learn them to be able to use the language well.

It's not so much that you are compiling for CPU optimization, but you are compiling for runtime path optimization. If you find that a method is very often called with a specific parameter, you could pre-compile that routine with that parameter as a constant which could (in the case of a boolean that controls flow) factor out gigantic chunks of work. C++ cannot come close to doing that kind of optimization.

On real world and real application C++ is still usually faster than java, mainly because of lighter memory footprint that result in better cache performance.

So how do JITs do at recompiling routines to take advantage of observed runpaths, and how much difference does that make?

The compile for specific CPU optimizations are usually overrated. Just take a program in C++ and compile with optimization for pentium PRO and run on a pentium 4. Then recompile with optimize for pentium 4. I passed long afternoons doing it with several programs. General results?? Usually less than 2-3% performance increase. So the theoretical JIT advantages are almost none. Most differences of performance can only be observed when using scalar data processing features, something that will eventually need manual fine tunning to achieve maximum performance anyway. Optimizations of that sort are slow and costly to perform making them sometimes unsuitable for JIT anyway.

Rectangle 27 0

C++ performance vs. JavaC ?

@Justicle The best your c++ compiler will offer for different architectures is usually x86, x64, ARM and whatnot. Now you can tell it to use specific features (say SSE2) and if you're lucky it'll even generate some backup code if that feature isn't available, but that's about as fine-grained as one can get. Certainly no specialization depending on cache sizes and whatnot.

@OrionAdrian ok we're full circle now ... See for examples of this theory not happening. In other words, show us that your theory can be proven correct before making vague speculative statements.

A C++ program has to be compiled beforehand usually with mixed optimizations so that it runs decently well on all machines, but is not optimized as much as it could be for a single configuration (i.e. processor, instruction set, other hardware).

Additionally certain language features allow the compiler in C# and Java to make assumptions about your code that allows it to optimize certain parts away that just aren't safe for the C/C++ compiler to do. When you have access to pointers there's a lot of optimizations that just aren't safe.

Also Java and C# can do heap allocations more efficiently than C++ because the layer of abstraction between the garbage collector and your code allows it to do all of its heap compression at once (a fairly expensive operation).

Generally, C# and Java can be just as fast or faster because the JIT compiler -- a compiler that compiles your IL the first time it's executed -- can make optimizations that a C++ compiled program cannot because it can query the machine. It can determine if the machine is Intel or AMD; Pentium 4, Core Solo, or Core Duo; or if supports SSE4, etc.

Now I can't speak for Java on this next point, but I know that C# for example will actually remove methods and method calls when it knows the body of the method is empty. And it will use this kind of logic throughout your code.

Now this all said, specific optimizations can be made in C++ that will blow away anything that you could do with C#, especially in the graphics realm and anytime you're close to the hardware. Pointers do wonders here.

One the Java side, @Swati points out a good article:

So as you can see, there are lots of reasons why certain C# or Java implementations will be faster.

So depending on what you're writing I would go with one or the other. But if you're writing something that isn't hardware dependent (driver, video game, etc), I wouldn't worry about the performance of C# (again can't speak about Java). It'll do just fine.

To be honest, this is one of the worst answers. It is so unfounded, I could just invert it. Too much generalisation, too much unknowledge (optimizing away empty functions is really just the tip of the iceberg). One luxury C++ compilers have: Time. Another luxury: No checking is enforced. But find more in .

Rectangle 27 0

C++ performance vs. JavaC ?

If you'd always allocate everything on the heap, then .NET and Java may even perform better than C/C++. But you just will not do this in C/C++.

On top of what some others have said, from my understanding .NET and Java are better at memory allocation. E.g. they can compact memory as it gets fragmented while C++ cannot (natively, but it can if you're using a clever garbage collector).

Or if you're using a better C++ allocator and/or pool of objects. This is far from magic, from a C++ view point, and it can boils down to have "heap allocation" become as fast a stack allocation.

Rectangle 27 0

C++ performance vs. JavaC ?

The executable code produced from a Java or C# compiler is not interpretted -- it is compiled to native code "just in time" (JIT). So, the first time code in a Java/C# program is encountered during execution, there is some overhead as the "runtime compiler" (aka JIT compiler) turns the byte code (Java) or IL code (C#) into native machine instructions. However, the next time that code is encountered while the application is still running, the native code is executed immediately. This explains how some Java/C# programs appear to be slow initially, but then perform better the longer they run. A good example is an ASP.Net web site. The very first time the web site is accessed, it may be a bit slower as the C# code is compiled to native code by the JIT compiler. Subsequent accesses result in a much faster web site -- server and client side caching aside.

Rectangle 27 0

C++ performance vs. JavaC ?

C# readonly and Java final are nowhere as useful as C++'s const

Generics are not as powerful as templates

"No matter the JIT optimization, nothing will go has fast as direct pointer access to memory...if you have contiguous data in memory, accessing it through C++ pointers (i.e. C pointers... Let's give Caesar its due) will goes times faster than in Java/C#". People have observed Java beating C++ on the SOR test from the SciMark2 benchmark precisely because pointers impede aliasing-related optimizations.

"The code processing will be done at compilation time". Hence template metaprogramming only works in the program is available at compile time which is often not the case, e.g. it is impossible to write a competitively performant regular expression library in vanilla C++ because it is incapable of run-time code generation (an important aspect of metaprogramming).

"The going word at Facebook is that 'reasonably written C++ code just runs fast,' which underscores the enormous effort spent at optimizing PHP and Java code. Paradoxically, C++ code is more difficult to write than in other languages, but efficient code is a lot easier [to write in C++ than in other languages]."

"We find that in regards to performance, C++ wins out by a large margin. However, it also required the most extensive tuning efforts, many of which were done at a level of sophistication that would not be available to the average programmer.

"playing with types is done at compile time...the equivalent in Java or C# is painful at best to write, and will always be slower and resolved at runtime even when the types are known at compile time". In C#, that is only true of reference types and is not true for value types.

...Until the problem of low-latency reared its ugly head the last months. Then, the Java server apps, no matter the optimization attempted by our skilled Java team, simply and cleanly lost the race against the old, not really optimized C++ server.

And despite C# primitive-like structs, C++ "on the stack" objects will cost nothing at allocation and destruction, and will need no GC to work in an independent thread to do the cleaning.

Another example is temporary variables, that are simply compiled away by the C++ compiler while still being mentioned in the IL produced by the C# compiler. C++ static compilation optimization will result in less code, thus authorizes a more aggressive JIT optimization, again.

As already said in the previous posts, JIT can compile IL/bytecode into native code at runtime. The cost of that was mentionned, but not to its conclusion:

As for memory fragmentation, memory allocators in 2008 are not the old memory allocators from 1980 that are usually compared with a GC: C++ allocation can't be moved in memory, true, but then, like on a Linux filesystem: Who needs hard disk defragmenting when fragmentation does not happen? Using the right allocator for the right task should be part of the C++ developer toolkit. Now, writing allocators is not easy, and then, most of us have better things to do, and for the most of use, RAII or GC is more than good enough.

Because the C++ static compiler was a lot better to produce already optimized code than C#'s.

But as far as I see it, C# or Java are all in all a better bet. Not because they are faster than C++, but because when you add up their qualities, they end up being more productive, needing less training, and having more complete standard libraries than C++. And as for most of programs, their speed differences (in one way or another) will be negligible...

But still, it is what happens today, both in the GUI teams and the server-side teams.

But when you need raw power, powerful and systematic optimizations, strong compiler support, powerful language features and absolute safety, Java and C# make it difficult to win the last missing but critical percents of quality you need to remain above the competition.

C++ has a memory usage different from Java/C#, and thus, has different advantages/flaws.

Currently, the decision is to keep the Java servers for common use where performance while still important, is not concerned by the low-latency target, and aggressively optimize the already faster C++ server applications for low-latency and ultra-low-latency needs.

For example, function inlining in .NET is limited to functions whose bytecode is less or equal than 32 bytes in length. So, some code in C# will produce a 40 bytes accessor, which won't be ever inlined by the JIT. The same code in C++/CLI will produce a 20 bytes accessor, which will be inlined by the JIT.

I have now 5 months of almost exclusive professional C# coding (which adds up to my CV already full of C++ and Java, and a touch of C++/CLI).

I kept contact with the server teams (I worked 2 years among them, before getting back to the GUI team), at the other side of the building, and I learned something interesting.

I played with WinForms (Ahem...) and WCF (cool!), and WPF (Cool!!!! Both through XAML and raw C#. WPF is so easy I believe Swing just cannot compare to it), and C# 4.0.

It's as if you needed less time and less experienced developers in C#/Java than in C++ to produce average quality code, but in the other hand, the moment you needed excellent to perfect quality code, it was suddenly easier and faster to get the results right in C++.

JIT has one massive problem is that it can't compile everything: JIT compiling takes time, so the JIT will compile only some parts of the code, whereas a static compiler will produce a full native binary: For some kind of programs, the static compiler will simply easily outperform the JIT.

Java is even more frustrating, as it has the same problems than C#, and more: Lacking the equivalent of C#'s using keyword, a very skilled colleague of mine spent too much time making sure its resources where correctly freed, whereas the equivalent in C++ would have been easy (using destructors and smart pointers).

Java, and even more C#, are cool languages, with extensive standard libraries and frameworks, where you can code fast, and have result very soon.

Last week, I had a training on .NET optimization, and discovered that the static compiler is very important anyway. As important than JIT.

Last years, the trend was to have the Java server apps be destined to replace the old C++ server apps, as Java has a lot of frameworks/tools, and is easy to maintain, deploy, etc. etc..

No matter the JIT optimization, nothing will go has fast as direct pointer access to memory (let's ignore for a moment processor caches, etc.). So, if you have contiguous data in memory, accessing it through C++ pointers (i.e. C pointers... Let's give Caesar its due) will goes times faster than in Java/C#. And C++ has RAII, which makes a lot of processing a lot easier than in C# or even in Java. C++ does not need using to scope the existence of its objects. And C++ does not have a finally clause. This is not an error.

Note that usually, you are comparing C++ runtime code with its equivalent in C# or Java. But C++ has one feature that can outperform Java/C# out of the box, that is template metaprograming: The code processing will be done at compilation time (thus, increasing vastly compilation time), resulting into zero (or almost zero) runtime.

Nothing is as simple as expected.

Now, the memory model is somewhat becoming more complicated with the rise of multicore and multithreading technology. In this field, I guess .NET has the advantage, and Java, I was told, held the upper ground. It's easy for some "on the bare metal" hacker to praise his "near the machine" code. But now, it is quite more difficult to produce better assembly by hand than letting the compiler to its job. For C++, the compiler became usually better than the hacker since a decade. For C# and Java, this is even easier.

Of course, C# (or Java, or VB) is usually faster to produce viable and robust solution than is C++ (if only because C++ has complex semantics, and C++ standard library, while interesting and powerful, is quite poor when compared with the full scope of the standard library from .NET or Java), so usually, the difference between C++ and .NET or Java JIT won't be visible to most users, and for those binaries that are critical, well, you can still call C++ processing from C# or Java (even if this kind of native calls can be quite costly in themselves)...

Of course, I'll update this post if something new happens.

So I guess C#/Java's productivity gain is visible for most code... until the day you need the code to be as perfect as possible. That day, you'll know pain. (you won't believe what's asked from our server and GUI apps...).

So, C# remains an pleasant language as long as you want something that works, but a frustrating language the moment you want something that always and safely works.

Still, the new standard C++0x will impose a simple memory model to C++ compilers, which will standardize (and thus simplify) effective multiprocessing/parallel/threading code in C++, and make optimizations easier and safer for compilers. But then, we'll see in some couple of years if its promises are held true.

The conclusion is that while it's easier/faster to produce a code that works in C#/Java than in C++, it's a lot harder to produce a strong, safe and robust code in C# (and even harder in Java) than in C++. Reasons abound, but it can be summarized by:

The reason for this was speculated to be the fact C++/CLI compiler profited from the vast optimization techniques from C++ native compiler.

The very same code compiled in C++/CLI (or its ancestor, Managed C++) could be times faster than the same code produced in C# (or VB.NET, whose compiler produces the same IL than C#).

You edit after 5 months of C# describes exactly my own experience (templates better, const better, RAII). +1. Those three remain my personal killer features for C++ (or D, which I hadn't the time for, yet).

[...] The Java version was probably the simplest to implement, but the hardest to analyze for performance. Specifically the effects around garbage collection were complicated and very hard to tune."

Rectangle 27 0

C++ performance vs. JavaC ?

For many practical purposes, allocation/deallocation-intensive algorithms implemented in garbage collected languages can actually be faster than their equivalents using manual heap allocation. A major reason for this is that the garbage collector allows the runtime system to amortize allocation and deallocation operations in a potentially advantageous fashion.

I haven't observed that on the big Unix programs that are meant to run forever. They tend to be written in C, which is even worse for memory management than C++.

In C/C++ you can allocate short lived objects on the stack, and you do when its appropriate. In managed code you cannot, you have no choice. Also, in C/C++ you can allocate lists of objects in contigous areas (new Foo[100]), in managed code you cannot. So, your comparison is not valid. Well, this power of choices places a burden on the developers, but this way they learn to know the world they live in (memory......).

In some cases, managed code can actually be faster than native code. For instance, "mark-and-sweep" garbage collection algorithms allow environments like the JRE or CLR to free large numbers of short-lived (usually) objects in a single pass, where most C/C++ heap objects are freed one-at-a-time.

Of course, in most cases that I've encounted, managed languages are fast enough, by a long shot, and the maintenance and coding tradeoff for the extra performance of C++ is simply not a good one.

That said, I've written a lot of C# and a lot of C++, and I've run a lot of benchmarks. In my experience, C++ is a lot faster than C#, in two ways: (1) if you take some code that you've written in C#, port it to C++ the native code tends to be faster. How much faster? Well, it varies a whole lot, but it's not uncommon to see a 100% speed improvement. (2) In some cases, garbage collection can massively slow down a managed application. The .NET CLR does a terrible job with large heaps (say, > 2GB), and can end up spending a lot of time in GC--even in applications that have few--or even no--objects of intermediate life spans.

The problem is that for long running processes, such as a web server, your memory over time will become so fragmented (in a C++ written program) that you will have to implement something that resembles garbage collection (or restart every so often, see IIS).

Rectangle 27 0

C++ performance vs. JavaC ?

First of all, if we look at Raymond Chen's code, he clearly does not understand C++ or data structures very well. His code almost reaches straight for low-level C code even in cases where the C code has no performance benefits (it just seems to be a sort of distrust and maybe a lack of knowledge of how to use profilers). He also failed to understand the most algorithmically sound way of implementing a dictionary (he used std::find for Christ's sake). If there's something good about Java, Python, C#, etc. - they all provide very efficient dictionaries...

For me the bottom line is that it took 6 revisions for the unmanaged version to beat the managed version that was a simple port of the original unmanaged code. If you need every last bit of performance (and have the time and expertise to get it), you'll have to go unmanaged, but for me, I'll take the order of magnitude advantage I have on the first versions over the 33% I gain if I try 6 times.

Interestingly, the time to parse the file as reported by both programs internal timers is about the same -- 30ms for each. The difference is in the overhead.

Of course he used available lower level libraries to do this, but that's still a lot of work. Can you call what's left an STL program? I don't think so, I think he kept the std::vector class which ultimately was never a problem and he kept the find function. Pretty much everything else is gone.

So am I ashamed by my crushing defeat? Hardly. The managed code got a very good result for hardly any effort. To defeat the managed Raymond had to:

So, yup, you can definately beat the CLR. Raymond can make his program go even faster I think.

Tries or even std::map would fare much more favorably towards C++ or even a hash table. Finally, a dictionary is exactly the type of program that benefits most from high-level libraries and frameworks. It doesn't demonstrate differences in the language so much as the libraries involved (of which, I would happily say that C# is far more complete and provides much more tools suited for the task). Show a program that manipulates large blocks of memory in comparison, like a large-scale matrix/vector code. That'll settle this quite quickly even if, as in this case, the coders don't know what...

Whenever I talk managed vs. unmanaged performance, I like to point to the series Rico (and Raymond) did comparing C++ and C# versions of a Chinese/English dictionary. This google search will let you read for yourself, but I like Rico's summary.

Rectangle 27 0

C++ performance vs. JavaC ?

Calling a method for the first time:

Calling a method for the second time:

  • From the metadata, the CLR knows what memory address the IL (Intermediate byte code) is stored in.
  • The CLR allocates a block of memory, and calls the JIT.
  • The CLR looks at the type that implements Foo() and finds the function pointer in the metadata.
  • The JIT compiles the IL into native code, places it into the allocated memory, and then changes the function pointer in Foo()'s type metadata to point to this native code.
  • The native code at this memory location is ran.
  • Your program code calls a method Foo()

As you can see, the 2nd time around, its virtually the same process as C++, except with the advantage of real time optimizations.

By the way Jonathan, I think someone is still downvoting your things. When I voted you up you had a -1 on this post.

JIT (Just In Time Compiling) can be incredibly fast because it optimizes for the target platform.

That said, there are still other overhead issues that slow down a managed language, but the JIT helps a lot.

The basic concept of the .NET JIT works like this (heavily simplified):

This means that it can take advantage of any compiler trick your CPU can support, regardless of what CPU the developer wrote the code on.

Rectangle 27 0

C++ performance vs. JavaC ?

I suspect that the saying "You can write FORTRAN in any language" predates Jeff's career.

If you're a Java/C# programmer learning C++, you'll be tempted to keep thinking in terms of Java/C# and translate verbatim to C++ syntax. In that case, you only get the earlier mentioned benefits of native code vs. interpreted/JIT. To get the biggest performance gain in C++ vs. Java/C#, you have to learn to think in C++ and design code specifically to exploit the strengths of C++.

To paraphrase Edsger Dijkstra: [your first language] mutilates the mind beyond recovery. To paraphrase Jeff Atwood: you can write [your first language] in any new language.

Rectangle 27 0

C++ performance vs. JavaC ?

Note that free and delete are not deterministic either and GC can be made deterministic by not allocating.

One of the most significant JIT optimizations is method inlining. Java can even inline virtual methods if it can guarantee runtime correctness. This kind of optimization usually cannot be performed by standard static compilers because it needs whole-program analysis, which is hard because of separate compilation (in contrast, JIT has all the program available to it). Method inlining improves other optimizations, giving larger code blocks to optimize.

Standard memory allocation in Java/C# is also faster, and deallocation (GC) is not much slower, but only less deterministic.