Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!



Permlink Replies: 10 - Last Post: Jun 11, 2014 10:27 PM Last Post By: Janez Atmapuri ...
Arnaud BOUCHEZ

Posts: 143
Registered: 2/17/02
Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 9, 2014 1:19 PM
Since there was recently some articles about performance comparison between several versions of the Delphi compiler, especially in this forum thread.

IMHO there won't be any definitive statement about this.

But I'm always doubtful about any conclusion which may be achieved with such kind of benchmarks.
Performance is an iterative process, always a matter of circumstances, and implementation.

Circumstances of the benchmark itself.
Each benchmark will report only information about the process it measured.
What you compare is a limited set of features, running most of the time an idealized and simplified pattern, which shares nothing with real-world process.

Implementation is what gives performance.
Changing a compiler will only gives you some percents of time change.
Identifying the true bottlenecks of an application via a profiler, then changing the implementation of the identified bottlenecks may give order of magnitudes of speed improvement.

For instance, multi-threading abilities can be achieved by mORMot regression tests.
With our huge set of regression tests, we have at hand more than 16,500,000 individual checks, covering low-level features (like numerical and text marshaling), or high-level process (like concurrent client/server and database multi-threaded process).

During our integration process, we always run regression benchmarks with Delphi 6, 7, 2007, XE4 and XE6 under Win32, and XE4 and XE6 under Win64.
We wrote a blog article with some numbers, and analysis.
http://blog.synopse.info/post/2014/06/09/Performance-comparison-from-Delphi-6%2C-7%2C-2007%2C-XE4-and-XE6

In short, all compilers performs more or less at the same speed.
Win64 is a little slower than Win32, and the fastest appears to be Delphi 7, using our enhanced and optimized RTL.
But all versions of the Delphi compilers perform more or less in the same range.

IMHO there is no definitive answer.

In short, for most real process, the Delphi compiler did not improve the execution speed, since decades.
On the contrary, we may say that the generated executables are slightly slower with newer versions.
The compiler itself is perhaps not the main point during our tests, but the RTL, which was not modified with speed in mind since Delphi 2010.
Even if mORMot code by-passes the RTL for most of its process, we can still see some speed regressions when compared to pre-Unicode versions of Delphi.
In some cases, the generated asm is faster since Delphi 2007, mainly due to function inlining abilities.
But we can't say that the Delphi compiler did generates much better code in newer versions.

To be fair... just like GCC or other compilers! The only dimension where performance did improve in order of magnitude is for floating-point process, and auto-vectorisation of the code, using SSE instructions.
But for business code (like database or client-server process), the main point is definitively not the compiler, but the algorithm. The hardware did improve a lot (pipelining, cache, multi-core....), and is the main improvement axis.

When testing the FreePascal compiler, we found out that the generated code is slightly slower than Delphi.
But still perfectly usable in production, generating smaller executables, and with better abilities to cross-platform support, and a tuned RTL.
Luigi Sandon

Posts: 353
Registered: 10/15/99
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 10, 2014 12:25 AM   in response to: Arnaud BOUCHEZ in response to: Arnaud BOUCHEZ
but the RTL, which was not modified with speed in mind since Delphi 2010.

Worse. It was modified to be portable using a "single code approach" (because mantaining separate RTL code for each paltform would be expensive), at the expenses of speed. And probably by unskilled developers also - no longer having the free code supplied by the FastCode project.
Arnaud BOUCHEZ

Posts: 143
Registered: 2/17/02
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 10, 2014 1:31 AM   in response to: Luigi Sandon in response to: Luigi Sandon
Luigi Sandon wrote:
but the RTL, which was not modified with speed in mind since Delphi 2010.

Worse. It was modified to be portable using a "single code approach"

Indeed.
But the worse part is that this "single code" was not optimized, nor profiled, as you stated.
For instance, I found out that val() with Int64 performs very well on CPU64, but rather poorly on Win32. Whereas the FastCode asm version for Win32 did perform amazingly better.

You can write "single code" which performs well on most targets.
For instance, we only see a few percent speed increase when we disable our tuned asm code, and rely on our optimized "pure pascal" version of our regression tests.

My concern is that a lot of people did react to https://forums.embarcadero.com/thread.jspa?threadID=105700 writing that they were "amazed by XE6", without finding out that there won't be any benefit, in terms of speed, with previous versions of the compiler.
They would be rather disappointed after the switch to XE6 when writing non FMX applications (i.e. server or VCL apps).
Luigi Sandon

Posts: 353
Registered: 10/15/99
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 10, 2014 4:51 AM   in response to: Arnaud BOUCHEZ in response to: Arnaud BOUCHEZ
You can write "single code" which performs well on most targets.
For instance, we only see a few percent speed increase when we disable our tuned asm code, and rely on our optimized "pure pascal" version of our regression tests.

IMHO it would be needed to assess which functions need per-platform/CPU optiomization, and which can be reasonably kept in optimized "pure pascal" version. Maybe some core, low-level function could benefit from heavily optimized code for a give platform/CPU, while some higher level one could be built using the "pure pascal" implementations, as long as the compiler is good enough (some of the FPU code shown lately was not encouraging...).

But as you said, they would need to be carefully written, profiled and benchmarked. Embarcadero still fails to undersand that one of the reason people turn to native tools is speed and optimizations. If you remove them, you get very little advantages over non-native tools - which also can support cross-platfrom development more easily, look at Xamarin - and even on other platforms than Windows developers may look for performance, not code reuse only. And developers who has no code to reuse, what should they look for? <G>
Dalija Prasnikar

Posts: 2,325
Registered: 11/9/99
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 10, 2014 2:32 AM   in response to: Luigi Sandon in response to: Luigi Sandon
Luigi Sandon wrote:
but the RTL, which was not modified with speed in mind since Delphi 2010.

Worse. It was modified to be portable using a "single code approach" (because mantaining separate RTL code for each paltform would be expensive), at the expenses of speed. And probably by unskilled developers also - no longer having the free code supplied by the FastCode project.

They should have left Windows part untouched, and go with "single code" for other platforms.
No matter how "hot" they think mobile is, Windows are still going to be main bread and butter
for many.

Dalija Prasnikar
Joseph Mitzen

Posts: 392
Registered: 6/9/02
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 10, 2014 11:51 PM   in response to: Arnaud BOUCHEZ in response to: Arnaud BOUCHEZ
Arnaud BOUCHEZ wrote:

To be fair... just like GCC or other compilers!

I don't know if that's completely fair. For instance, when GCC added specific tuning for the "bulldozer" architecture of AMD's K8 CPUs, some benchmarks saw significant improvements....

http://www.phoronix.com/scan.php?page=article&item=amd_fx8150_compilers&num=6

I'm going to go out on a limb and speculate that EMBT doesn't spend time fine-tuning their compiler for the K8. Heck, I remember many years ago on the newsgroup I had a nice new 450MHz CPU with MMX and AMD 3DNow! instruction sets and was disappointed to learn that that there was no compiler option to use them. Then as now most of the advice revolved around writing assembly code myself to use them. What really hurt was that not only was there no 3DNow option in the compiler settings, there was still an option to avoid the original Pentium 60 floating point division bug. I wonder when they finally removed +that+?
Leif Uneus

Posts: 33
Registered: 8/12/98
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 11, 2014 12:39 AM   in response to: Joseph Mitzen in response to: Joseph Mitzen
..., there was still an option to avoid the original Pentium 60 floating point division bug. I wonder when they finally removed +that+?

It's still there.

See http://stackoverflow.com/q/24101125/576719 “Pentium-safe FDIV” … in year 2014?
with a reply from Allen Bauer.

/Leif
Janez Atmapuri ...

Posts: 240
Registered: 2/8/00
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 11, 2014 1:25 AM   in response to: Arnaud BOUCHEZ in response to: Arnaud BOUCHEZ
Dear Arnuad,

The only dimension where performance did improve in order of magnitude is
for floating-point process,
and auto-vectorisation of the code, using SSE instructions.

This is only the most obvious part.

But for business code (like database or client-server process), the main
point is definitively not the
compiler, but the algorithm. The hardware did improve a lot (pipelining,
cache, multi-core....), and is the
main improvement axis.

There is a large class of problems where you do have "buisness" logic, but
the algorithm can
be rewritten to be vectorized. Intel has been running seminars and
publishing books on this topic for
15 years now. SSE and vectorization does not in any way imply that floating
point is the only area of
science that can benefit.

Since Delphi does not support SSE, a large amount of effort where people may
be tempted to vectorize
buisness logic was left unused.

I would even go as far as saying that "buissness logic" and "bad coding
style" are not far from
having an equality sign in between. Both describing code rooted in the past
which cant be accelerated
on modern hardware with any compiler.

Kind Regards!
Atmapuri
Markus Humm

Posts: 5,113
Registered: 11/9/03
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 11, 2014 12:51 PM   in response to: Janez Atmapuri ... in response to: Janez Atmapuri ...
Am 11.06.2014 10:25, schrieb Janez Atmapuri Makovsek:
Dear Arnuad,

The only dimension where performance did improve in order of magnitude is
for floating-point process,
and auto-vectorisation of the code, using SSE instructions.

This is only the most obvious part.

But for business code (like database or client-server process), the main
point is definitively not the
compiler, but the algorithm. The hardware did improve a lot (pipelining,
cache, multi-core....), and is the
main improvement axis.

There is a large class of problems where you do have "buisness" logic, but
the algorithm can
be rewritten to be vectorized. Intel has been running seminars and
publishing books on this topic for
15 years now. SSE and vectorization does not in any way imply that floating
point is the only area of
science that can benefit.

Since Delphi does not support SSE, a large amount of effort where people may
be tempted to vectorize
buisness logic was left unused.

Hello,

the x64 compiler does support SSE and afaik SSE2.

Greetings

Markus
Janez Atmapuri ...

Posts: 58
Registered: 8/19/01
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 11, 2014 10:27 PM   in response to: Markus Humm in response to: Markus Humm
the x64 compiler does support SSE and afaik SSE2.

It was meant in the sense of vectorization. x64 is using SSE2 to handle
single variable at one time. Never to handle two or more at the same time
which is the purpose of SSE2.
Mark Lumak

Posts: 8
Registered: 1/3/10
Re: Delphi 6, 7, 2007, XE4 and XE6: compiler do not matter much, algorithms do!
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 11, 2014 5:44 AM   in response to: Arnaud BOUCHEZ in response to: Arnaud BOUCHEZ
Compiler comparisons should also include which non-Pascal instructions (Asm, SSE, AVX and corresponding memory align requirements) are supported in the source, because they can lead to a huge performance difference. We use non-Delphi dlls for such instructions.
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02