The Dhrystone benchmark was designed to test performance factors important in non numeric systems programming (operating systems, compilers, wordprocessors, etc.):
There are two versions of the Dhrystone benchmark. A depreated version 1.1 contained some 'dead code' which could be removed by optimising compilers. Version 2.1 corrected this and should be the version used in practice (and is the one that is in the uClinux distribution). Some manufacturers, however, still quote the (better) results of Version 1.1 so care must be taken when comparing Dhrystone performance figures to check which version was used.
If Dhrystone data is used for comparison purposes, it is important that the conditions of the benchmark are well understood, including:
These questions were borrowed from a White paper Richard York wrote.
Some people still occasionally ask for Dhrystone MIPS data, and so we will continue to provide these numbers. However, we do not recommend that Dhrystone results be used as part of any embedded processor evaluation exercise, due to its many known deficiencies. Where we provides a Dhrystone figure, it will be based on its standard tool suite under the conditions outlined above, and will therefore be 100% publicly and independently reproducible.
These results were taken on the processor, with Drystone compiled as a Linux application.
rgetz@pinky:~/blackfin1/uClinux-dist/user/dhrystone> bfin-uclinux-gcc --version bfin-uclinux-gcc (ADI-trunk/svn-3648) 4.3.4 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
rgetz@pinky:~/blackfin/uclinux-dist/user/dhrystone> bfin-linux-uclibc-gcc -O3 -DNO_PROTOTYPES=1 -c -o dhry_1.o dhry_1.c rgetz@pinky:~/blackfin/uclinux-dist/user/dhrystone> bfin-linux-uclibc-gcc -O3 -DNO_PROTOTYPES=1 -c -o dhry_2.o dhry_2.c rgetz@pinky:~/blackfin/uclinux-dist/user/dhrystone> bfin-linux-uclibc-gcc -O3 -DNO_PROTOTYPES=1 dhry_1.o dhry_2.o -o dhrystone rgetz@pinky:~/blackfin/uclinux-dist/user/dhrystone> rcp ./dhrystone root@192.168.0.8:/dhrystone
root:/> cat /proc/cpuinfo processor : 0 vendor_id : Analog Devices cpu family : 0x27c8000 model name : ADSP-BF537 600(MHz CCLK) 120(MHz SCLK) stepping : 2 cpu MHz : 600.000/120.000 bogomips : 1196.03 Calibration : 598016000 loops cache size : 16 KB(L1 icache) 32 KB(L1 dcache-wb) 0 KB(L2 cache) dbank-A/B : cache/cache icache setup : 4 Sub-banks/4 Ways, 32 Lines/Way dcache setup : 2 Super-banks/4 Sub-banks/2 Ways, 64 Lines/Way board name : ADDS-BF537-STAMP board memory : 65536 kB (0x00000000 -> 0x04000000) kernel memory : 57336 kB (0x00001000 -> 0x037ff000)
root:~> ./dhrystone_O0
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute
Please give the number of runs through the benchmark: 10000000
Execution starts, 10000000 runs through Dhrystone
Execution ends
Final values of the variables used in the benchmark:
Int_Glob: 5
should be: 5
Bool_Glob: 1
should be: 1
Ch_1_Glob: A
should be: A
Ch_2_Glob: B
should be: B
Arr_1_Glob[8]: 7
should be: 7
Arr_2_Glob[8][7]: 10000010
should be: Number_Of_Runs + 10
Ptr_Glob->
Ptr_Comp: 3707284
should be: (implementation-dependent)
Discr: 0
should be: 0
Enum_Comp: 2
should be: 2
Int_Comp: 17
should be: 17
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->
Ptr_Comp: 3707284
should be: (implementation-dependent), same as above
Discr: 0
should be: 0
Enum_Comp: 1
should be: 1
Int_Comp: 18
should be: 18
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5
should be: 5
Int_2_Loc: 13
should be: 13
Int_3_Loc: 7
should be: 7
Enum_Loc: 1
should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING
should be: DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING
should be: DHRYSTONE PROGRAM, 2'ND STRING
Microseconds for one run through Dhrystone: 3.0
Dhrystones per Second: 336021.5
All these tests were taken with the same processor (BF537 0.2, running at 500MHz CCLK, 120 MHz SCLK), just varying the compiler optimization settings. 10000000 (Ten Million) iterations are used to obtain accurate results.
| Flags 1) | size (bytes) | md5sum | Loops | Dhrystones per Second 2) | Dhrystone MIPS 3) | DMIPS/MHz |
|---|---|---|---|---|---|---|
-Os | 35364 | af6bdc6aa9887ba3d06eabf41d48ab1a | 10000000 | 398089.2 | 226.573 | 0.453146 |
-Os -funsafe-loop-optimizations | 35364 | ca63e189556a518daa56d392aa14cfc7 | 10000000 | 398089.2 | 226.573 | 0.453146 |
-Os -funroll-loops | 35364 | 20488fe3ab43751bc61ff8b05d3dc703 | 10000000 | 398724.1 | 226.935 | 0.453869 |
-Os -funroll-loops -funsafe-loop-optimizations | 35364 | 4cf225e93d70bccf611c3171a2fb216e | 10000000 | 398724.1 | 226.935 | 0.453869 |
-Os -ffast-math | 35364 | b9215c8d6889de04f3650bb2b532340b | 10000000 | 398089.2 | 226.573 | 0.453146 |
-Os -ffast-math -funsafe-loop-optimizations | 35364 | 3b495a4b4c60f627f203085b647b040d | 10000000 | 398089.2 | 226.573 | 0.453146 |
-Os -ffast-math -funroll-loops | 35364 | b6a963058c34ef89836f90fc0d154298 | 10000000 | 398724.1 | 226.935 | 0.453869 |
-Os -ffast-math -funroll-loops -funsafe-loop-optimizations | 35364 | 84e07215483de58f61eff7378eb34bea | 10000000 | 398724.1 | 226.935 | 0.453869 |
-Os -fomit-frame-pointer | 35324 | 070dac6558ef149ab191999559dde16c | 10000000 | 419639.1 | 238.838 | 0.477677 |
-Os -fomit-frame-pointer -funsafe-loop-optimizations | 35324 | ee50be758aa32d44fc08f2797e32e22e | 10000000 | 419639.1 | 238.838 | 0.477677 |
-Os -fomit-frame-pointer -funroll-loops | 35324 | 3a526a30f33ea456f28fcaa9de565f3d | 10000000 | 420344.7 | 239.24 | 0.47848 |
-Os -fomit-frame-pointer -funroll-loops -funsafe-loop-optimizations | 35324 | 4b26618cd384c38af8c4ae50f3486b63 | 10000000 | 420344.7 | 239.24 | 0.47848 |
-Os -fomit-frame-pointer -ffast-math | 35324 | 383be4fcab72ef3be9b3f3af07e3f456 | 10000000 | 419463.1 | 238.738 | 0.477476 |
-Os -fomit-frame-pointer -ffast-math -funsafe-loop-optimizations | 35324 | f596b3dbd664f79727741476deeed312 | 10000000 | 419639.1 | 238.838 | 0.477677 |
-Os -fomit-frame-pointer -ffast-math -funroll-loops | 35324 | 03522a57744a4a78c163948d5c8ded63 | 10000000 | 420344.7 | 239.24 | 0.47848 |
-Os -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations | 35324 | 09e7a669e85b9899cd8705e50b2aa1cc | 10000000 | 420168.1 | 239.139 | 0.478279 |
-O0 | 36448 | 6e640730c45f3e5eb08a4b5f1e983718 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -funsafe-loop-optimizations | 36448 | 596a9b3d200f3047e7a06d8b1d684390 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -funroll-loops | 36448 | 4a06206b0bb3c416df301bdafea2c228 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -funroll-loops -funsafe-loop-optimizations | 36448 | b3b0d9d9a6f944d58ed1572a950cc92e | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -ffast-math | 36448 | 6a2cfbc971f7ec453812eedde46976a4 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -ffast-math -funsafe-loop-optimizations | 36448 | 28a44f6847d621979bf24cfad01dcc36 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -ffast-math -funroll-loops | 36448 | beb5cbb31f05b3cf0139620a65e849fe | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -ffast-math -funroll-loops -funsafe-loop-optimizations | 36448 | 2a455e7b1d6d0e81847fe54bd144a950 | 10000000 | 358680.1 | 204.143 | 0.408287 |
-O0 -fomit-frame-pointer | 36544 | 1ae7b3a40f02aa8b8cb05cb1e11dc4f4 | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O0 -fomit-frame-pointer -funsafe-loop-optimizations | 36544 | 8bca489799d5ab25963a451ec06cb02b | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O0 -fomit-frame-pointer -funroll-loops | 36544 | 2df7cb97d2af09a9559c15ced3e41d13 | 10000000 | 375093.8 | 213.485 | 0.426971 |
-O0 -fomit-frame-pointer -funroll-loops -funsafe-loop-optimizations | 36544 | 1bee72b38041fd6569ffed03f626322e | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O0 -fomit-frame-pointer -ffast-math | 36544 | c5cc07a32e481df50dac7b0629af3dd9 | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O0 -fomit-frame-pointer -ffast-math -funsafe-loop-optimizations | 36544 | 3ea72f97af19412a9e9ae43247769556 | 10000000 | 374812.6 | 213.325 | 0.426651 |
-O0 -fomit-frame-pointer -ffast-math -funroll-loops | 36544 | d04d977b26c118b6b56959f6ee675616 | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O0 -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations | 36544 | 057f93bbb54903604cbbbee318143dcf | 10000000 | 374953.1 | 213.405 | 0.426811 |
-O1 | 35372 | 555aa7557093dd446ca5c334bd616222 | 10000000 | 527148.1 | 300.027 | 0.600055 |
-O1 -funsafe-loop-optimizations | 35372 | ceec41ae53ae6c37be754282ec83f162 | 10000000 | 535618.6 | 304.848 | 0.609697 |
-O1 -funroll-loops | 35372 | d4b4de7851396208a948ce6ec3ad0e51 | 10000000 | 519210.8 | 295.51 | 0.59102 |
-O1 -funroll-loops -funsafe-loop-optimizations | 35372 | 86db4200d3176d4a95dc0169c6612aeb | 10000000 | 528820.8 | 300.979 | 0.601959 |
-O1 -ffast-math | 35372 | b65b58c929754c113386fa49f8c90fe4 | 10000000 | 527148.1 | 300.027 | 0.600055 |
-O1 -ffast-math -funsafe-loop-optimizations | 35372 | d06a19db3c50e8b28e06286502ac014c | 10000000 | 535905.7 | 305.012 | 0.610024 |
-O1 -ffast-math -funroll-loops | 35372 | 233b5e4afd7b5ae659769561c8df20ce | 10000000 | 519480.5 | 295.663 | 0.591327 |
-O1 -ffast-math -funroll-loops -funsafe-loop-optimizations | 35372 | 4fc45e7d91855cfe897a88ff1e6ea37f | 10000000 | 528820.8 | 300.979 | 0.601959 |
-O1 -fomit-frame-pointer | 35356 | 72f8b7ef8a88753f175f8e52a648c6e4 | 10000000 | 579038.8 | 329.561 | 0.659122 |
-O1 -fomit-frame-pointer -funsafe-loop-optimizations | 35356 | 9ff9634782c6d74ae7e21c23295834b6 | 10000000 | 592066.3 | 336.976 | 0.673951 |
-O1 -fomit-frame-pointer -funroll-loops | 35356 | ffae4167b98dacbd1cacef2dab5bde6a | 10000000 | 574712.6 | 327.099 | 0.654198 |
-O1 -fomit-frame-pointer -funroll-loops -funsafe-loop-optimizations | 35356 | 70ebd3f98429e7b327fdf6820dff7c3d | 10000000 | 583771.2 | 332.255 | 0.664509 |
-O1 -fomit-frame-pointer -ffast-math | 35356 | 4774fcd7727a9f0c287f1ff861ccb6bc | 10000000 | 579038.8 | 329.561 | 0.659122 |
-O1 -fomit-frame-pointer -ffast-math -funsafe-loop-optimizations | 35356 | 246e46ea2c8b6b0ad6f62c2c00a40df2 | 10000000 | 592066.3 | 336.976 | 0.673951 |
-O1 -fomit-frame-pointer -ffast-math -funroll-loops | 35356 | eafea6b0f13285e3fb4f3c81c3f1081b | 10000000 | 574712.6 | 327.099 | 0.654198 |
-O1 -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations | 35356 | e592fca7c8c0ae58bc506e7255508b9f | 10000000 | 583430.6 | 332.061 | 0.664121 |
-O2 | 35852 | eacb2e445e955baee80ba0f31e376eca | 10000000 | 697350.1 | 396.898 | 0.793796 |
-O2 -funsafe-loop-optimizations | 35852 | 81958b4fc25675f57d6e9680308f0c33 | 10000000 | 697836.7 | 397.175 | 0.79435 |
-O2 -funroll-loops | 35852 | 201f30a46cb2cb8407a30a682e901352 | 10000000 | 700770.9 | 398.845 | 0.79769 |
-O2 -funroll-loops -funsafe-loop-optimizations | 35852 | fc379475d6dc6c428a74b690f39e39d9 | 10000000 | 700770.9 | 398.845 | 0.79769 |
-O2 -ffast-math | 35852 | d0abae396fa85853deacf830528f0b1c | 10000000 | 697350.1 | 396.898 | 0.793796 |
-O2 -ffast-math -funsafe-loop-optimizations | 35852 | b9dfbbe3d047acc972ae05b87454d50d | 10000000 | 697350.1 | 396.898 | 0.793796 |
-O2 -ffast-math -funroll-loops | 35852 | 3d3fd64f295d1f08a861d0b5f98e54d6 | 10000000 | 700770.9 | 398.845 | 0.79769 |
-O2 -ffast-math -funroll-loops -funsafe-loop-optimizations | 35852 | f62ce4fa41e9caab7850f708e4ff2a1c | 10000000 | 700770.9 | 398.845 | 0.79769 |
-O2 -fomit-frame-pointer | 35820 | 23efc0484eb1412c8d03e36da9d58d7d | 10000000 | 758725.3 | 431.83 | 0.86366 |
-O2 -fomit-frame-pointer -funsafe-loop-optimizations | 35820 | dcd5ffba3cec11d52a6ef38a6c680e4f | 10000000 | 758725.3 | 431.83 | 0.86366 |
-O2 -fomit-frame-pointer -funroll-loops | 35820 | eae900d8703ab793b50fb10ae8f55393 | 10000000 | 761035.0 | 433.145 | 0.866289 |
-O2 -fomit-frame-pointer -funroll-loops -funsafe-loop-optimizations | 35820 | 75044b285c6460a558b82c2aa8eba1a4 | 10000000 | 761035.0 | 433.145 | 0.866289 |
-O2 -fomit-frame-pointer -ffast-math | 35820 | e16a62e362fcee6bab43da2145e86012 | 10000000 | 758150.1 | 431.503 | 0.863005 |
-O2 -fomit-frame-pointer -ffast-math -funsafe-loop-optimizations | 35820 | fab2d89108e47660b594a2c0707ae9da | 10000000 | 758725.3 | 431.83 | 0.86366 |
-O2 -fomit-frame-pointer -ffast-math -funroll-loops | 35820 | 0449cc215b65e44717cfd404bb3be632 | 10000000 | 761035.0 | 433.145 | 0.866289 |
-O2 -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations | 35820 | 9cd77ce165a92d6aa20c9d993efdd61e | 10000000 | 760456.3 | 432.815 | 0.86563 |
-O3 | 35852 | fdb28a2b7ba7143587455c275fde7b66 | 10000000 | 697836.7 | 397.175 | 0.79435 |
-O3 -funsafe-loop-optimizations | 35852 | cf616e1bfaa54d658f19720d5e947619 | 10000000 | 697350.1 | 396.898 | 0.793796 |
-O3 -funroll-loops | 35852 | c5c4c50ae1306ee7cf78461f30129c4d | 10000000 | 700280.1 | 398.566 | 0.797132 |
-O3 -funroll-loops -funsafe-loop-optimizations | 35852 | 8f5a156351237ed7b0d0ab0d1c0729d7 | 10000000 | 700280.1 | 398.566 | 0.797132 |
-O3 -ffast-math | 35852 | 61353a84e596e70926ede018e70302b2 | 10000000 | 697836.7 | 397.175 | 0.79435 |
-O3 -ffast-math -funsafe-loop-optimizations | 35852 | ddae039f1838554f7211ce265c6f2953 | 10000000 | 697350.1 | 396.898 | 0.793796 |
-O3 -ffast-math -funroll-loops | 35852 | f4b04e0eb021e12d6942ae56234cf11f | 10000000 | 700770.9 | 398.845 | 0.79769 |
-O3 -ffast-math -funroll-loops -funsafe-loop-optimizations | 35852 | c335c472b522736234ccf304e5f43e24 | 10000000 | 700280.1 | 398.566 | 0.797132 |
-O3 -fomit-frame-pointer | 35820 | 8e7ec6fa10cc4d2dfb11595c1237296c | 10000000 | 758150.1 | 431.503 | 0.863005 |
-O3 -fomit-frame-pointer -funsafe-loop-optimizations | 35820 | 41737019865627c2af5c084fc986b2ae | 10000000 | 758150.1 | 431.503 | 0.863005 |
-O3 -fomit-frame-pointer -funroll-loops | 35820 | 843026d94f073928fa1793d168520c6d | 10000000 | 761035.0 | 433.145 | 0.866289 |
-O3 -fomit-frame-pointer -funroll-loops -funsafe-loop-optimizations | 35820 | b871965048f5dc59e0e409f179fc9c9e | 10000000 | 760456.3 | 432.815 | 0.86563 |
-O3 -fomit-frame-pointer -ffast-math | 35820 | 9df7519ad3b51e9b595b86e0e4977a50 | 10000000 | 758725.3 | 431.83 | 0.86366 |
-O3 -fomit-frame-pointer -ffast-math -funsafe-loop-optimizations | 35820 | 0b789139f787960636c604b27ae79ac0 | 10000000 | 758725.3 | 431.83 | 0.86366 |
-O3 -fomit-frame-pointer -ffast-math -funroll-loops | 35820 | 869afcd1055dfa34e076707238769168 | 10000000 | 761035.0 | 433.145 | 0.866289 |
-O3 -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations | 35820 | 362ba1f9f25411fa8afc8d1e35078f7a | 10000000 | 760456.3 | 432.815 | 0.86563 |
“Benchmarking without analysis is as useless as analysis without benchmarking.” - Richard P. Gabriel, Performance and Evaluation of Lisp Systems, 1985
We can see that applications, like Dhrystone, which fit completely in cache, are only effected by a little bit by changing SCLK rates.
Before we can look at detailed analysis, some basic gcc flags must be understood:
-static-O0-Os-Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size. -Os disables the following optimization flags: -falign-functions -falign-jumps -falign-loops -falign-labels -freorder-blocks -freorder-blocks-and-partition -fprefetch-loop-arrays -ftree-vect-loop-version-O1-O2-O3-O2 and also turns on the -finline-functions, -funswitch-loops and -fgcse-after-reload options.-fomit-frame-pointer-ffunction-sections-fdata-sections-gc-sections-ffunction-sections and -fdata-sections can make applications smaller.Over 99.7% of the processor is spent running dhrystone.
| Total CPU Time | Application | Function |
|---|---|---|
| 46.44% | dhrystone | _main |
| 12.98% | libm-0.9.29.so | ___udivsi3 |
| 9.08% | libuClibc-0.9.29.so | _strcmp |
| 8.24% | dhrystone | _Proc_8 |
| 7.51% | dhrystone | _Func_2 |
| 6.23% | dhrystone | _Proc_7 |
| 3.43% | dhrystone | _Func_1 |
| 2.25% | dhrystone | ___divsi3 |
| 2.21% | dhrystone | _Proc_6 |
| 1.40% | dhrystone | __init |