---------------------------------------------------------------- STREAM Memory Benchmark v0.2 ---------------------------------------------------------------- The Test will run some minutes please be patient. Total memory required = 160.0 MB. Each test is run 3 times, but only the *best* time for each is used. ---------------------------------------------------------------- Memory throughput Working on Arrays of 80 MB. ---------------------------------------------------------------- Read test (summing up the array). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time read 8 235.2284 0.3414 0.3401 0.3427 read 32 349.2183 0.2298 0.2291 0.2305 read 64 357.6648 0.2237 0.2237 0.2237 read 32x2 346.8911 0.2306 0.2306 0.2306 read 32x4 347.8971 0.2308 0.2300 0.2316 read 32 CP3 692.7724 0.1156 0.1155 0.1158 read 32 CP4 831.9123 0.0969 0.0962 0.0977 read 32 CP5 * 957.7950 0.0835 0.0835 0.0835 read 32 CP6 925.4965 0.0871 0.0864 0.0879 read 32x4 CP3 801.4185 0.0999 0.0998 0.1000 read 32x4 CP4 925.5195 0.0866 0.0864 0.0868 read 32x4 CP5 952.2689 0.0842 0.0840 0.0843 read 32x4 CP6 932.0313 0.0859 0.0858 0.0860 ---------------------------------------------------------------- Write test (setting array A). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time write 8 397.3812 0.2023 0.2013 0.2033 write 32 718.9783 0.1119 0.1113 0.1124 write 64 720.3119 0.1114 0.1111 0.1116 write 32x2 711.1627 0.1127 0.1125 0.1128 write 32x4 718.2612 0.1121 0.1114 0.1128 memset 750 * 742.0116 0.1084 0.1078 0.1089 memset 750 0 741.5655 0.1079 0.1079 0.1079 libmoto memset 741.4066 0.1082 0.1079 0.1084 glibc memset 381.9488 0.2095 0.2095 0.2096 glibc memset0 744.4976 0.1081 0.1075 0.1087 ---------------------------------------------------------------- Compare test (comparing the source and destination arrays). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time cmp 8 278.4667 0.5753 0.5746 0.5760 cmp 32 329.8493 0.4857 0.4851 0.4863 cmp 64 331.1902 0.4839 0.4831 0.4847 cmp 32x2 329.8418 0.4852 0.4851 0.4852 cmp 32x4 334.3683 0.4791 0.4785 0.4796 cmp 32 CP2 496.2227 0.3241 0.3224 0.3257 cmp 32 CP3 * 858.0193 0.1877 0.1865 0.1888 cmp 32 CP4 826.0884 0.1938 0.1937 0.1939 cmp 32 CP5 847.7366 0.1895 0.1887 0.1902 cmp 32 CP6 815.7357 0.1961 0.1961 0.1961 cmp 32x4 CP2 834.4756 0.1921 0.1917 0.1925 cmp 32x4 CP3 869.4423 0.1847 0.1840 0.1854 cmp 32x4 CP4 813.3836 0.1969 0.1967 0.1970 cmp 32x4 CP5 840.4724 0.1913 0.1904 0.1923 cmp 32x4 CP6 796.5425 0.2018 0.2009 0.2027 libmoto memcmp 281.2994 0.5694 0.5688 0.5699 glibc memcmp 336.2008 0.4767 0.4759 0.4774 ---------------------------------------------------------------- Copy test (copying array A -> B). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time copy 8 408.3237 0.3922 0.3918 0.3925 copy 32 423.7064 0.3776 0.3776 0.3777 copy 64 425.9025 0.3757 0.3757 0.3757 copy 32x2 412.9055 0.3875 0.3875 0.3876 copy 32x4 413.5383 0.3869 0.3869 0.3869 copy 32 CP2 512.0754 0.3125 0.3125 0.3126 copy 32 CP3 639.1509 0.2506 0.2503 0.2508 copy 32 CP4 * 682.4022 0.2346 0.2345 0.2347 copy 32 CP5 661.4141 0.2420 0.2419 0.2420 copy 32x4 CP2 615.2543 0.2602 0.2601 0.2603 copy 32x4 CP3 638.9677 0.2505 0.2504 0.2506 copy 32x4 CP4 656.7010 0.2437 0.2436 0.2437 copy 32x4 CP5 652.9306 0.2452 0.2450 0.2454 copy 64x4 CP4 649.6062 0.2466 0.2463 0.2470 copy 64x4 CP4C 670.9243 0.2386 0.2385 0.2387 glibcb memcpy 412.5805 0.3881 0.3878 0.3884 bmove512 425.5331 0.3766 0.3760 0.3773 FC64 688.1986 0.2327 0.2325 0.2330 libmoto memcpy 617.6007 0.2593 0.2591 0.2595 memcpy 750 568.8932 0.2813 0.2812 0.2813 ---------------------------------------------------------------- 2nd level cache throughput Working on Arrays of 80 KB. ---------------------------------------------------------------- Read test (summing up the array). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time read 8 464.9405 0.1725 0.1721 0.1730 read 32 2268.9237 0.0353 0.0353 0.0354 read 64 2267.9576 0.0354 0.0353 0.0355 read 32x2 2385.9911 0.0339 0.0335 0.0343 read 32x4 2387.3492 0.0336 0.0335 0.0336 read 32 CP3 2159.8275 0.0372 0.0370 0.0373 read 32 CP4 2146.2749 0.0379 0.0373 0.0385 read 32 CP5 * 2153.7692 0.0373 0.0371 0.0374 read 32 CP6 2159.2438 0.0371 0.0371 0.0371 read 32x4 CP3 3215.0423 0.0249 0.0249 0.0250 read 32x4 CP4 3231.9504 0.0252 0.0248 0.0256 read 32x4 CP5 3203.7152 0.0254 0.0250 0.0259 read 32x4 CP6 3218.2801 0.0249 0.0249 0.0250 ---------------------------------------------------------------- Write test (setting array A). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time write 8 705.6915 0.1138 0.1134 0.1142 write 32 2388.0628 0.0335 0.0335 0.0336 write 64 3024.2296 0.0265 0.0265 0.0265 write 32x2 2377.9930 0.0338 0.0336 0.0339 write 32x4 2385.7027 0.0340 0.0335 0.0345 memset 750 * 2507.6177 0.0326 0.0319 0.0334 memset 750 0 4122.8752 0.0194 0.0194 0.0194 libmoto memset 4096.0500 0.0196 0.0195 0.0196 glibc memset 3000.8614 0.0270 0.0267 0.0274 glibc memset0 4115.6438 0.0195 0.0194 0.0195 ---------------------------------------------------------------- Compare test (comparing the source and destination arrays). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time cmp 8 725.8936 0.2213 0.2204 0.2222 cmp 32 2826.8981 0.0566 0.0566 0.0567 cmp 64 2920.5576 0.0557 0.0548 0.0567 cmp 32x2 2912.4330 0.0556 0.0549 0.0562 cmp 32x4 3013.5283 0.0532 0.0531 0.0534 cmp 32 CP2 2387.4596 0.0670 0.0670 0.0670 cmp 32 CP3 * 2380.3463 0.0679 0.0672 0.0686 cmp 32 CP4 2381.2331 0.0673 0.0672 0.0674 cmp 32 CP5 2375.6864 0.0676 0.0673 0.0678 cmp 32 CP6 2379.9242 0.0673 0.0672 0.0674 cmp 32x4 CP2 3705.5838 0.0432 0.0432 0.0432 cmp 32x4 CP3 3777.0572 0.0424 0.0424 0.0424 cmp 32x4 CP4 3772.2590 0.0425 0.0424 0.0425 cmp 32x4 CP5 3768.8693 0.0431 0.0425 0.0437 cmp 32x4 CP6 3775.5485 0.0425 0.0424 0.0425 libmoto memcmp 6279.4270 0.0281 0.0255 0.0307 ---------------------------------------------------------------- Copy test (copying array A -> B). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time copy 8 898.4722 0.1783 0.1781 0.1784 copy 32 3011.6485 0.0531 0.0531 0.0531 copy 64 4818.8235 0.0337 0.0332 0.0341 copy 32x2 2848.7019 0.0562 0.0562 0.0563 copy 32x4 3216.8608 0.0500 0.0497 0.0503 copy 32 CP2 2409.2820 0.0667 0.0664 0.0671 copy 32 CP3 2388.6663 0.0673 0.0670 0.0676 copy 32 CP4 * 2406.1981 0.0669 0.0665 0.0672 copy 32 CP5 2395.1028 0.0673 0.0668 0.0678 copy 32x4 CP2 3037.0949 0.0527 0.0527 0.0527 copy 32x4 CP3 3059.2051 0.0525 0.0523 0.0527 copy 32x4 CP4 2979.4117 0.0538 0.0537 0.0539 copy 32x4 CP5 3035.2405 0.0540 0.0527 0.0552 copy 64x4 CP4 4946.9883 0.0326 0.0323 0.0329 copy 64x4 CP4C 4422.9716 0.0362 0.0362 0.0362 glibcb memcpy 2911.0433 0.0550 0.0550 0.0550 bmove512 2996.4264 0.0534 0.0534 0.0534 FC64 5313.8700 0.0303 0.0301 0.0304 libmoto memcpy 7746.3396 0.0207 0.0207 0.0207 memcpy 750 3114.7797 0.0515 0.0514 0.0516 ---------------------------------------------------------------- 1st level cache throughput Working on Arrays of 800 B. ---------------------------------------------------------------- Read test (summing up the array). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time read 8 471.3059 0.1701 0.1697 0.1704 read 32 2807.6203 0.0285 0.0285 0.0286 read 64 2198.0997 0.0365 0.0364 0.0365 read 32x2 3755.3084 0.0213 0.0213 0.0214 read 32x4 4431.1489 0.0181 0.0181 0.0181 read 32 CP3 2752.8454 0.0291 0.0291 0.0292 read 32 CP4 2760.0461 0.0290 0.0290 0.0290 read 32 CP5 * 2760.6138 0.0290 0.0290 0.0290 read 32 CP6 2759.0021 0.0291 0.0290 0.0291 read 32x4 CP3 4392.4588 0.0182 0.0182 0.0183 read 32x4 CP4 4374.7059 0.0190 0.0183 0.0197 read 32x4 CP5 4202.8147 0.0191 0.0190 0.0192 read 32x4 CP6 4382.8200 0.0183 0.0183 0.0184 ---------------------------------------------------------------- Write test (setting array A). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time write 8 705.1813 0.1141 0.1134 0.1147 write 32 2808.7014 0.0285 0.0285 0.0286 write 64 3744.0786 0.0214 0.0214 0.0214 write 32x2 4768.9642 0.0168 0.0168 0.0168 write 32x4 4739.3935 0.0174 0.0169 0.0180 memset 750 * 4027.9980 0.0199 0.0199 0.0199 memset 750 0 11017.7088 0.0073 0.0073 0.0073 libmoto memset 9698.0930 0.0083 0.0082 0.0083 glibc memset 3847.6323 0.0209 0.0208 0.0209 glibc memset0 22228.8387 0.0036 0.0036 0.0036 ---------------------------------------------------------------- Compare test (comparing the source and destination arrays). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time cmp 8 764.7525 0.2095 0.2092 0.2097 cmp 32 3740.8645 0.0429 0.0428 0.0430 cmp 64 3472.6090 0.0462 0.0461 0.0464 cmp 32x2 4089.4355 0.0397 0.0391 0.0403 cmp 32x4 4095.5251 0.0391 0.0391 0.0391 cmp 32 CP2 2785.2705 0.0581 0.0574 0.0587 cmp 32 CP3 * 2798.0330 0.0572 0.0572 0.0573 cmp 32 CP4 2791.8869 0.0574 0.0573 0.0574 cmp 32 CP5 2797.8931 0.0572 0.0572 0.0572 cmp 32 CP6 2784.6926 0.0580 0.0575 0.0585 cmp 32x4 CP2 4021.1194 0.0400 0.0398 0.0402 cmp 32x4 CP3 4005.7102 0.0400 0.0399 0.0400 cmp 32x4 CP4 4012.7281 0.0403 0.0399 0.0407 cmp 32x4 CP5 4016.3544 0.0401 0.0398 0.0403 cmp 32x4 CP6 4030.3203 0.0397 0.0397 0.0398 libmoto memcmp 9352.3697 0.0171 0.0171 0.0171 ---------------------------------------------------------------- Copy test (copying array A -> B). ---------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time copy 8 938.7798 0.1707 0.1704 0.1711 copy 32 3776.2708 0.0424 0.0424 0.0425 copy 64 7436.6268 0.0215 0.0215 0.0216 copy 32x2 3369.6970 0.0475 0.0475 0.0475 copy 32x4 3916.7531 0.0409 0.0409 0.0409 copy 32 CP2 2273.1507 0.0709 0.0704 0.0715 copy 32 CP3 2280.3728 0.0702 0.0702 0.0703 copy 32 CP4 * 2279.9545 0.0702 0.0702 0.0702 copy 32 CP5 2280.4038 0.0714 0.0702 0.0726 copy 32x4 CP2 4458.3201 0.0359 0.0359 0.0360 copy 32x4 CP3 4471.6288 0.0358 0.0358 0.0359 copy 32x4 CP4 4484.8074 0.0357 0.0357 0.0357 copy 32x4 CP5 4484.4478 0.0357 0.0357 0.0358 copy 64x4 CP4 7237.5639 0.0222 0.0221 0.0223 copy 64x4 CP4C 5909.2911 0.0276 0.0271 0.0281 glibcb memcpy 3550.4142 0.0451 0.0451 0.0451 bmove512 2920.1890 0.0548 0.0548 0.0548 FC64 6575.1748 0.0244 0.0243 0.0246 libmoto memcpy 8970.5740 0.0181 0.0178 0.0184 memcpy 750 3199.2250 0.0500 0.0500 0.0501 ----------------------------------------------------------------