|
|
I just benchmarked the latest official MAME 0.192 64-bit binary on three different CPUs. All CPUs are overclocked to 4.5GHz.
System 1: Asus P8Z77-M (Z77 chipset) G.Skill DDR3 1600 16GB (8GB x 2) GeForce GTX 1050 Ti 4GB Windows 7 Home Premium x64 SP1 Intel Core i5 2500K 3.3GHz (3.7GHz turbo) overclocked to 4.5GHz turbo - 6M Cache - 4 cores - no hyperthreading - Sandy Bridge
System 2: Asus Z87M-Plus (Z87 chipset) G.Skill DDR3 1600 8GB (4GB x 2) Radeon HD 7750 2GB Windows 7 Home Premium x64 SP1 Tested with 2 different CPUs: Intel Pentium G3258 3.20 GHz (no turbo) overclocked to 4.5GHz - 3M Cache - 2 cores - no hyperthreading - Haswell Intel Core i7-4790K 4.0 Ghz (4.40 GHz turbo) overclocked to 4.5GHz - 8M Cache - 4 cores - hyperthreading - Haswell refresh (Devil's Canyon)
I have a batch script (attached) that runs "mame64 bench -90" on each game three times and I've averaged the three runs together for the results.
Note that my script runs three games that create diffs (blitz, gauntleg, gtfore06) each once before the real benchmark run as it seems to create more consistent and reproducible results with these three games.
For more benchmarking goodness, reference John IV's benchmark page: http://www.mameui.info/Bench.htm
Intel Core i5 2500K 3.3GHz (3.7GHz turbo) overclocked to 4.5GHz 6M Cache - 4 cores - no hyperthreading - Sandy Bridge blitz 216 crusnusa 276 cubeqst 252 cyvern 904 dkong 3676 drivedge 279 gauntleg 373 gnbarich 1703 gradius4 303 gtfore06 258 harddriv 494 kidniki 406 kof98 975 mario 735 mk4 212 pacman 11722 pinkswts 1379 pong 339 propcycl 158 radikalb 170 roadblst 820 robotron 4680 rvschool 238 scud 67 sf2 2093 sfa2 1701 sfiii 1443 slrasslt 373 starblad 156 starsldr 51 tekken 386 tekken3 181 vfkids 189
Intel Pentium G3258 3.20 GHz (no turbo) overclocked to 4.5GHz 3M Cache - 2 cores - no hyperthreading - Haswell blitz 210 crusnusa 393 cubeqst 380 cyvern 1065 dkong 4665 drivedge 371 gauntleg 352 gnbarich 1961 gradius4 296 gtfore06 283 harddriv 650 kidniki 489 kof98 1313 mario 920 mk4 291 pacman 17539 pinkswts 1349 pong 399 propcycl 141 radikalb 264 roadblst 1004 robotron 6175 rvschool 328 scud 78 sf2 2732 sfa2 2299 sfiii 1634 slrasslt 382 starblad 203 starsldr 62 tekken 530 tekken3 269 vfkids 222
Intel Core i7-4790K 4.0 Ghz (4.40 GHz turbo) overclocked to 4.5GHz 8M Cache - 4 cores - hyperthreading - Haswell refresh (Devil's Canyon) blitz 261 crusnusa 361 cubeqst 387 cyvern 1120 dkong 4871 drivedge 375 gauntleg 447 gnbarich 2005 gradius4 368 gtfore06 310 harddriv 657 kidniki 501 kof98 1311 mario 934 mk4 277 pacman 18205 pinkswts 1627 pong 397 propcycl 178 radikalb 257 roadblst 1011 robotron 6151 rvschool 332 scud 74 sf2 2754 sfa2 2339 sfiii 1672 slrasslt 435 starblad 207 starsldr 64 tekken 545 tekken3 277 vfkids 231
I compared the G3258 to the i7-4790K. They are both Haswell architecture and mainly differ on the number of cores and the amount of cache. As expected, single-threaded drivers show a tiny improvement and there are a few drivers that benefit quite a bit from the extra 2 cores. However, I was surprised by a few games that were actually a little slower on the i7. I have no idea why that would be the case.
|
GroovyMAME support forum on BYOAC
|
|
B2K24 |
MAME @ 15 kHz Sony Trinitron CRT user
|
|
|
Reged: 10/25/10
|
Posts: 2663
|
|
|
Send PM
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371598 - 12/03/17 08:56 PM
|
|
|
Thanks for the information. It's nice to have these numbers
|
|
|
John IV |
IV/Play, MAME, MAMEUI
|
|
|
Reged: 09/22/03
|
Posts: 1969
|
Loc: Washington, USA
|
|
Send PM
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371613 - 12/04/17 06:06 AM
|
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371616 - 12/04/17 08:28 AM
|
|
|
Quote:
However, I was surprised by a few games that were actually a little slower on the i7. I have no idea why that would be the case.
Try disabling Hyperthreading in bios on the i7, it sometimes introduces latency, which can effect performance in some scenarios.
|
If all else fails, Burn the manual.
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371633 - 12/04/17 10:55 PM
|
|
|
I've made also recently some tests. I was more interested in differences between windows 7, 8, 8.1, 10. What is funny in most cases 10 is the slowest system although just slightly.
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: haynor666]
#371643 - 12/05/17 03:13 AM
|
|
|
I wonder if custom compiling MAME with -march=haswell would make any performance difference (running on Haswell CPUs, obviously).
Or maybe -march=native instead if compiling on the target CPU itself.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371644 - 12/05/17 03:22 AM
|
|
|
> I wonder if custom compiling MAME with -march=haswell would make any performance > difference (running on Haswell CPUs, obviously). > > Or maybe -march=native instead if compiling on the target CPU itself.
You can get some gains (particularly with Voodoo emulation) with ARCHOPTS="-msse4.2 -mpopcnt -fomit-frame-pointer"
Avoid -march, it causes more problems than it's worth.
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: taz-nz]
#371679 - 12/06/17 06:34 AM
|
|
|
I disabled hyperthreading on the i7 and ran a new benchmark. It got slightly better on some and slightly worse on others (I just highlighted the negatives and the double digit gains). There doesn't appear to be any rhyme or reason to it either. Not what I was expecting...
Here's the old i7 results (with hyperthreading) compared to the new i7 results this makes it even more clear that it's basically a wash or maybe slightly favoring hyperthreading. Note that the spreadsheet was set to not show decimal places which explains results like scud where they both show 74 but there's a -1% difference.
I wonder if a longer bench run (like -bench 360) would show a more clear difference.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371683 - 12/06/17 12:51 PM
|
|
|
Quote:
I wonder if a longer bench run (like -bench 360) would show a more clear difference.
Hyperthread may not make as much difference as i remember, newer cpu may have lessoned any penalties for non multithreaded apps.
There is a big difference with a lot of games between a short benchmark and a longer one, a lot of games have long boot sequences that can increase or in a few cases decrease benchmark scores, a lot of games have a fairly static title and high score screens, so running a longer benchmark often results in a closer to real world result as they spent more time in the attract mode.
Back when I was benchmarking MAME all the time, I used a 240 benchmark for this reason.
|
If all else fails, Burn the manual.
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371689 - 12/06/17 09:04 PM
|
|
|
BTW, given that you're benchmarking in a noisy multi-tasking environment, I'd generally pick the maximum value over three runs rather than the average.
You'll probably never actually achieve the real maximum speed, but the more runs you do, statistically you'll probably get a max that starts to approach it.
By averaging, all you need is one ill-timed bit of background activity to create an artifically bad score that will disappear the next time you run.
I'll try to post my i7-8700K numbers tonight.
Aaron
|
|
|
|
|
> BTW, given that you're benchmarking in a noisy multi-tasking environment, I'd > generally pick the maximum value over three runs rather than the average. > > You'll probably never actually achieve the real maximum speed, but the more runs you > do, statistically you'll probably get a max that starts to approach it. > > By averaging, all you need is one ill-timed bit of background activity to create an > artifically bad score that will disappear the next time you run.
Good point. I've attached a spreadsheet with my updated benchmarks taking the highest of three runs for each game in the benchmark.
Still "-bench 90" with the official MAME 0.192 64-bit executable as before.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: New MAME 0.192 benchmarks
[Re: taz-nz]
#371701 - 12/07/17 05:42 AM
|
|
|
> Hyperthread may not make as much difference as i remember, newer cpu may have > lessoned any penalties for non multithreaded apps.
My original question was why a Haswell Pentium with 2 cores and 3M of cache was occasionally slightly faster than a Haswell i7 with 4 cores and 6M of cache. Running on otherwise identical hardware. I literally pulled the Pentium CPU out of the machine and replaced it with the i7 so everything else is identical. I expected the i7 would be equal or faster in every case but that's not what happened.
It was suggested that the difference might have been caused by hyperthreading on the i7 so I disabled hyperthreading and ran some new benchmarks.
Here's the Pentium G3258 compared to the i7-4790K with hyperthreading disabled. The games in pink are the ones that are still slower on the i7 for whatever reason. The games in green are ones that take advantage of the extra cores on the i7...
Also for reference, here's the i7-4790K compared against itself with and without hyperthreading. Games in pink are slower with hyperthreading off. Games in green are faster with hyperthreading off. Out of the 33 games tested, 14 were slightly slower and 19 were slightly faster. So it's basically slightly worse with hyperthreading off but it really doesn't make much difference either way.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: New MAME 0.192 benchmarks
[Re: AaronGiles]
#371703 - 12/07/17 12:00 PM
|
|
|
> I'll try to post my i7-8700K numbers tonight.
All right, as promised. Due to Turbo mode and such, I'm not quite sure how to quote the speed in GHz on this chip, so I just listed the multipliers. It came stock at x44, but will OC to x46 stably. At x48 it can run MAME fine but can't handle long periods maxed out on all cores, such as when building.
I also tried with both stock 0.192 as well as the mame0192 tag built with the options Vas mentioned above: ARCHOPTS=-msse4.2 -mpopcnt -fomit-frame-pointer
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: AaronGiles]
#371705 - 12/07/17 01:23 PM
|
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: AaronGiles]
#371706 - 12/07/17 05:17 PM
|
|
|
> All right, as promised. Due to Turbo mode and such, I'm not quite sure how to quote > the speed in GHz on this chip, so I just listed the multipliers. It came stock at > x44, but will OC to x46 stably. At x48 it can run MAME fine but can't handle long > periods maxed out on all cores, such as when building.
Your stock chip has a turbo of 4.7GHz. So when benchmarking, at least one of the cores should already be hitting that and the rest should be close. I'm not sure what the policy is for the new 6-core chips. You should be able to see the stock speed on each core under load using something like the Intel Extreme Tuning Utility.
That said, how are you trying to overclock your chip? When I just jacked up the multiplier across the board (from my stock 4.4 turbo) using the Intel Extreme Tuning Utility, the CPU core voltage was way too high and it kept overheating and thermal throttling, which defeats the purpose of overclocking.
I much had better results when I went into my BIOS and changed the following settings. Note that I have an ASUS Z87 board with a Haswell CPU so your setting will probably be different.
AI Overclock Tuner - XMP CPU Core Ratio - Sync All Cores 1-Core Ratio Limit - 47 2-Core Ratio Limit - 47 2-Core Ratio Limit - 47 2-Core Ratio Limit - 47 CPU Core Voltage - Adaptive Mode Additional Turbo Mode Core CPU Voltage - 1.275
I don't know what the correct voltage would be for your CPU but leaving it on "auto" for mine resulted in it being way too high under load.
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371713 - 12/07/17 10:37 PM
|
|
|
Thanks for the pointer to the Intel utility; I had not heard of it.
My system is running an Alienware BIOS which has fairly limited controls. There's also an Alienware OC Utility which has slightly better controls but nowhere near the Intel ones.
The image below is what the Intel utility says for my various configs. On the left is "Default" (according to Intel). Next column is what Alienware configures when OC is "off". Column 3 is "OS Stage 1" and Column 4 is "OC Stage 2". Basically the final 3 columns should correspond to my bench numbers.
Some things are definitely confusing, like what is "Max Non Turbo Boost Ratio"? Is that when there are 0 active cores? Also, running MAME and watching the live value of "Active Cores" often showed 0 and no more than 1.
(Sorry for the large size, I run high DPI and I forgot to scale it down.)
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: AaronGiles]
#371718 - 12/08/17 01:11 AM
|
|
|
> Some things are definitely confusing, like what is "Max Non Turbo Boost Ratio"?
Max Non Turbo Boost Ratio (37x) is the base CPU multiplier. I believe speeds for sub-units within the chip are based off this ratio. This is also where the Processor Base Frequency (3.7 GHz) comes from. According to Intel...
Processor Base Frequency describes the rate at which the processor's transistors open and close. The processor base frequency is the operating point where TDP is defined. Frequency is measured in gigahertz (GHz), or billion cycles per second.
If you disabled the Turbo Boost feature in the Intel XTU program, your CPU would max out at 3.7GHz.
More info on your CPU from Intel: https://ark.intel.com/products/126684/Intel-Core-i7-8700K-Processor-12M-Cache-up-to-4_70-GHz
You probably know all this, but I'll say it just in case. "Turbo Boost" is basically Intel temporarily overclocking your CPU. When Turbo Boost kicks in when the CPU is under load, the motherboard increases the voltage to support the overclock and then ramps it back down to normal when the load stops. Also, not all cores will hit the maximum turbo boost multiplier. Intel deliberately limits this due to thermal concerns. This is the only chart I could find for your CPU. Intel used to publish this information but they're stopped with recent CPUs so I don't know if it's 100% correct but check out the speeds that each core runs at when you've got all 6 loaded at once. Only one hits 4.7, the next 4.6 and so on until the last loaded core tops out at 4.3. It basically matches the screenshot you posted...
> Is that when there are 0 active cores? Also, running MAME and watching the live value of > "Active Cores" often showed 0 and no more than 1.
Not sure about the "Active Cores" value. Maybe it's fluctuating faster than the display can update and sometimes displays 0 due to a low sampling rate.
You can a "stress test" within the Intel XTU that will load all your CPU cores. I would turn on the thermal throttle display via the wrench icons in the upper-right corner so you can watch for that. If it gets too hot, it will throttle it back.
The Alienware OC "Stage 1" looks like it's setting turbo max to 4.6GHz on all cores equally. OC "Stage 2" is setting turbo max to 4.8GHz on all cores equally.
I don't know what kind of CPU cooler your system has so that might limit your overclocking due to thermal concerns. Also, I don't know how much overclocking headroom your CPU even has. The stock turbo of 4.7GHz is pretty awesome. How much are other people able to overclock the i7-8700K?
I'd be curious to see your bench numbers if you set all cores to 45x using the Intel XTU (I think it's under Manual Tuning -> Core). That way we can compare the coffee lake architecture performance at the same clock speed as my two Haswell chips as well as John IV's i7-6700K skylake CPU. And we can see how much boost you get from 2 more cores on the games that can use them. If you do a bench at 45x on all cores, can you stick the results in a text file and attach it so that I can more easily paste them into a spreadsheet?
|
|
|
|
Re: New MAME 0.192 benchmarks
[Re: krick]
#371727 - 12/09/17 12:08 AM
|
|
|
@krick:
your benchmarks really leave questions and i can only guess some things. either the benchmark tool has some fluctuation or your mainboard has. Overclocking is generally not some good comparing example and needs good cooling and quite a lot of knowledge to make those "numbers" practical, stable and useable. I read much about overclocking, but if it comes to real life situations, my experience is that a overclock that works in benchmarks, quite often doesnt work with CPU demanding applications and i dont count games as very CPU demanding. 4,5Ghz sounds very edgy to me, if you dont have at least watercooling on your CPU.
I can overclock my i7-5960 easily to 4,5Ghz for example and i passed through the asus benchmark tool, hitting place 19 in the hiscore-list of the RoG-Asus website (for my CPU), but in no way i can run that stable 24/7 with i.e. Adobe Premiere or After Effects. I am pretty sure, that if you really stress-test your CPU (with CPU-Z for at least 2hours), you will get a BoD quite fast.
How about a test with a slightly less overclock, like 4,2 or 4,1Ghz?
|
|
|