MAMEWorld >> EmuChat
View all threads Index   Threaded Mode Threaded  

Pages: 1

krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


New MAME 0.192 benchmarks
#371594 - 12/03/17 06:01 PM Attachment: bench.zip 2 KB (35 downloads)


I just benchmarked the latest official MAME 0.192 64-bit binary on three different CPUs. All CPUs are overclocked to 4.5GHz.

System 1:
Asus P8Z77-M (Z77 chipset)
G.Skill DDR3 1600 16GB (8GB x 2)
GeForce GTX 1050 Ti 4GB
Windows 7 Home Premium x64 SP1
Intel Core i5 2500K 3.3GHz (3.7GHz turbo) overclocked to 4.5GHz turbo - 6M Cache - 4 cores - no hyperthreading - Sandy Bridge

System 2:
Asus Z87M-Plus (Z87 chipset)
G.Skill DDR3 1600 8GB (4GB x 2)
Radeon HD 7750 2GB
Windows 7 Home Premium x64 SP1
Tested with 2 different CPUs:
Intel Pentium G3258 3.20 GHz (no turbo) overclocked to 4.5GHz - 3M Cache - 2 cores - no hyperthreading - Haswell
Intel Core i7-4790K 4.0 Ghz (4.40 GHz turbo) overclocked to 4.5GHz - 8M Cache - 4 cores - hyperthreading - Haswell refresh (Devil's Canyon)

I have a batch script (attached) that runs "mame64 bench -90" on each game three times and I've averaged the three runs together for the results.

Note that my script runs three games that create diffs (blitz, gauntleg, gtfore06) each once before the real benchmark run as it seems to create more consistent and reproducible results with these three games.

For more benchmarking goodness, reference John IV's benchmark page: http://www.mameui.info/Bench.htm


Intel Core i5 2500K 3.3GHz (3.7GHz turbo) overclocked to 4.5GHz
6M Cache - 4 cores - no hyperthreading - Sandy Bridge
blitz 216
crusnusa 276
cubeqst 252
cyvern 904
dkong 3676
drivedge 279
gauntleg 373
gnbarich 1703
gradius4 303
gtfore06 258
harddriv 494
kidniki 406
kof98 975
mario 735
mk4 212
pacman 11722
pinkswts 1379
pong 339
propcycl 158
radikalb 170
roadblst 820
robotron 4680
rvschool 238
scud 67
sf2 2093
sfa2 1701
sfiii 1443
slrasslt 373
starblad 156
starsldr 51
tekken 386
tekken3 181
vfkids 189


Intel Pentium G3258 3.20 GHz (no turbo) overclocked to 4.5GHz
3M Cache - 2 cores - no hyperthreading - Haswell
blitz 210
crusnusa 393
cubeqst 380
cyvern 1065
dkong 4665
drivedge 371
gauntleg 352
gnbarich 1961
gradius4 296
gtfore06 283
harddriv 650
kidniki 489
kof98 1313
mario 920
mk4 291
pacman 17539
pinkswts 1349
pong 399
propcycl 141
radikalb 264
roadblst 1004
robotron 6175
rvschool 328
scud 78
sf2 2732
sfa2 2299
sfiii 1634
slrasslt 382
starblad 203
starsldr 62
tekken 530
tekken3 269
vfkids 222


Intel Core i7-4790K 4.0 Ghz (4.40 GHz turbo) overclocked to 4.5GHz
8M Cache - 4 cores - hyperthreading - Haswell refresh (Devil's Canyon)
blitz 261
crusnusa 361
cubeqst 387
cyvern 1120
dkong 4871
drivedge 375
gauntleg 447
gnbarich 2005
gradius4 368
gtfore06 310
harddriv 657
kidniki 501
kof98 1311
mario 934
mk4 277
pacman 18205
pinkswts 1627
pong 397
propcycl 178
radikalb 257
roadblst 1011
robotron 6151
rvschool 332
scud 74
sf2 2754
sfa2 2339
sfiii 1672
slrasslt 435
starblad 207
starsldr 64
tekken 545
tekken3 277
vfkids 231


I compared the G3258 to the i7-4790K. They are both Haswell architecture and mainly differ on the number of cores and the amount of cache. As expected, single-threaded drivers show a tiny improvement and there are a few drivers that benefit quite a bit from the extra 2 cores. However, I was surprised by a few games that were actually a little slower on the i7. I have no idea why that would be the case.



GroovyMAME support forum on BYOAC



B2K24
MAME @ 15 kHz Sony Trinitron CRT user
Reged: 10/25/10
Posts: 2663
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371598 - 12/03/17 08:56 PM


Thanks for the information. It's nice to have these numbers



John IV
IV/Play, MAME, MAMEUI
Reged: 09/22/03
Posts: 1969
Loc: Washington, USA
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371613 - 12/04/17 06:06 AM


Added to bench page, thanks Krick.



john iv
http://www.mameui.info/



taz-nz
MAME Fan
Reged: 11/26/07
Posts: 125
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371616 - 12/04/17 08:28 AM



Quote:


However, I was surprised by a few games that were actually a little slower on the i7. I have no idea why that would be the case.




Try disabling Hyperthreading in bios on the i7, it sometimes introduces latency, which can effect performance in some scenarios.



If all else fails, Burn the manual.



haynor666
Reged: 05/06/06
Posts: 101
Loc: Tarnobrzeg/Poland
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371633 - 12/04/17 10:55 PM


I've made also recently some tests. I was more interested in differences between windows 7, 8, 8.1, 10. What is funny in most cases 10 is the slowest system although just slightly.



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: haynor666]
#371643 - 12/05/17 03:13 AM


I wonder if custom compiling MAME with -march=haswell would make any performance difference (running on Haswell CPUs, obviously).

Or maybe -march=native instead if compiling on the target CPU itself.



GroovyMAME support forum on BYOAC



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371644 - 12/05/17 03:22 AM


> I wonder if custom compiling MAME with -march=haswell would make any performance
> difference (running on Haswell CPUs, obviously).
>
> Or maybe -march=native instead if compiling on the target CPU itself.

You can get some gains (particularly with Voodoo emulation) with ARCHOPTS="-msse4.2 -mpopcnt -fomit-frame-pointer"

Avoid -march, it causes more problems than it's worth.



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: taz-nz]
#371679 - 12/06/17 06:34 AM


I disabled hyperthreading on the i7 and ran a new benchmark. It got slightly better on some and slightly worse on others (I just highlighted the negatives and the double digit gains). There doesn't appear to be any rhyme or reason to it either. Not what I was expecting...




Here's the old i7 results (with hyperthreading) compared to the new i7 results this makes it even more clear that it's basically a wash or maybe slightly favoring hyperthreading. Note that the spreadsheet was set to not show decimal places which explains results like scud where they both show 74 but there's a -1% difference.



I wonder if a longer bench run (like -bench 360) would show a more clear difference.



GroovyMAME support forum on BYOAC



taz-nz
MAME Fan
Reged: 11/26/07
Posts: 125
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371683 - 12/06/17 12:51 PM



Quote:


I wonder if a longer bench run (like -bench 360) would show a more clear difference.




Hyperthread may not make as much difference as i remember, newer cpu may have lessoned any penalties for non multithreaded apps.

There is a big difference with a lot of games between a short benchmark and a longer one, a lot of games have long boot sequences that can increase or in a few cases decrease benchmark scores, a lot of games have a fairly static title and high score screens, so running a longer benchmark often results in a closer to real world result as they spent more time in the attract mode.

Back when I was benchmarking MAME all the time, I used a 240 benchmark for this reason.



If all else fails, Burn the manual.



AaronGiles
Galaxiwarrior
Reged: 09/21/03
Posts: 1343
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371689 - 12/06/17 09:04 PM


BTW, given that you're benchmarking in a noisy multi-tasking environment, I'd generally pick the maximum value over three runs rather than the average.

You'll probably never actually achieve the real maximum speed, but the more runs you do, statistically you'll probably get a max that starts to approach it.

By averaging, all you need is one ill-timed bit of background activity to create an artifically bad score that will disappear the next time you run.

I'll try to post my i7-8700K numbers tonight.

Aaron



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: AaronGiles]
#371700 - 12/07/17 04:53 AM Attachment: krick_mame_0192_benchmarks.zip 3 KB (2 downloads)


> BTW, given that you're benchmarking in a noisy multi-tasking environment, I'd
> generally pick the maximum value over three runs rather than the average.
>
> You'll probably never actually achieve the real maximum speed, but the more runs you
> do, statistically you'll probably get a max that starts to approach it.
>
> By averaging, all you need is one ill-timed bit of background activity to create an
> artifically bad score that will disappear the next time you run.

Good point. I've attached a spreadsheet with my updated benchmarks taking the highest of three runs for each game in the benchmark.

Still "-bench 90" with the official MAME 0.192 64-bit executable as before.



GroovyMAME support forum on BYOAC



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: taz-nz]
#371701 - 12/07/17 05:42 AM


> Hyperthread may not make as much difference as i remember, newer cpu may have
> lessoned any penalties for non multithreaded apps.

My original question was why a Haswell Pentium with 2 cores and 3M of cache was occasionally slightly faster than a Haswell i7 with 4 cores and 6M of cache. Running on otherwise identical hardware. I literally pulled the Pentium CPU out of the machine and replaced it with the i7 so everything else is identical. I expected the i7 would be equal or faster in every case but that's not what happened.

It was suggested that the difference might have been caused by hyperthreading on the i7 so I disabled hyperthreading and ran some new benchmarks.

Here's the Pentium G3258 compared to the i7-4790K with hyperthreading disabled. The games in pink are the ones that are still slower on the i7 for whatever reason. The games in green are ones that take advantage of the extra cores on the i7...


Also for reference, here's the i7-4790K compared against itself with and without hyperthreading. Games in pink are slower with hyperthreading off. Games in green are faster with hyperthreading off. Out of the 33 games tested, 14 were slightly slower and 19 were slightly faster. So it's basically slightly worse with hyperthreading off but it really doesn't make much difference either way.



GroovyMAME support forum on BYOAC



AaronGiles
Galaxiwarrior
Reged: 09/21/03
Posts: 1343
Send PM


Re: New MAME 0.192 benchmarks new [Re: AaronGiles]
#371703 - 12/07/17 12:00 PM


> I'll try to post my i7-8700K numbers tonight.

All right, as promised. Due to Turbo mode and such, I'm not quite sure how to quote the speed in GHz on this chip, so I just listed the multipliers. It came stock at x44, but will OC to x46 stably. At x48 it can run MAME fine but can't handle long periods maxed out on all cores, such as when building.

I also tried with both stock 0.192 as well as the mame0192 tag built with the options Vas mentioned above: ARCHOPTS=-msse4.2 -mpopcnt -fomit-frame-pointer




mhoes
MAME Fan
Reged: 08/27/15
Posts: 170
Send PM


Re: New MAME 0.192 benchmarks new [Re: AaronGiles]
#371705 - 12/07/17 01:23 PM


> Due to Turbo mode and such, I'm not quite sure how to quote
> the speed in GHz on this chip, so I just listed the multipliers.

For what it's worth, Intel lists this CPU as having a 'Processor Base Frequency' of 3.70 GHz, and a 'Max Turbo Frequency' of 4.70 GHz.

https://ark.intel.com/products/126684/Intel-Core-i7-8700K-Processor-12M-Cache-up-to-4_70-GHz



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: AaronGiles]
#371706 - 12/07/17 05:17 PM


> All right, as promised. Due to Turbo mode and such, I'm not quite sure how to quote
> the speed in GHz on this chip, so I just listed the multipliers. It came stock at
> x44, but will OC to x46 stably. At x48 it can run MAME fine but can't handle long
> periods maxed out on all cores, such as when building.

Your stock chip has a turbo of 4.7GHz. So when benchmarking, at least one of the cores should already be hitting that and the rest should be close. I'm not sure what the policy is for the new 6-core chips. You should be able to see the stock speed on each core under load using something like the Intel Extreme Tuning Utility.

That said, how are you trying to overclock your chip? When I just jacked up the multiplier across the board (from my stock 4.4 turbo) using the Intel Extreme Tuning Utility, the CPU core voltage was way too high and it kept overheating and thermal throttling, which defeats the purpose of overclocking.

I much had better results when I went into my BIOS and changed the following settings. Note that I have an ASUS Z87 board with a Haswell CPU so your setting will probably be different.

AI Overclock Tuner - XMP
CPU Core Ratio - Sync All Cores
1-Core Ratio Limit - 47
2-Core Ratio Limit - 47
2-Core Ratio Limit - 47
2-Core Ratio Limit - 47
CPU Core Voltage - Adaptive Mode
Additional Turbo Mode Core CPU Voltage - 1.275

I don't know what the correct voltage would be for your CPU but leaving it on "auto" for mine resulted in it being way too high under load.



AaronGiles
Galaxiwarrior
Reged: 09/21/03
Posts: 1343
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371713 - 12/07/17 10:37 PM


Thanks for the pointer to the Intel utility; I had not heard of it.

My system is running an Alienware BIOS which has fairly limited controls. There's also an Alienware OC Utility which has slightly better controls but nowhere near the Intel ones.

The image below is what the Intel utility says for my various configs. On the left is "Default" (according to Intel). Next column is what Alienware configures when OC is "off". Column 3 is "OS Stage 1" and Column 4 is "OC Stage 2". Basically the final 3 columns should correspond to my bench numbers.

Some things are definitely confusing, like what is "Max Non Turbo Boost Ratio"? Is that when there are 0 active cores? Also, running MAME and watching the live value of "Active Cores" often showed 0 and no more than 1.



(Sorry for the large size, I run high DPI and I forgot to scale it down.)



krick
Get Fuzzy
Reged: 02/09/04
Posts: 4235
Send PM


Re: New MAME 0.192 benchmarks new [Re: AaronGiles]
#371718 - 12/08/17 01:11 AM


> Some things are definitely confusing, like what is "Max Non Turbo Boost Ratio"?

Max Non Turbo Boost Ratio (37x) is the base CPU multiplier. I believe speeds for sub-units within the chip are based off this ratio. This is also where the Processor Base Frequency (3.7 GHz) comes from. According to Intel...

Processor Base Frequency describes the rate at which the processor's transistors open and close. The processor base frequency is the operating point where TDP is defined. Frequency is measured in gigahertz (GHz), or billion cycles per second.

If you disabled the Turbo Boost feature in the Intel XTU program, your CPU would max out at 3.7GHz.

More info on your CPU from Intel:
https://ark.intel.com/products/126684/Intel-Core-i7-8700K-Processor-12M-Cache-up-to-4_70-GHz

You probably know all this, but I'll say it just in case. "Turbo Boost" is basically Intel temporarily overclocking your CPU. When Turbo Boost kicks in when the CPU is under load, the motherboard increases the voltage to support the overclock and then ramps it back down to normal when the load stops. Also, not all cores will hit the maximum turbo boost multiplier. Intel deliberately limits this due to thermal concerns. This is the only chart I could find for your CPU. Intel used to publish this information but they're stopped with recent CPUs so I don't know if it's 100% correct but check out the speeds that each core runs at when you've got all 6 loaded at once. Only one hits 4.7, the next 4.6 and so on until the last loaded core tops out at 4.3. It basically matches the screenshot you posted...


> Is that when there are 0 active cores? Also, running MAME and watching the live value of
> "Active Cores" often showed 0 and no more than 1.

Not sure about the "Active Cores" value. Maybe it's fluctuating faster than the display can update and sometimes displays 0 due to a low sampling rate.

You can a "stress test" within the Intel XTU that will load all your CPU cores. I would turn on the thermal throttle display via the wrench icons in the upper-right corner so you can watch for that. If it gets too hot, it will throttle it back.

The Alienware OC "Stage 1" looks like it's setting turbo max to 4.6GHz on all cores equally. OC "Stage 2" is setting turbo max to 4.8GHz on all cores equally.

I don't know what kind of CPU cooler your system has so that might limit your overclocking due to thermal concerns. Also, I don't know how much overclocking headroom your CPU even has. The stock turbo of 4.7GHz is pretty awesome. How much are other people able to overclock the i7-8700K?

I'd be curious to see your bench numbers if you set all cores to 45x using the Intel XTU (I think it's under Manual Tuning -> Core). That way we can compare the coffee lake architecture performance at the same clock speed as my two Haswell chips as well as John IV's i7-6700K skylake CPU. And we can see how much boost you get from 2 more cores on the games that can use them. If you do a bench at 45x on all cores, can you stick the results in a text file and attach it so that I can more easily paste them into a spreadsheet?



uman
MAME Fan
Reged: 04/15/12
Posts: 455
Send PM


Re: New MAME 0.192 benchmarks new [Re: krick]
#371727 - 12/09/17 12:08 AM


@krick:

your benchmarks really leave questions and i can only guess some things. either the benchmark tool has some fluctuation or your mainboard has. Overclocking is generally not some good comparing example and needs good cooling and quite a lot of knowledge to make those "numbers" practical, stable and useable. I read much about overclocking, but if it comes to real life situations, my experience is that a overclock that works in benchmarks, quite often doesnt work with CPU demanding applications and i dont count games as very CPU demanding. 4,5Ghz sounds very edgy to me, if you dont have at least watercooling on your CPU.

I can overclock my i7-5960 easily to 4,5Ghz for example and i passed through the asus benchmark tool, hitting place 19 in the hiscore-list of the RoG-Asus website (for my CPU), but in no way i can run that stable 24/7 with i.e. Adobe Premiere or After Effects. I am pretty sure, that if you really stress-test your CPU (with CPU-Z for at least 2hours), you will get a BoD quite fast.

How about a test with a slightly less overclock, like 4,2 or 4,1Ghz?


Pages: 1

MAMEWorld >> EmuChat
View all threads Index   Threaded Mode Threaded  

Extra information Permissions
Moderator:  Robbbert, Tafoid 
0 registered and 226 anonymous users are browsing this forum.
You cannot start new topics
You cannot reply to topics
HTML is enabled
UBBCode is enabled
Thread views: 2230