MAMEWorld >> EmuChat
View all threads Index   Threaded Mode Threaded  

Pages: 1

Pr3tty F1y
MAME Fan
Reged: 07/18/05
Posts: 396
Send PM


ROMCMP - Can anyone confirm if it is UTF-8 Unicode compliant?
#384984 - 01/12/20 03:55 PM


I'm thinking romcmp.exe is not UTF-8 unicode compliant based on the fact that it's refusing to open zip files with Japanese characters in the name, but I wanted to confirm this.

Also, would there be a way to compile MAME tools with UTF-8 unicode compliance?

I realize that I could rename files, but I have a lot of them and the naming with the Japanese characters is important as I'm trying to retain the original naming scheme.



Pr3tty F1y
MAME Fan
Reged: 07/18/05
Posts: 396
Send PM


Issue with zip parser? new [Re: Pr3tty F1y]
#384987 - 01/12/20 07:28 PM


OK - So apparently the only limitation on filename characters with romcmp is the name of the zip file itself.

The files/roms internal to the zip file can contain extended UTF-8 characters in their file names. It's just that the zip file names that romcmp opens seem to have to be ASCII standard (or something similar).



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4466
Loc: Melbourne, Australia
Send PM


Re: ROMCMP - Can anyone confirm if it is UTF-8 Unicode compliant? new [Re: Pr3tty F1y]
#384990 - 01/13/20 03:47 AM


MAME has all kinds of issues with UTF-8 command line arguments and console I/O on Windows while it works most of the time on Linux/Mac. There doesn't seem to be an actually good way to solve it though.

Right now, MAME blindly outputs UTF-8 via cout/cerr. This works most of the time on Linux/Mac because almost everyone uses UTF-8 locales these days. It doesn't work on Windows, because Windows console subsystem will try to interpret it as the current ANSI code page, which typically depends on the display language.

Bletch (npwoods) attempted to solve it in imgtool by switching the entire thing to use wchar_t I/O streams. The net result is that it still works fine on Linux/Mac, but is broken in a different way on Windows. You now end up with interspersed NUL characters in the output. It could be an issue in the MinGW runtime itself causing this, but it's not really relevant - the fact is it doesn't work.

The other thing complicating it is -listxml which needs to be able to produce a UTF-8 XML file. For this to work on Windows in a command shell, the current hack of spitting out UTF-8 via cout is the only approach that works. However it seems to break if you try to do it from PowerShell.

tl;dr console I/O is pretty much fucked on Windows, and attempts to fix it have failed



Pr3tty F1y
MAME Fan
Reged: 07/18/05
Posts: 396
Send PM


Re: ROMCMP - Can anyone confirm if it is UTF-8 Unicode compliant? new [Re: Vas Crabb]
#384998 - 01/13/20 11:35 PM


Thank you for the detailed explanation.

It makes sense. Until I started looking into unicode and UTF-8, I was flabbergasted when I found out that there wasn't a standardized, universal character set that held all characters. I guess it does make sense just due to the variation, but still.

And Windows definitely makes it worse, but at least in this case, the limitation could be bypassed by simply renaming the zip files and leaving the original rom files with their original titles.

ROMCMP proved invaluable in identifying WiiuVCextractor has a flaw that modifies binary data of Gameboy Advance roms that do not start with a byte equal to 2E (which it alters to 2E based on the assumption that all GBA roms start with this entry point, but the truth is *most* GBA roms start with 2E, not all)>


Pages: 1

MAMEWorld >> EmuChat
View all threads Index   Threaded Mode Threaded  

Extra information Permissions
Moderator:  Robbbert, Tafoid 
0 registered and 357 anonymous users are browsing this forum.
You cannot start new topics
You cannot reply to topics
HTML is enabled
UBBCode is enabled
Thread views: 312