MAMEWorld >> News
Previous thread Previous  View all threads Index   Next thread Next   Threaded Mode Threaded  

Pages: 1

Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


History.dat, the XML
#389258 - 12/29/20 11:21 PM


/!\ New format not yet supported by MAME or any frontend, please don't download it if you are an user /!\

History.dat, the XML version
----------------------------

As MAME, here is a "time for an overhaul" of history.dat.

The DAT format used by the History.dat (and other support files) is old and no more adapted for storing a big game information database.

We decided to migrate to a more modern format, the XML format. A format made for faster reading by the programs and for big data storage. We have already tested it in a home-made frontend and the gain of performance (at reading) is highly notable.

If a mamedev can add support for this new format, here is the file: https://www.arcade-history.com/temp/history-xml5.zip EDIT: file updated to the latest valid structure.

If a mamedev says this format has problems, the idea will be abandoned (or at least put aside) and we'll continue with the DAT format.
We are open for improvements or corrections to our XML design, in case there are any.
But our MAME frontend implementation proofs the XML design works very well and can be easily read (and/or transfered to a database like SQLite which is currently used in MAME for converting history.dat).

Anyway, for now, the History.dat will continue to be exported as a DAT file. The two formats will be available on our Download page for the next release and as soon as the XML version is supported by MAME, the DAT version will be stopped.

Edited by Alexis B. (12/30/20 10:54 PM)



See my collection: here



crazyc
MAME Fan
Reged: 06/23/16
Posts: 62
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389261 - 12/30/20 01:15 AM


I'm definitely not opposed to tossing the dat format which stinks. Why have separate tags for mame vs software? If a tag has an sl attribute then its a softlist entry, having different tags for that seems unnecessary.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: crazyc]
#389263 - 12/30/20 02:07 AM


> I'm definitely not opposed to tossing the dat format which stinks. Why have separate
> tags for mame vs software? If a tag has an sl attribute then its a softlist entry,
> having different tags for that seems unnecessary.

To avoid a condition (if(sl="" is here or not)) since a XML parser can see a tag and immediately determine the type without seeying its attributs. So for all mame tags, the condition to search a sl="" is not triggered. I think it's a gain of speed. But maybe i'm wrong ? If required, I can correct this

Edited by Alexis B. (12/30/20 02:21 AM)



See my collection: here



agard
MAME Fan
Reged: 08/04/13
Posts: 331
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389264 - 12/30/20 04:16 AM


I'm really glad you're making the move to xml because i've always wanted to edit the history dat to my liking so great work.

1 thing that would be good though say i have spent all these hours on editing the info to my liking is it possible when you update it you would make an xml with just what has been updated so that people like me could then just add what has been updated so that we could edit the just what has been updated to our liking then add to our edited history xml.
I do hope you can do this or just make a list of what's been updated so we can manually add to our edited xml.

Thank you as always for the work you do on it



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389265 - 12/30/20 07:10 AM


I also support switching to XML – it’s easier to parse, more tools support it, etc. But would it be cleaner to do it like this? Element/attribute names are preliminary, but the structure should be apparent.

Code:


<history>
<entry>
<systems>
<system name="99lstwar" />
<system name="99lastwara" />
</systems>
<text>The text goes here, remember to use CDATA if ignorable whitespace becomes an issue...</text>
</entry>
<entry>
<software>
<item list="nes" name="002agent" />
<item list="nes" name="002agentb" />
<item list="nes" name="002agenta" />
</software>
<text>Description for software.</text>
</entry>
</history>



This approach would be a lot easier to automatically generate bindings for (e.g. in .NET languages or with XSDC++), and easier to load into a database.



RaspBear
retrogamer
Reged: 09/12/16
Posts: 304
Loc: Italia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389267 - 12/30/20 08:53 AM


Great idea to move to XML, thanks.
In terms of XML format, while I appreciate avoiding duplication of descriptions, I would suggest a very simple structure, like gamelist.xml used by Emulationstation.

Each game item would be something like:

Code:


< system >
< systemname> Arcade < /systemname>
< game>
< name>Pengo< /name>
< romname> pengo.zip < romname>
< parent> nameoftheparentrom < /parent>
(if it's a child or clone, if field empty it'a a parent)
< parentSys> system name of the parent rom < /parentsys>

< desc> .... < /desc>
...
...
< /game>
< /system>"



Just my 2 cents.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389270 - 12/30/20 03:05 PM


Thanks for the suggestion. Here is a new file exactly how you want

https://www.arcade-history.com/temp/history-xml2.zip

Please let me know if everything is OK for you.

Please note there is one special case where an entry can have a system AND a software (for example with a vgmplay)


Code:

< entry >
< system >
< item name="10yardj" >
< /system >
< software >
< item list="vgmplay" name="10yard" >
< /software >
< text >
Arcade Video game published 37 years ago:
10-yard Fight (c) 1983 Irem Corp.

< /text >
< /entry >



Edited by Alexis B. (12/30/20 03:28 PM)



See my collection: here



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389271 - 12/30/20 03:28 PM


Well, it's missing the / for self-closing item elements - it should be:

Code:


<entry>
<system>
<item name="10yardj" />
</system>
<software>
<item list="vgmplay" name="10yard" />
</software>
<text>
Arcade Video game published 37 years ago:
10-yard Fight (c) 1983 Irem Corp.
</text>
</entry>



That should work fine.

The only potential issue is that in XML DTD you'll need to make the "list" attribute on the "item" element implied, because you can't have different defintions for the "item" element depending on where it appears. This means that an XML DTD validator won’t be able to detect an "item" element inside the "software" element missing the "list" attribute.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389272 - 12/30/20 03:32 PM


> Well, it's missing the / for self-closing item elements - it should be:
> That should work fine.


I'll fix this.

EDIT : file fixed : https://www.aracde-history.com/temp/history-xml4.zip


> The only potential issue is that in XML DTD you'll need to make the "list" attribute on the "item" element implied.

Maybe I can rename the item tag to another word (i.e.: soft) when it's a software. Since only softwares can have a "soft" tag with "list" atribute. So, "item" under "system" will never have a "list" attribute, and "soft" under software will always have a "list" attribute.

Edited by Alexis B. (12/30/20 04:22 PM)



See my collection: here



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389273 - 12/30/20 03:51 PM


That one has some structural issues:

Code:


$ xmllint --noout ../../../../Downloads/history.xml
../../../../Downloads/history.xml:51733: parser error : Opening and ending tag mismatch: entry line 51726 and system
</system>
^
../../../../Downloads/history.xml:51753: parser error : Opening and ending tag mismatch: history line 51726 and entry
</entry>
^
../../../../Downloads/history.xml:51754: parser error : Extra content at the end of the document
<entry>



The problem starts here:

Code:


<entry>
<system>
<item name="mt_aftrb" />
</system>
<software>
<item list="megatech" name="mt_aftrb" />
</software>
</system>
<text>Sega Mega-Tech cart. published 31 years ago:



Note the stray closing "system" tag after the closing "software" tag.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389274 - 12/30/20 04:18 PM


Sorry, it should now be fixed.

https://www.aracde-history.com/temp/history-xml4.zip

Edited by Alexis B. (12/30/20 04:18 PM)



See my collection: here



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389275 - 12/30/20 04:28 PM


Still has issues:

Code:


$ xmllint --noout history.xml
history.xml:79352: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:116022: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:293863: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:385653: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:463527: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:609256: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:609968: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:611936: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:675156: parser error : Opening and ending tag mismatch: system line 65535 and entry
</entry>
^
history.xml:1172185: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:1172221: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:1172235: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:1172249: parser error : Opening and ending tag mismatch: entry line 65535 and software
</software>
^
history.xml:1172258: parser error : Opening and ending tag mismatch: history line 65535 and entry
</entry>
^
history.xml:1172259: parser error : Extra content at the end of the document
<entry>



The issue is an opening tag for a "system" element with no corresponding closing element:

Code:


<entry>
<system>
<software>

<soft list="amiga_workbench" name="amigos35" />
</software>
<text>Commodore Amiga CD published 21 years ago:



What are you using to generate and validate your XML? It would probably be easier if you used some kind of well-tested tools. XML is pretty mature, and there are a lot of tools for working with it.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389276 - 12/30/20 05:19 PM


Thanks for the issues report. I'll fix my file and come back to you with a correct one. I'm using a home-made php script with a ton of conditions... Some are misplaced. I'll fix everything.



See my collection: here



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389278 - 12/30/20 07:16 PM


Sorry for the previous issues. They has been corrected, the XML syntax is now correct. Please let me know if everything is OK on your side.

https://www.arcade-history.com/temp/history-xml5.zip

Edited by Alexis B. (12/30/20 07:19 PM)



See my collection: here



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389301 - 12/31/20 05:08 AM


Looks good now. This may need some changes to MAME to efficiently load it – we probably need to expose an XML SAX parser to Lua scripts, building DOM for big XML files is slow. But the format itself is good, and shouldn't cause a problem for consuming in any application. Thanks for taking on the feedback.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: agard]
#389312 - 12/31/20 02:26 PM


> I'm really glad you're making the move to xml because i've always wanted to edit the
> history dat to my liking so great work.
>
> 1 thing that would be good though say i have spent all these hours on editing the
> info to my liking is it possible when you update it you would make an xml with just
> what has been updated so that people like me could then just add what has been
> updated so that we could edit the just what has been updated to our liking then add
> to our edited history xml.
> I do hope you can do this or just make a list of what's been updated so we can
> manually add to our edited xml.
>
> Thank you as always for the work you do on it

Hello, since the history.dat/.xml is extracted from the arcade-history.com database. I can't do this but you can easily do this by putting the old and the new history files in a program called Winmerge. It will show you all differences between the twos.



See my collection: here



soviet
MAME Fan
Reged: 02/18/09
Posts: 38
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389314 - 12/31/20 03:02 PM


What do you think of structuring the entries, which currently have very regular sections but are a long flat text? The current text element could be split into various elements, some possibly repeatable, in arbitrary order (a frontend can sort and hide them and reconstruct headings)


Example:


Quote:



<entry>
<software>
<item list="nes" name="100mandk" />
</software>
<text>Nintendo Famicom cart. published 32 years ago:</text>
<name>
100万$キッド 幻の帝王編 (c) 1988 Sofel.
($1,000,000 Kid - Maboroshi no Teiou-Hen)
</name>
<text>Based on the manga series $1,000,000 Kid by Yuki Ishigaki.</text>
<technical>

Game ID: SFL-KP
</technical>
<trivia>$1,000,000 Kid was released on January 6, 1989 in Japan (even if titlescreen says 1988).
</trivia>
<trivia>
Exported as "Casino Kid [Model NES-KP-USA]" in North America.
</trivia>
<staff>

CG Design: Tadao Nomura (T. Nomura), Natural
Music: Toshio Murai
Programmer: Hirokazu Sugisaka (H. Sugisaka), Kazu Takahashi (K. Takahashi), Yukiyasu Kutsuna (Y. Kutsuna), Masayoshi Shinohara (M. Sinohara), M. Takahashi, Kazuyuki Oka, Nobu Saito (N. Saito)
Director: Masanori Iwamoto (M. Iwamoto), Fred K. Ishii (Fred Ishii), Marie Atake
Producer: Yuji Yamaguchi (Y. Yamaguchi)
President: Takeshi Iga
</staff>
<contribute>Edit this entry: https://www.arcade-history.com/?&page=detail&id=83855&o=2&lt;/contribute&gt;
</entry>




Entries could be even more structured, particularly with references between entities: games to people, games to companies, games to other games (from the series section and in "see also" links), etc. But it would require a large data entry effort.



motoschifo
MAME Fan
Reged: 07/09/15
Posts: 8
Loc: Italy
Send PM


Re: History.dat, the XML new [Re: soviet]
#389319 - 12/31/20 07:39 PM Attachment: Schermata del 2020-12-31 18-41-52.png 82 KB (0 downloads)


This is a very good idea, I like the xml structure because it's more standard and flexible today.

You can simplify your xml by adding software and systems nodes without a parent, like Mame does.

Remember also to add the DTD at the top of xml and the CDATA attribute on every "text" node.
I don't know if you can add CDATA attribute on DTD only.

Example:

Code:




Nintendo Famicom cart. published 32 years ago...
]]>




Arcade Video game kit published 32 years ago:

'88 Games (c) 1988 Konami Industry Company, Limited.

Export release. Game developed in Japan by Konami. For more information, please see the original Japanese release entry: "Hyper Sports Special [Model GX861]".
]]>







...text...
]]>






EDIT: Sorry, I reply to the wrong post... the pre/quote/code block is render the text as html, I added an image as attachment

[ATTACHED IMAGE]

Attachment

Edited by motoschifo (12/31/20 07:43 PM)



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389324 - 12/31/20 08:46 PM


> Looks good now. This may need some changes to MAME to efficiently load it – we
> probably need to expose an XML SAX parser to Lua scripts, building DOM for big XML
> files is slow. But the format itself is good, and shouldn't cause a problem for
> consuming in any application. Thanks for taking on the feedback.

Thanks you for your patience and for your help. The file is finaly ready for an eventual MAME support



crazyc
MAME Fan
Reged: 06/23/16
Posts: 62
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389326 - 01/01/21 12:03 AM


I added support here https://github.com/mamedev/mame/commit/886bf9ac675adb96b406d35a6c3d1a6b2c67332c . As suggested above you can simplify your schema by removing the systems and software tags, my parser ignores them, and unify the system and item tags, my parser treats them the same, softlist entries are detected with the list attribute.



StilettoAdministrator
They're always after me Lucky ROMS!
Reged: 03/07/04
Posts: 6472
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389328 - 01/01/21 01:01 AM


> > I'm really glad you're making the move to xml because i've always wanted to edit
> the
> > history dat to my liking so great work.
> >
> > 1 thing that would be good though say i have spent all these hours on editing the
> > info to my liking is it possible when you update it you would make an xml with just
> > what has been updated so that people like me could then just add what has been
> > updated so that we could edit the just what has been updated to our liking then add
> > to our edited history xml.
> > I do hope you can do this or just make a list of what's been updated so we can
> > manually add to our edited xml.
> >
> > Thank you as always for the work you do on it
>
> Hello, since the history.dat/.xml is extracted from the arcade-history.com database.
> I can't do this but you can easily do this by putting the old and the new history
> files in a program called Winmerge. It will show you all differences between the
> twos.

Actually...

If you made a repository here: https://github.com/arcadehistory?tab=repositories
JUST for the history.dat / history.xml...

Not only would it be another way for people to download it (and submit changes, tho I am sure you don't desire that method?)

... but it would be an easy way to compare the changes with each new "commit" (release).

However, I think we'd be talking about SO many changes (I haven't actually checked) that Github would probably not permit the resulting diff to be displayed.

- Stiletto



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: crazyc]
#389329 - 01/01/21 02:23 AM


> I added support here
> https://github.com/mamedev/mame/commit/886bf9ac675adb96b406d35a6c3d1a6b2c67332c . As
> suggested above you can simplify your schema by removing the systems and software
> tags, my parser ignores them, and unify the system and item tags, my parser treats
> them the same, softlist entries are detected with the list attribute.

Thanks for the quick support !



Vas Crabb
BOFH
Reged: 12/13/05
Posts: 4462
Loc: Melbourne, Australia
Send PM


Re: History.dat, the XML new [Re: Alexis B.]
#389331 - 01/01/21 03:11 AM


> > I added support here
> > https://github.com/mamedev/mame/commit/886bf9ac675adb96b406d35a6c3d1a6b2c67332c .
> As
> > suggested above you can simplify your schema by removing the systems and software
> > tags, my parser ignores them, and unify the system and item tags, my parser treats
> > them the same, softlist entries are detected with the list attribute.
>
> Thanks for the quick support ! If your parser ignores them and everything works
> perfect without them, I've no reason to keep them

It’s better to keep them because it makes it easier to do bindings in a .NET language. MAME isn’t the only thing that ever reads these kinds of data files. Please don’t remove them.



Alexis B.
Historian
Reged: 09/20/03
Posts: 417
Loc: Cannes, FRANCE
Send PM


Re: History.dat, the XML new [Re: Vas Crabb]
#389332 - 01/01/21 03:21 AM


> > > I added support here
> > > https://github.com/mamedev/mame/commit/886bf9ac675adb96b406d35a6c3d1a6b2c67332c .
> > As
> > > suggested above you can simplify your schema by removing the systems and software
> > > tags, my parser ignores them, and unify the system and item tags, my parser
> treats
> > > them the same, softlist entries are detected with the list attribute.
> >
> > Thanks for the quick support ! If your parser ignores them and everything works
> > perfect without them, I've no reason to keep them
>
> It’s better to keep them because it makes it easier to do bindings in a .NET
> language. MAME isn’t the only thing that ever reads these kinds of data files. Please
> don’t remove them.

Ok, no problem, for now I didn't removed anything. I keep my extract code safe since it works well and it make a correct file.

Edited by Alexis B. (01/01/21 03:27 AM)


Pages: 1

MAMEWorld >> News
Previous thread Previous  View all threads Index   Next thread Next   Threaded Mode Threaded  

Extra information Permissions
Moderator:  John IV, Robbbert, Tafoid 
1 registered and 144 anonymous users are browsing this forum.
You cannot start new topics
You cannot reply to topics
HTML is enabled
UBBCode is enabled
Thread views: 1441