|
bug: chdman output depends on system qsort() implementation
#289601 - 06/16/12 12:55 PM
|
|
|
Hi,
trying to rebuild the 0.146 CHD set on Unix to avoid re-downloading, I found that the Huffman compressor (src/lib/util/huffman.c) depends on an exact behavior of the external qsort() function. Because Quicksort is not a stable sort and the node weights which are compared can be the same for different node entries, output of Quicksort is not unambiguously defined.
This makes chdman produce different outputs on different operating systems.
|
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289602 - 06/16/12 03:56 PM
|
|
|
> Hi, > > trying to rebuild the 0.146 CHD set on Unix to avoid re-downloading, I found that the > Huffman compressor (src/lib/util/huffman.c) depends on an exact behavior of the > external qsort() function. Because Quicksort is not a stable sort and the node > weights which are compared can be the same for different node entries, output of > Quicksort is not unambiguously defined. > > This makes chdman produce different outputs on different operating systems.
AFAIK Huffman was used for v4 CHDs and those produced the same output on different operating systems. Are you sure this is actually the problem - ie, you've dumped qsort() output from a CHDMAN run on Windows and Linux?
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289604 - 06/16/12 04:16 PM
|
|
|
http://lmgtfy.com/?q=Quicksort+is+not+a+stable+sort
Code:
/src/lib/util/huffman.c
529 //------------------------------------------------- 530 // tree_node_compare - compare two tree nodes 531 // by weight 532 //------------------------------------------------- 533 534 int CLIB_DECL huffman_context_base::tree_node_compare(const void *item1, const void *item2) 535 { 536 const node_t *node1 = *(const node_t **)item1; 537 const node_t *node2 = *(const node_t **)item2; 538 return node2->m_weight - node1->m_weight; 539 }
...
565 // sort the list by weight, largest weight first 566 qsort(list, listitems, sizeof(list[0]), tree_node_compare);
http://www.gnu.org/software/libc/manual/html_node/Array-Sort-Function.html
Quote:
Warning: If two objects compare as equal, their order after sorting is unpredictable. That is to say, the sorting is not stable. This can make a difference when the comparison considers only part of the elements. Two elements with the same sort key may differ in other respects.
If you want the effect of a stable sort, you can get this result by writing the comparison function so that, lacking other reason distinguish between two elements, it compares them by their addresses. Note that doing this may make the sorting algorithm less efficient, so do it only if necessary.
I think the bottom line is that either...
1) the qsort comparison function would need to be modified to handle the case when items are equal and sort them by another key (address?) that guarantees stable results
...or...
2) a different "stable" sort needs to be used. i.e. mergesort.
Here's a real-world example of modifying the comparison function...
http://old.nabble.com/RFA:-GCC-4.2.1:-Stabalizing-coalesce_list%27s-qsort-td12233553.html
|
GroovyMAME support forum on BYOAC
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: krick]
#289613 - 06/16/12 05:50 PM
|
|
|
Nobody's disputing that non-glibc qsort() may be unstable, I just want proof that it's the problem in this case before we make everyone re-convert all their CHDs *again*. (Because the GCC link indicates that the Linux version of the behavior is correct, so the created-on-Windows CHDs on all the sites are what will turn out to be wrong).
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289617 - 06/16/12 06:28 PM
|
|
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: krick]
#289620 - 06/16/12 06:39 PM
|
|
|
The amounts of data we're sorting would only trigger the memory problem on a system with less than 8 MB or so.
So, I know it's irresistable, but can we stop playing "I gotcha'ed the only remaining dev stupid enough to read MW!" and get back to providing actual factual evidence that the qsort is the reason for the alleged cross-OS difference?
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289630 - 06/16/12 08:10 PM
|
|
|
> The amounts of data we're sorting would only trigger the memory problem on a system > with less than 8 MB or so. > > So, I know it's irresistable, but can we stop playing "I gotcha'ed the only remaining > dev stupid enough to read MW!" and get back to providing actual factual evidence that > the qsort is the reason for the alleged cross-OS difference?
I was just providing evidence to the original poster's claim that qsort *can* produce different results because it's inherently not a stable sort algorithm.
In practice, due to the way that MAME uses qsort, I don't know if it actually *does* produce different output. Re-reading the original post, I'm not sure that he's even claiming that it actually *is* producing different output.
I guess we need for him to come back and provide some proof.
|
GroovyMAME support forum on BYOAC
|
|
|
|
Hi,
sorry for appearing rude or posting the report to a wrong place, but http://mametesters.org/mantis/ I found on mamedev.org returns 403 Forbidden and I couldn't find any other site. If you can point me to a proper bug tracker, I'd be happy to post there.
I don't have Windows development env set up. but I am attaching a proof-of-concept patch, which abuses the m_bits field as a secondary key during the sorting phase to achieve a stable sort with a generic Quicksort. Under glibc, chdman output is the same, because glibc qsort() is stable as you mentioned above.
I expect that on Windows, qsort() is not stable and the output will change. Could you please check if you have time?
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: krick]
#289633 - 06/16/12 08:33 PM
|
|
|
> I was just providing evidence to the original poster's claim that qsort *can* produce > different results because it's inherently not a stable sort algorithm. > > In practice, due to the way that MAME uses qsort, I don't know if it actually *does* > produce different output. Re-reading the original post, I'm not sure that he's even > claiming that it actually *is* producing different output.
Yes, the output is different. Even the same chdman binary produces a different output when run 1) under Wine, using Wine's C library, and 2) under real Windows, using msvcrt.dll.
> I guess we need for him to come back and provide some proof.
With the patch I attached, it's possible to eg. reverse the order of the secondary sort key, which yields completely different (but also valid) compressed data. So the exact qsort() implementation really matters.
Yet another minor issue:
I also noticed that one byte near the end of the output chaosheatj.chd can have several different values. Valgrind reports usage of some uninitialized bytes when writing the index, I'll look into that later.
|
|
|
italie |
MAME owes italie many thank yous, hah
|
|
|
Reged: 09/20/03
|
Posts: 15246
|
Loc: BoomTown
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289634 - 06/16/12 08:38 PM
|
|
|
> the only remaining dev stupid enough to read MW!
That hurts man...
|
|
|
Tafoid |
I keep on testing.. testing.. testing... into the future!
|
|
|
Reged: 04/19/06
|
Posts: 3138
|
Loc: USA
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289637 - 06/16/12 09:06 PM
|
|
|
> Hi, > > sorry for appearing rude or posting the report to a wrong place, but > http://mametesters.org/mantis/ I found on mamedev.org returns 403 Forbidden and I > couldn't find any other site. If you can point me to a proper bug tracker, I'd be > happy to post there.
As a partial co-admin at the mametesters.org site, I cannot see your IP on any deny list we have installed there. Are you getting the same from mamedev.org? If so, the provider itself is blocking you for whatever reasons and it's beyond our control. Posting stuff here is still acceptable as Devs do read this forum regular. Sorry for any inconvenience.
On second thought.. looking at your link, we originally were hosted at "mametesters.org/mantis" (now it's simply "mametesters.org". The old link used to redirect but I guess it doesn't now. Please try without the /mantis bit..
|
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289640 - 06/16/12 10:13 PM
|
|
|
> I expect that on Windows, qsort() is not stable and the output will change. Could you > please check if you have time?
Thanks, that patch looks very reasonable. Let me know what you find out regarding the other issue.
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289650 - 06/17/12 02:09 AM
|
|
|
> > I expect that on Windows, qsort() is not stable and the output will change. Could > you > > please check if you have time? > > Thanks, that patch looks very reasonable. Let me know what you find out regarding the > other issue.
Right, good find. The simplest fix as you mention is to ensure that we never return "equal" from the qsort callback.
Fortunately, no matter the ordering, all CHDs produced the existing "bad" way should be readable by all other implementations. But we should make sure we always generate the same output regardless. Thanks for digging into it.
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289704 - 06/17/12 06:47 PM
|
|
|
> Thanks, that patch looks very reasonable. Let me know what you find out regarding the > other issue.
Fortunately, this one is trivial:
Code:
Index: src/lib/util/chd.c =================================================================== --- src/lib/util/chd.c (revision 15463) +++ src/lib/util/chd.c (working copy) @@ -1788,6 +1788,7 @@ compressed[12] = lengthbits; compressed[13] = selfbits; compressed[14] = parentbits; + compressed[15] = 0; // write the result m_mapoffset = file_append(compressed, complen + 16);
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289705 - 06/17/12 06:58 PM
|
|
|
Yet another unitialized variable fix (this should be quite harmless though)
Code:
Index: src/lib/util/cdrom.c =================================================================== --- src/lib/util/cdrom.c (revision 15463) +++ src/lib/util/cdrom.c (working copy) @@ -734,6 +734,7 @@ { /* parse the metadata */ type[0] = subtype[0] = 0; + pgtype[0] = pgsub[0] = 0; if (sscanf(metadata, CDROM_TRACK_METADATA_FORMAT, &tracknum, type, subtype, &frames) != 4) return CHDERR_INVALID_DATA; if (tracknum == 0 || tracknum > CD_MAX_TRACKS)
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289708 - 06/17/12 07:38 PM
|
|
|
> I am attaching a proof-of-concept > patch, which abuses the m_bits field as a secondary key during the sorting phase to > achieve a stable sort with a generic Quicksort.
I'm not a mamedev, so feel free to ignore me, but I don't think it's a good idea to re-use the m_bits field for a different purpose. Could another field be added to to the node structure for the purposes of secondary sorting?
If the only option is to re-use the field for a different purpose, then it should be thoroughly documented as a convenience hack/abuse in the places where it's being used for another purpose so that future people reading the code understand what's going on.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: krick]
#289709 - 06/17/12 07:44 PM
|
|
|
> > I am attaching a proof-of-concept > > patch, which abuses the m_bits field as a secondary key during the sorting phase to > > achieve a stable sort with a generic Quicksort. > > I'm not a mamedev, so feel free to ignore me, but I don't think it's a good idea to > re-use the m_bits field for a different purpose. Could another field be added to to > the node structure for the purposes of secondary sorting? > > If the only option is to re-use the field for a different purpose, then it should be > thoroughly documented as a convenience hack/abuse in the places where it's being used > for another purpose so that future people reading the code understand what's going > on.
Sure, I did it this way so I could avoid changing of the header file and rebuilding of the whole source. The patch is definitely not meant to be applied as is.
|
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289727 - 06/17/12 10:00 PM
|
|
|
> Yet another unitialized variable fix (this should be quite harmless though)
Thanks! I have applied all 3 changes to trunk.
|
|
|
R. Belmont |
Cuckoo for IGAvania
|
|
|
Reged: 09/21/03
|
Posts: 9716
|
Loc: ECV-197 The Orville
|
|
Send PM
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: jmak]
#289728 - 06/17/12 10:13 PM
|
|
|
> Sure, I did it this way so I could avoid changing of the header file and rebuilding > of the whole source. The patch is definitely not meant to be applied as is.
I've applied it as-is because properly running code always trumps theory and discussion. If you'd like to clean it up anyway, of course go ahead.
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289731 - 06/17/12 11:37 PM
|
|
|
> > Sure, I did it this way so I could avoid changing of the header file and rebuilding > > of the whole source. The patch is definitely not meant to be applied as is. > > I've applied it as-is because properly running code always trumps theory and > discussion. If you'd like to clean it up anyway, of course go ahead.
There's nothing wrong with using an arbitrary field to break ties as long as the value of that field is stable. The order of equal elements is ultimately irrelevant, so any stable ordering solution is as good as any other.
|
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: AaronGiles]
#289748 - 06/18/12 04:08 AM
|
|
|
> There's nothing wrong with using an arbitrary field to break ties as long as the > value of that field is stable. The order of equal elements is ultimately irrelevant, > so any stable ordering solution is as good as any other.
What I meant is that (unless I'm misunderstanding his code), he's temporarily stuffing curcode into m_bits (which is not being used yet) so that he can get at it to use as a tie breaker inside the qsort compare...
m_huffnode[curcode].m_bits = curcode;
Then he later sets m_bits back to 0 after the sort...
node.m_bits = 0;
Essentially, he's hijacking an existing member of the node object and storing something else in it temporarily, which is fine, but it just seems like something that could be confusing if it wasn't documented clearly. That's why I suggested that there might be another way to do it that didn't involve hijacking m_bits. Unfortunately, I can't think of one.
At the points where m_bits is being set and later cleared aren't right before and after the sort, so unless you already understand what's going on, I don't think it would make much sense. And if you look inside the comparison function, it appears that m_bits is being compared between the two nodes, but actually the curcode for each node is being compared, it's just stored in m_bits.
|
GroovyMAME support forum on BYOAC
|
|
|
Re: bug: chdman output depends on system qsort() implementation
[Re: R. Belmont]
#289867 - 06/19/12 09:30 PM
|
|
|
Thanks, I just tried the conversion of cubeqst.chd, with Win32, 32bit wine, and Linux/glibc 64bit build. All produced the same valid result.
|
|
|