What's your plan for maintaining temporal coherency?
Even if you get MAME to run on a server backend, a user controlling the emulated machine is going to be issuing input events against a visual state that, most likely, is already temporally out-of-date by the time it appears in the client-side window.
By the time any packet containing input state from the client makes it back to the server, the local time of the emulated machine will have advanced even further, so applying the client inputs at that point will effectively be applying the input response after the point that the user intended for those inputs to be acted upon.
|