MAMEWorld Forums - EmuChat

Well seeing as there are so few active discussions here my threshold for holding back for fear of adding noise to the boards has become low enough that I've decided to post to open discussion on a half-baked idea.

The half-baked idea is to implement a facility for supporting writing A.I. ('artificial intelligence') agents for playing arcade games using MAME.

There are many ways that one could approach this issue but I've already solidified my plans for some of it.

I already have patches to MAME that turn MAME into a library, allowing it to be linked into any program, giving that program easy facilities for starting and controlling emulated games, and receiving callbacks as the game is emulated. The API and callbacks will be familiar to the MAME developers as they do not deviate very far from what is provided to an OSD implementation:
Controlling the emulator:
- Start a game, running in the calling thread
- Provide callbacks to be made from the emulator as it runs
Callbacks, made frequently (essentially at the video refresh rate of the game):
- Startup percent complete (loading roms, initializing engine)
- Ask for current controllers state
- Provide a frame of video
- Provide a frame of audio
- Paused callback
- Callback that gives the code using the API an opportunity to make other
calls to change the game state:
- Pause
- Schedule exit
- Schedule hard/soft reset
- Save game state
- Load game state
- Alter a dipswitch setting

The only aspect of this that differs significantly from the callbacks that MAME makes into its OSD layer is the 'ask for current controllers state' callback. The API that MAME provides to its OSD layer is to allow the OSD to provide a callback per input device to be called to get the current input state for every device; libmame, like the OSD implementations, acquires the state to provide in these callbacks via a single poll done once in every 'update' call from the emulator, but whereas the OSD implementations internally poll this state, libmame makes a callback out to acquire this state, and the form of the query is in a single large structure containing every possible game input which the callback is expected to update with the current input state. The important criterion here being that I think that this provides a very simple way for any user of the library to provide controller state; it doesn't have to worry about N callbacks, one for each possible input device for the game, instead it has to worry about 1 callback, which expect to get an update of the state of all input devices for the game. libmame takes care of the gory details of mapping input device callbacks to code that pulls the data from the results of that single large 'poll all controllers state' callback made to the user of libmame.

This API provides a mechanism for any program to run a game via the MAME emulator, and to control it, get the video and audio that it produces, and provide controller inputs to it. The obvious use is to link this into a program that interfaces with a human to show them the video and audio and solicit their input. This is in fact what I am primarily using this for - an 'integrated' frontend where MAME runs in the frontend process itself and is under complete control of the frontend, which can more seamlessly interact with it than typical frontends which just launch MAME and then get out of the way.

But there is a secondary possibility, and that is to replace the human with an A.I. And the most useful way to do this would be to create another API that would add facilities for making it easy to write a program that provides an API to play a game. However, there is a very important caveat here that I haven't explored in great detail:

It is necessary to be able to tell the current game state for a given game on a frame by frame basis. One way to do this is to expect the A.I. to take video and audio frames and interpret the video pixels/audio frames directly and come up with their own internal representation of the game state from that. This would require very sophisticated image recognition techniques and it seems like it would be almost impossible for complex games. While it is an interesting aspect of game playing A.I., it's not one that I would expect every A.I. developer to want to have to deal with. So as an additional mechanism for getting game state, I would like to provide a definition of the game state for every game and a way to populate that game state from the running game. An example will help illustrate what I mean:

For Pac-man, the minimum game state that the A.I. would need is:
- Where on the screen is each ghost

The rest of it the A.I. could keep track of itself because it knows where Pac-man and each dot and power pill starts on the screen, and it knows what 'must have' happened on every game frame given its input. The only unknown is what the ghosts are doing because the game controls that, and so that is the only game state that would really need to be delivered to an A.I.

However, some other game state may be easy to deliver and may simplify some aspects of an A.I.:
- Where on the screen is Pac-man
- What dots are on the screen and where
- What power pills are on the screen and where
- What color are the ghosts (normal, blue, white)

Thus there is quite alot of variability in what one person may feel is game state that should be delivered to the A.I. versus what the A.I. should figure out for itself, even for a game as simple as Pac-man. My personal interests are not to write the 'eyes' and 'ears' of the A.I., but instead just the 'brains'. And so I personally would like the game state to tell me everything relevent that a human would normally know just from what their eyes and ears tell them. But others find the challenge of implementing the eyes and ears (i.e. turning pixels and audio into an ongoing accurate representation of the game state), so I would want to enable but not require that.

OK so for 'flavors' of the API that provide game state instead of video and audio to the A.I., where does this game state come from?

This is where I have uncertainty. I am just hoping and praying that it will be possible to deduce this information from values that can be extracted from the emulator in a knowable way each time a game is run. For example, for Pac-man:

- Would there be a specific set of memory addresses that contain the current locations of the ghosts that can be polled on every frame?

I am hoping that this is true. I know that the 'cheats' system of MAME exists and to me it demonstrates that there are certainly some aspects to games that can be intrusively read or written and consistently produce an expected result. I also know that these are very much on a game-by-game basis which would make adding 'game state' APIs a one-off for each game. This greatly adds to the work load of producing the framework for supporting an A.I.; but these can be done by people interested in writing the A.I.s so there can be an easy parallelism of the effort.

For complex games, this might be an intractible problem; I don't know. Especially if there are games that use lots of dynamic memory structured in ways that cannot be easily examined.

Here are some of the features that I would hope to be able to provide:

- libmame would be linked into a standalone program that would have a GUI that would:
- Allow the user to select a game to be run
- Allow the user to select the A.I. to be used to play the game
- Set up the game to be emulated:
- Configure dipswitches (difficulty, etc)
- Provide a canned input sequence to start the game
- For example, wait 5 seconds, add a coin, press 1 player start
- Would be very useful to provide further inputs to get the game into a known state to make things harder for the A.I. (for example, starting Pac-man with a few seconds of random play so that the A.I. could not just use a pattern for the level)
- Set up parameters controlling the "rules":
- Does the A.I. have to provide its input 'in real time', which means that has to calculate its next move in the interval between frames in the game run at real time speeds, or can it instead take as much time as it wants to, slowing the game down from real time speeds?
- Start the game
- Show the game as it is emulated and driven by the A.I., including showing the controller inputs that the A.I. is delivering on every frame
- Launch the A.I. in a separate thread and allow it to register a callback that gives it the current 'game state' and a 'controllers state' structure for the A.I. to fill in with the controller inputs that it wants to deliver as a result of that game state

In terms of what language the A.I. should be written in; the core is C++ since that is the language of libmame and MAME. And so an A.I. could be written as C++ methods, linked into a shared object library that would be dynamically loaded at runtime when the user selects that library as the A.I. to run. But wrappers could be made that would allow other languages to participate; the A.I. library could include, for example, a Java VM that would be run and the C++ callback to provide game state and query controller input would be turned into a call into the VM to provide these to a Java thread running within the VM. Then developers could write their A.I. as Java classes which would then be loaded by the VM to provide the game A.I. Similarly for other interpreted languages.

So what is the goal of all this? Well, mostly I just think it would be really fun to be able to write A.I. programs to play old arcade games. And the facilities I described here would allow anyone to focus on the A.I., with all of the details of controlling the emulator and displaying the results taken care of for them. I envision a few interesting possibilities:

- Contests to see who could write the best A.I. for a given game
- Head-to-head A.I. versus A.I. in any competitive 2 or more player game (fighting games would be good ones)
- 'Aftermarket' A.I.s to control the CPU player in player-vs-cpu games (for example, making different A.I.s to control the non-human opponent in e.g. Street Fighter II)
- Learning environemnts like A.I. classes at schools and universities
- Challenges like writing an A.I. that can beat the best human score in games like Donkey Kong etc.
- Advanced 'attract mode' for arcade cabinets where the games actually play themselves

Well anyway I'm open to any and all discussion on this topic. This is not just idle speculation as I intend to make real efforts on this project after I finish with my more immediate project of my integrated frontend program and subsequent MAME cabinet (which itself may take years at the current rate ...).