Sunday, December 30, 2007

Project Spedini is a Go!

I have (re)started a spare time project over the past week and have made a good amount of progress so far which I wanted to share...

The project stems from the combined interests of mine in AI, computer sight, algorithm design, digital image analysis, video games, and generally trying to make programs that do what humans do, better.

Therefore, the project's aim is to provide a lightweight framework which would aid in the development of computer programs that could play online Flash games. The main goal is the development of the program's AI and processing algorithms that would allow it to accept inputs from the game that are meant for human and not machine consumption ... and to make logical decisions on the next human move without any user input. In order to do this, I would need a framework that would facilitate the interaction between the program and the Flash game in a human manor.

What do I mean by a "human manor"? I want the program to interact with the game in the same way I would. Therefore, its input should be in the form of visual cues (what can be seen on the computer screen) and possibly any audio cues (this will be added later if the need arisses). The program should interact with the game just as I would (through mouse and keyboard events).

The program obviously won't be hampered by physical limitations of my hardware (what if my mouse can only send so many signals a second) and/or of myself (I can only tap keys so fast) but thats ok. The program won't have access to any internal state or logic information about the game that I wouldn't be able to figure out while playing.

So whats my progress so far?


Well, a little background first. The vast majority of my experience is in Java so that was a natural starting point. However, that presents one big problem. Since one of the tenants of Java is write once run anywhere, accessing native components of an OS (or other non-Java programs) is a cumbersome task at best. Since the first two problems I had to tackle were 1) find a way to load/display/interact with a Flash game via Java and 2) interact with that flash game in the same manor I would as a human (mouse/keyboard) .... this was going to be an issue.

I searched high and low for a viable Java based Flash viewer and/or browser and found very few. There are a number of open source projects, and commercial, out there but almost all of them had major drawbacks. Either they didn't support the flash plugin, weren't implemented completely, and/or lacked Mac OS X support (I work on Windows at work, and on an iMac at home). I finally ended up choosing a solution from the Eclipse foundation, there SWT API.

SWT is basically a wrapper in much the same way as some of the others are. It wraps calls to a native browser process and displays it in a Java accessible context. (Oh and it does tons of other stuff and is a pretty interesting alternative to Swing/AWT). Why did I chose this one? Especially when I had never used it before? Because I figured that since Eclipse ran on both Windows and Mac OS X, then the browser component must work everywhere. I was right! But it wasn't as easy as I thought.

In order to get my project to work (a non-eclipse based project mind you) I had to find and download the SWT libraries along with the windows and mac native plugin libraries. By adding these to my class path I was able to add a browser component to my app running in either Windows or Mac OS X. Great!

Only one more little problem (which took me the better part of an afternoon to figure out). However SWT is registering and accessing the Carbon hooks, it needs a lovely little JVM argument to be added to work. I didn't find this until a couple hours later, but here it is in case anyone else stumbles upon this article: -XstartOnFirstThread. (FYI, the errors you get without this lead you in a completely different direction. Much thanks goes out to this blog)

So thats part 1 working!

Part 2 was much easier, although not very elegant ... My initial hope was for my programmatic mouse and keyboard interaction to be transparent to the user and be solely event based (e.g. I would create a new mouse event and add it to the Swing/AWT event queue to be processed). Unfortunately, when working with native underlying components .... lightweight Java events don't get passed correctly. The only solution I have been able to find so far is to use the little known Robot Java class. This class allows the program to fire native events to the OS for mouse/keyboard/screen capturing duties. Thats great you say, exactly what we wanted and exactly like if I had done it myself! Yep. So much so that you actually end up seeing the mouse move around instead of just creating the event.

You may wonder why this would concern me. The problem is that by invoking these methods, I can have my program move my mouse (without my immediate control) to any place on the screen and click at will. Start...Shutdown...OK? Big X button? Not very likely, but possible and a scary thought. To get around this I created a "safe" interface around the Robot class (using the MouseInfo class) that allows me to specify either a static rectangle of "safe" coordinates or a dynamic reference to a visual component so that the Robot class will only work in these coordinates (or over the specified window) and if the program tries to move the mouse outside the box, an exception is thrown.

Part 2 down!

Next up .... I need to refactor the framework to be a bit more friendly and adaptable to being used by the flash playing programs I intend it for, but its basically there. If I have any more time over the holidays to work on it, I am hoping to get a very rough version out that can do a few things and find somewhere to host/demo it so that others can see what I have done.

My next post will hopefully demo what I have so far and outline the first flash game I am going to try and tackle.

Wish me luck!