multipart-message nevyn bengtsson's blog

featured articles 🦄, about, archive, tags

Mutable Adventure, Pedro and Erlang Text Processing

As some of you might know, I’m currently writing a game called Mutable Adventure, a 2D sidescrolling platformer MMOG with the editability of Second Life.

The thing about Mutable that interests me the most, however, is the networking. I’m using a library/protocol called Pedro, by Peter Robinson, which I found when Keith Clark presented it at work. Before I go on, I need to explain why this protocol is so goddamned cool.

Pedro and software agents

It’s based on Prolog, more specifically, his own implementation of QuProlog. It’s sort of a blackboard system, with a central messaging server that everyone connects to. It’s string-based, and what you send over the network are prolog fact-style messages, called a notification, for example:


  playerWasHit(152, 12388)

This is then broadcasted to everyone to subscribe to this message. A subscription might look like this:


  playerWasHit(PlayerID, ObjectID), PlayerID = 152

Notice that the expression contains Prolog variables, and a guard expression. This means that several clients may subscribe to the same notification, but with a guard expression that says they only want player info about their own player.

Each subscription is accompanied by a number, so that you on the client side can redirect the incoming notification to the right place. In Python I’ve implemented this so that individual objects may subscribe, and that individual objects may receive notifications. For example, say a ball is spawned in the game world. At the instance it’s instansiated, it may subscribe to information pertaining to this specific instance, by giving the subscription a guard expression with its own ID number. Boom, automatic network message propagation within your game client or server. As an example, in the client, the Tilemap class is appended with a category/mixin with the following code (notice the third argument, the guard):


  gClient.net.subscribe(self, "tilemap(TilemapName, NameOfTilesetTilemapUses, ScrollX, ScrollY, AutoScrollX, AutoScrollY)", 
                              "TilemapName = \"%s\""%self.name)
  gClient.net.subscribe(self, "tileInMap(TilemapName, TileX, TileY, TileIndexInRoomTileset)",
                              "TilemapName = \"%s\""%self.name)
  gClient.net.subscribe(self, "tilesInMap(TilemapName, FromIndex, ToIndex, TileIndexArray)",
                              "TilemapName = \"%s\""%self.name)

(Voxar has significantly improved the syntax in the Pedro python wrapper since then)

Suddenly, your objects aren’t mere objects; because they can communicate on their own with the outside world, and respond on their own, they’re now software agents. Also, because the messaging medium is now separate from the server instance, your game server may now be freely split as you see fit into several processes, both for load balancing and for division of labor into discrete entities, with higher cohesion and lower coupling.

The downside

Peter Robinson’s implementation of this concept isn’t perfect, however. First off, it’s written in plain c, and it depends on glib2. GObjects. I can’t stand GObjects. I can’t stand manual memory management. And it’s 5000 lines of code for a relatively simple concept.

Secondly, you just traded yourself simplicity for the price of a single bottleneck. A buggy, unstable, memory leaking single bottleneck with no means of load balancing or distribution. Written in C with a single maintainer, based on glib2 which is pretty hard to get running under Windows, if you would want to.

(I attended a 24 hour game development challenge a while back, where I wrote a networked 3D racer. Because I couldn’t get pedro running under Windows, I had to run it on my own server at home, while the judges tested the game from the other side of the country. >1sec lag and no lag compensation or similar whatsoever in the code = I certainly didn’t win that competition :( )

Erlang, The Savior

After reading Joe Armstrong’s book on Erlang, I was itching to write something. I tried writing a Twitter clone, thinking that Erlang’s highly distributed nature would come to great use there, but every line took a minute to write (because I’m so new to Erlang), and Twitter just felt too big and too difficult to write as a first project, so that got abandoned.

So, the next thing I’m trying is to write a Pedro clone in Erlang. It’s a good match, since the hardest part of of Pedro is the pattern matching, and Erlang got that as a part of being a functional programming language. I’ve set out to get it to work in 100 lines or less. So far I’m up to 60 lines, and I think I have a chance of making it.

In Erlang, processes are cheap and abundant. You spawn a lot of processes and then do all communication through erlang messaging. A common pattern is the middle man pattern. While cleaning the dishes I remembered this pattern, from its usage in an IRC client built in the book, and tried to apply it to Erdro (Erlang Pedro :P) conceptually in my head (Sorry, the following is a bit hazy as 1) I haven’t implemented it yet 2) it contains a lot of erlang terms). One problem with Erdro is that in its current implementation, even if you have a supervisor process watching the server and respawning if it does, restoring all the stack state, the process would be dead and the sockets meaningless, and all clients would have been lost and restring state would be pointless. However, if each client is abstracted away with a middle man process (meaning all socket communication goes to and from a separate process, and communication with the server is handled through erlang messages), you could store all the state including process pointers to these middle men in a mnesia database or similar, and if the server dies, respawn the server and its state and everything you’d have lost was a single message (the one that made the server crash); all socket connections would be alive and all clients just continue communicating.

Put the middle men on a cluster of machines away from the messaging server, each with its own IP and bandwidth, and you’ve distributed your network load. The machine containing the messaging server could self-immolate, still only a single message would be lost (given that there is another computer that could act as messaging server).

Process load balancing of the server, however, is left as an exercise for the reader.

The nitty gritty details of text processing in Erlang

There’s a slight mismatch between Erlang and Prolog, given that they’re completely different languages (one is functional, the other is logical). In the context of Pedro, however, I could only think of a single difference that mattered, and had to be changed.

In Prolog, the term “myFunctor(atom, anotherAtom)” defines a fact. In Erlang, it’s a function call. So, for this to work, I’d have to convert that term to something equivalent that could be used in Erlang pattern matching: a tuple, like so, “{myFunctor, atom, anotherAtom}”. Since Pedro doesn’t have tuples, the transformation is reversible and fully equivalent (I hope! It’s not like I’ve tested my code yet…). How would you do this? The first thing on my mind, being a ruby coder, was regular expressions. My first realization was the erlang’s regexp support is really, really, really horrible. Not only is it slow, its syntax support is so basic you might as well do without. My second realization was that regular expressions were a really bad match for the task at hand anyway. So my first real piece of Erlang code was a text search-and-replace implementation with pattern matching specific to my task:


functor_to_struct(PrologString) ->
  [First|Rest] = PrologString,
  fts("", [First], Rest, false, 0).

fts(BeforeFunctor, Functor, [], _, _) ->
    BeforeFunctor++Functor;
fts(BeforeFunctor, Functor, Rest, InStringEscapeMode, Prev) ->
  [Next|Rest2] = Rest,
  case Next of
    34 when (Prev == $\\) and InStringEscapeMode -> % \"
      fts(BeforeFunctor++[34], "", Rest2, true, Next);
    34 -> % "
        fts(BeforeFunctor++[34], "", Rest2, not InStringEscapeMode, Next);
    Char when InStringEscapeMode ->
      fts(BeforeFunctor++[Char], "", Rest2, true, Next);

    $( ->
      fts(BeforeFunctor++"{++Functor++", " , "", Rest2, false, Next);
    $) ->
      fts(BeforeFunctor++Functor++"}", "", Rest2, false, Next);
    Char when ((Char >= $a) and (Char == $A) and (Char == $0) and (Char =
      fts(BeforeFunctor, Functor++[Char], Rest2, false, Next);
    Char ->
      fts(BeforeFunctor++Functor++[Char], "", Rest2, false, Next)
  end.

Half-way through the code, it felt so horrible, I felt like I was writing the worst code in the universe. Now that it’s done, though, I find it pretty nice. It’s relatively short for what it accomplishes (replaces function calls with tuples, being careful not to interpret parens inside a quoted string) imo. However, after a good night’s sleep, I realized that this was a really stupid way of doing it. Why? Because Erlang has a code parser in its standard library.


  {ok, Scanned, _} = erl_scan:string("foo(bar)."),
  {ok, Parsed} = erl_parse:parse_exprs(Scanned)

yields [{call,1,{atom,1,foo},[{atom,1,bar}]}], a perfectly fine nested Erlang data structure that can be traversed and parsed. So that’s what I did!


  transform_calls_to_tuples(ParseTree) ->
    tctt(ParseTree, []).

  tctt([], Collected) ->
    lists:reverse(Collected);
  tctt([Token|Rest], Collected) when element(1, Token) == call ->
    {call, 1, FunctorNameAtom, ArgumentList} = Token,
    tctt(Rest, [{tuple, 1, [FunctorNameAtom | tctt(ArgumentList, [])]} | Collected]);
  tctt([Token|Rest], Collected) ->
    tctt(Rest, [Token|Collected]).

… which yields the same result, but in a parse tree (which you can turn into a string again with erl_pp, the pretty printer).

That’s it! I’ll get back to you when Erdro is done.

Tagged faves, gamedev, coding