Cogs Turn, No Matter How Rusty

"Its bindings are not forged of steel, iron or even rope - but rather its own dramatically lengthy tongue, pulsating with strength as the creature attempts to suffocate itself."

The Games Foxes Play

(complete source code | view all previous posts)

Hello. How long has it been? 3 weeks? I've been busy with the student's plight that comes with the end of a semester, but still snuck in some progress once in a while. The features showcased today are therefore the product of stray and scattered hours.

It is time to dial down the dreams and get working on real work that produces real results. I showed my genetic algorithm experiment last time I posted here, but I also did another experiment with peer-to-peer multiplayer with rollback in Rust right afterwards.

Useful for my game? Very likely not. Interesting? Absolutely, I am glad that I understand better how networking technology in multiplayer games works now.

But the time for messing around has come to an end.

Behold.

I have redone my entire basic game logic and UI in Rust and the Bevy engine (it used to be in pure JavaScript with the PIXI.js library). I'm a big fan of the minimap and the sidebars!

Pros:

  • The performance has received unfathomable improvements.
  • I no longer have to bang my head against the brick wall that is scaling the screen to every possible resolution.
  • The Entity Component System structure is amazing, I never want to do an object-oriented game ever again.
  • I just write things and they work. I still have not used the debugger once.

Cons:

  • UI support is quite sad. Don't tell anyone, but the "UI" in my game is actually just more game entities, except they follow the player around as it moves.
  • I have a feeling implementing mouse support will be a terrible experience.

As for the game's design - I have once again drawn out my butcher's knife and have been cutting out the chaff. I've been so obsessed with being creative - to the point where I forgot to be reasonable.

As an example, even the controls had been turned into a game mechanic. You literally started with a spell which was composed of the editable blocks "When pressing 'W', on the tile north of the caster, move towards the targeted tile, a turn passes'". Let's list the problems with that:

  • Removing the "a turn passes" component causes the player to freeze time forever, which is unfathomably OP
  • Removing any of the other components basically induces a softlock
  • Rebinding keys, a fundamental part of game design, was tied with a game mechanic (can't just do it in a quick menu, you need to use controls to rebind the controls)
  • Quality-of-life like "click on a tile to move towards it" like in most modern roguelikes was simply impossible

I am dialing it down, keeping the spirit of "build your own abilities" but in such a way that can coexist with the accessibility standards of modern gaming. I have a rather tight concept - a spin on the oldest ideas I ever had for this game, which I'll talk about when it is actually getting implemented.

The Tango Problem

The Tango Problem

Personal experiments with genetic algorithms and neuroevolution, in Rust, with the Bevy engine.

A number of "Psychics" are placed inside a 45x45 play area, and submitted to one of the following two challenges:

  1. Get as close as possible to a "Beacon of Light" randomly placed in the space before the 100 turn limit is elapsed. Walls are disabled in this experiment.
  2. Paint as many walls in pink as possible within the space before the 100 turn limit is elapsed. Walls are enabled in this experiment.

The Psychics do not "know" these objectives. All they are given is a series of inputs representing their senses, such as:

  • The angle between themselves and the Beacon of Light

  • The North-West-East-South direction in which the Beacon of Light is located

  • Which adjacent tiles are solid

  • Which tiles in range 2 of themselves are solid

  • Which adjacent tiles are painted walls

  • Which adjacent tiles are unpainted walls

A neural network receives these inputs as a vector of floating-point numbers (for example, "the Beacon is North of you" could look like [1.0, 0.0, 0.0, 0.0], "the Beacon is East of you" would be [0.0, 0.0, 1.0, 0.0]).

Then, snazzy things happen inside one or more hidden layer(s), and a vector of floating-point numbers comes out (such as [0.32, 0.55, 0.38, 0.85, 0.11]). These correspond to actions, such as [Move North, Move West, Move East, Move South, Paint Nearby Walls]. The highest number is the chosen action.

After each 100-turn sequence, each Psychic has their fitness calculated - how well they performed. Initially, this was only based on the objective, such as "the number of tiles separating a Psychic from the Beacon" or "the number of tiles painted." The best Psychics are selected through "roulette pick" (everyone has a chance to be chosen, that chance increases with the fitness score), some mutations are applied to the neural network's weights, and a new generation is born, which will hopefully fare better than the last.

But, this proved insufficient. I first observed the "Tango Problem" in the Beacon experiment - large numbers of Psychics would simply move down until they were on the same Y coordinate as the Beacon, then simply move up and down over and over.

Why was this happening? After some struggling, I figured it out.

A lucky Psychic would occasionally spawn on the same X level as the Beacon. Then, they "figured out" that they could just move down until the Beacon was reached, and start moving up and down over and over again. This was gloriously rewarded by the system - after all, they perfectly completed the task of reaching the Beacon and staying on it!

Other Psychics would start copying this behaviour - except these are not actually on the same X level of the Beacon, and as such would fail the test, forming a line passing through the Beacon across the entire level. Naturally, there would always be at least one lucky Psychic succeeding through this strategy and making this behaviour repeat itself.

My answer was to add, in the fitness function, a HUGE reward for Psychics who would dare to use more than just 2 different actions. This immediately solved the issue, as the "bob up and down and hope to be lucky" strategy could no longer result in the maximum possible fitness a Psychic could reach. Here is the result!

The painting experiment followed. It, too, experienced struggled related to the Tango Problem (which, I believe, is known as "getting stuck in a local maximum" in the literature): some Psychics would occasionally be very lucky and spawn next to many walls, which means they can just stay there, paint their surroundings, never move and still score 3-7 points without effort. Psychics that actually see an unpainted wall, move towards it and paint it are not rewarded as well.

Once again, I added extra objectives:

  • Do not stay in the same place for too long. (this resulted in Psychics bobbing up and down just like in the original Tango Problem and not doing anything else)
  • Receive a penalty for re-painting already painted walls (this resulted in Psychics being terrified of ever painting anything)
  • Receive a bonus for moving far away from the spawn location (this result in Psychics all rushing the bottom of the screen and painting it, leaving the rest of the level unpainted)

Significant tweaking may eventually produce even more performant results, but for now, a balance of all these objectives does succeed in achieving a "trickle down" strategy where all Psychics move towards the bottom of the screen, painting walls as they fall down. (Creatures being in walls at the bottom is purely a visual glitch - I am not quite sure how to handle getting just the right amount of Bevy entities in each simulation without causing major performance issues).


Overall, I am happy with how much I have learned from this mini-project. Studying genetic algorithms initially made it seem like they could handle anything with brute force and sufficient time, but I see now that the objectives, "sense" inputs and possible action outputs change everything. It is very easy for these algorithms to find something judged to be "good enough" and stick to it without much of an attempt to improve or innovate. I am aware a good algorithm should balance between exploitation and exploration, but the amount of tweaking required to reach this fine balance is truly a work of absolute precision and patience.

Rust is an interesting piece of technology. Its "fearless concurrency" was very useful in making this program able to BOTH display the results of a completed simulation and train more simulations in the background at the same time. As someone coming from dynamic, untyped languages like JavaScript and Python, it is truly astounding how many errors are prevented just through rust-analyzer's watchful eye. I am however disappointed by the constant long recompilation times at each added library, the rather poor step-by-step debugging support, and how rust-analyzer loves turning my CPU into a localized micro-sun.