MExp: File Structure

2024-05-17

I recently released MExp, a calculator remake of A Monster's Expedition by Draknek & Friends. This is the second of a series of articles on interesting technical aspects of the game.

The cover image of MExp. In three-level greyscale, the player stands on an island surrounded by a tree, a rock, and a raft.

In this article, I will discuss the save file design of MExp, and will show why carefully structuring data is integral to program performance.

MExp's Files

MExp has three main files.

  • MExp.8xk: The MExp application. Contains all the code and logic that drives MExp.
  • MEXPWORL.8xv: The MExp world file. Contains all of the "permanent" things in the world, like the terrain and where to place entities on islands.
  • MEXPSAVE.8xv: The MExp save file. Contains the current state of the world. This includes two things: 1. The current state of entities in the world, and 2. The undo file.

As the player plays MExp, the application will read the world file and the save file, writing the current state of the world in the save file, as shown in Figure 1.

A visual demonstration of MExp's file structure. The application file reads the world and save files, and it writes the save file.

Figure 1: MExp's file makeup.

Although the structure of each file is meaningful, let's just focus on one aspect of the files: How do we store entities in the save file?

An Entity in Memory

An entity represents a single object in the world. This includes the player, logs, trees, rafts, signs, and snowfolk1.

Just 8 bytes are needed to represent a single entity. Figure 2 shows this fact visually.

A picture showing the player, a tree, and a raft in MExp's world, along with their corresponding 8-byte memory representations.

Figure 2: The 8-byte entity structure.

Byte 1 is the entity's type, whether that is a log, a raft, or the player.

Bytes 2, 3, and 6 are the entity's X position, Y position, and height.

Bytes 4 and 5 contain the entity's ID. This tells us which island this entity spawned from, and the index of the entity on that island. This helps with island resetting, which searches for entities to return to an island. As a special case, the player will always have an ID of (0, 0).

Bytes 7 and 8 are extra bytes. How are they used in practice? A couple examples:

  • A signpost uses byte 7 to mark which string is written on it, indexing a list of strings in the world file.
  • A raft formed from two logs will use bytes 7 and 8 to store the entity ID of the other log in the raft. This is needed for island resetting; a reset may just need to grab one of the two raft logs.
    • In Figure 2, the pictured raft just spawned like that, and was not formed from two logs. Its extra bytes, therefore, are (0, 0).

Storing the Entities

Knowing what an entity looks like in memory, how should MExp store all of the world's entities in memory?

Let's try a simple approach: Place all of the entities in a single array in memory. Figure 3 shows this in practice.

A simple scene from MExp. There are two islands. The left island has the player and a single tree. The right island has a single tree and a single raft. The label "Memory:" is followed by cutouts of the four entities, left-to-right.

Figure 3: A simple entity array, where all of the world's entities are placed one after the other in memory, with no auxiliary information.

This approach is straightforward, easy to implement, and memory-efficient. Many common operations like moving an entity, adding an entity, and removing an entity are quick to execute, too!

But there are other operations that are quite slow in this approach. For example, say we want to draw the screen.

For each of the 8 * 12 = 96 tiles on the screen:

  1. We must iterate the whole list of entities, to find a list of entities on that tile.
  2. Then we sort those entities by height, bottom-to-top.
  3. Finally we draw each of the entities in the sorted order.

Step 1 is the main concern here. Having to iterate every entity on every single tile is an expensive operation! While the scene pictured in Figure 3 only has a few entities, some maps in MExp have over 50 entities, which is a lot to iterate so frequently for calculator hardware.

Optimization 1: Simple Entity Map

The first problem to address is having to search for entities on every single tile.

Let's add a new segment to the save file: the Entity Map. It tracks whether an entity exists at any given tile.

A simple scene in MExp, identical to the last one. A field of black zeroes overlays the scene, mostly confined to the upper-left corner. Red ones are placed above the four entities in the scene.

Figure 4: A simple entity map.

Figure 4 shows this idea in action. If a tile has an entity, the entity map contains a 1. If it doesn't, the entity map contains a 0.

This improves performance considerably. At any given time, there are generally fewer than 10 tiles on the screen containing entities, so MExp saves lots of time by not searching for entities on empty tiles.

So, why might this approach still be problematic? Of course, there's the overhead of maintaining the entity map. But more importantly, this code still doesn't scale well for larger maps.

Imagine we had two MExp maps. Map A has 10 entities, while Map B has 100 entities. No matter how many entities are on the screen, Map B will require ten times as much computation to determine which entities are on a single tile, having to iterate all 100 entities! Drawing Map B could be considerably slower than drawing Map A. I wanted to avoid this, if possible. MExp should be able to support small maps just as well as large maps2.

Optimization 2: Island-Entity Ownership

Even if a map has 100 entities, the screen will probably contain fewer than 10 at any given time. How can we narrow down our search for entities on a given tile?

In my implementation of MExp, I chose to address this with an "Island-Entity Ownership" system. If we suppose all entities on a tile are "owned" by a certain island3, then we should only need to search the smaller list of entities owned by that island.

This optimization has two main requirements:

  1. There must be an easy way to find which entities belong to an island.
  2. There must be an easy way to find which island owns entities at a given tile.

Part 1: Linked List Entities

The first step to letting islands own entities is being able to find which entities belong to a certain island. We can achieve this with a fairly simple modification to our existing entity array approach.

Think of each island as owning a list of entities. We need to find a way to create and navigate that list. First, add an additional 4 bytes to the end of every entity, which each entity can use to point to the previous and next entities in the island's list. Then, track the pointer to the first entity in each island's list. With that, we have a navigable list! Following through the chain of "next pointers", MExp can traverse each island's entity list.

The same MExp scene as before. The left island is highlighted red, and the right island is highlighted blue. In the memory representation, a red line is drawn to link all entities on the left island, and a blue line is drawn to link all the entities on the right island.

Figure 5: The entity array, with extra linked list functionality.

Figure 5 demonstrates how this looks in our simple two-island scene. Notably, the entities could be in any order, as long as the links still correctly traverse each island's entities.

Part 2: Entity Map Island Ownership

To make use of our new island list, we need to know which island's entities are at a given tile. We can do this by putting more information in the entity map.

In the entity map, a 0 still denotes that no entity is present on that tile. But instead of just using a 1 to show that an entity is present, we put the island number that owns entities at that tile! Now, when iterating tiles and drawing the screen, MExp can know which island to search for entities on a given tile. Figure 6 shows how this looks in practice.

The same MExp scene as before. A field of black zeroes overlays the scene, mostly confined to the upper-left corner. Red ones are placed above the entities on the left island, and blue twos are placed above the entities on the right island.

Figure 6: An entity map that also respects islands.

With island-entity ownership in place, drawing the screen (along with other operations, like entity movement, that require searching for all entities on a tile) is now efficient and scalable, no matter how large the MExp world file is.

Optimization 3: Entity Pointer Tracking

There are two more small issues to consider. Moving the player, and resetting islands, are not yet efficient.

While most operations in MExp involve finding the entities on a specific tile, these two operations require locating very specific entities, with particular entity IDs. Player movement requires finding the player, and island resetting requires finding several entities to bring back to an island.

Thus we add one final component to our save file: The Entity Pointers File. This file contains a mapping from every entity ID to its pointer, for easily locating an entity with a specific ID4.

As Figure 7 shows, we also have a special entry for the player's entity ID: (0, 0). The player has this special ID because they don't respawn on a specific island.

The same MExp scene as before. Below the memory representation, all existing entity IDs are listed out, with lines pointing from each ID to its location in memory.

Figure 7: The entity pointers file.

Of all the optimizations, this is probably the least important. And, importance is worth considering! Having to search for the player once on every frame is not a terribly costly operation. The MExp executable can only be about 16 kilobytes in size, and every bit of optimization code comes at the cost of other features.

Still, I implemented the Entity Pointers File in MExp. That way, every operation in the game is efficient, and map size will not affect the game's performance!

Efficiency in Practice

Or, at least that's the theory. All of the above optimizations should theoretically make MExp faster. But, do they? All of these speedups don't mean much unless they make a difference in practice.

I tested the effects of optimizations 1 and 2 on screen drawing. Take the following (rather complex) scene from area 3 of MExp:

A scene from the Flower Lake area of MExp. The player is climbing a rocky hill, with a few tall trees on ledges above and below. One of the large trees is knocked over, and split into two logs.

I timed how long it took to draw the screen 20 times with no player movement, and calculated the game's frames-per-second (FPS) rate from that:

Optimizations Time FPS
None 13.55s 1.476
Only 1 5.240s 3.817
Both 1 and 2 4.417s 4.528

These times suggest that the optimizations are, indeed, helping the performance5 of MExp! Optimization 1 has the greatest impact, since it cuts out entity searching on every empty tile. Optimization 2 has a smaller - but still noticeable - impact, simplifying entity searching on tiles that do have entities.

Conclusion

Now you know more about how the MExp save file is structured! Through judicious structuring of data, MExp can operate fairly quickly for its limited hardware.

While making games, it is important to carefully consider how the world-state is represented. Choosing more effective data structures can improve a game's performance significantly, allowing you to get the information you need in less time!

And, it is important to test, time, and profile those data structures in practice. The best optimizations actually save time!

  1. Tree stumps are not included in this! Stumps are a permanent aspect of the world encoded in the world file. Surprisingly, they're also purely cosmetic. A height increase of 1 and a stump are completely equivalent in physics code.

  2. For further reading, look up the term "asymptotic notation"! In my case, I wanted my worst-case space complexities to be at worst linear in the map area, number of islands, or total entities. And, I wanted my worst-case time complexities to have no terms relating to the map area, number of islands, or total number of entities. In short, I'm okay with larger maps taking up more space, but I want MExp to run just as quickly if possible.

  3. An important implementation question is: What happens when entities from different islands enter the same tile? In MExp, generally the entity entering a tile last will become owned by the island that already owned all the entities on that tile. That includes two logs combining into a raft! However, this rule has one exception: If an entity ever enters the landmass (reset-zone, generally) of an island, it becomes owned by that island.

  4. The Entity Pointers File has other good uses in practice. When an entity is gone from the world (can happen under specific island resetting circumstances) we set its pointer to 0. And if an entity is contained in a raft, its pointer is the negative pointer to the raft entity. So, these pointers encode not just how to find an entity, but also meaningful information about the state of the entity.

  5. Funnily enough, immediately after taking these times I found a better way of batching world-file data grabbing that bumped the FPS to 5.357. No amount of data structure optimization can make the hardware details obsolete!