Some Notes on Proper Playtesting

By The Bard
created 2/08/04 updated 11/29/04

I received a complaint about one of my latest reviews stating that it was “unfair” and accusing me of “trying things an average gamer would never do”. This designer obviously didn’t understand two very important facts:
First: If hundreds or even thousands of people are going to play his scenario, a lot of them will actually try things he never imagined the ‘average gamer’ would do.
And second: Testing is supposed to be unfair – stretching a system to its limits and using every mean trick imaginable while trying to break it.

In real life this kind of testing is sometimes absolutely vital. The ‘Titanic’ for example really was unsinkable – provided her hull didn’t suffer more than ‘average’ damage. AoM scenario designers can of course be a little more relaxed about this matter. I personally don’t think that a reviewer should go to lengths trying to ‘break’ a scenario. But I feel a lot of designers should invest a little more effort in this area.
During my review work I often get the impression that most people don’t properly test their scenarios before they submit them. So I thought it might be a good idea to come up with a couple of notes here that may help designers in getting along with this task.
Keep in mind that I can offer no Eternal Truths, just some points that may make life a little easier for designers and the people who want to play their scenarios. After all, part of being an engineer is to know when you can bend or even break a rule – and to know when you better don’t.

Rule of Thumb

First, a rule of thumb: In a professional software project at least 30% of the total effort goes into testing. (That’s about the same amount as goes into the implementation, i.e. the programming.) For custom made AoM scenarios this number may look a little high, but the testing effort for any but the most simple scenarios should sooner be in this order of magnitude than in the “well, let’s hope it works” range of 0-5%.

Simplifying Testing

Testing doesn’t just start after everything else is finished, testing should go on in parallel with the actual programming. There are a number of things to keep in mind during implementation which can simplify testing later on.

1. Test every single trigger as soon as you have implemented it

Actually, this is a bit of an exaggeration. In most cases it’s not even possible to test a single trigger all by itself, so perhaps I should better talk about something like ‘functional blocks’ here. By this I understand a group of triggers which are working together to create a certain effect. They may be loosely connected (like triggers for making a village look alive) or directly depending on each other (like a day/night cycle). Once you have implemented such a functional block you should test it before you continue with something else. This is usually the best approach and can save you a lot of time in the long run, especially when you’ve got a complex scenario. It’s not only because testing a clearly defined group of triggers is much easier than testing all of them at the same time. If you first implement everything and then start testing you will have a hard time tracking down any bugs – they could be hiding almost anywhere. You will end up with dozens or even hundreds of triggers, most of them probably interfering with each other and all of them possibly buggy. However, if you have tested all those functional blocks before, then you will at least know that every single piece of your machinery works correctly, which will make it much easier to find the problems.
Oh, one more point: Try and don’t skip some of these tests because “these triggers are so easy they will work anyway”. Usually this approach at some point leads to “I don’t understand this, it should have worked…” Test a couple of them together to save time and effort, but make sure you test them all!

2. Name your triggers and group them

I always wonder how some designers manage to work on scenarios where all triggers have names like ‘Trigger_326’. The editor offers the possibility to name the triggers and to group them, so why not make use of these features? It’s easy to forget, say, during the implementation of ‘Trigger_482’, which is supposed to grant God Powers to a player, that there is already ‘Trigger_276’ which does the same thing, just with some other God Powers. If those two triggers interfere with each other the results can be quite confusing and the designer will have a hard time to find out why ‘Trigger_482’ doesn’t work properly. On the other hand, if there is a group named ‘Startup’ which contains a trigger named ‘Grant_GP_Player_3’ there won’t be any such problems because it’s immediately obvious that the trigger already exists.

3. Implement your triggers in a way that limits the possibilities for errors

Most triggers can be implemented in different ways. Designers should usually try and look for the easiest solution. Use the ‘KISS’ approach – keep it simple, stupid! No use in doing things with three different triggers when a single one would do. Take the above example: Granting all God Powers at the beginning of the game certainly has its advantages. But on the other hand, how do you make sure the God Power is still available when the ‘Invoke God Power’ trigger is fired? In a complex scenario there may be dozens of different reasons why the available God Powers of an AI opponent could change during the game. And if they do, how will you find out about it? So the least error-prone way is usually to use a single trigger which grants and immediately invokes the God Power when it’s needed.
Keep in mind though that a lower number of triggers doesn’t automatically result in a simpler solution. Often it’s easier to pack an operation into a number of triggers which activate each other in sequence instead of piling everything into a single complex (and therefore error-prone) trigger. Go for the simple way, it’s almost always the better one.

4. Implement your triggers in a way that simplifies testing

Testing takes time and effort, that’s probably why lots of people try to avoid it as much as possible. But saving on testing rarely pays off – the Ariane 5 rocket for example exploded because someone had decided it was not worth the effort to re-test a piece of code they had already used in the Ariane 4… What you can do however is simplifying the matter by implementing your triggers with testing in mind. For example, say you design an RPG which contains a dozen different villagers who react in some way when Ajax walks up to them. If you implement this with ‘distance to unit’ conditions you need to move Ajax all across the map to every single villager to test if all the triggers fire. However, if you use ‘units in area’ conditions you can just put down a dozen Ajaxes on the map and test all triggers in a fraction of this time. So if two or more ways of implementing something seem to be equivalent, go for the solution that simplifies testing.

Proper Testing

Once the implementation is finished and every single trigger has been tested it’s time to get into proper playtesting. Keep in mind that playtesting is not about simply playing a scenario and hoping to come across some bugs. In testing it’s mandatory to know what you want to test, and how you want to test it. It’s very hard to reach a goal you don’t know about, so there are a number of points to remember here.

1. Don’t just test the triggers, test the map as well

I’ve just been talking about triggers up to now. But triggers make up only one part of the scenario, the map design is just as important. I’m not referring to eye candy here but to the ‘functional’ aspect of the map. Most maps have to provide some kind of guidance to the player, like keeping him out of certain areas and so on. I’ve seen lots of scenarios where units could take shortcuts right through seemingly impassable terrain like walls or forests because the designer hadn’t bothered to check his ‘fences’ for holes. Every element on the map which has some sort of a function (like a forest to keep the player out of a valley, or a ford to allow an AI opponent to cross a river) has to be tested as well.

2. Add debug information to your triggers

Sometimes it’s difficult to decide on the cause of a specific problem. In this case it can help to tweak the triggers a bit to add some debug information. Adding a ‘send spoofed chat’ effect to a trigger which doesn’t work is an easy way to check whether the condition was met at all. ‘Quest var echo’ is another neat effect which allows to keep track of events that cannot be observed directly. A looping trigger echoing all affected quest vars can really lift the fog around complicated trigger operations. Sometimes it may even be a good idea to specifically add some variables for that very reason.
Just don’t forget to remove all those changes once you’re done, otherwise the results can be quite embarrassing. Programmers have actually been sacked for leaving pop-ups with comments like “only an idiot would try something like this” in their code…

3. Don’t just test from the player’s perspective, use the others’ as well

One easy way to test certain triggers is to playtest from another player’s perspective. A lot of people seem to ignore this possibility completely. I remember a case where some designers asked for help because they couldn’t get an AI’s God Power invoked. If they had just started the scenario from the AI’s point of view it would have been immediately obvious that the God Power had not been granted at all. This is a simple example, but the same technique works just as well with more complicated triggers. Get them to fire, sit back, and watch if everything works out the way it was planned. It may actually be a good idea to add a special player which is given omniscience and is used only for debugging. From this perspective you can see everything that happens on the map and find out for example how different AI players react to one another.

4. Test for negatives, not just for positives

Testing that a trigger works when it’s supposed to is usually quite easy. The difficult part is often to make sure it doesn’t fire when it’s not meant to. It’s a logical impossibility to actually prove that a trigger doesn’t fire when it’s not supposed to, but it’s possible to get a pretty good idea about that – provided it’s tested properly. Take the RPG example mentioned above. Every villager reacts to Ajax if the hero comes close, that’s the simple case. But is the trigger supposed to fire if Ajax is accompanied by Jason? Or only after he has performed some heroic deed? It’s not enough to check that the villager praises the hero for killing a Cyclops after the fight. You have to check that he doesn’t if the battle didn’t take place, or if it did and Ajax run from the beast.
For every trigger you want to test, ask yourself three questions: What is supposed to happen? (The villager praises Ajax for killing the Cyclops.) What is allowed to happen? (The villager praises Ajax even if Jason helped fighting the Cyclops.) And of course: What must not happen? (The villager praises Ajax who actually ran from the Cyclops.)
This can lead to a combinatorial explosion which makes it impossible to test it completely. Think of all the different possibilities for Ajax and/or Jason to visit all twelve villagers in sequence – there are just too many to test them all. It’s one of the cases where common sense comes into play – no matter how thoroughly you want to test a scenario, you have to draw a line somewhere. Test the cases that make sense, like the ones mentioned above. Don’t test what happens if Ajax steps up to the villager while Jason is running circles round different trees at the other end of the map.

5. Try to think along new lines

This is probably the hardest part in testing, and the reason why in professional software development the engineers who actually test something are often highly specialised people who sometimes don’t even know the programmers. This is because programmers tend to think along their own lines. They know what they wrote the code for so they will always focus on this functionality. The testers on the other hand didn’t go through the whole design process and tend to look at things in a much broader way. They’ll try out things the programmers had dismissed (and probably forgotten about) months ago, and they usually get results this way.
The problem in AoM scenario design is exactly the same. The designer has his idea about how the scenario is supposed to evolve and he knows how he would play it. This can make him ignore the fact that the player doesn’t have the same background and might try completely different approaches.
There’s actually a nice example in Ensemble’s official ‘Titans’ campaign. Look at the second scenario, ‘Atlantis Reborn’. The storyline works fine if the player attacks the Greek Town Center and ignores the harbour. However, if he decides to attack the harbour first then the defeated Greek troops in the end will leave the island on ships that have been sunk long before, from a harbour that has been destroyed and then miraculously rebuilt just for the cinematics. Seems as if no one at ES ever tried to attack from the sea… (Note however that they made sure the cinematics worked properly even in this case!)
Of course it’s simply not possible to prepare for everything a player might try. (Unless the scenario is built in a way which doesn’t leave him a choice, which is usually not much fun to play.) But there are a few points to keep in mind when playtesting.
First, the objectives. They give the player a rough idea what to do, but there’s no guarantee that he will actually follow these orders. I’ve seen one scenario where the overall goal was something like “destroy the enemy stronghold”, and the first objective came down to “prepare to defend the ford”. Keeping in mind that a good offence is often the best defence I took my army, crossed the ford, attacked the enemy base and won the game in about two minutes!
Objectives are simply not ‘hard’ enough to keep the player on the intended storyline. Other means are necessary to make sure he won’t stray too far, like in this simple case, a strong army on the other side of the river. God Powers are another way to completely kill a scenario if the player uses them in a way not foreseen by the designer. There was one scenario where the goal was to get the hero to a heavily guarded Hades gate. All the player actually had to do was advance in age, invoke Ceasefire and then merrily walk by the armies supposed to defend the gate! Underworld Passage or Vortex are other God Powers which can be quite tricky to handle for a designer because they allow the player to laugh at any obstacles on the map.
Certain units can cause similar problems. Flyers are an obvious example because they can easily get into places the player was never meant to reach. Strong myth units, especially instant killers like the Medusa, can ruin the balance of a scenario if the player fields them in masses because there is almost nothing that can stand up to a whole group of them.
When testing a scenario it’s important to consider the whole range of options the player has available. The fact that the designer always goes for Hera in the fourth age doesn’t keep some other player from choosing Hephaestus!
A player’s creativity can cause many problems for a designer, and so can ‘stupidity’. I’m not talking about someone who complains that a scenario is “impossible with those few units” because he doesn’t understand that he has to use his villagers to build a Town Center and raise his own army. People like that are beyond help (at least until they have learned to play the game). I’m referring to the ‘stupid’ player who just doesn’t understand which of the three settlements is supposed to be the Lost City of Kreton – something the designer has known for weeks but forgot to clearly state in the objectives. Faced with a situation like this a player will usually either quit the game altogether or try just about anything that comes to his mind – even things the designer has long since dismissed as “too stupid for anyone to try”.
The point is that a designer, when testing his own scenario, should try, to the extent possible, to pretend that he doesn’t know anything about it. He should try to come up with different approaches to the situation on the screen and find out what happens. The easiest – and certainly the best – way to achieve this is of course to have someone else doing it. I would advise everyone to try and get help for the playtesting, preferably from someone who has not seen the scenario before. Just make sure you listen to those people! You should never say “you’ve got that wrong” to a tester, because you cannot say that to a player afterwards either. Usually it’s an indication that there is something wrong or at least unclear in the scenario, and that’s your problem, not his. Murphy’s Law for AoM designers: If the player can do something, some player eventually will. It’s the designer’s job to have the scenario prepared for that.

6. Take your time

I know it’s hard: The implementation is finished, you want to upload the scenario and forget about that boring testing. Still, I recommend patience – and stamina. A scenario should not be regarded as ‘finished’ until the testing is completed. Use your common sense – if you think you’ve done anything you could, upload it. But don’t upload it just because you are too lazy to test it any more and want to see what people make of it. Usually they won’t like it too much if they stumble from one flaw to the next and get the feeling that you are using them as cheap beta testers.

7. Use the final version for the final tests

One last point: It can be tempting to tweak something at the very last moment, after the testing is finished. Don’t! The version you upload should be the version you tested, not the one you added a little extra something. If more designers would follow this rule then there wouldn’t be so many scenarios around that can only be played after a detour to the editor to set the players correctly.

Testing is most certainly not what people want to spend their time on when designing a scenario. Painting maps and creating ingenious triggers is much more fun. But keep in mind that the real fun (at least for everyone else) lies in playing the scenario, which is only possible if it has been properly tested before.