GDC: Creating believable crowds in Assassin’s Creed
Wednesday, February 27th, 2008Having just recently finished Assassin’s Creed, I was quite looking forward to the talk on crowd AI in Assassin’s Creed. The talk was mostly given by Sylvain Bernard, the Animation Director for NPC & Cinematics. James Therien, the Technical Lead for crowd gameplay, chimed in with technical details here in there, but it was a mostly high level talk. Which was good, talking about their goals and design for the crowd, with a little of how they accomplished it.
They started out talking about measuring their expectations. Sylvain joined the company 3 1/2 years ago. At the time, they were looking at GTA since they were doing an open world game. But their focus would be much more on the crowds. The new consoles weren’t out yet so they didn’t know what they were capable of. They recalled watching the trailer for Dead Rising and using that as a benchmark for how many rendered characters the Xbox 360 was capable of that the player could interact with in some way. They also got reference from movies about how people interact with crowds. Some key ideas they took were that in a dense crowd, people form two aisles of traffic, looked at how the crowd reacts to threats, ways the crowd can be an obstacle for the player (showed some footage from Indiana Jones, yeah!) and can make the player trip and fall. They really wanted to incorporate the crowd into gameplay.
High level goals
Rich NPCs: There is no actual notion of a crowd in Assassin’s Creed. Every NPC is an individual. The crowd is emergent from the behaviors of the individuals.
Realistic Art Direction: The art direction was going for a realistic style, which meant they needed to give the player character realistic capabilities. Compared to the Prince of Persia, where the player could run 10 meters along a wall, the assassin can only run a short distance along a wall. They do cheat where needed, however, to make gameplay more fun. The player can survive a 20 meter fall, for instance.
Animation Style: Tried to make it more realistic. They showed the Prince and the Assassin’s wall climb animations side by side and there was much more movement in the assassin’s animation. Ultimately, they felt their animation style was “stylized realistic.”
Shared Skeleton: All NPCs share the same skeleton. Some characters are just scaled down or scaled up. It was a bit of a challenge to fit all the meshes onto that same skeleton, but ultimately it was worth it because any character could play any animation.
They then went on to talk about some of the specifics of bone count and their animation system. One point I found really interesting was that they said they started with an animation system that was more realistic but the controls weren’t responsive. The player would let go of the stick and his character would take a few steps to stop, sometimes running off a roof. NPCs couldn’t stop on a point. They ultimately went back to a more traditional system to allow for more precise control. This was really interesting to me because we run into the same question, trying to balance really smooth locomotion animations with really precise player control.
Unlike in Uncharted: Drake’s Fortune, NPCs and the Assassin share the same animation system. In fact, you could control every character as the player. They would just switch to using Assassin animations if you tried to do something that character didn’t have the capability of doing.
There are two layers to the AI. The first is the behavior layer, and this is shared with the assassin. These are fine-grained environment interactions. The other layer is for the decisions. There is behavior environment data that acts as guidance for runtime interpretation. These guidelines are used only in specific cases, like when the assassin is climbing. NPCs following him in a chase behavior use the exact same code as the player and do the same environment detection tests to achieve their goals.
The AI decision environment data consists of a navmesh. On top of that they generate a waypoint network for A* pathplanning. On top of that they have metalinks. The waypoint links are simple connectivity info, the metalinks encode complex connectivity such as jumps, ladders and beams. They use steering to move the NPCs and communicate with the decision layer when they’re in trouble (something is blocking it should they stop and wait or what).
Since the game is about crowds, they want the designers to be able to specify in general where they want the crowds to go. Level design placed road segments down in the level. Then the NPCs only had to make decisions about which branch to take as they wandered. This made wandering very cheap. They also used this system when NPCs decided to flee from the player - it was just a different decision on which branch to take.
They specifically called out their lack of a technology like Euphoria (something I know a bit about given the project I’m working on), meaning they had to make a ton of different animations. They only wanted to activate rag doll at the last possible moment because they wanted the limbs to have weight to them. So they then showed an example of a character running into a wall. Many many times. All different heights of walls, running into it forwards, backwards, all every which way. That’s how you get to 15,000 animations, they said. 15,000! That’s a lot, wow.
Next they discussed how the spawned in the crowds. They wanted the city to feel like it was full of people but they had I think 120 NPCs at a time. They first tried just spawning them in a radius around the player, but they ended up wasting spawns in places you couldn’t see. They tried changing the radius of the circle, but if they made it too big they just ended up with a sparse smattering of NPCs in any one place. They called their solution “the blob.” They’d pick the closest triangle to the player on a 2d mesh and then do a flood fill outwards to all the connected places the player could go. They’d spawn NPCs on random triangles within that blob out of sight. They didn’t try to bias the blob in any one direction because the player is so mobile, they couldn’t guess where he was going. They just made sure all the connected streets were full of NPCs.
They had a cool system for creating variety on the NPCs. They could vary the head structure, facial texture, skin color, body texture, gender, height, accessories, AI category, reactions, voice and more on spawn. Beyond just looking different, the crowd was composed of people with different duties. First, there was the base walking crowd, which was made up of bench sitters, monks, beggars, military patrols, trouble makers, etc. Then they had people with specific duties, of which they only ended up with the kiosk workers and the orators. These were to give structure to the flow of the crowd. They had wanted more, like people sweeping or drinking from fountains but ultimately these were all really just aesthetic and didn’t make it in.
At one point they had a big system to do simulation and have people change tasks, but they realized they didn’t need it. The lifespan of an NPC is very short, you don’t care if someone just walks off and despawns.
The reaction system was how the crowd would respond to the player. They talked about how it important it was for design to communicate what they wanted to the programmers. They made some animated videos (what we call pre-visualizations) of what they wanted. Showed us a few, they conveyed the basics of how the crowd should move.
They showed some in-game footage of a player hanging on to a world. NPCs that wander close enough and are facing the player will stop and ask what he’s doing. If more than one stops next to each other they’ll start a fake conversation. Reactions are composed of an individual reaction, the sound that actually comes out of that NPC, and over 100 possible body gestures.
Reactions are much more than a visual system. They have “reaction packs” designed by the level designer so they can have special responses to specific events in the level. Alert was a reaction used to draw guards into a fight. Guard awareness levels were done by switching out the current reaction pack based on when a guard saw a body.
How do you get lots of NPCs at 30 frames per second? They didn’t want to have a level of detail system on the decision or behavior layers because they wanted reactions to be consistent. They do get a little bit simpler as the NPCs get farther from the player though - they don’t do much when they are out of sight of the player. But they did a lot of LOD on animation. The bone counts drop quickly, there’s no more IK, simpler look at, simpler procedural rigs. They focused on making common NPCs as cheap as possible - cheap pathfinding, environment tests, steering, etc. They also made use of multi-threading to take advantage of the X360 and PS3 platforms.
Ultimately, they felt they met their goals of creating a believable crowd. They had a quality focused team with good support from management. They felt they fell somewhat short in incorporating the crowd into gameplay. Blocking the player and reacting to the player was there, but they player couldn’t really use the crowd in interesting ways.
Their team size was around 150-180 people, 1/3 programmers, 1/3 animators, and 1/3 artists. Well, I think that’s what he said, but then were are the designers? Must have took that down wrong.