Collision Detection Performance (Volume 2)

Yikes, I’m getting backlogged on the stuff I want to write about!  <Insert generic complaint about how work is eating up most of my time here>

Anyway, this will probably be way shorter than it deserves, but my memory on the subject is about 3 months old.  Basically, this will be more or less a short chronicle of the dumb story of the mesh vs. mesh tests.

Appearances Can Be And Are Frequently Deceiving

When I started work on the collision detection observation, there was one surprising fact: the mesh vs. mesh code (used to determine whether the player was intersecting enemy ships) was working at full speed!  I didn’t seem to have to do any optimization at all on it to get it working.

I did, however, opt to go ahead and change the functions to not be recursive (as the current implementation was, of course, recursing into both meshes’ sphere trees).  When I finished that work, suddenly, the routine was much, much slower.  Was, in this case, the recursion overhead better than what it took to handle the double tree recursion in a non-recursive way?

The Answer May Surprise You

While I was working out the kinks in the non-recursive version, I had added some visual feedback to display unambiguously when the meshes were colliding.  While experimenting, I reverted back to the old (recursive) method, and found out something surprising: the collision results weren’t accurate!  It was reporting collision in cases where the player was inside of the object’s space, but nowhere near any edges!

Turns out, there was an erroneous fall-through case, and it was falling through to a “true” result, causing the entire collision test to short-circuit with an invalid collision result.  Sure the original routine was fast, but it didn’t work at ALL.  Now that I’d fixed that and had an accurate version, it was super, super slow.  Now it became an imperative to optimize.

Double Dribble

Through the course of optimizing the routine, I tried all sorts of crazy stuff.  I eliminated the recursion, I tried AABB vs. Rotated AABB tests, I tried Sphere vs. AABB tests, etc.  There was nothing that I could do.  As I was stepping through the code, however, I started noticing something odd: there were duplicate triangles all over the place.  I would see the same triangle tens of times in a given collision test.

As it turns out, the way I was dividing up the mesh in the tree creation routine was boneheaded.  What I would do was as follows:

  1. If the current set of triangles has less than a specified number of entries, stop splitting the mesh, we’ve reached a leaf node.
  2. Figure out the bounding box of the current set of triangles
  3. Find the axis on which it was best to split the set of triangles (where best was determined as the most balanced split, i.e. where the difference between triangle counts on the left and right of the split was lowest)
  4. For each triangle
    1. If the triangle has a vertex to the left of the split, add it to the left side.
    2. If the triangle has a vertex to the right of the split, add it to the right side.
  5. Once the mesh is divided into the two sets, recurse into each set to split it further as necessary

Now, a cursory analysis of this algorithm should tell you that step 4 is horribly broken.  If a triangle straddles the split, it would get added into both sides, thus expanding the bounding boxes of each sub element to encompass all triangles that straddle the split, causing there to be tons of duplicates of the same triangles throughout the code.

The correct way to do it would be:

  1. If the current set of triangles has less than a specified number of entries, stop splitting the mesh, we’ve reached a leaf node.
  2. Figure out the bounding box of the current set of triangles
  3. Find the axis on which it was best to split the set of triangles (where best was determined as the most balanced split, i.e. where the difference between triangle counts on the left and right of the split was lowest)
  4. For each triangle
    1. Calculate the centroid of the triangle (i.e. average the 3 vertices).
    2. If the centroid is less than the split, add the triangle to the left, OTHERWISE add it to the right.
  5. Once the mesh is divided into the two sets, recurse into each set to split it further as necessary

There, now a triangle only gets added to the side that it’s MOSTLY on, and there are no duplicates.  This change greatly boosted the speed of the collision, however there were cases where it would still cause a framerate dip.  Curses and drat!

All the King’s Horses…

Try as I might, I could not get the mesh vs. mesh intersection test to perform well.  So what happened when you get into a situation like that?  You do what every game developer does: you cheat!

What existed at the time was a wicked fast sphere vs. mesh collision test.  So rather than treat the player ship as a mesh, I opted to treat it as a representative collection of spheres (There are 29 spheres total, and it’s a pretty good approximation of the player ship – in fact, it’s probably better than necessary).

Suddenly, player vs. enemy collision performance was great!  Since this was the only place that I had needed a mesh vs. mesh, I could get rid of that slow beast.

~fin~

All of the collision code is complete, now, so next (give or take a post or two) I’ll talk about something with a bit more of a visual element – Bolts of lightning!

Collision Detection Performance (Volume 1)

I have been hard at work on my game (in my ridiculously limited spare time) for the last month and a half. One major hurdle that I’ve had to overcome was collision detection code. Specifically, my collision detection performed great on my PC, but when running it on the Xbox 360, everything would slow to a crawl (in certain situations).

The types of collision detection I have to deal with are varied, due to the weird way that I handle certain classes of obstacle (like walls):

  • Player bullets vs. Enemy – Player bullets are, for simplicity, treated as spheres, so sphere/mesh testing works here.
  • Enemy bullets vs. Player – Same as above.
  • Player vs. Wall – Because the game’s playing field is 2D, the walls in-game are treated as 2D polygons, so it boils down to a 2D mesh vs. polygon test.
  • Player vs. Enemy – Mesh vs. Mesh here
  • Beam vs. Enemy – The player has a bendy beam weapon.  I divide the curved beam up into line segments, and do ray/mesh tests.

The worst performance offender was, surprisingly, the sphere vs. mesh test, which will be the subject of this article.  Before optimizing, when I’d shoot a ton of bullets in a certain set of circumstances, the framerate would drop well into the single digits, because the bullet vs. mesh (sphere vs. mesh) collision couldn’t keep up.  Here are the things that I changed to get this test working much, much faster.

When using Value Types, Consider Passing As References

One thing that was slowing my code down was all of the value type copying that my code was doing.  Take the following function:

public static bool SphereVsSphere(Vector3 centerA, float radiusA, Vector3 centerB, float radiusB)
{
  float dist = radiusA+radiusB;

  Vector3 diff = centerB-centerA;
  return diff.LengthSquared() < dist*dist;
}

Simple, yes?  This function, however, falls prey to reference type copying.  You see, “centerA” and “centerB” are both passed in by value, which means that a copy of the data is made.  It’s not an issue when done infrequently, but with the number of SphereVsSphere calls that were happening during a given frame, the copies really started to add up.

There’s also a hidden set of copies:  the line “Vector3 diff = centerB-centerA” also contains a copy, as it passes centerB and centerA into the Vector3 subtraction operator overload, and they get passed in by value.  Also, a new Vector3 gets created inside of the operator then returned, which, I believe, also copies the data into diff.

To eliminate these issues, you should pass all of your non-basic value types (that is, anything that’s not an int, bool, float, anything like that) by reference instead of by value.  This eliminates all of the excess copies.  It does come at a price, though: in my opinion, it does make the code considerably uglier.

Here’s the updated routine:

public static bool SphereVsSphere(ref Vector3 centerA, float radiusA, ref Vector3 centerB, float radiusB)
{
  Vector3 diff;
  Vector3.Subtract(ref centerA, ref centerB, out diff);
  float dist = radiusA+radiusB;

  return diff.LengthSquared() < dist*dist;
}

Instead of having a nice-looking overloaded subtraction, now there’s a call to Vector3.Subtract.  While it’s not so bad in the case of a simple subtraction, when you have a more complicated equation, they pile up pretty quickly.  However, given the speed boost just making this change can give you, it’s totally worth it.

Use Hierarchical Collision Detection (But Use a Good Bounding Volume)

Heirarchical collision detection is a good thing.

For those of you that DON’T know, basically instead of testing your collider against every triangle in a mesh, you have a tree in which each node has a bounding volume, and the leaves contain the actual triangles.  The idea is that, by doing a much simpler collider vs. bounding volume test, you can elminiate large amoungs of triangles before you ever have to test them.

In this case, I was using a sphere tree, where each node in the tree has a bounding sphere, and the leaves of the tree contain actual mesh triangles.  I used spheres instead of AABBs (Axis-aligned bounding boxes) because transforming AABBs is expensive (and they become Oriented bounding boxes after the transform).  Transforming a sphere is easy, however.  None of my object transforms have scale data, so it’s a simple matter of transforming the sphere’s center.

However, the use of bounding spheres has a dark side.  Unless all of your heirarchy levels is roughly sphere-shaped, a sphere is a terribly inefficient bounding volume.  They’re usually much larger than the geometry that they contain, so there are more recursions into lower levels of the tree (think of there as being more dead space around the geometry when using spheres than bounding boxes).

By also adding bounding boxes to the data, I could use them where I’m not having to transform them.  For instance, because this is sphere vs. mesh, and the entire mesh is rigid, I can take the mesh’s world 4×4 matrix, and transform the sphere by the INVERSE of it.  This way, the sphere is in model space, and I can use the bounding volumes without having to do any transformations at lower levels.

But now I needed a sphere vs. AABB test.  However, I didn’t much care if it was exact or not, so instead I used a simple test where I expand the box by the radius of the sphere, then test whether the sphere is inside of the box or not.  Near the corners (surely this is where the term “corner case” comes from), this can give false positives, but it will never say the sphere DOESN’T intersect the box when it should say it does.  This is an acceptable trade-off.

public static bool SphereVsAABBApproximate(ref Vector3 sphereCenter, float sphereRadius, ref Vector3 boxCenter, ref Vector3 boxExtent)
{
  Vector3 relativeSphereCenter;
  Vector3.Subtract(ref sphereCenter, ref boxCenter, out relativeSphereCenter); // Get the sphere center relative to the box's center.
  Vector3Helper.Abs(ref relativeSphereCenter); // Per-component absolute value.

  return (relativeSphereCenter.X <= boxExtent.X+sphereRadius && relativeSphereCenter.Y <= boxExtent.Y+sphereRadius && relativeSphereCenter.Z <= boxExtent.Z+sphereRadius);
}

Simple, but effective.  Converting from using a sphere bounding volume to AABBs cut down the number of recursions (and triangle comparisons) being done dramatically, since the AABBs are a much tighter fit to the geometry.

Recursion Is Weird

One suggestion I got, also, was to eliminate recursion.  The heirarchical nature of the algorithm meant that my test was recursive.  Here was the test as originally written:

public static bool SphereVsAABBTree(ref Vector3 sphereCenter, float sphereRadius, CollisionTreeNode node)
{
  if (!SphereVsAABBApproximate(ref sphereCenter, sphereRadius, ref node.BoxCenter, ref node.BoxExtent))
    return false; // No collision with this node, return false

  if (node.Left != null)
  {
    // This means that there are Left and Right children (either they're both null, or both set).
    if(SphereVsAABBTree(ref sphereCenter, sphereRadius, node.Left))
      return true;
    return SphereVsAABBTree(ref sphereCenter, sphereRadius, node.Right); // A node with child nodes can't have triangles, so just return this result.
  }

  // This node has triangles, so test against them.  If any of them intersects the sphere, return success.
  for(int i = 0; i < node.Indices.Length; i+=3)
  {
    if(SphereVsTriangle(ref sphereCenter, sphereRadius, ref node.Vertices[node.Indices[i+0]], ref node.Vertices[node.Indices[i+1]], ref node.Vertices[node.Indices[i+2]]))
      return true;
  }
  return false;
}

As you can see, it recurses into child nodes until it either gets a false test out of both of them, or it reaches triangles.  But how do you eliminate recursion in a tree such as this?  More specifically, how do you do it while using a constant (non-node-count dependent) amount of memory?

The trick is as follows:  Assuming your nodes contain Parent pointers in addition to Left and Right pointers (where the Parent of the trunk of the tree is null), you can do it with no issue.  You track the node that you’re currently visiting (“cur”), and the node that you previously visited (“prev”, initialized to null).  When you reach a node, test as follows:

  • if you came to it via its parent (that is, prev == cur.Parent), you’ve never visited it before.
    • At this point, you should do your collision tests.  It’s a newly-visited node.
    • prev = cur
    • cur = cur.Left — This is basically “recursing” into the Left node.
  • If you arrived from its Left child, you visited it before and recursed into its left side
    • Since this node has already been visited, do not do the collision tests.  They’ve already been shown to be successful.
    • prev = cur
    • cur = cur.Right  — Since we recursed into Left last time we were here, recurse into Right this time.
  • If you arrived via its Right child, you’ve visited both of its children, so you are done with this node.
    • Again, this node has already been visited, so do not run the collision tests.
    • prev = cur
    • cur = cur.Parent — We’re done with this node, so go back to its parent.
  • When doing the collision tests, if a Sphere vs. AABB bounding volume test ever fails, we don’t have to “recurse” so go back to its parent.
    • prev = cur
    • cur = cur.Parent
  • Finally, if you do a Sphere vs. Triangle collision test and it succeeds, we can immediately return, as we have a guaranteed collision, and no more triangles or nodes need to be tested.

Doing all of this makes the routine bigger, but no recursion is necessary, so there’s no additional stack space generated per node visited (and no function call overhead, either).  The finished code is as follows (note that I also added a quick sphere vs. sphere test right at the outset, because it’s a very quick early out if the sphere is nowhere near the mesh):

public static bool SphereVsAABBTree(ref Vector3 sphereCenter, float sphereRadius, CollisionTreeNode node)
{
  CollisionTreeNode prev = null, cur = node;

  // At the top level, just do a sphere/sphere test for a super-quick out.
  if (!SphereVsSphere(ref sphereCenter, sphereRadius, ref node.Sphere.Center, node.Sphere.Radius))
    return false;

  while(cur != null)
  {
    if(prev == cur.Parent) // Only do the tests if we JUST got here.
    {
      if (!SphereVsAABBApproximate(ref sphereCenter, sphereRadius, ref cur.BoxCenter, ref cur.BoxExtent))
      {
        // No intersection?  Go ahead and just back out of this node now.
        prev = cur;
        cur = cur.Parent;
        continue; // By continuing, we bypass the rest of this code and re-visit the parent immediately.
      }

      for(int i = 0; i < cur.Indices.Length; i+=3)
      {
        if(SphereVsTriangle(ref sphereCenter, sphereRadius, ref cur.Vertices[cur.Indices[i+0]], ref cur.Vertices[cur.Indices[i+1]], ref cur.Vertices[cur.Indices[i+2]]))
          return true;
      }
    }

    // "Recurse"
    if (cur.Left != null)
    {
      if (prev == cur.Parent) // If this is the first visit to the node, recurse left.
      {
        prev = cur;
        cur = cur.Left;
        continue;
      }
      if (prev == cur.Left) // If this is the second visit, recurse right.
      {
        prev = cur;
        cur = cur.Right;
        continue;
      }
    }

    // If there are no child nodes or prev == cur.Right, return to the parent.
    prev = cur;
    cur = cur.Parent;
  }
  return false;
}

Mission Complete

After making that set of changes, the sphere vs. mesh tests no longer bog down on the Xbox, even in a degenerate case such as when there are tens or hundreds of bullets well inside of the mesh’s area.

Getting the sphere vs. mesh test working was a great accomplishment, but as much as I thought it was already working well, it turns out that mesh vs. mesh testing was a much bigger problem.  However, that’s another story for another day.