New Garbage-Collection Method Beneficial To Minecraft?

jakj · May 12, 2012

Introduced in Java 7u4:

http://docs.oracle.com/javase/7/docs/technotes/guides/vm/G1.html

Specifically:

Recommended Use Cases for G1

The first focus of G1 is to provide a solution for users running applications that require large heaps with limited GC latency. This means heap sizes of around 6GB or larger, and stable and predictable pause time below 0.5 seconds.

Applications running today with either the CMS or the ParallelOld garbage collector would benefit switching to G1 if the application has one or more of the following traits.

More than 50% of the Java heap is occupied with live data.

The rate of object allocation rate or promotion varies significantly.

Undesired long garbage collection or compaction pauses (longer than 0.5 to 1 second)

I have a feeling that, despite their 6GiB threshold, this will still be useful for those people who get lag spikes when allocating larger amounts (like 2-4 GiB) to Minecraft. Might be good to try out.

The_DarthMoogle · May 12, 2012

Ooh! Gimme!

Are there really any other programs that have all three of the traits? Even one of them?

jakj · May 12, 2012

Ooh! Gimme!

Are there really any other programs that have all three of the traits? Even one of them?

Well, yes, the new garbage collector is designed for high-utilization server software, if you read the rest of that page. It just so happens that Minecraft is written so wretchedly that it behaves in some ways like high-utilization server software, even in client usage.

Rabbit · May 16, 2012

Well, yes, the new garbage collector is designed for high-utilization server software, if you read the rest of that page. It just so happens that Minecraft is written so wretchedly that it behaves in some ways like high-utilization server software, even in client usage.

Edit: This somehow ended up in the wrong thread with the wrong post by jakj quoted, but I'll leave it as it somehow seems fitting.

What gets me is that there isn't even a legitimate reason not to make your program multi-threaded anymore. OpenMP is now heavily supported by Intel, Microsoft, the GNU Foundation and from what I understand, an implementation is being worked on for LLVM/Clang. It's so easy to write cross-platform multi-threaded code with it that it borders on ridiculous. It's also tasked based, and while people have their preferences for multi-threading style, it's arguable that task-based multi-threading is a superior choice for writing multi-threaded code that automatically makes use of more processors as they're available.

OpenMP isn't even the "slow hog" that it's made out to be by people more comfortable with pthreads or Win32 Threads. The fact of the matter is, threading is very hard to get right when you're manipulating threads at the system level. Libraries like OpenMP (which is actually a collection of preprocessor macros) and Boost can't optimize for every possible situation, but more often than not, they produce much faster multithreaded code than something that was multithreaded by hand. Again, some people are geniuses at this stuff, but multithreading is hard, and even game developers, who tend to be top of their game (no pun intended) can write some downright awfully performing code. A quick look at any number of high-budget games built on-top of in-house engines will testify to that (STALKER, etc.).

Even the age old problem of making code multi-threading safe is getting easier with task-based and other new paradigms of multi-threading. For a game like Minecraft, you could easily write multi-threading safe code that builds a single chunk. Easiest (if not the best performing) way of doing this would be to load all the data needed from disk at once. Then tell the game to use every available thread to take a piece of data corresponding to a chunk and actually generate the chunk, then wait for all other threads on the task to finish. Once all the threads finish, they remerge and you can do as you see fit with the data, as you're now back to a single-thread.

This really ended up going off on a tangent, but suffice to say, there's no reason not to multi-thread anymore. At this point it's either reliance on legacy code that's not multi-threading safe or laziness.

jakj · May 16, 2012

You are laboring under a misapprehension, that Notch's decision to use limited multithreading was because he thought it would be too difficult, or too inefficient, or not beneficial enough, or any of the reasons you just mentioned.

The fact of the matter is, he's just fucking awful. He even admitted it himself, in an oblique way, in his blog post about how he failed to get a programming job because he was so visibly self-taught and unlikely to work well in groups. Just look at his code, and see all the places where he does crazy-ass things for no good reason.

I mean, just look at this: The following is the main part of the algorithm that handles the decay of leaf blocks based on the absence of log blocks.

    int l = world.getBlockMetadata(i, j, k);

    if ((l &amp; 8) != 0 &amp;&amp; (l &amp; 4) == 0)

    {

      byte byte0 = 4;

      int i1 = byte0 + 1;

      byte byte1 = 32;

      int j1 = byte1 * byte1;

      int k1 = byte1 / 2;

      if (adjacentTreeBlocks == null)

      {

        adjacentTreeBlocks = new int[byte1 * byte1 * byte1];

      }

      if (world.checkChunksExist(i - i1, j - i1, k - i1, i + i1, j + i1, k + i1))

      {

        for (int l1 = -byte0; l1 &lt;= byte0; l1++)

        {

          for (int k2 = -byte0; k2 &lt;= byte0; k2++)

          {

            for (int i3 = -byte0; i3 &lt;= byte0; i3++)

            {

              int k3 = world.getBlockId(i + l1, j + k2, k + i3);

              if (k3 == Block.wood.blockID)

              {

                adjacentTreeBlocks[(l1 + k1) * j1 + (k2 + k1) * byte1 + (i3 + k1)] = 0;

                continue;

              }

              if (k3 == Block.leaves.blockID)

              {

                adjacentTreeBlocks[(l1 + k1) * j1 + (k2 + k1) * byte1 + (i3 + k1)] = -2;

              }

              else

              {

                adjacentTreeBlocks[(l1 + k1) * j1 + (k2 + k1) * byte1 + (i3 + k1)] = -1;

              }

            }

          }

        }



        for (int i2 = 1; i2 &lt;= 4; i2++)

        {

          for (int l2 = -byte0; l2 &lt;= byte0; l2++)

          {

            for (int j3 = -byte0; j3 &lt;= byte0; j3++)

            {

              for (int l3 = -byte0; l3 &lt;= byte0; l3++)

              {

                if (adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1) * byte1 + (l3 + k1)] != i2 - 1)

                {

                  continue;

                }

                if (adjacentTreeBlocks[((l2 + k1) - 1) * j1 + (j3 + k1) * byte1 + (l3 + k1)] == -2)

                {

                  adjacentTreeBlocks[((l2 + k1) - 1) * j1 + (j3 + k1) * byte1 + (l3 + k1)] = i2;

                }

                if (adjacentTreeBlocks[(l2 + k1 + 1) * j1 + (j3 + k1) * byte1 + (l3 + k1)] == -2)

                {

                  adjacentTreeBlocks[(l2 + k1 + 1) * j1 + (j3 + k1) * byte1 + (l3 + k1)] = i2;

                }

                if (adjacentTreeBlocks[(l2 + k1) * j1 + ((j3 + k1) - 1) * byte1 + (l3 + k1)] == -2)

                {

                  adjacentTreeBlocks[(l2 + k1) * j1 + ((j3 + k1) - 1) * byte1 + (l3 + k1)] = i2;

                }

                if (adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1 + 1) * byte1 + (l3 + k1)] == -2)

                {

                  adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1 + 1) * byte1 + (l3 + k1)] = i2;

                }

                if (adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1) * byte1 + ((l3 + k1) - 1)] == -2)

                {

                  adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1) * byte1 + ((l3 + k1) - 1)] = i2;

                }

                if (adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1) * byte1 + (l3 + k1 + 1)] == -2)

                {

                  adjacentTreeBlocks[(l2 + k1) * j1 + (j3 + k1) * byte1 + (l3 + k1 + 1)] = i2;

                }

              }

            }

          }

        }

      }

      int j2 = adjacentTreeBlocks[k1 * j1 + k1 * byte1 + k1];

      if (j2 &gt;= 0)

      {

        world.setBlockMetadata(i, j, k, l &amp; -9);

      }

      else

      {

        removeLeaves(world, i, j, k);

      }

    }

That's right: All of that, to determine whether log blocks are present, and to apply a scaling decay rate based on the density of leaves remaining.

Do you really want to see what a greatly-multithreaded application would have been like, made by somebody who writes that?

NightKev · May 17, 2012

I think I recall he had to scrap an older leaf decay algorithm (sometime around the passage between "alpha" and "beta") because it was just grinding even the best of computers down to single-digit FPS.

jakj · May 17, 2012

The best leaf-decay algorithm would use refcounting, but that's way beyond Notch's skill level, so the next step down would be a very simple loop:

For a leaf block at i,j,k, iterate over all blocks in the area i-4,j-4,k-4 and i+4,j+4,k+4, adding up the total number of log blocks and leaf blocks. Any time a log block is found, abort the loop, mark the leaf block clean, and exit the loop. Now, choose a random number, and based on that random number and the number of leaf blocks remaining, decide whether or not to decay the leaf.

Boom. Done.

Sign In

New Garbage-Collection Method Beneficial To Minecraft?

Recommended Posts

jakj

The_DarthMoogle

jakj

Rabbit

jakj

NightKev

jakj

Join the conversation

Browse

Activity