Majorly improved performance of my game/server

Package: LuaJIT Profiler

Warr1024 2022-10-18 14:36 UTC

I maintain NodeCore, a very complex game spanning over 20k lines of code. For some time (apparently years), a serious performance issue had been slowly building up on my NodeCore server, causing bad lag spikes, and my players and I were having no luck figuring out the cause. Minetest's built-in profiler was no help, and I had tapped out its extremely limited abilities a long time ago.

After being pointed to the JIT Profiler, and figuring out how to use it and interpret the results (read the description carefully), it helped solve the performance problem very quickly (a raycast happening too early in a sequence of checks), and also identified other performance issues that I hadn't even detected yet (uncached privilege checks, dynamic lighting checks, and more).

I highly recommend all modders who care about quality use this tool and test performance of all their packages. It is a little complex to run but very much worth it.

(more in comments)
Warr1024 2022-10-18 14:46 UTC
A few more technical notes:

Sometimes a lot of C code execution is reported in suspicious places. I think this represents time taken by the Minetest engine, but I don't know why the profiler thinks the code is executed inside a Lua function. -- (from the documentation)

I noticed this, and it was probably the biggest obstacle to interpreting my results. I used a simple perl script/regex to pre-filter the raw profiler data, and produced a "filtered" and a "raw" version of each flame graph.
- The "filtered" one helped point towards places where I was doing things in lua that were too expensive. It was best for identifying problems that I could directly and immedately fix.
- The "raw" version seemed to indicate the overall performance impact of major components, e.g. globalsteps vs ABMs vs node timers. Seeing a lot of poorly-affiliated C code execution time in any branch of the graph suggested that I may be making API calls or other decisions that were impacting the amount of time the engine was spending on things. Things like adding a neighbor check to an ABM can make it roughly 7x as expensive on the C++ side, so things like this need to be weighed carefully, and the "raw" graph helped me get an idea of how much attention practices like these warranted.
Doesn't this only apply to LuaJIT though?

Technically, yes, results you get profiling the JIT are not necessarily comparable to PUC, and they may have different performance characteristics under certain kind of workload. However, almost all of the problems I discovered were algorithmic problems, or using engine APIs incorrectly (e.g. making unnecessary calls to the engine that I could avoid), so I expect the result to improve performance on PUC-only platforms significantly also.

Even if you are mainly targeting PUC (i.e. the way IKEA is/was), it still makes sense to get a JIT build of MT and run this mod against it to find non-runtime-specific performance issues.
Warr1024 2022-10-18 14:52 UTC

One small caveat I observed:

After having run the JIT profiler on my production server, even after stopping it, I noticed server freezes or slowdowns that would eventualy cause my server watchdog to trip and restart the server.

Even given that, it's still worth having the mod available, in case I need to debug performance issues that are hard to reproduce locally again. Yes, of course, generic protests against "debugging in production," but I already have a test environment, and sometimes you just can't reproduce a bug locally, and sometimes the most efficient way to get forensics on a problem is to capture it in production. Even if the server needs to restart, I still get the valuable forensic data to track down issues that affect everyone, not even just my server.

This basically just means that I won't be sampling randomly on my server, but only when there's an actual suspected problem.