Don't preoptimize
Hi
I just wanted to share a result of a quick test I ran, which indicates that you should not spend time trying to optimize code where you "think" you may get better performance. Basically, don't try to micro-optimize everything.
So the test I ran was to see how efficient the Manhattan distance approximation is. To those of you who are unfamiliar with it, it is an approximation to distance (magnitude) of a vector. Since square roots are normally an expensive function, we approximate the distance of a vector with components x and y, as
instead of
This gives you a proxy value for distance. It is not quite distance but if is useful for comparisons. Now how much performance can you get from this versus using distance with C# in unity? Here's the test code:
The results on my machine is:
In other words, the performance is WORSE for a result that is less accurate. This just shows how well optimized certain functions are in Unity (and the same applies to other packages you may be using). I believe that it could be because the custom manhattan distance script runs "higher up" on the virtual machine, whereas unity's function perhaps directly communicates with the hardware.
On a side note (slightly unrelated), if you are trying to optimize distance comparisons, use magnitude squared (just skip taking the square roots on both sides of the comparison. If the one side is a constant, simply square it to get the same comparison, but faster). The result of doing it in a test similar to above:
Much faster :)
I just wanted to share a result of a quick test I ran, which indicates that you should not spend time trying to optimize code where you "think" you may get better performance. Basically, don't try to micro-optimize everything.
So the test I ran was to see how efficient the Manhattan distance approximation is. To those of you who are unfamiliar with it, it is an approximation to distance (magnitude) of a vector. Since square roots are normally an expensive function, we approximate the distance of a vector with components x and y, as
distance = Abs(x) + Abs(y)
instead of
distance = Sqrt(x*x + y*y);
This gives you a proxy value for distance. It is not quite distance but if is useful for comparisons. Now how much performance can you get from this versus using distance with C# in unity? Here's the test code:
Vector2 []vectors = new Vector2[8000000]; float x = 0; float y = 0; for (int i = 0; i < vectors.Length; i++) { vectors[i] = new Vector2(x, y); x++; y = x * (y*0.001f +0.1f); } float total = 0; float t1 = Time.realtimeSinceStartup; foreach (Vector2 vec in vectors) { total += vec.magnitude; } float t2 = Time.realtimeSinceStartup; Debug.Log ("TEST1: " + (t2 - t1)); total = 0; float t3 = Time.realtimeSinceStartup; foreach (Vector2 vec in vectors) { total += Mathf.Abs(vec.x) + Mathf.Abs(vec.y); } float t4 = Time.realtimeSinceStartup; Debug.Log ("TEST2: " + (t4 - t3));
The results on my machine is:
TEST1: 0.218.... TEST2: 0.2519....
In other words, the performance is WORSE for a result that is less accurate. This just shows how well optimized certain functions are in Unity (and the same applies to other packages you may be using). I believe that it could be because the custom manhattan distance script runs "higher up" on the virtual machine, whereas unity's function perhaps directly communicates with the hardware.
On a side note (slightly unrelated), if you are trying to optimize distance comparisons, use magnitude squared (just skip taking the square roots on both sides of the comparison. If the one side is a constant, simply square it to get the same comparison, but faster). The result of doing it in a test similar to above:
TEST3: 0.0907...
Much faster :)
Comments
There are a couple things muddying the results of your test though - the timer checks and Log hits are weighing in against the later tests.
Still, I'm surprised to see that magnitude squared is so much faster when, presumably, it was done with a*a+b*b in VM code.
Maybe I'll try out your test in native code... I've also wondered how mag squared really compares to sqrt...
UPDATE:
The results: (these are milliseconds)
There was a tiny bit of variation of about 1ms between different runs on each of those times.
That's 8 million of the good magnitude tests in 33 ms. Bumping it up to 64 million, the times all scaled perfectly linearly.
I tried using basic "new Vec2d[]" instead of "std::vector<Vec2d>" to see if it made any difference.
The only difference:
...What? Something must be wrong, but I can't find anything. I'm too lazy to go over the assembly now, perhaps the compiler found something to optimize in this situation.
I guess the conclusion is very much in line with the topic of the thread and, indeed, sanity - Don't Preoptimize ;)
Whether or not you should pre-optimize also depends on how leaky your abstractions are. Optimizing for cache-coherency (AKA data-oriented program design) can have profound effects on your public API, meaning if you wait too long to optimize, doing so could have a ripple effect on the rest of your codebase, as well as your clients' codebases.
Premature optimization is definitely something to be taken seriously, but postmature optimization is a demon in its own right.
Bob Nystrom's Data-Locality pattern has the most concise explanation of cache coherency issues and their potential solutions that I've seen.
Worrying about APIs and clients down the line is a problem for general software development, not really an issue for a small game developer here in the current situation.
Optimizing the overall process of making a game is the real optimization.
(This post is a good example of wasteful optimization. It was longer, then took time to make smaller, all the while doing nothing but making me seem more like a mean old nazi...) ;)