I was looking for some information on whether or not calling GlUniformLocation
for every frame has any performance impact. Everything I found was either claims without any sources or very old data.
Since I couldn't find anything useful, I ended up writing a small benchmark program to test 3 different ways of using uniform locations:
- Calling
GlUniformLocation
every time (Lookup). - Getting the location at first call and caching it in a std::map using compile time keys (Cached).
- Getting the location at load time and storing the location in a variable (Static).
The test uses a shader with 20 vec4 uniforms (10 in vertex, 10 in fragment) and sets each uniform 50 times per render (less than 50 times made it hard to measure the time delta) with 1000 renders per run to come up with an average time.
Here are the results of a modern high-end desktop and 5 year old laptop both running Windows 11.
Machine / Method | First | Second | Fastest | Slowest | Average |
---|---|---|---|---|---|
NVIDIA RTX4090 | |||||
- Lookup | 1423 | 1047 | 578 | 2317 | 929 |
- Cached | 366 | 147 | 105 | 402 | 134 |
- Static | 159 | 21 | 5 | 167 | 10 |
i7-8565U / UHD 620 | |||||
- Lookup | 1300 | 1115 | 870 | 1984 | 1043 |
- Cached | 676 | 671 | 488 | 1849 | 579 |
- Static | 130 | 129 | 73 | 459 | 103 |
The difference is pretty significations. If running at 60 fps with 1000 uniforms then looking up the location every frame would use up 5.6% of the frame time on average vs only 0.1-0.6% (depending on hardware) when storing the location.
The idea of caching locations in a std::map worked better than lookup but still significantly worse than using a custom class with uniform variables hardcoded.
Benchmark source code available here.