mastodon.gamedev.place is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon server focused on game development and related topics.

Server stats:

5.1K
active users

@gavkar Very interesting! Does the improved scheduling also apply to fragment and tile shaders?

@gavkar I watched these videos earlier, really interesting videos! I’m especially blown away by dynamic caching and really curious about how you managed it. So the register file doesn’t exist as a going concern and everything is L1, but is that logically or physically? As in do the registers still physically exist inside the shader core but they’re logically now part of the L1 cache or is everything even on the physical level L1 cache?

@gavkar if the former there must be a minimum size for the L1 register file yes? Or can the thread pool memory other core memory types extend into the registers? Is the L1 collocated with the registers physically or is that not necessary? If the L1 register file is bigger than the registers what’s the penalty for grabbing a piece of data resident in the cache vs the physical register?

@gavkar Similarly if it’s the latter and everything is cache how do you maintain performance? Sorry for bombarding you with questions, I do mostly CUDA programming for biology simulations on Nvidia GPUs so I’m somewhat familiar with the topic but not an expert particularly in Apple GPUs. I’ll understand if you can’t comment though. Anyway again I thought this sounded like a really awesome advancement and was super impressed and curious. Thanks!

@gavkar holy moly, this sounds extremely good! maybe I should get an M3 to play around someday (too bad the M1 is still pretty good for my needs, lol)

@gavkar my God finally. I was waiting for this, and wondering where they'd gone.

@gavkar This is incredibly impressive work! Personally, I consider Dynamic Caching to be the most important innovation since scalar SIMT. Happy to see the Apple GPU team establishing themselves as innovation leaders. Also, very impressive ability to plan, execute, and synergise many years into the future, not many can pull this off. Looking forward to performance and capability improvements your new model will unlock in the future.