Velocity Architecture vs PS5 SSD

CallMeCraig · March 24, 2021, 6:18am

Epics guys didn’t talk a lot about it, but their head engineer (Brian Karis) referenced some inspirations for their solution like

or here is Brian Karis blog post about this methology

Yes, this is over 10 years old. Shows how difficult this stuff is.

TavishHill · March 24, 2021, 6:59am

Seems a lot like a displacement map…?

CallMeCraig · March 24, 2021, 7:21am

Yes and no. This is a completely different way of storing meshes. Meshes are cut and then transformed to 2D space, like a texture. Every texel on this geometry texture has then the same coordinates as the color texture, no more UV mapping needed.

Handling geometry like textures has a lot of advantages, because GPUs have lots of features for them. You can use hardware accelerated mipmapping, texture filtering and sampling for geometry to basically get lod for free. You can use sampler feedback streaming to only load geometry visible and at the best mimap (lod) level. When geometry and color textures share the same coordinate system you get simpler math and better data caching.

Disadvantages are geometry and rasterization hardware acceleration in your GPU are mostly useless. And animation is difficult and cutting and mapping the 3D mesh to 2D doesn’t work for every geometry. Thats why Epic uses this technique for static meshes and not foliage. Also you need tools support for this.

TavishHill · March 24, 2021, 5:55pm

CallMeCraig:

…inates as the color texture, no more UV mapping needed. Handling geometry like textures has a lot of advantages, because GPUs have lots of features for them. You can use hardware accelerated mipmapping, texture filtering and sampling for geometry to basically get lod for free. You can use sampler feedback streaming to only load geometry visible and at the best mimap (lod) level. When geometry and color textures share the same coordinate system you get simpler math and better data caching. Disadvantages are geometry and rasterization hardware acceleration in your GPU are mostly useless. And animation is difficult and cutting and mapping the 3D mesh to 2D doesn’t work for every geometry. Thats why Epic uses this technique for static meshes and not foliage. Also you need tools support for this.

So going by the gif at the link it is almost like a UV mapping scenario for textures where it folds up to form the geo then? Using the color to map how the ‘folds’ work? That sounds real interesting.

In my head I’m imagining the process being akin to making a model and texturing it, flattening out the texture map as usual, then doing another image mapping to the model but instead of painting it as a texture ya use colors to map the location on the model in 3D space, then throw out the base model and just overlay the texture onto the image map and let em both form the model in-game. Or something. lol

I coulda sworn EPIC said they were gonna be using mesh shaders to support XSX for UE5 and Nanite, and what you just outlined almost seems to make mesh shaders moot, no? At least for the culling features. Maybe mesh shaders can work real well for the transformation/shading to get the image maps/textures folded into shape properly? I could see having a fully programmable geo pipeline a la mesh shaders being helpful there maybe.

I think you only get the lack of UV mapping if you have a 1:1 resolution for textures/geo images based on what his blog post suggested.

It’s interesting that this stems from MS Research too. Didn’t know that.

TavishHill · March 24, 2021, 6:03pm

Man, imagine the shit storm that will kick up when devs start using SFS reliably… >.>

LucasTaves · March 24, 2021, 6:28pm

On the contrary, mesh shaders (or more precisely compute based processing of geometry) was pushed precisely to enable such scenarios.

In the current fixed pipeline only a few input formats are supported, and some more esoteric approaches (like displacement maps + tessellation) have incredibly low performance.

With mesh shaders you have more flexibility to parse the input format from whatever it is to the new one (meshlets) that can be consumed in parallel which then solves the performance issue.

TavishHill · March 24, 2021, 6:31pm

My thinking there was that one of the motivators for mesh shaders was the ability to choose when you cull triangles, and this approach seems to shift that entire process over to texture filtering instead. I can see the format aspect being real important for this too though now that ya mentioned it.

LucasTaves · March 24, 2021, 6:31pm

Back on the other era I always used to say that ps5 high clocks and at the time insane ssd limits looked to be a reaction to a better designed system aiming for a higher performance.

I was banned a few times for saying that, but it’s increasingly clear that it was indeed the case.

LucasTaves · March 24, 2021, 6:36pm

From what I understand there’s still culling of triangles, specially depending on the texture format. For example you can store geometry as SDF (which is kinda like the pixel represents a vector from the object position, so you’d still need higher resolutions for close ups, but in every resolution the object would be described with an infinite amount of polygons by the texture itself.

And on that scenario the gpu can also entirely on its own with mesh shaders find out the exact size each vertex needs to have.

TavishHill · March 24, 2021, 6:36pm

And yet, most there felt MS’s entire Azure cloud IaaS branch was seemingly dreamt up overnight in reaction to PS4 specs, lol.

I dunno that I’d agree entirely. I mean, the SSD was targeting a known bottleneck and it’s honestly the simplest solution (just go real damn fast and try keeping the memory cool). For clockspeeds though, I’d agree that your story seems likely since it seems hard to believe they’d have gone all in with high clocks beyond 2 GHz if not for competition. After all, that kinda jump probably required them to rethink how their entire cooling solution would work.

TavishHill · March 24, 2021, 6:39pm

Ok, so my next question, more relevant to the topic of the thread, is how would using geometry images mitigate SFS’s advantages? Seems like you are now bringing in WAY more texture-like image data compared to before(?) since you gotta stream in the geo images too, no?

LucasTaves · March 24, 2021, 6:43pm

I only think it was an overreaction because why push for an SSD that was able to reach even higher than 9GB/s throughput (with compression and whatnot) with a design so costly that you need to reduce the storage size and still ended up with a more costly drive than your competition?

Only reason that makes sense to me is that they heard of the benefits from SFS and how Ms was able to load GB/s of data basically instantly and brute forced a similar approach.

Did you see the SFS demo on SS? That was a super small room using insanely high quality texture assets and they are able to discard and load GB/s of data instantly as the camera turns or gets close to an object.

There’s also one part with the rotating globe that you can see how well it works. Textures are loaded in higher quality as the globe is rotating and that part becomes closer to the screen and discarded as soon as they are not visible

LucasTaves · March 24, 2021, 7:27pm

I don’t think it would mitigate the advantages, in fact there are a few scenarios where I can see it suting very well for SFS.

For example you can’t just load a part of a traditional model, but with the model being represented as a texture and being parsed at runtime in a local mesh it doesn’t really matter if you load the full model or only a portion of it so occluded parts of the object could not only be completely discarded from the rendering pipeline but for the whole system memory too (or more in line with how SFS only the lowest lod of the geometry texture would be loaded, and a “page fault” would trigger loading the next lod if needed).

I also assume that increased geometry detail (that requires either higher resolution textures or multiple texture layers to compose the full detail) would also see increasingly benefits when loading only what it’s needed.

TavishHill · March 24, 2021, 7:38pm

Not so sure it’d give a net reduction in streaming here. What you seem to be saying is yes you have more image data to stream in but basically have no geo data of the traditional sort at all. Ok, so you get the savings upfront there, BUT the geo data you do bring in, in the form of a geo image now, is gonna be larger than it woulda been as vertex data, no? Granted, yes you can avoid streaming in the entire geo image courtesy of SFS making it feasible to avoid popping, so it isn’t an apples to apples comparison.

So the comparison would seem to be between how much data ya are streaming in for visible vertices only vs Nanite’s approach using geo images instead.

This is real interesting since it makes me wonder if you can find benefits using ML to upres the geo images at runtime too as you can with textures. Given that this would be early in the rasterization stages, and that there would be major rasterization savings early on in the frame time from this approach, I bet that could work too.

TavishHill · March 24, 2021, 7:41pm

LucasTaves:

…9GB/s throughput (with compression and whatnot) with a design so costly that you need to reduce the storage size and still ended up with a more costly drive than your competition? Only reason that makes sense to me is that they heard of the benefits from SFS and how Ms was able to load GB/s of data basically instantly and brute forced a similar approach. Did you see the SFS demo on SS? That was a super small room using insanely high quality texture assets and they are able to discard and load GB/s of data instantly as the camera turns or gets close to an object. There’s also one part with the rotating globe that you can see how well it works. Textures are loaded in higher quality as the globe is rotating and that part becomes closer to the screen and discarded as soon as they are not visible

I did see the demo but can’t find it now (do ya have the link handy?). I only saw the one screengrab of it when searching yesterday. I think the SSD cost woulda been higher than XSX’s no matter what, unless they went with one slower than it and that wouldn’t solve the challenges they sought to tackle wrt RAM utilization (unless they built their own SFS which they evidently were not interested in doing).

I’d also think the TMU’s would need to do some work on geo images, no? So RT could be limited (maybe why Lumen isn’ using RT).

LucasTaves · March 24, 2021, 8:56pm

Only up to a certain point. In comparison to super low poly models we use today? Yeah it’s going to be an increase.

Compared to the billions of polygons that compose the source geometry it’s actually a tremendous compression.

Edit: look about Mega Meshes from lionhead on 360 for milo and Kate.

Back them using only 512MB of ram available on 360 they were able to load a world comprised of over 10bi polygons which would be impossible to fit in the index buffer format.

TavishHill · March 24, 2021, 9:09pm

I forgot all about Milo’s geo tech. Hmmmm.

LucasTaves · March 24, 2021, 9:28pm

Took me a while but I finally found it: https://youtu.be/xvCpbsiEKPE

Regarding the TMUs, that’s actually how they managed to get SFS to be implemented, they modified the TMUs so they can handle virtual texturing and the sampler feedback table without a performance penalty. With virtual texture and using the sampler feedback texture there are some duplicated reads that are eliminated with the hardware modifications Ms made.

TavishHill · March 24, 2021, 9:31pm

Ahhhhh! It was in the XSS tech showcase vid! Damn, lol. Spent an hour searching and I’m quite talented with Google-fu, so was surprised I couldn’t dig it up. Thanks!

ParkerPetrov · March 24, 2021, 9:56pm

We haven’t really seen a game use the velocity architecture so I’m not sure a comparison can really be done. A game would need to be designed with the use of the architecture in mind.

It’s a bit different than the SSD the PlayStation has where any game would naturally use it. So once we see games actually using hte velocity architecture effectively we will see better results.

The reason why you are seeing Playstations load times not what the fanboys were dreaming of is because sony was providing Peak Performance numbers while Microsoft provided the sustained read/write speeds. So while in bursts I/O performance of the ps5 has the potential to better. The Xbox I/O performance would be naturally a bit better as it would have higher peak performance than the numbers Microsoft provided but at minimum provide a consistently at the speeds advertised. You aren’t going to see the ps5 doing sustained I/O performance at the numbers Sony provided. The perceived gulf that some people thought existed was never actually present.