Games Analysis |OT| Time To Argue About Pixels And Frames!

CallMeCraig · September 28, 2021, 12:26pm

Which memory optimisation features are you talking about in such a simple process as calculating a shadow map or a stencil buffer?

Depth and color compression are both in GCN 1.2 and RNDA.

Senua · September 28, 2021, 12:51pm

SFS reduces memory usage to a fraction. XVA will act as a memory multiplier with SSD always ready to stream data from. You don’t need to have huge memory bandwidth at a time. The DLSS-alternative should also cut back on the overall bandwidth requirement. Also remember this is for 1080p gaming.

Senua · September 28, 2021, 1:04pm

I’m not sure if Series S|X hardware doesn’t have ANY architectural IPC improvements (performance and thermal) at a microcode level over GCN. Maybe the actual folks who are in hardware designing business can confirm this.

KageMaru · September 28, 2021, 1:23pm

The memory bandwidth we’re referring to isn’t to get data off of the SSD to memory, it’s the memory bandwidth the GPU uses to render a scene. Every pixel adds to the demand of memory bandwidth used. On top of that, effects such alpha transparencies, eat up a lot more bandwidth per pixel than a clean opaque scene. This is called overdraw.

Picture a scene in a third person action game where your enemy is behind a waterfall but this waterfall is also behind a huge force field for some reason. In this scenario one draw call can come from the platform this enemy is sitting on, the next draw call is from the enemy themself, next we have another two calls for the waterfall that’s covering enemy plus platform, and yet another two draw calls for the force field covering everything. In this one scenario, we now have six draw calls where there could otherwise be two if it weren’t for the water and force field.

Let’s apply this knowledge to Halo. Every piece of foliage adds on overdraw, the explosions each add on an extra layer of overdraw, effects such as our shields are another transparency effect that eats up memory bandwidth, those jackals with shields you can see through adds on even more overdraw, and so on. Picture a fire fight with another spartan where they use a drop shield to give them time to use an overshield, a moment later a teammate throws a grenade from the side to damage their now boosted shields. In this simple scenario, this single spartan would be drawn four times. This is an over-simplification of the GPU load, but it’s easy to see why the XSS may be 1080p while the One X is 1440p at 60fps when the latter has much more memory bandwidth.

You’re talking about I/O throughput, we’re talking about the amount of memory bandwidth needed to run a game smoothly at run time.

Edit:

There most definitely is IPC improvements going from GCN to RDNA 2. For example GCN requests instructions every four cycles where RDNA does that during every cycle. This has nothing to do with the memory bandwidth limitations we’re talking about though.

CallMeCraig · September 28, 2021, 1:24pm

SFS has nothing to do with the memory bandwith requirement for calculating and using shadow maps or stencil buffers

KageMaru · September 28, 2021, 2:19pm

For anyone that would like to learn more about how 3D rendering is done in games, these are some good starting points.

https://www.techspot.com/article/1851-3d-game-rendering-explained/

Senua · September 28, 2021, 3:57pm

To my understanding XSS has been designed with the RDNA 2/XVA optimisations in mind which will reduce the need of raw physical resources.

Folks at Azure/Xbox engineering won’t make such foolhardy design decisions for which they have to bet on a hardware system which performs worse than a last generation system. It’s just that Halo Infinite doesn’t have any next-gen optimisations.

But it absolutely does matter for selectively loading mipmaps into memory.

Senua · September 28, 2021, 3:59pm

KageMaru · September 28, 2021, 4:22pm

Can you explain what in that demo is using up memory bandwidth and how?

None of that, absolutely none of that, has anything to do with memory bandwidth. That demo shows how SFS can reduce the time to load assets while also reducing the amount of memory the assets use. None of that has to do with memory bandwidth.

You are focusing on this aspect of the console specs:

While we are talking about an entirely different part of the system:

These two things are not related in anyway outside of them having something to do with the system memory. To put it a different way, you’re talking the process of getting assets into memory. I’m talking about how the GPU uses the system memory once the assets are loaded into memory. I explained this in my post earlier, so I’m not sure how you can relate how much memory is used to how pixels consume memory bandwidth.

CallMeCraig · September 28, 2021, 5:08pm

I/O bandwidth to storage is two magnitudes of order slower than bandwidth to GPU memory. SFS has nothing to do with a 4TF RDNA2 gpu ~= 6TF GCN. There are workloads (i gave you examples) where the One X is better performing. You know that is true because we see this in games.

Senua · September 28, 2021, 5:30pm

KageMaru:

…width and how? None of that, absolutely none of that, has anything to do with memory bandwidth. That demo shows how SFS can reduce the time to load assets while also reducing the amount of memory the assets use. None of that has to do with memory bandwidth. You are focusing on this aspect of the console specs:

While we are talking about an entirely different part of the system:

These two things are not related in anyway outside of them having something to do with the system memory. To put it a different way, you’re talking the process of getting assets into memory. I’m talking about how the GPU uses the system memory once the assets are loaded into memory. I explained this in my post earlier, so I’m not sure how you can relate how much memory is used to how pixels consume memory bandwidth.

Can you see this graph? This is depicting memory usage. Less memory usage implies low bandwidth requirement.

Yes. In last gen unoptimised non GDK games. Stop assuming things that won’t be true for next-gen optimised titles. Azure/Xbox hardware engineers are not illiterate folks they know what they will be selling for next decade for which they have designed the stack.

KageMaru · September 28, 2021, 5:39pm

Yup and I have to wonder about the claim that we’ve seen XSS in higher resolutions than 1X games at 60fps because I’ve been going back to the few 60fps 1X games we’ve gotten in the past year and I’m pretty sure most, if not all, are higher resolution on the 1X.

I have to ask, are you even reading my posts? Is there something I’m not explaining well enough that you’re not understanding the difference between I/O bandwidth and memory bandwidth? Look back at the two images I posted in my last response to you. The last two responses should have made things much more clear.

Again, you are referring to I/O bandwidth, not memory bandwidth. There is a difference. I posted images showing you those exact differences. What you are posting has nothing to do with memory bandwidth.

Senua · September 28, 2021, 5:53pm

So this 387MB of data is loaded to memory using the dedicated memory lanes right? It consumes the memory channel bandwidth as well and not only I/O (unless it’s using some other BUS). So it’s also less data for GPU to load from memory.

Again, we have seen absolutely zero next-gen optimised titles past year.

proelitedota · September 28, 2021, 6:12pm

It is less if the game leaves the freed memory unused, which will not happen. The freed SFS space will be used for better assets, BVH data, etc.

On the topic of 1080p60 HI on the S, I think that’ll be boosted to 1440p60 sometime down the line. According VG tech the S hits native 1080p most of the time, so there is headroom for 1440p reconstructed.

I also expect a 60fps mode on Xbox One. VG tech had the lowest native res pixel counter being 1152x900. That can very well be lowered to 480p in the 60hz mode. For reference, Metro Exodus bottoms out at 480p on the series s.

KageMaru · September 28, 2021, 7:05pm

I couldn’t find a good image for the Series consoles but the PS5 image below is good enough. I added an additional curved line leading from the CPU and GPU to the system memory to go along with the already created arrow showing the data feeding from the SSD to the memory. There are two lines here, you are talking about the semi-transparent arrow. I am talking about the opaque curved line coming from the CPU/GPU to the system memory. As @proelitedota pointed out, they aren’t going to leave free memory unused. So the memory saved by SFS will just be used for either higher resolution textures, higher variety of texture data, BVH structures, higher resolution buffers, and so on. Point is the memory will be used at the end of the day. So at this point we come to the GPU memory bandwidth and how efficiently those assets can be fed to the GPU to process and render what’s being displayed on the screen.

This is factually incorrect and the wrong way to look at this. Any game developed using the GDK and running on a Series X/S is a new gen game. That doesn’t mean that these systems are being utilized to their fullest, that’s a completely different conversation. So in the context of what has actually been released, I can’t think of one XSS game that has a higher resolutions than the last gen 1X counterpart when running at 60fps outside of RE8 and even then it’s not a 1:1 comparison.

Even if we go with this spin on reality, it still makes the comment below untrue. We should really keep to factual statements and not lean into hyperbole.

CallMeCraig · September 28, 2021, 7:05pm

The picture shows memory for assets and caches from I/O storage. This is not the whole memory footprint of the game rendering a single frame. Games create a lot of (temporary) data in multiple steps for calculating everything you see in a single frame. Some of these steps are math heavy light calculating lighting, some like stencils, shadow maps or z prepass are memory (not storage) bandwitdth heavy.

Where did i call Xbox engineers illiterate? What?!

Staffy · September 28, 2021, 7:43pm

Summary?

KageMaru · September 28, 2021, 8:13pm

Both Series X and PS5 have a dynamic resolution between 1920x2160 and 3840x2160 in 60fps mode. When the resolution drops down below the target 4K resolution, reconstruction is then used to reach the 4K goal. So if the game drops down to 2560x2160, it will then reconstruct up to 3840x2160.

Series S targets 1920x1080 in 60fps but will drop down to 960x1080 at the lower figure. Again resconstruction is used to bring the IQ back up to 1920x1080 when dropping below the target resolution.

All three hit the 60fps target 99% of the time. There are some one-off single frame drops but it’s minor.

In 120fps mode both high end consoles drops down to around 1536p as the max resolution, again with resolution scaling on the horizontal axis based on performance. The two smaller maps hold 120fps most of the time, usually dropping when there are a lot of physics on screen. On the larger maps, we see drops down to 80fps on both high end systems. The main difference is the PS5 always has v-sync enabled where the Series consoles will drop v-sync when going below the target frame rate.

The Series S keeps the same 960x1080 to 1920x1080 resolution in 120fps mode. The frame rate is decent on smaller maps with less action but the larger maps will see drops down to the low 70s when stressed.

CallMeCraig · September 29, 2021, 10:40am

so the new dash is indeed in real true 4k

Zappy5 · September 29, 2021, 11:10am

When members like @KageMaru and @CallMeCraig with clearly lots of technical knowledge about the rendering process post detailed explanations it might be a good idea to listen and reflect rather than trying to argue with them using marketing slides and bullet points. I find proper well explained technical explanations very fascinating (and thanks for the basics videos too) but its draining to then have to read a load of counter points based on clearly a lower level of understanding and for said members to have to repeat themselves. Just a personal thing and I’m not trying to police the thread but it would be good for everyone if these sorts of informative posts weren’t drowned in a sea of ‘less informative hopes and dreams’ because if we wanted that we could go to reddit.

Even someone as dense as I am knows that memory bandwidth to the GPU is NOT the same as the bandwidth to load from the SSD into RAM in the first place. And that the Series S major limiting factor is that GPU memory bandwidth. Its been talked about by a few devs now and we have to accept that going forwards it will in the main be what limits the box from higher resolution or settings.