Hot Chips 32, XSX Architecture Deep Dive

In English please :crazy_face:

Iā€™m agreeing with you :slight_smile: the total power is constant, so when GPU power is boosted (frequency is increased, but to do so, the GPU power rail voltage also needs to be increased, and power is proportional to v^2 or even v^3), so this power needs to be borrowed, from the CPU. Only lowering just the CPU frequency doesnā€™t help by itself. If the required frequency is low enough, the CPU power rail voltage can be dropped from nominal to minimum, freeing up the most power.

1 Like

Im Sorry I didnt mean any offence im just to dumb to understand.

No offense taken. Its more likely I did not explain clearly. As long as things are public knowledge, I am more thank happy to engage and clarify myself.

I would presume any documentation shared with developers are covered under NDA, Still, once third party games are released, we should learn more.

Honestly, if the PS% had features like VRS, mesh shaders, etc. Cerny would have mentioned it at the GDC talk.

Which is a shame, really. When used together I gather they can probably eek out 40%+ performance gains.

1 Like

I was probably wrong to say legally, but I still think it would be wrong of them to say they have the most powerful without knowing for sure. Like I said though I think itā€™s fairly clear they do so weā€™ll wait and see when the marketing cycle starts.

I think Phil actually said that they can only speak for themselves and he doesnā€™t know what others hardware is.

I bet they did know but have to act like they didnā€™t. Same as Sony.

Everyone is looking for the answer to this question.

Itā€™s like saying x+y=4 in USA but x+y=5 in Japan.

The balancing between CPU and GPU in the PS5 is not done by clock but by power consumption.

The power consumption is in parts a result of each hw resource that is in use. The more hw resources are in use the higher the power consumption. The less hw resources are in use the lower the power consumption.

The other part of the total power consumption is the clock frequency which acts as an amplifier, the higher the clock the higher the additional increase in power consumption. We are talking about an exponential dependency here.

Some examples:

Not very demanding game scenes / use cases:

GPU & CPU only use a fraction of the hw resources (ALUs) and the total power consumption is way within the limits of both the GPU and the CPU. Means you have those highest clocks in times where the GPU and CPU are not really challenged. Though no benefits in demanding game scenes. Also, that use case doesnā€™t represent the theoretical peak performance as only a fraction of those resources are used. This is why Sony can say those high clocks are achieved a lot of times.

Demanding game scene / use cases:

Clocks for potentially CPU and GPU are reduced due to all hw resources are used heavily and power budgets for CPU and GPU are at their limits (high power consumption). Clocks have to be reduced for either one of CPU or GPU or even both which changes the upper performance limit profile. How much clocks have to be reduced is not easy to say without any profiling data. In multiplatform face-offs you will see that the PS5 particularly will struggle in CPU and GPU demanding game scenes or game scenes require on-premise frame time budgets (also most likely the origin of the rumors Dusk Golem and others talked about ā†’ 60 fps struggles).

The last point is also the reason that optimizing a game for PS5 requires much more effort as you have to workaround those demanding scenes for CPU and GPU including the additional effort to not only find loopholes in your own code but also find out at what point in time the GPU or CPU have reduced performance limits.

In result the whole process is more complex and time consuming which results in additional costs. Some of the workarounds will be doable by programming but a chuck of those workarounds will be only doable by changing the actual game scenes, means cutting features, visual quality or content to get the game stable.

In comparison to the XSX, you still have to do profiling - to find loopholes in your code - but you donā€™t need to find the lowest performance of CPU or GPU in time, as it always has the same upper limit.

Summary: The PS5 never will come even close to its theoretical peak performance because of those use cases:

  1. low power consumption ā†’ highest clocks are achieved while not all hw resources are in use
  2. high power consumption ā†’ reduced clocks when it counts the most because all hw resources are needed
6 Likes

I think it was Jason Ronald who said they also explored the idea of variable clocks, but that in the end, it wasnā€™t worth it.

Good explanation. I also believed that things are actually dependent on the utilisation of the resources.

It could have been worth it for promotion.

The engineering team deserves an applause for creating such beautiful hardware.

2 Likes

When I saw the Cerny presentation I got a little sick to my stomach. I felt the same way watching MS announced the XBox One. There was a lot obfuscation on both sides. Both was trying to hide the fact they had the weak console by doing a lot of hand waving.

Edit: Iā€™m just talking hardware. Itā€™s still to be determine who will have the better first party game library. But I think we can now safely say XSX is the better hardware.

4 Likes

MUH SSD POWAH

This is exactly how I felt and honestly I think it really reads the situation correctly. Sony being cagey about Ps5 hw doesnā€™t indicate at all some secret sauce rdna3 fanboy dream bs, but that the hw is inferior than that of XSX. Sony approach is very similar to that used by MS on Xbox One back in the day, specs wise.

Yes, designing a console that is going to run insanly hot, uses loads of power for little performance gain, that makes the console massive and require massive cooling just makes no sense. This is the exact opposite of what a company would want when designing a console.

Its quite funny really, if tflops dont matter so much I wonder why cerny pointed out the max theoretical 10.28tflops of the PS5? it would make sense to give a range given that the clock speed of the gpu is variable.

I do wonder how much the performance will differ from there original target of a fixed 2000mhz gpu.

I expect a 15-20% performance difference between the PS5 and XSX.

Meaning say for example rdr2 @ ultra native 4k and unlocked framrate running on ps5 + xsx

PS5 avg fps - 40 Xsx avg framrate 46-48fps

Which is not insignificant for a dev targeting 4k 30fps, at the visual fedelity the dev wants the ps5 could be 4k @ 22-24fps where the xsx is 4k 30fps.

1 Like

Iā€™m not too worried about the PS5. I think their first party will use these variable clocks to their advantage such that it will act like a fixed 10.28TF machine, plus they will take advantage of their SSD. The result is going to be perfectly great looking games that will sell the console. That is their focus and the bulk of the PS5 fanbaseā€™s focus (at least the most vocal ones).