I still dont understand how the PS5 cooling system works

Knottian · August 15, 2020, 2:18pm

I’m with you, and this is a guy who has two OLED Vitas for the record. Between being one income, thanks COVID, and generally being more excited about the specs and offerings of the ecosystem I’m going with XSX Day One and skipping PS5 for the foreseeable future. I’m not a fan of Sony’s business practices the last few years, and I’m genuinely not a fan of some of their “best titles’” (I may have been oversold but Horizon feels like an Ubisoft game), and given their exclusives are constantly touted as bangers I know they’re not going to change that primary genre-focus, so it’s not clearly not for me.

I am a tech nut as I work with supercomputing and hardware for scientific research, so I’m always trying to capture a semblance of that at home and the power behind the XSX is incredible considering the size and perceived cost. Knowing that I’ll be getting the best versions of multi-plats, that MS is serious about investing in their existing and new studios, that Game Pass is inarguably the best value in gaming, and that when first-parties start utilizing the features of the console we’ll get some amazing experiences all add up to a no-brained decision for me.

Capybara · August 15, 2020, 2:24pm

I was only speaking on the gpu because you can pretty much get a ps5 gpu for pc with a 5700 xt and overclocking it. The gains from taking the card at 1750 mhz to 2150mhz aren’t worth going up that high on the speed. I know rdna 2 is different to rdna 1 but its really the only thing we can physically base the ps5 on. Series x doesn’t even have a gpu equivalent on the market yet so we will have to wait and see

No-1HoloLens-Fan · August 15, 2020, 2:28pm

It is assumption but i think PS5’s cooling is linked to the predicted heat generation which can obviously be linked to the power draw.

They may have designed the console cooling around a particular power draw limit which will generate a said amount of heat.

So now as far as variable clocks are concerned, it’s just a way to manage that limit of power draw. That’s where smartshift comes into play.

I think there is a missing piece which cerny didn’t share with us. And it could be about how much utilisation the APU has every cycle.

For a max theoretical, all the transistors in GPU can work at 2.23 GHz and all the transistors on CPU can work at 3.5 GHz. Now as per cerny, running the system at 3 GHz CPU and 2 GHz GPU was impossible. This statement of his, is what contradicts and make everyome question how by utilising smartshift the cpu and GPU now run at 3 and 2.23 GHz?

I hope, someone can answer this in brief and enlighten us.

No-1HoloLens-Fan · August 15, 2020, 2:47pm

PianoBlack:

…ax clock. I believe the quote from Mark Cerny was “at or near maximum clock the majority of the time” which could mean just about anything. And diminishing returns at crazy high clock speeds could absolutely be a thing. And it may be a pain in the ass for developers if they find that their specific thing causes the clocks to tank. But I don’t really buy that it’s a dumb design just because it has variable clocks. In a nutshell, allowing it to downclock for “H” operations means they can have much higher performance than they otherwise would for “E” operations. This helps them overcome the huge disadvantage the PS5 has in terms of raw compute hardware. 36 CUs vs 52 CUs implies a 44% advantage for the Series X, which Sony will reduce by half or more depending on scenario by using this strategy.

This is the best explanation up until now.

I really find PS5 design praise worthy as far as getting max performance out from the given hardware.

Although it does put developers on toe and they might have to spend some time figuring out this.

Anakin · August 15, 2020, 3:47pm

I don’t think so. The memory bandwidth isn’t changed. And how are developers going to optimize for variable clocks? In modern software development nobody tweaks code at assembly level, the compiler handles the optimization. Especially because of how gigantic code bases are.

Besides, there’s no way of knowing what instructions are running in multiprogramming. Any core could be executing a random instruction at any given time.

What will happen is what happens on PC: games performance will vary. On PC the solution is to throw more powerful hardware to it until problems are gone.

LifeForms · August 15, 2020, 4:30pm

PianoBlack:

…ax clock. I believe the quote from Mark Cerny was “at or near maximum clock the majority of the time” which could mean just about anything. And diminishing returns at crazy high clock speeds could absolutely be a thing. And it may be a pain in the ass for developers if they find that their specific thing causes the clocks to tank. But I don’t really buy that it’s a dumb design just because it has variable clocks. In a nutshell, allowing it to downclock for “H” operations means they can have much higher performance than they otherwise would for “E” operations. This helps them overcome the huge disadvantage the PS5 has in terms of raw compute hardware. 36 CUs vs 52 CUs implies a 44% advantage for the Series X, which Sony will reduce by half or more depending on scenario by using this strategy.

Thank you for this, so the cooling system reacts to SoC tasks instead of power, But im a little unsure why sonys method is better, its just an opposite way of doing things. At the end of the day a systems max performance will be dictated by its max power draw.

Also regarding the CUs I dont know where getting the info that PS5s method will decrease this by half. The PS5 has roughly a real world clock advantage of 20%, but in the PC space higher clocks has never made up for more tflops .a lower clocked 5700xt will still out perform a higher clocked 5700.

Cerny said higher clocks and less cus will result in a number of things performing faster, but he didnt demonstrate this at all, for all we know he could be talking about only a few percent more overall performance.

Capybara · August 15, 2020, 4:30pm

Am I right to assume that the ps5 version of a gamed will be the main development hardware and scaled up for series x. So if the series s is 4-5 teraflops and the ps5 version runs at 4k then series s should be able to do 1440p and if the ps5 version runs at 1440p then series s should be able to do 1080p. Am I right to assume this?

LifeForms · August 15, 2020, 4:33pm

Yes series S will be able to achieve half the resolution of the PS5 in a lot games. Depends on how powerful the series S GPU is. The closer to 5tflop it is the more u will see this happen.

Capybara · August 15, 2020, 4:36pm

OK thanks not much of a tech guy since my field is construction but I do like reading up on tech and getting a basic understanding of it. Just funny how many people on other forums seem to believe series s will struggle to even achieve 1080p.

Psyrgery · August 15, 2020, 7:13pm

If RDNA2 behaves similar as RDNA1, then the perf difference could be higher than 20%.

RDNA1 was shit at overclocking, and the gpu in Ps5 has very high frequencies for a console

JaggedSac · August 15, 2020, 8:26pm

That’s because they haven’t told us how the cooling system works…

deftones_r_cool · August 15, 2020, 8:35pm

I have a suspicion that it doesn’t work very well at all.

PianoBlack · August 15, 2020, 9:24pm

It’s not necessarily “better” overall. It could end up being a pain for devs, or running the super high clocks for what I called “easy” scenarios might be wasteful in terms of heat, noise, or reliability.

The variable clocks should generally give more performance for a given cooling budget, though.

I’m not saying the PS5 will only ever have a 20% performance gap. We can’t know that yet. In some cases it might be more due to memory bandwidth, poor gains from clock speed scaling, or whatever else. In other cases it might be less, due to the fact that its ROPs and other components are running at a higher clock speed than the XSX.

My only point in the original explanation was to compare it to a fixed clock system. If Sony had fixed their clocks at the level of XSX (1.825 GHz), they would have a 44% performance gap. Instead they are letting the clocks range higher when the workload isn’t as taxing, so in many (most?) scenarios the gap will be substantially narrower than 44%.

They’ve already stated that it’s fully deterministic. The same set of instructions will always result in the same changes to clock frequency. I’m not a deep expert, but I don’t see any reason to doubt this. They’ve been quite clear in public about it.

Anakin · August 15, 2020, 9:45pm

That doesn’t explain how it affects development, let’s say 4 cores happen to be executing something expensive. There is no way for the developer to know when that will happen. Will the system throttle the GPU?

It sounds like extra complexity. Maybe that’s why they were talking a lot about “ease of development”. If it was easy, it’d be obvious. They are trying to convince others about it.

uberdave · August 15, 2020, 9:52pm

Also keep in mind that consoles get less good at dealing with heat as the get older and dustier. My xbox’s fan is actually audible now on more taxing games and it didn’t used to be. You could literally have the system becomes less powerful overtime as it gets older.

PianoBlack · August 15, 2020, 9:55pm

Not sure I get what you’re trying to say. The developer will know that is happening when they develop the game and profile its performance. E.g. If they see the framerate tank because the clock speed has dropped, they can figure out what kind of code is causing the drop and fix or change it.

I agree, it may represent extra complexity for developers. We will see how things go once games start shipping, I’m certainly curious for that and for a teardown of the cooling system. I hope there will be some way for sites like DF to observe changes in the GPU/CPU clocks in real time.

Anakin · August 15, 2020, 10:01pm

Current programming technology relies on side effects, i.e., it’s not deterministic. Meaning it is impossible to predict the result given some input. Now extrapolate that to a complex system such a game, where many subsystems will be running in parallel and may take more or less resources.

I don’t buy that it is simple. The system is dynamically allocating resources, the only way to ensure that framerate won’t drop is to develop with room to spare, so even if the GPU throttles down, it won’t affect the framerate.

Maybe this is irrelevant, because the extra performance due to overclock is minimal. So even if the GPU throttles down, it changes little.

PianoBlack · August 15, 2020, 10:39pm

Wait, what? Given identical game states and a fixed player input, you’re arguing that different code would execute each time? Based on what? That doesn’t make sense (how would you play a game where different things could happen given the same input?), and it doesn’t match my own development experience at all. If I press “shoot” on a controller, the same polling for a button press takes place, the same “shoot_weapon()” function gets invoked after the button press is detected, and the same code path gets executed every time. It’s not like there’s some RNG going on to determine what code gets executed (unless I put it there).

Here’s what Cerny said in the GDC talk:

“So, rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis, which makes everything deterministic and repeatable.”

LifeForms · August 15, 2020, 11:03pm

I see, but I still think the tflops and memory bandwidth are the best metrics for gpu performance when using the same architecture, over the decades in the pc gpu space this has always been the case. I think the xsx will perform like a 12tf machine with 560gb/s max ram bandwidth and the PS5 will perform like a 10tf machine with 448gb/s max ram bandwidth. I remember in the hardware speculation threads over at reset everyone said that more cus is better because the way the gpu industry is going more cus will scale better. This narrative of higher clockspeed closing the gap only appeared after the ps5 cerny talk, I know your not saying this but I think its a funny thing.

Anakin · August 16, 2020, 1:12am

PianoBlack:

…ld execute each time? Based on what? That doesn’t make sense (how would you play a game where different things could happen given the same input?), and it doesn’t match my own development experience at all. If I press “shoot” on a controller, the same polling for a button press takes place, the same “shoot_weapon()” function gets invoked after the button press is detected, and the same code path gets executed every time. It’s not like there’s some RNG going on to determine what code gets executed (unless I put it there). Here’s what Cerny said in the GDC talk: “So, rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis, which makes everything deterministic and repeatable.”

PianoBlack:

…ld execute each time? Based on what? That doesn’t make sense (how would you play a game where different things could happen given the same input?), and it doesn’t match my own development experience at all. If I press “shoot” on a controller, the same polling for a button press takes place, the same “shoot_weapon()” function gets invoked after the button press is detected, and the same code path gets executed every time. It’s not like there’s some RNG going on to determine what code gets executed (unless I put it there). Here’s what Cerny said in the GDC talk: “So, rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis, which makes everything deterministic and repeatable.”

This is CompSci 101. Programming languages commonly used for game programming are imperative, meaning they rely on side effects. When you call “shoot()”, there’s no way to guarantee what is going to happen, because the internal state of the object can be anything, thus the result will differ each time based on such state.

There’s no way to determine beforehand what the result and state will be, thus it is not deterministic.

This is opposed to mathematics for example, or functional programming, where functions always return the same result for the same inputs because there’s no state to worry about (in functional programming objects are immutable).

The whole talk about being deterministic makes no sense. Either Cerny oversimplified a concept for laymen to understand and left a big chunk of the explanation out, or it’s just talk.