Microsoft is releasing Direct ML apis for the general public

It would upscale to whatever output res the dev targets using SS. If it goes too far ya get artifacts. I’d be very curious to see DirectML SS operating on top of VRS and DRS. Wonder if those would get in the way of each other.

No idea. I didn’t tried this.

1 Like

If anything VRS frees up compute for more AI work. Dedicated hw is always faster than generalized.

1 Like

It may get in the way. But VRS is generally applied in the region of low detail so maybe artificats won’t be a problem. Also, in vrs a pixel is simply shaded to whatever property it’s adjacent pixel have and saves the shading cycle. It still have same resolution on entire screen. So there may not be any artifact at all.

DRS is the most tricky one. I don’t think it will work together with SS. Then again, i believe SS is being developed to gain performance. So DRS might not be implemented at all.

1 Like

I suppose it does. They might be in the same portion of the frame time though potentially.

Yeah I’m wondering if SS solutions need to be trained for a specific target output or if they can work on any target below the super high res one they get trained on.

1 Like

I see a lot of people still saying Sony, MS and Amd are all working on a DLSS like solution but in reality, it is only MS that is working on a DLSS like solution. Series X|S are the only apus with dedicated hardware for ML. Amd’s solution is confirmed to be TAA like solution.

That was how DLSS 1.0 worked I believe, 2.0 is a more general solution that doesn’t need training for individual games, just need the game engine to support it, UE 4.26 just added support for DLSS 2.0, so now every game can have DLSS.

MS implementation will probably not as powerful as DLSS 2.0 simply because Nvidia GPUs have way more dedicated hardware to do the work. The Series X|S will not have the same level of capabilities. Maybe the Series X will upscale 1440p to 4k whereas Nvidia you can upscale even from 1080 to 4k in performance mode.

Exploring DLSS 2.0 in Unreal Engine 4.26 – Tom Looman

DLSS doesn’t require TC’s in order to get the job done. Those are there for applications beyond SS and for training models. The weights in SS models end up being integers once trained and XSX has more than enough compute to scale 1080p–>4k using that. Ppl have a misunderstanding that you need Nvidia’s specific tech to end up with DLSS…which isn’t remotely accurate.

Yeah, and right now even for training the tensor cores are severely underutilized on nvidia gpus.

If DLSS didn’t require tensor cores then it would run on older GPUs like the 1000 series GPUs. I know TCs are used for AI training/research etc. but DLSS also needs it.

Once the network is trained, NGX delivers the AI model to your GeForce RTX PC or laptop via Game Ready Drivers and OTA updates. With Turing’s Tensor Cores delivering up to 110 teraflops of dedicated AI horsepower, the DLSS network can be run in real-time simultaneously with an intensive 3D game. This simply wasn’t possible before Turing and Tensor Cores.

I’ve explained this before earlier in the thread. Short version is that research in past couple yrs demonstrates that the models trained for SS can have integer weights without any meaningful loss in quality. As such, to run the inference ya just need INT8 and/or INT4 compute. For whatever reason, AMD’s cards did not include the required logic to actually use INT compute effectively and MS had to request extra logic and tweaks to the shader cores to accomplish it.

This isn’t something we need to keep debating since it’s not speculation. MS’s engineers already explained all of this.

2 Likes

Tensor cores are significantly more efficient at it though.

I’m just glad MS though ahead to add direct ML capable features in the SOC so when the software to enable it happens the hardware will be ready for it.

1 Like

Says who? I see no indication of that in their TOPS figures at all. Those seem to scale with TFLOPS on the cards iirc. They are gonna be more performant since they are dedicated specifically to those computations whereas on XSS/XSX that’s not the case but it may or may not matter depending on what else is happening in that portion of the frame time.

EDIT: Take this with a grain of salt…

https://twitter.com/JohnDraisey/status/1356527635346518016

The XSX is capable of 48ish int8 TOPS and 98ish int4 TOPS. An RTX2080ti is capable of 215 and 430. Thats a 13 tflop GPU. Though i know you can’t directly compare NVIDIA and AMD.

That said, I have NO idea what is required to run DLSS.

Direct ML is the most interesting wild card for next gen hardware. Not ready to make any proclamations…but I think it’s more than a grain of salt. Just need to see it executed.

The claim was about efficiency, not absolute compute figures. The TC’s aren’t being stressed a whole lot by the inference I don’t think.

That dev hasn’t used it themselves yet but good to see they are optimistic.

This isn’t new per se but I often feel like I’m the only person on the whole internet who has seen it ans recognizes it, lol. Nobody seems to talk about it, so here is another confirmation of runtime texture upscaling at XGS:

James Gewrtzmann interview: Microsoft's Head of Cloud Gaming Talks Future of AI in Gaming, Experimental Tech

“We have a research project in our studios where they built a texture decompression and compression algorithm, where they trained a machine learning model up on textures. They can take a big texture, shrink it down till it’s really ugly, and then use the machine learning model to decompress it again in real time. It’s not exactly the same, but it looks really realistic. It’s almost creepy how it works.”

They have a patent for it too: US Patent Application for MACHINE LEARNING APPLIED TO TEXTURES COMPRESSION OR UPSCALING Patent Application (Application #20200105030 issued April 2, 2020) - Justia Patents Search

I read this a while back.

It’s even more interesting then super resolution.

1 Like