Well hopefully not the same results because DLSS 2.0 has big problems with certain motion vectors. That trail coming off the crypto bytes in death stranding looks bad.if they can achieve the same results of dlss 2.0 this is a gigantic news.
Even with some flaws, it still is far better than any checkboard technique from sony.Well hopefully not the same results because DLSS 2.0 has big problems with certain motion vectors. That trail coming off the crypto bytes in death stranding looks bad.
Anyway Sony were always going to improve this and if it means we get close to 4k while using up less power this is great news. Native 4k is such a waste.
Is it wrong that I totally imagined you saying this while looking and posing EXACTLY like your avatar?If this is really true it's good news for AMD. If they collaborated to do it (ie do it and, equally important, fund it) with Sony then I'd be surprised if AMD didn't implement it in their own GPUs. Considering the RDNA2 stuff is already sounding pretty good based on rumors, RDNA3 with a variant of DLSS would be huge news and (hopefully) bring down the price of GPUs for consumers.
In other words, we should be all be crossing our fingers and toes that this really happens, even those of us not interested in PS5.
Well yeah the state of the art DLSS 2.0 is of course better then last gen checker board technique from Sony. Compare it to their ps5 offering which I hope to god doesn't have those hideous vector bugs. It would ruin a second DS playthrough seeing those god awful lines coming off each crypto bytes. And you don't have to zoom in to notice that either.Even with some flaws, it still is far better than any checkboard technique from sony.
If the 2060 couldn't take advantage of the tensor cores because it bottlenecks somewhere, the same amount of tensor core might match what the 2070,thus not needing the increase.
I understand your speculation but believe, there is no way cpus in consoles can match the performance of tensor core for the tasks the tensor core are specialized for.
The first step towards DLSS 2.0 was the release of Control. This game doesn’t use the "final" version of the new DLSS, but what Nvidia calls an “approximation” of the work-in-progress AI network. This approximation was worked into an image processing algorithm that ran on the standard shader cores, rather than Nvidia’s special tensor cores, but attempted to provide a DLSS-like experience.
reddit threadControl used a version of DLSS nicknamed "DLSS 1.9" by reviewers, this actualy ran on the compute shaders and was the best implementation at the time up until DLSS 2.0 released.
You guys missed the PS show or forgot it already happened? All those patents were cool to discuss and speculate about before the console and games were revealed, like the supposedly external SSD one, but all of that has been pretty much debunked since we saw the games, running on the PS5, and none of the found patents are visible in the games. So this will be applicable either in the future (PS6), or something other than gaming consoles, like digital camera or smartphones.
You know better than to judge a new console based on launch games.
reddit threadWe've been waiting to reexamine Nvidia's Deep Learning Super Sampling (DLSS) for a long time and after a thorough new investigation we're glad to report that DLSS...www.techspot.com
DLSS1.9 ran on the shader compute on the cuda cores without using the tensor cores, and it could run even on 2060 while the game was running. A similar algorithm should be able to run on shader compute on ps5 or xbox series x.
The question is if nvidia's higher quality dlss2.0 has significantly higher performance requirements or if it's performance requirements are similar to dlss1.9.0
so i'm a rank dilettante...Nope, checkerboarding and DLSS are technically completely different methods of achieving the same thing.
DLSS invents new pixels based on prior training, so it knows what's missing and fills it in. Checkerboard doesn't invent new pixels, when it has low confidence it interpolates based on nearby pixels.
You cant compare the 2 methods directly in terms of hardware usage. DLSS is just more advanced tech specifically on nvidia cards that uses the tensor cores. The game engine changes things too. So it all depends on how the dev uses it on the type of hardware.so i'm a rank dilettante...
how is this reducing how much work the pc does
because, presumably something in the pc, which relies on the pc's fund of RAM or whatever, is doing some kind of thinking to 'fill in the gaps'.
Like, the computer stops doing 100% of the rendering, because somethign else does 20% of it, or whatever, but that 20% still takes power, right? so how come it doesn't just stay at 100% ... i guess i'm asking, what's more efficient about an AI filling in the gaps than the gaps just being filled? is it a post-processing thing? But still, doesn't that take memory? eeeh
EDIT: I'm picking on u because i like ur avatar
dlss on shader ? source ? and they won't release this for older gpu and competition i guess..The 2080ti tensor cores are likely not fully utilized by dlss, as there might be a limit to parallelization of image processing from 1080p to 4k. At least 2060 can do 1080p to 4k if I'm not mistaken.
The question is is there ample performance left for developers to use tensor cores in games besides dlss, or does dlss take up most of the tensor cores or a significant portion of them.
I think i'd heard that control had dlss 1.9 that could run on shaders, without tensor cores, in game and even boost performance.
If true, question is what is the difference in required performance between quality mode and dlss1.9. What is being done? Two passes? Something else?
well, sorry man. This wasn't meant to be dismissive of Sony. Everyone has the same problem.
Unless Nvidia wants to start handing out models, everyone is stuck unless they want to build their own. Which is by all means can be very expensive.
To put things into perspective, Google, Amazon, and MS are the largest cloud processing for AI. None of them have a DLSS model. Facebook is trying but has something inferior to Nvidia as I understand it. Even with using RTX AI hardware, it's magnitudes away from DLSS performance.
MS can tout ML capabilities on the console, but no model, it's pointless. The technology for AI is in the model, the hardware to run it is trivial.
Further explanation on this front: a trained model consists of data and processing and the network. Even if you have the neural network to train with, and lets say it was open source, you still need data and then you need processing. Power.
To put things into perspective, BERT is a transformer network whose job is for natural language processing. It can read sentences and understand context as it reads both forwards and backwards to understand context. BERT the network is open source. The Data is not. The data source is Wikipedia (the whole wikipedia is read into BERT for training) but you'd still have to process the data ahead of time for it to be placed for training. Assuming you had a setup capable of training so much data, then gets into the compute part of the equation. Simply put, only a handful of companies in this world can train a proper BERT model. So while there are all sorts of white papers on BERT, small teams can't verify the results or keep up because the compute requirements are so high.
For a single training:
How long does it take to pre-train BERT?
BERT-base was trained on 4 cloud TPUs for 4 days and BERT-large was trained on 16 TPUs for 4 days. There is a recent paper that talks about bringing down BERT pre-training time – Large Batch Optimization for Deep Learning: Training BERT in 76 minutes.
If you make any change to it, any change to the network or data set. That's another 4 days of training before you can see the result. Iteration time is very slow without more horsepower on these complex networks.
Google BERT — estimated total training cost: US$6,912
Released last year by Google Research, BERT is a bidirectional transformer model that redefined the state of the art for 11 natural language processing tasks.
From the Google research paper: “training of BERT – Large was performed on 16 Cloud TPUs (64 TPU chips total). Each pretraining took 4 days to complete.” Assuming the training device was Cloud TPU v2, the total price of one-time pretraining should be 16 (devices) * 4 (days) * 24 (hours) * 4.5 (US$ per hour) = US$6,912. Google suggests researchers with tight budgets could pretrain a smaller BERT-Base model on a single preemptible Cloud TPU v2, which takes about two weeks with a cost of about US$500.
What may surprise many is the staggering cost of training an XLNet model. A recent tweet from Elliot Turner — the serial entrepreneur and AI expert who is now the CEO and Co-Founder of Hologram AI — has prompted heated discussion on social media. Turner wrote “it costs $245,000 to train the XLNet model (the one that’s beating BERT on NLP tasks).” His calculation is based on a resource breakdown provided in the paper: “We train XLNet-Large on 512 TPU v3 chips for 500K steps with an Adam optimizer, linear learning rate decay and a batch size of 2048, which takes about 2.5 days.”
None of these costs account for the amount of the R&D of how many times they had to run training just to get the result they wanted. The labour and education required from the researchers. The above is the cost of running the hardware.
Nvidia has been in the business since the beginning sucking up a ton of AI researcher talent. They have the hardware and resources and subject matter expertise in a long legacy of graphics to make it happen. It's understandable how they were able to create the models for DLSS.
I frankly can't see anyone else being able to pull this off. Not nearly as effectively. At least not anytime soon.
Well, I hope not. That would have a huge impact on the main memory bandwidth. Faster delivery is a good thing, but the GPU still needs most of the memory bandwidth to read and write things. If you write 15-20GB from memory, you still have to read it, which would be 30-40GB of "lost" memory bandwidth. This is ok for loading screens if the GPU has nothing to do, but not so ok for live action.
All in all memory bandwidth is something the next gen consoles are a bit underdelivering. Not only that the CPU & GPU are much stronger (and need more memory bandwidth e.g. faster clocks, new features like ray tracing, ...), now we also have a fast SSD for streaming that this way also "steals" bandwidth.
interesting but we are at 9gb/s in carebear scenarios and imo probably 1/8 (or even 10) of that in streaming ones so it will not be a relevant problem for this genPost is from Allandor:
Basically this means the faster the storage, the more the maximum bandwidth for components like the GPU lowers . This is something that affects pretty much all computers, and upcoming systems like PS5 and the Series devices.
dlss on shader ? source ? and they won't release this for older gpu and competition i guess..
Feb 26, 2020 - We'll be focusing primarily on Youngblood as it's a major release. Nvidia tells us that DLSS 2.0 is the version that will be used in all DLSS-enabled games going forward; the shader core version, DLSS 1.9, was a one-off and will only be used for Control.
interesting but we are at 9gb/s in carebear scenarios and imo probably 1/8 (or even 10) of that in streaming ones so it will not be a relevant problem for this gen
You people really need to finally realize the times of exotic, hard to master architectures are long time gone, consoles are PCs now with all the power easily accessible for the devs from day 1, that's what people who made those consoles say, that that was exactly the primary goal when designing those consoles.