• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

AMD Exclusive Interview by RedGamingTech. VERY interesting - PS4/X1/PC Relevance

Vizzeh

Banned
Redgamingtech are starting to grow momentum over the last couple of years, I have been following them a while and find them very insightful, particularly on Technical analysis. So its great seeing them do well and getting an interview of this size. Not to mention competition to Digital Foundry has got to be a good thing for both tech and gaming (yes its in their name lol)

There are some Generic questions that touch on GCN Cores, Rops and bandwidth in general, which makes the connection to PS4/X1 and others that directly respond to them. PC enthusiasts should largely find it interesting also.

The interview Is fairly large so I wont Quote everything, Infact I will likely not get all the saucy stuff so I recommend checking out here for anything I miss: http://www.redgamingtech.com/exclus...n-architecture-performance-the-future-part-1/ (Note there is a page 2)

He states at the end of the interview, A further follow up detailed interview regarding ROP, bandwidth and compute in the next coming days



Interview With, Robert Hallock: Technical Communications, Desktop Gaming & Graphics at AMD. Interviewer: Paul Eccleston (CrimsonRayne) Redgamingtech.

Question: Audio technology on PC’s has been fairly stagnant for a number of years, and requires high CPU overhead for processing. Does AMD’s TrueAudio processor take care of all of the processing work with audio, or is there some processing left for the CPU to perform on the audio?

Robert Hallock: AMD TrueAudio fully offloads the CPU if the developer maximally utilizes its capabilities. It’s a programmable audio pipeline, so it will only shoulder as much of the burden as it’s told to. Regardless, AMD TrueAudio sends the voice(s) it’s processing and rendering to the user’s existing audio hardware as a passthrough to the endpoint device (e.g. headphones).

amd-true-audio-cpu-processing-budget-ps4-and-pc.jpg


Question: CPU’s over the past 3 years or so have only increased marginally in performance compared to huge strides in GPU technology. AMD have drastically beefed up the compute potential of your latest GPU cores, for example Hawaii (R9 290X). With both the PS4 and Xbox One likely to offload a lot of work for compute to the GPU, how do you think compute will affect the future of games development on the PC and console?

Robert Hallock: You’re right, PC gaming performance has come a long way in a short while. For example, the ATI Radeon™ HD 5870 was a flagship product just three years ago with the price to match. Today, roughly equivalent performance can be found starting at $139 in the AMD Radeon™ R9 260X. I digress, but I really wanted to put a fine point on how hard we’ve been working to bring significantly better performance to every gamer year after year.
To your point, compute is more important now than ever. Consider the following effects used by game devs: high definition ambient occlusion (HDAO), global illumination, FXAA, MLAA, TressFX Hair, diffusion depth of field, and contact hardening shadows. What do they have in common? They’re all GPU compute-driven effects. That’s just a small selection of the effects you can accelerate with compute resources rather than the traditional graphics pipeline. Where possible, using GPU computer is a rather efficient way of rendering!

Question: Regarding TressFX, recently you’ve updated it to simulate more than just hair, for example grass and fur (which is very impressive tech might I add), and reduced the overhead associated with running it. Can you tell us if you’re planning to use a version of hardware Physics for smoke and debris?

Robert Hallock: I cannot speculate on unannounced technologies.

Question; AMD seem to be in a great position right now for unifying games development, with your CPU, APU and GPU’s being used in PC’s and next gen consoles. Developers enjoying low level coding with Mantle, and other technologies such as TrueAudio and TressFX. Can you speak a little of your vision of games development, both on PC and consoles?

Robert Hallock: We’re tremendously proud of the continuum we’ve built in the gaming ecosystem. Game developers have already begun to leverage the commonalities, such as Crystal Dynamics’ recent decision to bring TressFX Hair to life for all platforms with the Tomb Raider Definitive Edition. We hope dividends will continue to be paid in this fashion for years to come for all platforms that we address.

Question: With the next generation consoles featuring 8 AMD Jaguar CPU cores, do you feel that more games on PC will benefit from an increased number of CPU cores as games developers change their engines to better use multi-core environments ?

Robert Hallock: I can’t speak for the future of the console business, but I can talk about what AMD is doing. Mantle is a powerful way to improve the robustness of a game’s multi-threading capabilities. Multiple game developers, notably Johan Andersson of DICE (in his keynote address at APU13), have been quite complimentary to our API and its scalability across cores. The whole talk is very enlightening on this topic, but the heart of Mantle is in addressing aspects of the software layer (like threading) that have lagged behind the capabilities of our hardware.

Question Currently, most PC’s use DDR3 RAM for main system RAM, with the GPU using GDDR5, although DDR4 will slowly start becoming the norm over the next few years. In situations where you have a discrete card like the Radeon R9 290x taking care of the graphics processing, is a CPU using DDR3 RAM on a PC bandwidth limited, when processing highly complex game engines due to the number of different tasks and data being accessed?

Robert Hallock: DDR3 is more than sufficient for today’s graphics landscape. Linus over at LinusTechTips did a very informative video on this topic, actually. The final verdict: capacities being equal, anything between DDR3-1333 and DDR3-2400 had negligible impact on GPU performance.
In fact, excellent gaming performance is all about keeping resources local to the GPU and its framebuffer. Farming texture fetches (for example) out to system RAM is very substantial performance penalty. More important to overall GPU performance is the potential of the CPU to feed the beast—the graphics subsystem. As resolutions and fill rates increase, an increasingly powerful CPU is required to capably feed the GPU. If the CPU is not up to the task, then performance tapers off. Most users call this “bottlenecked” or “CPU limited” performance.

Question : With the release of the Radeon R9 290X, bandwidth was increased by changing to a 512-Bit Memory bus from 384 and also doubling the ROP count of the GPU, ratio wise that’s higher than the increase in Stream Processors. Do you find that the bottleneck of GPU’s is becoming more on ROP and local memory bandwidth (on the discrete GPU) or is everything fairly balanced? Additionally, what’s the ideal ROP, Texture Units and bandwidth ratio vs say 8 GCN cores (or 512 Stream Processors)?

Robert Hallock: GPU design is analogous to the game engine design question you previously posed. Everything does have to be balanced. You could throw fistfuls of render backends at a GPU, but if your memory bandwidth is insufficient, then that hardware is wasted. And vice versa, of course.
I think it can best be explained by working backwards, asking yourself: “What performance and resolution target do I want to hit?” Then you build a core out on paper that, by your mathematical models, would yield performance roughly equivalent to your target. Then you build it!
For 512 CUs, then I would say: 32 texture units, 16 ROPs and a 128-bit bus.


Question: Keeping on the subject of compute for a moment longer, how much impact does running compute commands on the GCN 1.1 architecture make to graphics processing performance? Do you often find that a compute command must ‘wait’ to be processed because the GPU is busy processing the scene, or that graphics performance suffers? Or has improvements to compute architecture on the GCN 1.1 helped eliminate these issues?


Robert Hallock: It depends entirely on the effect. With respect to gaming, compute and graphics pipeline resources are interdependent. Being too aggressive with either category of rendering will ultimately compromise overall performance, but that is true of any GPU. Talented game developers will understand a GPU’s fundamentals and design an engine that takes a balanced approach, not just from a total resource perspective, but when to use compute versus pipeline as well.
With respect to products like the AMD Radeon™ R9 290 or R9 290X, we refined the basic GCN Architecture in a couple key ways: accommodating higher pixel rates for UHD content, off-chip buffering improvements to enhance tessellation performance, more robust data storage in the geometry processors to improve geometry shader performance, a smaller and more efficient memory controller, the addition of our “XDMA” technology, support for up to four primitives per clock, and of course we were able to scale Graphics Core Next out to 2816 total shader units. Overall, though, this is the basic GCN Architecture we know and love, but with an extra ounce of love to make it a meaner and more capable engine for multi-purpose work. (note, GCN 1.1 compute is similar to PS4

Question: What were the factors prompting AMD to create the Mantle API? Was it simply a case of helping developers solve the major weaknesses of PC gaming architecture (the overhead) in preparation for the next generation of consoles and game engines?

Robert Hallock: Game developers did the rounds in the industry, asking all of the hardware vendors with a stake in graphics for a solution to make PCs more “console-like” with respect to hardware utilization efficiency and programming simplicity. They recognized that PC gaming could learn a lot from its siblings in the living room. Only AMD took these requests from the negotiation stage to the manpower and money phase, and Mantle was born!

(Again There are more questions on their Website I have not quoted, throughout)

I linked Redgamingtech's Youtube channel, incase you prefer to listen to the interview breakdown in Audio:
http://www.youtube.com/watch?v=NSGsIZwonok
 

Vizzeh

Banned
Personal interests specifically I would like to know what this means for the TrueAudio costing no additional resources for any GPU + built in Audio chip (PS4 Obviously). Can we draw any further comparisons between TrueAudio + SHAPE?

Also, extrapolating the information we know about the PS4 + X1 and the information given by AMD on

Question: Do you find that the bottleneck of GPU’s is becoming more on ROP and local memory bandwidth (on the discrete GPU) or is everything fairly balanced? Additionally, what’s the ideal ROP, Texture Units and bandwidth ratio vs say 8 GCN cores (or 512 Stream Processors)?
AMD: "512 CUs, then I would say: 32 texture units, 16 ROPs and a 128-bit bus"
(64 CU's Per GCN Core X8 GCN = 512)

If you require 16Rops per 512Cu's and X1 having 768, does this mean that the X1 is under the ROP performance / Bandwidth issues because of the specs AMD point out??

Using AMDs Math on the Needed Bandwidth:

AMD: In fact, excellent gaming performance is all about keeping resources local to the GPU and its framebuffer. Farming texture fetches (for example) out to system RAM is very substantial performance penalty.

What does this tell us about the DDR3 in X1?

R9 290x on 512bit bus has 320GB/s bandwdith,
so if you do a little math...
http://www.techpowerup.com/reviews/AMD/R9_290/

512bit /4 = 128bit so 320Gb/s /4 = 80GB/s

X1 is 68GB/s Peak + Esram. - So to achieve the 80GB/s X1 is constantly having to swap its assets to the Esram.

Hence... the potential difficulty of the programing? /1080p ?

There is a wealth of information in that interview and some of it is a little over my head, over to you techGaf, what do we learn from this??
 

TheD

The Detective
The first question is bullshit, game audio processing does not have a high CPU hit.
Kind of sounds like AMD is feeding the interviewer questions.

The second answer is also wrong, FXAA and most other Post AA are not compute shaders, just pixel shaders (if it was a compute shader it would not work on DX9 games!).
 

sp3000

Member
Audio processing has almost a zero CPU hit. The real reason audio has been completely stuck in 2002 is because there hasn't been a single company that has done better audio than Aureal's A3D raytracing solution.
 
The first question is bullshit, game audio processing does not have a high CPU hit.
Kind of sounds like AMD is feeding the interviewer questions.

The second answer is also wrong, FXAA and most other Post AA are not compute shaders, just pixel shaders (if it was a compute shader it would not work on DX9 games!).

There are computer versions of those I think (even used in some games). Am I mistaken?
 

Vizzeh

Banned
The second answer is also wrong, FXAA and most other Post AA are not compute shaders, just pixel shaders (if it was a compute shader it would not work on DX9 games!).

https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/sdk/11/SSAO11.pdf
http://graphics.pixar.com/library/DepthOfField/paper.pdf

According to these articles these are compute

Horizon-Based
Ambient Occlusion
using
Compute Shaders

The first question is bullshit, game audio processing does not have a high CPU hit.
Kind of sounds like AMD is feeding the interviewer questions.

Audio processing has almost a zero CPU hit. The real reason audio has been completely stuck in 2002 is because there hasn't been a single company that has done better audio than Aureal's A3D raytracing solution.

Basically this is saying CPU does take a performance hit on Audio.

Before TrueAudio was confirmed on PS4:
http://uk.ign.com/blogs/finalverdic...vs-playstation-4-hardware-and-specifications/

Overview: There is still a lot unknown about the playstation 4's audio block, but based on what we do know, Xbox One's audio block is looking to be significantly better. If this is the case, it means that the PS4 will have to rely on it's CPU a lot more to process some of the audio on it's games, and since it already has a weaker CPU than the Xbox One, it could really suffer in this area. Audio actually takes up a lot more resources to process than most people realize. This past generation, in some cases the audio for some games ended up taking 1-2 entire CPU threads to process on the Xbox 360 (which only had 6 hardware threads overall).
 
These look like pre approved questions to me

How about asking why they won't release any decent cpus (yet instead waste all the die space on a crappy integrated gpu that 90 percent of gamers won't use or to provide some actual cpu performance figures for their kaveri apu

It would be nice to just for once read an intereview that doesn't read like an ad
 

astraycat

Member
The first question is bullshit, game audio processing does not have a high CPU hit.
Kind of sounds like AMD is feeding the interviewer questions.

The second answer is also wrong, FXAA and most other Post AA are not compute shaders, just pixel shaders (if it was a compute shader it would not work on DX9 games!).

They are compute shaders that you can implement as a pixel shader. This is how people did compute before compute shaders were a thing.

There are pros and cons to doing full-screen techniques as a pixel shader vs. a compute shader, but for the most part using a compute shader is the preferred method if it's available.
 

Vizzeh

Banned
These look like pre approved questions to me

How about asking why they won't release any decent cpus (yet instead waste all the die space on a crappy integrated gpu that 90 percent of gamers won't use or to provide some actual cpu performance figures for their kaveri apu

It would be nice to just for once read an intereview that doesn't read like an ad

There are alot of follow up questions + Answers in another interview (in RGT's part2 coming in a few days), On the contrary, I think this interview tells us alot of information and Next Gen Consoles (or is it current GEN now) if we read between the lines.

Some nice information on mantle and the Future of tressFX in gaming, obviously its in hair like in Tomb Raider, but potentially hardware Physics for smoke and debris as AMD did not deny it, they just said they wouldnt comment on unannounced tech
 

TheD

The Detective
They are compute shaders that you can implement as a pixel shader. This is how people did compute before compute shaders were a thing.

There are pros and cons to doing full-screen techniques as a pixel shader vs. a compute shader, but for the most part using a compute shader is the preferred method if it's available.

It changes the colour value of pixels on a screen and is a pixel shader, seems pretty clear as to what it is!

Just because you could write a compute shader to do the same thing means nothing (if having the possibility to write a compute version of a shader makes said shader a compute shader then nearly everything would count as a compute shader!).


And none of those are post AA shaders.




Basically this is saying CPU does take a performance hit on Audio.

Before TrueAudio was confirmed on PS4:
http://uk.ign.com/blogs/finalverdic...vs-playstation-4-hardware-and-specifications/

Not only are IGN very clueless when it comes to tech, the 360 CPU is not very good by modern standards and that one game is like that due to an idiotic waste of CPU time by running 100's of sounds at the same time!
 

Vizzeh

Banned
And none of those are post AA shaders.

Those previous examples were to highlight that the entire question was not BS, FXAA, im no expert, I only seen links saying it was compute driven, so relevant?

Not only are IGN very clueless when it comes to tech, the 360 CPU is not very good by modern standards and that one game is like that due to an idiotic waste of CPU time by running 100's of sounds at the same time!

Yeah I agree On IGN, Not sure you can say Audio costs nothing on the CPU, Having TruAudio is surely a benefit to have given that it will still free up resources, last Gen as you know X360 had to dedicate an entire Xenon thread to it so thats 10%

http://en.wikipedia.org/wiki/BioShock
Chris Kline, lead programmer of BioShock, deemed BioShock as "heavily multithreaded" as it has the following elements running separately:[55]
Simulation Update (1 thread)
UI update (1 thread)
Rendering (1 thread)
Physics (3 threads on Xenon, at least one on PC)
Audio state update (1 thread)
Audio processing (1 thread)
Texture streaming (1 thread)
File streaming (1 thread)

But all that aside, the most interesting thing in the interview for me, which would have had GAF on its knees a few months ago, if the math add up.......Is the Information on the Rop/CU needed at 512CU's /16Rops (x1 768)- This surely tells us information about why 1 of our consoles is having FPS + Resolution issues.? (yes the Xbox one) + AMD's information on bandwidth.
 

astraycat

Member
It changes the colour value of pixels on a screen and is a pixel shader, seems pretty clear as to what it is!

Just because you could write a compute shader to do the same thing means nothing (if having the possibility to write a compute version of a shader makes said shader a compute shader then nearly everything would count as a compute shader!).



And none of those are post AA shaders.






Not only are IGN very clueless when it comes to tech, the 360 CPU is not very good by modern standards and that one game is like that due to an idiotic waste of CPU time by running 100's of sounds at the same time!

I don't think you really understand what a pixel shader is supposed to do. A pixel shader has some very specific functionality that is not available in a compute shader, but is very useful for shading triangles. None of this is useful for a full-screen pass. It just so happens that you can do a full-screen pass with a full-screen quad, but doing so is rather inefficient compared to doing the same thing in a compute shader since this has to be done using the graphics pipeline.

That, along with the fact that there are lots of optimizations open to compute shaders for full-screen pass sort of stuff (LDS being the main one, especially for compute-based AA solutions) that are not available to pixel shaders, makes pixel shaders a poor choice. But beggars can't be choosers.
 

TheD

The Detective
Those previous examples were to highlight that the entire question was not BS, FXAA, im no expert, I only seen links saying it was compute driven, so relevant?

I am not sure what you are trying to say, I only highlighted the FXAA in that answer.

Yeah I agree On IGN, Not sure you can say Audio costs nothing on the CPU, Having TruAudio is surely a benefit to have given that it will still free up resources, last Gen as you know X360 had to dedicate an entire Xenon thread to it so thats 10%

http://en.wikipedia.org/wiki/BioShock
Chris Kline, lead programmer of BioShock, deemed BioShock as "heavily multithreaded" as it has the following elements running separately:[55]
Simulation Update (1 thread)
UI update (1 thread)
Rendering (1 thread)
Physics (3 threads on Xenon, at least one on PC)
Audio state update (1 thread)
Audio processing (1 thread)
Texture streaming (1 thread)
File streaming (1 thread)

But all that aside, the most interesting thing in the interview for me, which would have had GAF on its knees a few months ago, if the math add up.......Is the Information on the Rop/CU needed at 512CU's /16Rops (x1 768)- This surely tells us information about why 1 of our consoles is having FPS + Resolution issues.? (yes the Xbox one) + AMD's information on bandwidth.

I did not say it costs nothing, I said it does not have a high CPU hit.
Xenon only has 6 hardware threads, not 10 (100/10 = 10).

Those are the software threads in Bioshock, not the hardware thread usage.

I don't think you really understand what a pixel shader is supposed to do. A pixel shader has some very specific functionality that is not available in a compute shader, but is very useful for shading triangles. None of this is useful for a full-screen pass. It just so happens that you can do a full-screen pass with a full-screen quad, but doing so is rather inefficient compared to doing the same thing in a compute shader since this has to be done using the graphics pipeline.

That, along with the fact that there are lots of optimizations open to compute shaders for full-screen pass sort of stuff (LDS being the main one, especially for compute-based AA solutions) that are not available to pixel shaders, makes pixel shaders a poor choice. But beggars can't be choosers.

What part of FXAA being a Pixel Shader do you not understand?!
It is written as a Pixel shader, it works in the old DX/D3D9c API, it works on Pixels!
If you don't like it being a Pixel shader go complain to Timothy Lottes!

It is just a Pixel shader acting on a screen buffer, it is not that uncommon and other post AAs also do it (like SMAA X1).
 

Error404

Banned
These look like pre approved questions to me

How about asking why they won't release any decent cpus (yet instead waste all the die space on a crappy integrated gpu that 90 percent of gamers won't use or to provide some actual cpu performance figures for their kaveri apu

It would be nice to just for once read an intereview that doesn't read like an ad
clearly you have never given an interview. You can't be rude to the person who is kindly taking time out of there schedule to speak with you.
 

Chobel

Member
What part of FXAA being a Pixel Shader do you not understand?!
It is written as a Pixel shader, it works in the old DX/D3D9c API, it works on Pixels!
If you don't like it being a Pixel shader go complain to Timothy Lottes!

And their is tons of Pixel shaders that act on a screen buffer like FXAA!

AMD guy never said that these effects are compute shader exclusive. he said compute-driven, they are better done in compute.
 

TheD

The Detective
AMD guy never said that these effects are compute shader exclusive. he said compute-driven, they are better done in compute.

No, saying that they are compute driven does not mean they could be better as compute shaders, it is stating that they are (which is clearly not the case for the reference version of FXAA on PC).
 
Yes because that's what makes you a journalists, be rude and piss of the people they are interviewing.

Asking tough questions or being critical is rude now
where do you even get that idea, it's not rude it's their job

Is a real journalist asking a politician about a double standard in his campaign or asking about a shortcoming or even just aksing to elaborate on something he avoids talking about being rude?
Again, what is the purpose of these people if all they do is ask 'how awesome is your product?' 'so awesome'
 

astraycat

Member
I am not sure what you are trying to say, I only highlighted the FXAA in that answer.



I did not say it costs nothing, I said it does not have a high CPU hit.
Xenon only has 6 hardware threads, not 10 (100/10 = 10).

Those are the software threads in Bioshock, not the hardware thread usage.



What part of FXAA being a Pixel Shader do you not understand?!
It is written as a Pixel shader, it works in the old DX/D3D9c API, it works on Pixels!
If you don't like it being a Pixel shader go complain to Timothy Lottes!

It is just a Pixel shader acting on a screen buffer, it is not that uncommon and other post AAs also do it (like SMAA X1).

There were no compute shaders when Timothy Lottes introduced FXAA, or I'm sure he would have used them.

FXAA is much better done with compute simply because of the ability to use LDS. If you don't understand why that is, then I suggest you read on compute shaders and how they differ from pixel shaders.

Pixel shaders have a very specific purpose, and it's not just to work on pixels. It's to work on fragments (of which pixels are a subset), and fragments have properties that are typically irrelevant for things like full-screen quad passes. Again, the only technical reason you'd chose to implement something like FXAA as a pixel shader instead of a compute shader would be due to lack of availability. Using a pixel shader for this is like using your garden sprinklers to shower. You can do it, but would you if you had a choice?
 

DonasaurusRex

Online Ho Champ
These look like pre approved questions to me

How about asking why they won't release any decent cpus (yet instead waste all the die space on a crappy integrated gpu that 90 percent of gamers won't use or to provide some actual cpu performance figures for their kaveri apu

It would be nice to just for once read an intereview that doesn't read like an ad

Well thats really easy to answer, about 8 years ago or so amd sold off most of its business for pennies because at the time they werent making much money. Most are ok with their fab business. But their flash memory, alchemy chipsets, imageon processors were also sold for pennies in hindsight. Flash memory exploded shortly after they sold their business , low power integrated cpu's on the MIPS isa broadcom says thank you , low power was the future and the sold that. Then theres the most obvious product, the imageon, which was ati's answer for mobile, consumer and other graphics solutions. You may know them by their modern name Qualcomms Adreno graphics solutions. All of that money and resources could've been AMD's. You dont make that many missteps and not fall behind in business. Cutbacks and lean times are not going to compete with Intel.

However the APU and Huma are an interesting opportunity. Originally announced 7 years ago this is the one project amd kept. Combined with Mantle to better utilize gpu resources and huma to free up precious bandwidth. They could have a winner and theres no longer a point to having a pure cpu champ if the gpu can be used for many workloads.

Your demands are reasonable i suppose..but they are DURING the convention that is debuting a lot of this tech, so...lets wait and see what happens with mantle and kaveri.
 

TheD

The Detective
There were no compute shaders when Timothy Lottes introduced FXAA, or I'm sure he would have used them.

FXAA v1 was first shown in 2009, their has been upgrades by Timothy Lottes done over the next year or so (so he could of used compute shaders)

SMAA was even later than FXAA and the 1 x version can get along just fine without using compute shaders.


FXAA is much better done with compute simply because of the ability to use LDS. If you don't understand why that is, then I suggest you read on compute shaders and how they differ from pixel shaders.

Pixel shaders have a very specific purpose, and it's not just to work on pixels. It's to work on fragments (of which pixels are a subset), and fragments have properties that are typically irrelevant for things like full-screen quad passes. Again, the only technical reason you'd chose to implement something like FXAA as a pixel shader instead of a compute shader would be due to lack of availability. Using a pixel shader for this is like using your garden sprinklers to shower. You can do it, but would you if you had a choice?
Once again you are missing the point!
The fact is that no matter you think would be better, the reference version of FXAA is not a compute shader and thus claiming that FXAA is "compute driven" is false.
 

astraycat

Member
FXAA v1 was first shown in 2009, their has been upgrades by Timothy Lottes done over the next year or so (so he could of used compute shaders)

SMAA was even later than FXAA and the 1 x version can get along just fine without using compute shaders.

Once again, beggars can't be choosers. SMAA is designed also to work on consoles, and until November no consoles had compute shaders.

Once again you are missing the point!
The fact is that no matter you think would be better, the reference version of FXAA is not a compute shader and thus claiming that FXAA is "compute driven" is false.

This statement makes it pretty clear that you don't know what a reference implementation is for, that you don't know how modern GPUs work, and as such really shouldn't go around telling people what is and what isn't "compute driven".
 

Vizzeh

Banned
Asking tough questions or being critical is rude now
where do you even get that idea, it's not rude it's their job

Is a real journalist asking a politician about a double standard in his campaign or asking about a shortcoming or even just aksing to elaborate on something he avoids talking about being rude?
Again, what is the purpose of these people if all they do is ask 'how awesome is your product?' 'so awesome'

I believe a couple of his questions, involving rival companies and Unannounced AMD products got knocked back which in itself shows they were not pre-determined. There was alot of information there that is not just marketing, but information for people that are interested in the tech behind their products (which General PR surely doesn't normally cover)

This statement makes it pretty clear that you don't know what a reference implementation is for, that you don't know how modern GPUs work, and as such really shouldn't go around telling people what is and what isn't "compute driven".

Whats your opinion on the ROP/CU references AMD was making using the numbers in my first post reply. Its speculative but does the math add up on how you understand the workings of GPU's and why the X1 is having trouble with 1080p and in some cases solid fps. There were threads and threads of this pre-launch, we have figures from AMD to work with, be nice to put it to bed. In theory.
 

Mudkips

Banned
FXAA v1 was first shown in 2009, their has been upgrades by Timothy Lottes done over the next year or so (so he could of used compute shaders)

SMAA was even later than FXAA and the 1 x version can get along just fine without using compute shaders.



Once again you are missing the point!
The fact is that no matter you think would be better, the reference version of FXAA is not a compute shader and thus claiming that FXAA is "compute driven" is false.

Any competent implementation of FXAA will use compute shaders where available because they're good for the actual work involved. Much of AMD's focus has been on general-purpose computing performance on GPUs. When they say something is "computer-driven" they mean that the work involved can be handled by the GPU well, be it via DirectCompute, OpenCL, AMD APP, whatever.

FXAA is compute-driven.
 

TheD

The Detective
Once again, beggars can't be choosers. SMAA is designed also to work on consoles, and until November no consoles had compute shaders.
FXAA is also meant to work on the consoles, yet you said that he would of used compute shaders if they were out.....
This statement makes it pretty clear that you don't know what a reference implementation is for, that you don't know how modern GPUs work, and as such really shouldn't go around telling people what is and what isn't "compute driven".

I very well know what a reference implementation is, it does not mean you can go around calling the shader a compute shader just because some versions of it might have been rewritten (versions that are not public AFAIK), when the version that gave the shader it's name does not!

And I do have a pretty good idea how the graphics pipeline works, you just need to get over the fact that the likely most common version of FXAA is written as a shader that works all the way back on hardware (and using an API) that does not support compute shaders!

Any competent implementation of FXAA will use compute shaders where available because they're good for the actual work involved. Much of AMD's focus has been on general-purpose computing performance on GPUs. When they say something is "computer-driven" they mean that the work involved can be handled by the GPU well, be it via DirectCompute, OpenCL, AMD APP, whatever.

FXAA is compute-driven.

So the reference version is not "competent"?
 

astraycat

Member
Whats your opinion on the ROP/CU references AMD was making using the numbers in my first post reply. Its speculative but does the math add up on how you understand the workings of GPU's and why the X1 is having trouble with 1080p and in some cases solid fps. There were threads and threads of this pre-launch, we have figures from AMD to work with, be nice to put it to bed. In theory.

I think what makes 1080p harder on the XB1 is less to do with ROPs and more to do with its memory structure. You can relieve ROP pressure by drawing in a smarter order, so that less triangles actually make it to the fragment stage and thus less fragments have to be dealt with by the ROPs, or by doing something as simple as Z-only pre-pass.

But it's hard to fit a more modern deferred pipeline into ESRAM unless you make the render targets smaller than 1080p. I think the XB1 will get more 1080p games in the future as developers wrap their heads around needing to shuffing things in an out of ESRAM, and figure out through experience what actually needs to be in ESRAM at a given time to maintain a certain amount of performance. Wrappers and ESRAM managers will be written, and this will become less and less of an issue as time goes on. Then maybe they'll start hitting issues with the ROPs.
 

astraycat

Member
So the reference version is not "competent"?

Competent in that it works, but competent in that its performant? Not usually.

Take the reference implementation of a D3D device. It's a software implementation of the whole pipeline. It runs about as well as dirt, but it shows you what the answer you should get is.

Reference implementations are supposed to show the correct result, not show you the best way to achieve that result.

Edit: sorry for the double post.
 

TheD

The Detective
Competent in that it works, but competent in that its performant? Not usually.

Take the reference implementation of a D3D device. It's a software implementation of the whole pipeline. It runs about as well as dirt, but it shows you what the answer you should get is.

Reference implementations are supposed to show the correct result, not show you the best way to achieve that result.

Performance wise it is pretty good, nothing like the D3D reference renderer vs hardware!
I also never said that FXAA (how it is now) is the fastest it can be, just that the most common version (and the shader that was first called FXAA) does not make use of compute shaders.

It is like saying that Linux is a hard real time OS, just because their is a patch set that gives it hard real time support, bar the fact that the main line kernel and most versions used in distros do not use said hard real time patch set (and thus do not support it)!
 

astraycat

Member
Performance wise it is pretty good, nothing like the D3D reference renderer vs hardware!
I also never said that FXAA how it is now is the fastest it can be, just that the most common version does not make use of compute shaders.

Right, and what people (including me, in a roundabout way) are saying is that all the heavy lifting done in an FXAA shader is compute-based. It's a bunch of texture samples and some math. It doesn't rely on the special pixel shader functionality like derivatives.

That it's usually implemented as a pixel shader is more a result of circumstance than preference. As the work done is mostly math, it'll benefit more from things like faster (and to a certain point, additional) CUs, which what people mean when they say something is "compute driven".
 

TheD

The Detective
Right, and what people (including me, in a roundabout way) are saying is that all the heavy lifting done in an FXAA shader is compute-based. It's a bunch of texture samples and some math. It doesn't rely on the special pixel shader functionality like derivatives.

That it's usually implemented as a pixel shader is more a result of circumstance than preference. As the work done is mostly math, it'll benefit more from things like faster (and to a certain point, additional) CUs, which what people mean when they say something is "compute driven".

Based off the Q&A it very much reads like they are talking about performance of D3D compute shaders/OpenGL compute shaders/OpenCL ect. and not the ALU/shader processor/SIMD blocks in the graphics processor (vs something like ROPs or TMUs or tessellation hardware ect.). e.g: " compute resources rather than the traditional graphics pipeline" and "Where possible, using GPU computer is a rather efficient way of rendering!".
 

SRG01

Member
The first question is bullshit, game audio processing does not have a high CPU hit.

I would disagree. If we're looking at next-gen audio and the so-called "3D" sound systems, the CPU overhead isn't insignificant. Offloading work means it can free up the CPU to do other things as well.

Anything that requires more than the CPU pushing bits to a DAC will impact CPU performance.
 

TheD

The Detective
I would disagree. If we're looking at next-gen audio and the so-called "3D" sound systems, the CPU overhead isn't insignificant. Offloading work means it can free up the CPU to do other things as well.

Anything that requires more than the CPU pushing bits to a DAC will impact CPU performance.

But we have had "3D" sound systems for years and keep in mind that the PS4 will be doing all this in software on the CPU (the sound processor does not have hardware for advanced sound processing in it and Sony themselves has noted that that the GPU is not a very good fit for sound processing), so "next gen" audio is going to have to work on it.

I also did not say that audio processing will have no impact on the CPU, just that it is a small one.
 
Top Bottom