Redgamingtech are starting to grow momentum over the last couple of years, I have been following them a while and find them very insightful, particularly on Technical analysis. So its great seeing them do well and getting an interview of this size. Not to mention competition to Digital Foundry has got to be a good thing for both tech and gaming (yes its in their name lol)
There are some Generic questions that touch on GCN Cores, Rops and bandwidth in general, which makes the connection to PS4/X1 and others that directly respond to them. PC enthusiasts should largely find it interesting also.
The interview Is fairly large so I wont Quote everything, Infact I will likely not get all the saucy stuff so I recommend checking out here for anything I miss: http://www.redgamingtech.com/exclus...n-architecture-performance-the-future-part-1/ (Note there is a page 2)
He states at the end of the interview, A further follow up detailed interview regarding ROP, bandwidth and compute in the next coming days
Interview With, Robert Hallock: Technical Communications, Desktop Gaming & Graphics at AMD. Interviewer: Paul Eccleston (CrimsonRayne) Redgamingtech.
(Again There are more questions on their Website I have not quoted, throughout)
I linked Redgamingtech's Youtube channel, incase you prefer to listen to the interview breakdown in Audio:
http://www.youtube.com/watch?v=NSGsIZwonok
There are some Generic questions that touch on GCN Cores, Rops and bandwidth in general, which makes the connection to PS4/X1 and others that directly respond to them. PC enthusiasts should largely find it interesting also.
The interview Is fairly large so I wont Quote everything, Infact I will likely not get all the saucy stuff so I recommend checking out here for anything I miss: http://www.redgamingtech.com/exclus...n-architecture-performance-the-future-part-1/ (Note there is a page 2)
He states at the end of the interview, A further follow up detailed interview regarding ROP, bandwidth and compute in the next coming days
Interview With, Robert Hallock: Technical Communications, Desktop Gaming & Graphics at AMD. Interviewer: Paul Eccleston (CrimsonRayne) Redgamingtech.
Question: Audio technology on PC’s has been fairly stagnant for a number of years, and requires high CPU overhead for processing. Does AMD’s TrueAudio processor take care of all of the processing work with audio, or is there some processing left for the CPU to perform on the audio?
Robert Hallock: AMD TrueAudio fully offloads the CPU if the developer maximally utilizes its capabilities. It’s a programmable audio pipeline, so it will only shoulder as much of the burden as it’s told to. Regardless, AMD TrueAudio sends the voice(s) it’s processing and rendering to the user’s existing audio hardware as a passthrough to the endpoint device (e.g. headphones).
Question: CPU’s over the past 3 years or so have only increased marginally in performance compared to huge strides in GPU technology. AMD have drastically beefed up the compute potential of your latest GPU cores, for example Hawaii (R9 290X). With both the PS4 and Xbox One likely to offload a lot of work for compute to the GPU, how do you think compute will affect the future of games development on the PC and console?
Robert Hallock: You’re right, PC gaming performance has come a long way in a short while. For example, the ATI Radeon™ HD 5870 was a flagship product just three years ago with the price to match. Today, roughly equivalent performance can be found starting at $139 in the AMD Radeon™ R9 260X. I digress, but I really wanted to put a fine point on how hard we’ve been working to bring significantly better performance to every gamer year after year.
To your point, compute is more important now than ever. Consider the following effects used by game devs: high definition ambient occlusion (HDAO), global illumination, FXAA, MLAA, TressFX Hair, diffusion depth of field, and contact hardening shadows. What do they have in common? They’re all GPU compute-driven effects. That’s just a small selection of the effects you can accelerate with compute resources rather than the traditional graphics pipeline. Where possible, using GPU computer is a rather efficient way of rendering!
Question: Regarding TressFX, recently you’ve updated it to simulate more than just hair, for example grass and fur (which is very impressive tech might I add), and reduced the overhead associated with running it. Can you tell us if you’re planning to use a version of hardware Physics for smoke and debris?
Robert Hallock: I cannot speculate on unannounced technologies.
Question; AMD seem to be in a great position right now for unifying games development, with your CPU, APU and GPU’s being used in PC’s and next gen consoles. Developers enjoying low level coding with Mantle, and other technologies such as TrueAudio and TressFX. Can you speak a little of your vision of games development, both on PC and consoles?
Robert Hallock: We’re tremendously proud of the continuum we’ve built in the gaming ecosystem. Game developers have already begun to leverage the commonalities, such as Crystal Dynamics’ recent decision to bring TressFX Hair to life for all platforms with the Tomb Raider Definitive Edition. We hope dividends will continue to be paid in this fashion for years to come for all platforms that we address.
Question: With the next generation consoles featuring 8 AMD Jaguar CPU cores, do you feel that more games on PC will benefit from an increased number of CPU cores as games developers change their engines to better use multi-core environments ?
Robert Hallock: I can’t speak for the future of the console business, but I can talk about what AMD is doing. Mantle is a powerful way to improve the robustness of a game’s multi-threading capabilities. Multiple game developers, notably Johan Andersson of DICE (in his keynote address at APU13), have been quite complimentary to our API and its scalability across cores. The whole talk is very enlightening on this topic, but the heart of Mantle is in addressing aspects of the software layer (like threading) that have lagged behind the capabilities of our hardware.
Question Currently, most PC’s use DDR3 RAM for main system RAM, with the GPU using GDDR5, although DDR4 will slowly start becoming the norm over the next few years. In situations where you have a discrete card like the Radeon R9 290x taking care of the graphics processing, is a CPU using DDR3 RAM on a PC bandwidth limited, when processing highly complex game engines due to the number of different tasks and data being accessed?
Robert Hallock: DDR3 is more than sufficient for today’s graphics landscape. Linus over at LinusTechTips did a very informative video on this topic, actually. The final verdict: capacities being equal, anything between DDR3-1333 and DDR3-2400 had negligible impact on GPU performance.
In fact, excellent gaming performance is all about keeping resources local to the GPU and its framebuffer. Farming texture fetches (for example) out to system RAM is very substantial performance penalty. More important to overall GPU performance is the potential of the CPU to feed the beast—the graphics subsystem. As resolutions and fill rates increase, an increasingly powerful CPU is required to capably feed the GPU. If the CPU is not up to the task, then performance tapers off. Most users call this “bottlenecked” or “CPU limited” performance.
Question : With the release of the Radeon R9 290X, bandwidth was increased by changing to a 512-Bit Memory bus from 384 and also doubling the ROP count of the GPU, ratio wise that’s higher than the increase in Stream Processors. Do you find that the bottleneck of GPU’s is becoming more on ROP and local memory bandwidth (on the discrete GPU) or is everything fairly balanced? Additionally, what’s the ideal ROP, Texture Units and bandwidth ratio vs say 8 GCN cores (or 512 Stream Processors)?
Robert Hallock: GPU design is analogous to the game engine design question you previously posed. Everything does have to be balanced. You could throw fistfuls of render backends at a GPU, but if your memory bandwidth is insufficient, then that hardware is wasted. And vice versa, of course.
I think it can best be explained by working backwards, asking yourself: “What performance and resolution target do I want to hit?” Then you build a core out on paper that, by your mathematical models, would yield performance roughly equivalent to your target. Then you build it!
For 512 CUs, then I would say: 32 texture units, 16 ROPs and a 128-bit bus.
Question: Keeping on the subject of compute for a moment longer, how much impact does running compute commands on the GCN 1.1 architecture make to graphics processing performance? Do you often find that a compute command must ‘wait’ to be processed because the GPU is busy processing the scene, or that graphics performance suffers? Or has improvements to compute architecture on the GCN 1.1 helped eliminate these issues?
Robert Hallock: It depends entirely on the effect. With respect to gaming, compute and graphics pipeline resources are interdependent. Being too aggressive with either category of rendering will ultimately compromise overall performance, but that is true of any GPU. Talented game developers will understand a GPU’s fundamentals and design an engine that takes a balanced approach, not just from a total resource perspective, but when to use compute versus pipeline as well.
With respect to products like the AMD Radeon™ R9 290 or R9 290X, we refined the basic GCN Architecture in a couple key ways: accommodating higher pixel rates for UHD content, off-chip buffering improvements to enhance tessellation performance, more robust data storage in the geometry processors to improve geometry shader performance, a smaller and more efficient memory controller, the addition of our “XDMA” technology, support for up to four primitives per clock, and of course we were able to scale Graphics Core Next out to 2816 total shader units. Overall, though, this is the basic GCN Architecture we know and love, but with an extra ounce of love to make it a meaner and more capable engine for multi-purpose work. (note, GCN 1.1 compute is similar to PS4
Question: What were the factors prompting AMD to create the Mantle API? Was it simply a case of helping developers solve the major weaknesses of PC gaming architecture (the overhead) in preparation for the next generation of consoles and game engines?
Robert Hallock: Game developers did the rounds in the industry, asking all of the hardware vendors with a stake in graphics for a solution to make PCs more “console-like” with respect to hardware utilization efficiency and programming simplicity. They recognized that PC gaming could learn a lot from its siblings in the living room. Only AMD took these requests from the negotiation stage to the manpower and money phase, and Mantle was born!
(Again There are more questions on their Website I have not quoted, throughout)
I linked Redgamingtech's Youtube channel, incase you prefer to listen to the interview breakdown in Audio:
http://www.youtube.com/watch?v=NSGsIZwonok