Radeon RDNA vs GCN how much faster is AMD s next gen architecture

Radeon RDNA vs GCN how much faster is AMD s next gen architecture

Radeon RDNA vs GCN: how much faster is AMD's next-gen architecture? Eurogamer.net If you click on a link and make a purchase we may receive a small commission. Read our editorial policy. Radeon RDNA vs GCN: how much faster is AMD's next-gen architecture? Performance rated teraflop for teraflop: what could it mean for PS5 and Scarlett? Feature by Richard Leadbetter Technology Editor, Digital Foundry Published on 10 Sep 2019 21 comments This is a project I've been wanting to work on for a long time - since it became clear that AMD's brand new Navi architecture would feature in the next-gen consoles, in fact. From PS4 and Xbox One, through the enhanced consoles and up to the reveal of Google Stadia, graphics power has been measured by a somewhat arbitrary unit: the teraflop. And let's be clear: how many teraflops the new consoles have remains a preoccupation for many observers, eager to get some idea of what PlayStation 5 or Project Scarlett may deliver up against the hardware of today. But perhaps the focus needs to shift and maybe we need to take a closer look at the new AMD Navi architecture itself. Put simply, a teraflop of Navi compute should produce much faster game performance than an old-school GCN equivalent - but can we quantify that? Testing Navi - and its teraflops - sounds like a relatively simple task. You'd start by tracking down graphics cards across the last seven years of AMD history, starting all the way back at GCN 1.0, the architectural foundation of the GPUs found in the current generation of consoles. From there, we'd equalise shader count, core clocks and memory bandwidth across the various GCN iterations and stack them up against a similarly specced Navi. After completing a thorough range of benchmarks, we'd have a progression of AMD performance improvements from the dawn of GCN right up to the brand new RDNA products - and at the end of it, maybe we'd get some idea of how a GCN 1.0 teraflop compares against an RDNA 1.0 equivalent. Unfortunately, carrying out this procedure is somewhat difficult because equalising frequencies, compute units and memory bandwidth is essentially impossible. The GCN era began with Tahiti - a 32 compute unit GPU, while Navi's lowest end offering has 36 CUs. Further complicating matters is that Navi's GDDR6 VRAM offers a vast 448GB/s of bandwidth - way beyond the limits of any kind of comparable GCN part, with no obvious means of underclocking it. However, a tip from the brilliant Steve Burke at Gamers Nexus pointed me towards MorePowerTool, which I found could underclock memory to 256GB/s - the upper-end of GDDR5's capabilities on prior GCN products. With that hurdle overcome, some mathematical shenanigans can get us to where we need to be, as this table demonstrates. Watch on YouTube A video breakdown of our GCN 1.0 vs RDNA 1.0 testing - which demonstrates that a last-gen teraflop of compute is easily beaten by a next-gen Navi equivalent. Architecture Shaders/CUs Bandwidth Launch Product Used Tahiti GCN 1.0 2048/32 288GB/s Dec 2011 R9 280X Polaris 10 GCN 4.0 2048/32 211-224GB/s Jun 2016 RX 570 Polaris 10 GCN 4.0 2304/36 224-256GB/s Jun 2016 RX 580 Navi 10 RDNA 1.0 2304/36 448GB/s Jul 2019 RX 5700 We can't compare GCN 1.0 to RDNA 1.0 directly, but we can do the next best thing. The original Graphics Core Next silicon, codenamed Tahiti, is represented here by the Radeon R9 280X with 32 compute units. Its 384-bit memory interface tops out at 288GB/s of bandwidth and can be easily underclocked to 256GB/s. Moving onto the evergreen Polaris architecture, the Radeon RX 570 has the same CU count, and its RAM can be overclocked to 256GB/s. The plan is starting to come together - we can compare GCN 1.0 and GCN 4.0 directly. However, working back from the top-end, the RX 5700 has 36 compute units, which presents a problem. We can underclock the GDDR6 to reach 256GB/s of bandwidth, but we have four more CUs than our other cards. This is where our final GPU proves crucial - the Polaris-powered RX 580 has the same CU count as Navi and comes with the required 256GB/s of memory bandwidth straight out of the box. In short: we can't compare GCN 1.0 to RDNA 1.0, but we can compare Tahiti to Polaris and Polaris to Navi, establishing a chain between architectures and using those two figures, project the percentage gain a 32 CU version of Navi would likely bring. This leaves just one more minor challenge - matching core frequency. Our MSI R9 280X tops out at 1050MHz, so I settled on a round 1.0GHz clock for all cards. The RX 5700 clock does wander a little, even when drastically underclocked to this level, but not to any degree that's likely to unduly compromise the result. Game on - but now the next challenge is to figure out what we're actually going to test on all four cards. Tahiti/32CUs Polaris/32CUs Polaris/36CUs Navi/36CUs 3DMark Firestrike DX11 9197 11299 11991 14153 3DMark TimeSpy DX12 2279 3149 3696 5035 There's a lot to cover here and we're starting with some synthetic benchmarks to set the stage. But before we go on, let's just stress that this is an architectural performance analysis and not representative of the actual products - remember that we've altered core clock, memory bandwidth or both (sometimes quite drastically) in order to get an idea of how AMD's graphics technology has evolved over the last seven years. 3DMark's graphics scores on its established Firestrike DX11 and TimeSpy DX12 benchmarks are our first port of call. Firestrike shows a 23 per cent boost in throughput between Tahiti and Polaris, and a lesser 18 per cent uplift from Polaris to Navi. End to end, by stacking up those two percentage gains, the improvement is around 45 per cent overall. The usefulness of 3DMark is often called into question, but as we'll see on the next page, this figure is close to actual gaming performance under DirectX11. TimeSpy stress-tests a card's DX12 credentials and produces an eye-opening result. Polaris beats Tahiti by 38 per cent, while Navi exceeds Polaris by 36 per cent. From one end of AMD's recent GPU history to the other, and factoring out Navi's extra CUs, Navi delivers a vast 88 per cent improvement over Tahiti - and again, this is reflected in many of our DX12 gaming results. The question is: are we looking at a genuine architectural improvement here though, or is Tahiti simply lacking a decent DX12 implementation by today's standards? Tahiti/32CUs Polaris/32CUs Polaris/36CUs Navi/36CUs GFXBench Tessellation 111fps 711fps 718fps 947fps GFXBench ALU2 (Compute) 655fps 825fps 913fps 1178fps Spoilers: we're going to see some crazy results running the same workloads across the AMD generations over the next few pages, where the GCN 1.0 technology is going to fall short of both Polaris - and especially Navi - owing to its far more limited feature set and lower levels of raw horsepower. And that's where the GFXBench OpenGL benchmarks above may prove illuminating. Polaris vs Tahiti sees a 28 per cent increase in compute power, despite identical levels of rated compute and memory bandwidth. Navi vs Polaris sees another leap of the same magnitude. From Tahiti to Navi, a notional 32 CU part based on the new architecture would deliver a 62 per cent improvement in compute power. I've also included the tessellation result as an example of AMD has beefed up geometry processing over the years. We know about game benchmarks being limited by compute power, ROPs or memory bandwidth - but could we see games being held back simply by triangle set-up? Yes indeed. The numbers speak for themselves here, but essentially from a 32 CU Tahiti to a 36 CU Navi, this aspect of AMD's architecture has seen an 8.5x increase in seven years - while totting up architectural gains from generation to generation to generation, there's a 745 per cent performance uplift. Now it's time to apply the same methodology to a range of games. The results are interesting, but not as consistent or as clear-cut as you might imagine. AMD RDNA vs GCN Analysis: Introduction, video analysis, synthetic benchmarks [This Page] Gaming benchmarks DX11: AC Unity, Crysis 3, Ghost Recon Wildlands, Far Cry 5 Gaming benchmarks DX12: Rise/Shadow of the Tomb Raider, Strange Brigade, Wolfenstein 2 Gaming benchmark problem children: Battlefield 1, Forza Horizon 4, The Witcher 3 AMD architecture: Navi memory bandwidth scaling and conclusion Will you support the Digital Foundry team? Digital Foundry specialises in technical analysis of gaming hardware and software, using state-of-the-art capture systems and bespoke software to show you how well games and hardware run, visualising precisely what they're capable of. In order to show you what 4K gaming actually looks like we needed to build our own platform to supply high quality 4K video for offline viewing. So we did. Our videos are multi-gigabyte files and we've chosen a high quality provider to ensure fast downloads. However, that bandwidth isn't free and so we charge a small monthly subscription fee of £4.50. We think it's a small price to pay for unlimited access to top-tier quality encodes of our content. Thank you. Support Digital Foundry Find out more about the benefits of our Patreon More Features Digital Foundry Nvidia GeForce RTX 4090: a new level in graphics performance The Digital Foundry video review - and how the new GPU champion delivers for 4K 120fps gaming. 14 Feature Evercore Heroes wants to wind people up the right way "There's less rage at them, because they didn't end your fun." Feature What games get wrong about horses And what they could do about it. 34 Feature Shout out to all the Overwatch supports - where would we be without you? Merci. 55 Latest Articles Digital Foundry Sennheiser's legendary HD 599 open-back headphones are just £70 at Amazon in the Prime Early Access Sale Comfortable with neutral sound and a wide sound stage. Preview Football Manager's new Console edition is the best you'll get without a PC Getting Touch-right. 1 Splatoon 3 Amiibos will be out next month Ink-coming! 3 Fans think Phil Spencer's shelf is teasing the Xbox Game Pass streaming box UPDATE: Xbox confirms old Keystone prototype. 61 Supporters Only Premium only Off Topic: Take a minute to appreciate Cookin' with Coolio's incredible scallops recipe. What a great book. Premium only Off Topic: Reading City of Glass in comic form "Where exactly am I going?" Premium only Off Topic: Il Buco is a transporting film about a really big hole Underlands. Off-Topic Netflix handled Sandman brilliantly It was Dreamy. 9 Buy things with globes on them And other lovely Eurogamer merch in our official store! Explore our store
Share:
0 comments

Comments (0)

Leave a Comment

Minimum 10 characters required

* All fields are required. Comments are moderated before appearing.

No comments yet. Be the first to comment!