Hacker News

story

ZenHammer: Rowhammer attacks on AMD Zen-based platforms(comsec.ethz.ch)

330 pointstranspute posted a month ago

141 Comments:

tdullien said a month ago:

Coauthor of the original Rowhammer exploit here. ECC remains a highly effective method for turning this from a security issue to a reliability issue, mostly. As an individual owner of a server, if that server has ECC and you expect to notice machine halts due to uncorrectable ECC errors, the security implications for you are modest.

Now, if you are a cloud provider that provides VMs on multitenant hosts, your threat model may be different.

Either way, avoid machines without ECC. TRR was a lame duck even when Rowhammer was still fresh, and bits flipping in DRAM will not go away unless the economics in DRAM manufacturing change (e.g. not).

Arnavion said a month ago:

I would use ECC memory if I could. I used to use a TR 2920x with ECC but now I'm on a Ryzen 7950x with non-ECC. Unbuffered ECC memory is the only one supported by Ryzen, and it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory. The latest Threadripper lineup supports Registered ECC, but Threadripper is overkill (cost, threads, PCIe lanes) for home users like myself.

justinclift said a month ago:

> and it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory.

That's not anything new though.

My local supplier (https://www.scorptec.com.au) has a fair amount of both, with ECC currently being about double the price (ugh).

When the AM4 generation was current, the difference in price was a lot less though. :/

Still it's worth it for piece of mind. Especially if you're undervolting the cpu. ;)

namibj said a month ago:

Keep in mind that for my 5950X I had to buy Micron Rev E 16Gbit x8 due based DIMMs, rated 3200CL22, running 3600CL18. I.e., they just don't ship with XMP presets.

Overclock it yourself. It's not that hard.

Arnavion said a month ago:

The advantage of buying RAM with XMP presets is that the reseller who created the preset has tested the sticks with that overclock and binned the original chips accordingly. When you buy RAM that is only rated for the default speed (as ECC server RAM is), you have no guarantee that all sticks will overclock the same amount, so in the worst case one stick will bring all the other sticks down to its level.

namibj said 19 days ago:

Running at that speed with that little voltage implied good speed binning, though.

Also I tried to get factory-OC'd RAM, but couldn't find any 32G sticks with ECC.

account42 said a month ago:

> it's slower or more expensive or both compared to the equivalent-capacity non-ECC memory

How much of that is it being actually slower and how much is it just ECC memory not being sold at pre-overclocked speeds?

foresto said a month ago:

Have you considered a 7950X3D? The larger CPU cache might make up for the RAM speed difference.

Arnavion said a month ago:

I didn't want to have to deal with the non-uniform CCDs. Of course the two on a 7950x aren't uniform either due to silicon lottery (eg on mine the first CCD clocks 100MHz higher than the other on all-core load and 200MHz higher on single-core load), but that is a small difference. It would presumably be more pronounced on the 7950x3d since only one has access to the extra cache. So I would be using it "sub-optimally" if I didn't `taskset` / cgroup everything to run on one CCD or the other.

foresto said a month ago:

I wonder what workload needs more than eight dual-threaded cores, but has trouble if the additional cores have more cache or the RAM isn't factory-overclocked, and doesn't care about data integrity.

cbozeman said a month ago:

> bits flipping in DRAM will not go away unless the economics in DRAM manufacturing change

This seems like all the argument necessary to require all computers - and I do mean all computers - to require ECC memory. The security risk is simply too great, and everything is too integrated not to make this change. Even a "gamer" on a pure gaming computer will have some crucial information on that machine, so I simply do not see how we've gone this far without making this change.

maxcoder4 said a month ago:

>Even a "gamer" on a pure gaming computer will have some crucial information on that machine

Which belongs to the gamer user, so it can be extracted or encrypted by every malware running on that system. No need for esoteric attacks like rowhammer at this point. Unless you think about, for example, exploring user machine via rowhammer in js running in a browser tab, but as far as I know that was never made practical.

tdullien said a month ago:

The reality is that Rowhammer remains one of the hardest ways to compromise a machine, largely because the software stack most people are running isn't great.

myself248 said a month ago:

I've heard that DDR2 is immune to rowhammer. Is that actually the case, or is it just because nobody's looked at it? Is SRAM the only thing that's truly immune?

treprinum said a month ago:

AMD now took Intel's market segmentation approach and is disabling ECC on most Ryzen CPUs. Only Pro and Threadrippers have it guaranteed, then some boards with some desktop Ryzens.

c2h5oh said a month ago:

Nothing changed since first Ryzen launch:

- Desktop Ryzen CPUs support ECC, but implementation by motherboard vendors is not mandatory

- Laptop and G-series Ryzen CPUs only support ECC in pro variant

- Threadripper has ECC support

edit: not confirmed but supposedly laptop and APUs starting with 6000 series all support ECC.

0xcde4c3db said a month ago:

There are also a few oddball desktop SKUs that are actually a G-series processor with the GPU disabled (primarily ones below the "600" tier, e.g. the Ryzen 3 4100 or Ryzen 5 5500), which also lack ECC support.

treprinum said a month ago:

7840HS laptop CPU definitely doesn't have ECC, only the PRO variant.

adrian_b said a month ago:

What is weird is that AMD has published for almost 2 years specifications that all the Ryzen 6000 series and all the Ryzen 7000 series of laptop CPUs (Rembrandt and Phoenix) support ECC, then suddenly and silently they have removed the statement about ECC support from all their specifications.

For the current Ryzen 8000 laptop series (Hawk Point), the ECC support has been specified as missing from the beginning.

sunshowers said a month ago:

I thought the same as you, but the situation's less dire thankfully. Wrote a post about it a while back that was top of HN: https://sunshowers.io/posts/am5-ryzen-7000-ecc-ram/

justinclift said a month ago:

Possibly interestingly, the motherboard (ASRock X670E Taichi) you mention there now has ECC listed on the ASRock page:

https://www.asrock.com/mb/AMD/X670E%20Taichi/index.asp#Speci...

    Supports DDR5 ECC/non-ECC, un-buffered memory up to 7800+(OC)
sunshowers said a month ago:

Oh awesome. That's really fantastic.

The Riptide motherboard I use also has ECC listed now. https://pg.asrock.com/mb/AMD/B650E%20PG%20Riptide%20WiFi

treprinum said a month ago:

"Supports" might mean you can run unbuffered ECC UDIMMs but without ECC? Even Intel can run ECC UDIMMs in non-ECC mode. Also some manufacturers don't distinguish between "on-die ECC" (DDR5) and real ECC.

sunshowers said a month ago:

No, the memory controller believes ECC is activated. See my post :)

Dylan16807 said a month ago:

Less dire for sure, but having to pay double the cost for ECC modules is still pretty dire.

adrian_b said a month ago:

That is not true. The last time I have looked ECC DDR5 UDIMM modules had a price higher by at most 50%.

Nevertheless that is still excessively high. While in the beginning for DDR5 there were only 80-bit modules, which could claim a +25% higher price, now there are 72-bit modules, like in the previous generations, which can justify at most a +12.5% price increase.

Dylan16807 said a month ago:

I looked right when I posted that and found 2x32GB non-ECC 5600MHz for $165, exactly half the $330 price of the sticks listed in the post. I spent several minutes looking for cheaper ECC at the same specs and couldn't find any.

Trying again, I can find some Kingston sticks that are $120 each, so that's about 50% higher. Amazon's search is really bad, by the way. But that's not the "at most" price. And a month ago they were $140 each.

Also the spec sheet says they are 72 bit modules made out of 20 2GB x8 chips? That is baffling. Is 10% of the memory going unused? https://www.kingston.com/datasheets/KSM56E46BD8KM-32HA.pdf

Edit: Micron data sheets suggest that UDIMMs have 2x13 command/address pins and RDIMMs have 2x6, so that's one piece of the puzzle. Apparently UDIMMs can do x64 and x72, and RDIMMs can do x72 and x80.

adrian_b said a month ago:

When DDR5 was first introduced, there were only x8 chips.

Because the DDR5 channels must have a width of 32, 36 or 40 bits, with x8 chips one had to use 40-bit channels, even if only 36-bit channels are needed, so indeed 10% of the memory capacity remained unused.

Meanwhile, about a year ago, at least Micron has also introduced x4 chips. There are such ECC UDIMMs, using both x8 and x4 chips, which waste no memory.

On the market there are both modules made only with x8 chips, which do not use a part of the memory, and modules with a combination of x8 and x4 chips, without unused memory.

wtallis said a month ago:

Is that a change? I think what you described applies equally to every generation of Zen processors: Pro-branded chips have ECC capability officially, laptop chips don't have it, and consumer-branded chips have it unofficially with ECC capability optional for motherboards.

adrian_b said a month ago:

Already since a few generations, at least since Zen 3, the desktop Ryzen CPU have official ECC support, not unofficial ECC support like the first Ryzen.

This can be verified easily on the AMD site by reading the CPU specifications.

For laptop CPUs, there has been a time interval between the beginning of 2022 and the autumn of 2023 when ECC support was specified for all mobile Rembrandt and Phoenix CPUs, but then the ECC support has been removed suddenly from the specifications of all non-Pro laptop CPUs.

dralley said a month ago:

Is it disabled or is it simply not certified?

Last I checked ECC was not certified to work on most "consumer" oriented hardware, but AMD didn't make any attempt to actually disable it.

mjevans said a month ago:

Last I checked, DESKTOP AMD CPUs have working (not disabled, but not 'supported') ECC with DDR5 UDIMMs (5v source, not 12v server ram). Desktop BOARDS, depends on the HW + BIOS; initial firmware revisions didn't do ECC but for many boards on some brands it does work. I haven't checked recently.

adrian_b said a month ago:

No longer true.

At least since Zen 3 (2020), all AMD desktop Ryzen CPUs have ECC support clearly included in their specifications, so it is official support, not just a non-disabled ECC.

This change was around the same time when Intel has begun to support ECC in some Alder Lake desktop CPUs (and in their successors), so it might have been a response to Intel's decision.

So now the ECC support depends strictly on the motherboard manufacturer. The best chance to find motherboards with ECC support is at ASUS and at ASRock (including ASRock Rack, which offers server boards for Ryzen CPUs).

foresto said a month ago:

All the Ryzen 7000 series desktop processors support ECC, I think. Check each model's specs to be sure.

Asus motherboards for those CPUs also support it, as stated in the BIOS manuals I looked over. It requires changing the BIOS ECC setting from Auto to Enable.

I have done this on one such system, and the appropriate EDAC messages showed up in the Linux boot log.

crest said a month ago:

I have a few questions I haven't found satisfactory answers for in the existing papers:

* Are modern patrol read engines guided by the memory access patterns to respond to RowHammer style attacks?

* How aggressive would a patrol read engine have to scan the DRAM to safely stay ahead of RowHammer induced bit-flips?

* Would larger ECC words than the traditional 64+8 with multi-bit error correction change the game and allow us to build more reliable systems from DRAM with pattern vulnerabilities?

c2h5oh said a month ago:

I would expect that increased crash rate of multi-tenant hosts would be something that would be detected and investigated by the cloud provider. At the same time targeting a specific tenant would require a lot of luck.

crotchfire said a month ago:

What about DIMMs with Error Correction Codes (ECC)? Previous work on DDR3 showed that ECC cannot provide protection against Rowhammer.

This is incredibly misleading. The paper they cite states:

When the ECC detection is used correctly 0.65%-7.42% of all bit flips still cause silent corruptions... On setup AMD-1, uncorrectable errors crash the system.

The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

Kudos for calling out JEDEC's terrible behavior on the rowhammer question, but we should not be downplaying ECC as a near-term solution.

wolpoli said a month ago:

> The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

Is there a process for the operations team managing the system to figure out that it was an attack and not just flaky hardware?

adrian_b said a month ago:

Memory bit flips are very rare.

Normally a memory error does not happen more than a few times per year, unless you have a huge amount of memory.

Therefore when 2 memory correctable or uncorrectable errors happen in the same day, that should be enough to trigger an immediate report to the user or administrator of the computer that either there is an ongoing RowHammer attack that must be stopped or one of the memory modules is approaching its end-of-life due to aging and it must be replaced before it will begin to have very frequent memory errors.

At least on server computers it should be easy to configure their logging system so that a second memory error per day, even if it was correctable, should immediately send an e-mail message and/or an SMS to the administrator.

wolpoli said a month ago:

If that's the case, then I guess they would take physical server offline. And if other machines started showing similar signs of failure, then they would analyze the logs for possible row hammer attack?

crotchfire said a month ago:

Sure: you replace the hardware with brand new hardware and it keeps happening. Then you know it's not the hardware.

pixl97 said a month ago:

The same workload starts crashing after migrating to multiple machines?

justinclift said a month ago:

Sounds like a process thing that would need to be developed by each team. So probably a mix of results there.

p1necone said a month ago:

> The attacker will need to cause dozens of machine halts in order to achieve even a single exploitable bitflip. Dozens of machine halts is not something that goes undetected.

If you're targeting a specific machine, if you're throwing the exploit at a few thousand machines shotgun style then you're still going to get your botnet - it'll just be smaller.

crotchfire said a month ago:

Can you point to any botnets which were built using rowhammer attacks?

Rowhammer and speculative execution attacks are incredibly labor-intensive and target-specific. They are targeted attacks for high-value targets.

vlovich123 said a month ago:

I think the point is that people with thousands of machines are probably going to notice if a meaningful chunk of them start halting.

SAI_Peregrinus said a month ago:

Yep, and desktop users will certainly notice. Only AMD has desktop (not workstation) ECC support.

riedel said a month ago:

If you are running windows 10 random halts and the CPU getting hot won't seem suspicious.

p1necone said a month ago:

Why do you need to target one person who has thousands of machines? What if I just want to pwn whatever random machines visit my dodgy website? Dismissing an exploit just because it only works some fraction of the time seems overly optimistic to me.

jquery said a month ago:

Thanks for this. One reason I bought ECC for my home desktop was specifically for protection against Rowhammer (Zen2 TR platform), and that line made my heart race a bit. Very misleading.

transpute said a month ago:

Any recommendations for client devices with ECC memory?

wtallis said a month ago:

If it has ECC memory, it's going to be branded as a workstation or server or industrial device, not marketed as a consumer device.

Among consumer products, some AMD desktop CPUs and motherboards support ECC memory, and that's about it.

justinclift said a month ago:

For desktops, ASRock motherboards seem to be the common choice for people wanting ECC memory.

It's specifically mentioned on the ASRock motherboard pages under "Specifications". Some random examples:

https://www.asrock.com/mb/AMD/B650%20Pro%20RS/index.asp#Spec...

https://pg.asrock.com/mb/AMD/B650%20PG%20Lightning%20WiFi/in...

https://www.asrock.com/mb/AMD/X670E%20Taichi/index.asp#Speci...

These all have:

    Supports DDR5 ECC/non-ECC, un-buffered memory up to 7200+(OC)
jeffbee said a month ago:

I think it's worth investigating the level of "support" these boards offer for ECC. The ASRock Taichi for example does not have any ECC DIMMs in its "qualified" list.

justinclift said a month ago:

Interesting. Might be good for someone (not me!) to investigate then write in-depth info about. :)

As a data point, I'm using a previous generation ASRock AM4 motherboard with ECC and that definitely works.

I'm undervolting my cpu and ram, and very occasionally (every 6 months or so?) one of those seems to be generating a correctable ECC error that gets propagated to warning messages on my terminal. Haven't bothered investigating any further though. ;)

adrian_b said a month ago:

The laptops with ECC memory are expensive and they are available for now only with Intel CPUs (while it should be possible to use mobile AMD CPUs I have never seen any such product). They are sold as "mobile workstations" by Dell, Lenovo and HP. I have a Dell Precision mobile workstation laptop with ECC memory bought in 2016 and it still works fine. However I had to pay for it EUR 3000 in 2016 and now something similar would be even more expensive (it had an NVIDIA Quadro GPU and 32 GB of ECC memory).

For desktops it is much easier to choose ECC memory, because the additional cost (the cost of the memory modules is 50% higher for DDR5-4800) remains a small fraction of the cost of an entire computer.

What is needed is to buy a motherboard with ECC support.

An example of a good motherboard with ECC support is ASUS PRIME X670E-PRO WIFI (for AMD Ryzen). I have been using a similar ASUS motherboard with ECC memory from the previous X570 generation for the last 5 years and it still works fine.

There are several other such MBs, mainly at ASUS and ASRock.

For Intel Raptor Lake there are fewer and more expensive such motherboards, but they can be found at ASUS (Pro WS W680M-ACE SE) and at Supermicro, as "workstation motherboards".

reliabilityguy said a month ago:

--

crotchfire said a month ago:

It will detect (by crashing) enough to make exploitation impractical. That is the key point.

reliabilityguy said a month ago:

I would say that 60% success per trial is a good chance.

exmadscientist said a month ago:

In the process of generating one triple flip, many, many, many, many, many single and double flips will occur and will be caught. That is why ECC is still an effective defense. Attackers don't just get to go straight to their end game.

YetAnotherNick said a month ago:

You can cause any amount of single and double flip without worry. It's not a defence as the attacker can retry till ECC labels it as uncorrectable. AFAIK there is no cost in retrying.

exmadscientist said a month ago:

That's true, but none of it is silent. Corrected errors get reported and it will be obvious that something is going wrong to anyone who's paying attention.

YetAnotherNick said a month ago:

Reported where? There is no reporting in Ryzen CPUs.

theevilsharpie said a month ago:

Ryzen CPUs report ECC errors like any other modern CPU -- it raises a Machine Check Exception which the operating system is expected to handle. Linux and Windows will both handle and log any ECC errors that the CPU raises. Presumably the various BSDs do as well.

YetAnotherNick said a month ago:

[1] says AM4 doesn't support reporting:

> While the X570D4U-2L2T and its predecessors the X470D4U series supports ECC memory, there is a bit of a gotcha. As readers noted in the original X470D4U reviews, while ECC memory is supported and performing error correction, the reporting of that error correction was not functioning. In other words, even if you were experiencing continuous memory errors, no log of those errors was being recorded in the IPMI event log where one might expect them to show up. A user over on the Level One Techs forums had a conversation thread with someone from ASRock Rack, who reported that while the AM4 platform had ECC support, it did not have error reporting support.

[2] says:

> However we got AMD official respond today. AM4 does not support ECC error reporting function

[1]: https://www.servethehome.com/asrock-rack-x570d4u-2l2t-review...

[2]: https://forum.level1techs.com/t/asrock-rack-x470d4u2-2t/1475...

theevilsharpie said a month ago:

This is referring to ECC as reported from an out-of-band management controller on those specific motherboards. This needs either some platform-level integration into the memory controller (which is unlikely to be present on consumer Ryzen), or an driver/agent running on the host OS that captures ECC events and sends them to the out-of-band management controller.

Standard ECC events are handled by the OS, and don't depend on (or otherwise interact with) an out-of-band management controller or any other external device. This works fine on Ryzen.

namibj said a month ago:

Got a reference? Because my Zen3 desktop has the driver loaded and information shown, just not the bitflips but that may be due to excessively early refresh configuration.

adrian_b said a month ago:

Normally you should not see any bit flips, because they happen at intervals of several months or even less frequently, depending on location.

Only for some old modules, e.g. 5-years old or older, the frequency of errors can increase a lot, even up to many bit flips per day, which means that the offending module must be replaced.

This feature of identifying the aged modules is one of the main benefits of ECC.

I have not looked again at the AMD EDAC driver, which has been updated during last year, but previously, a couple of years ago, its feature of injecting errors for testing was broken on Ryzen (because it had not been updated since Bulldozer, at that time), so the only method to verify that error reporting is working was to overclock the memory in the BIOS settings, to ensure that errors will be generated. Obviously, for the test one should boot from read-only media, to avoid the corruption of the storage in the case of excessive errors.

namibj said 19 days ago:

I've overall looked at about 100 GB*years of edac counter on this massive, and never once there was any error.

If I knew how, I'd dial down the voltage very slowly while running a rowhammer PoC to either catch it hammering or catch edac counts.

reliabilityguy said a month ago:

--

exmadscientist said a month ago:

The ECCploit paper has extensive discussion of all the ways their work is detected, and how they even use detection to probe the correction structure. This is not a silent attack. This is a proof that ECC is a penetrable defense. Which we all know! The question is how difficult it is and how stealthily it can be done.

But regardless, ECC still sounds the alarm when it's being attacked. If no one listens, there's not much ECC can do about that.

rightbyte said a month ago:

That's true for encryption too.

VHRanger said a month ago:

Serious question: as an average person, are those hardware security issues (rowhammer, spectre, meltdown) an actual risk?

My understanding with spectre and meltdown was that it was an issue for escaping VMs and similar attacks - something AWS engineers should care about, but not me

gary_0 said a month ago:

The solution is to disable JavaScript and not run any untrusted apps. And then move to a shack in the woods and live off the land, because you just cut yourself off from modern society.

bee_rider said a month ago:

Noscript is annoying for like a week until you get the sites that you use frequently and basically trust whitelisted.

Sure, it isn’t perfectly safe. If HN or my employer goes evil, they can rowhammer me I guess. I’d expect it to cause a big todo, though, so I’m not that worried about it.

I don’t really understand why people seem to think disabling JS is a big hassle. Is this motivated reasoning by web devs or something?

It is not a big problem, and the sort of “ambient shittiness” of the internet greatly improved by doing it. Most sites work fine, they’ll default to some (better) less dynamic state, maybe some ads won’t load. For those sites that don’t work, you can make an exception or leave. Personally I’m now mostly visiting sites by people who don’t enjoy over complicating things, and who think about fallbacks. It is great!

spxneo said a month ago:

The year is 2024. Solar panels you installed from Alibaba begins to search for cell towers. Your local instance of LLM voice bot you built to keep you company is using a malicious npm package that suddenly communicates with the solar panels and starts sending packets to a Chinese server.

malfist said a month ago:

Your solar panels are talking to China, your lightswitch is part of a massive botnet promoting bitcoin on X, your car is selling your data to your insurance company to have an excuse to raise your rates, your browser is protecting your privacy by routing all your sensitive information through their servers for them to inspect.

Your phone is selling your location for antiabortion fanatics to harass you, or help your stalker find you. Your ISP is selling your browsing history to anyone with a dollar.

That databroker that everyone was selling too just went to bankrupt and the banks are selling your data to anyone with a penny.

We desperately need wholesale privacy regulation.

spxneo said a month ago:

We have a "democracy",

Yet no privacy.

It's almost like the people who run the country,

do not believe in democracy.

tycho-newman said a month ago:

The fact that JavaScript is essential to modern society makes me want to cry.

berkes said a month ago:

I'm still not sure what's worse.

The fact that I must run JavaScript written by just about anyone, in order to live in a modern society, or the fact that I keep having to write code in JavaScript in order to run a (completely non-JS related) business.

spxneo said a month ago:

back in my day we had a thing called Java Applets. You had to download Java to get snowflakes on your geocities son!

rtehfm said a month ago:

Sounds like a recipe to become to the next Ted Kaczynski.

bee_rider said a month ago:

Just install Firefox, then noscript, and skip the bit about the shack.

bitwize said a month ago:

Ted Kaczynski's views are pretty popular on Hackernews.

rustcleaner said a month ago:

Too bad we won't see Uncle Ted give a TED Talk. :^(

transpute said a month ago:

Some browsers (including Brave on iOS) can disable Javascript by default, to be enabled only on trusted sites where 3rd-party ads are blocked.

ngneer said a month ago:

No. As a sober hardware security researcher, most exploited vulnerabilities that would affect an average person are far more mundane and mostly software driven.

bee_rider said a month ago:

Everyone should install some kind of script whitelisting ad-on and only run JavaScript programs from websites that they really trust. I like noscript. I’m not sure what the Chrome pick is.

Other than that… we don’t often run random programs from the internet, right?

They’ve only scratched the surface for these sorts of bugs. Modern hardware is too complex to actually believe they’ll ever get them all.

sundvor said a month ago:

Definitely not a security expert here, but this is one of the reasons why I at least run ublock origin on just about everything - and recommend everyone do the same. The ad delivery networks is just such a huge vulnerability surface.

Noscript would be much better of course, I guess I'm just too lazy to go that extra step.

bee_rider said a month ago:

I’m not an expert either, but the I think the experts are not really very useful in this context.

At least, I typically see things about the trade off between usability and security and the need to enable certain use-cases. I think most security experts work in industry where their job is to figure out what can be done to patch things up within the constraint that their job doesn’t exist unless the company can do the stuff it needs to do to stay in business.

I don't really care about any of that, I just want to be able to read text from the internet without my system getting messed up. It is a much easier use-case, because static content is usually pretty safe (although I do think there have been vulnerabilities in font and image rendering libraries). We don’t need an expert to intelligently analyze things and balance against the interests of competing parties because there’s no need to push in the “open things up” direction for the most part.

cortesoft said a month ago:

Rowhammer has a javascript implementation that can run in the browser: https://github.com/IAIK/rowhammerjs

hathawsh said a month ago:

From a security perspective, a web browser is a kind of VM hypervisor, where each web site may have its own VMs. So yes, everyone can be affected.

rgbrenner said a month ago:

You run untrusted code everyday inside a VM: your browser.

Dalewyn said a month ago:

If you really are an average person, then no: Like most other supposed threats, you lose more to the fixes/mitigations than to the threat itself. They just make for great headlines and sensationalism, which is why you as an average person would hear about them at all.

Note that the average person wouldn't know WTF "DRAM" means, let alone "Rowhammer" or "Zen" or other esoteric industry terms.

ls612 said a month ago:

No. I've run the Rowhammer test in memtest86 on my PC after building it (as part of the whole memtest package to verify my XMP was stable) and got zero errors on 64GB of DDR5 memory over all the passes. If Memtest couldn't do it when trying its hardest to brute force it nobody doing drive-by javascript has any chance to exploit it.

userbinator said a month ago:

Could you tell us what DIMMs you're using? I thought Rowhammer-free RAM was a thing of the past, but if some manufacturer has fixed theirs to be immune, they deserve the extra sales and publicity.

ls612 said a month ago:

Corsair Vengeance 2x32GB 5200Mhz. My understanding is that DDR5 in general is mostly immune to known rowhammer attacks because the on-chip ECC is good enough to fix any issues. This attack seems to work only with AMD Zen processors and not with Intel 12th-14th gen so I suspect DDR5 on intel is still good.

transpute said a month ago:

First paragraph:

  This poses a significant risk as DRAM devices in the wild cannot easily be fixed, and previous work showed that Rowhammer attacks are practical, for example, in the browser, on smartphones, across VMs, and even over the network.
ngneer said a month ago:

That is just one view, namely the authors' view. You may wish to consider recent perfect 10 vulnerabilities for comparison, as these are far more likely to cause problems.

Tuna-Fish said a month ago:

Before browsers got patched, meltdown could be used to steal browser encryption keys using js. This absolutely would have affected normal people.

mik1998 said a month ago:

"could", theoretically. In practice, there has never been an observed exploitation of the supposed vulnerability.

mitigations=off

oynqr said a month ago:

Just don't do that on modern AMD processors, you'll lose performance.

ncann said a month ago:

The practical answer is that, if 99.9% of people out there has system that mitigates these issues, no one will bother using these exploits in the wild and you can turn off these mitigations to get the perf benefit and be reasonably sure that you won't get exploited. Unless you're targeted of course.

berkes said a month ago:

But "we", being the average tech expert, also has no way to know when that 1% will hit.

It takes only one creative genious to turn the next security issue into a thing that does affect us all. Some worm that eats all linuxes, a virus that spreads through all bsds or something that installs crypto miners on every second android or so. We cannot know.

And so we cannot defend ourselves against that. And so it's useless to worry about it. But it will happen. Our systems are way too monoculture, both soft- and hardware, to be protected against a digital potato famine.

magnoliakobus said a month ago:

If 99.9% of people can be exposed to the same malicious code and not even be aware that it was running in the background, it's all the more reason for a malicious actor to expose the largest amount of people to it with relatively minimal risk.

AtNightWeCode said a month ago:

Some of these exploits can be used in a browser. They leave no trace. So it is hard to tell how much these exploits have been used in the past and how likely a wider attack will happen in the future.

Some of these exploits have been used in targeted attacks towards end users so the risk is not 0.

swozey said a month ago:

I have a very vague understanding of all of these DDR bitflip attacks, but I found the original Hammertime paper and it's actually very easy to read. I haven't gone through all of it but it breaks things down to be better understood very well.

I've heard bitflipping a million times and never really got it (not that I made serious effort) until this.

https://comsec.ethz.ch/wp-content/files/hammertime_raid18.pd...

I feel like I just went through a 101 EE course. I had NO idea any of this was related to the actual hardware manufacturing imperfections, etc.

That explains the name Rowhammer. I've probably been under a rock and everyones knows this stuff.

> Due to the extreme density of modern DRAM arrays, small manufacturing imperfections can cause weak electrical coupling between neighboring cells. This, combined with the minuscule capacitance of such cells, means that every time a DRAM row is read from a bank, the memory cells in adjacent rows leak a small amount of charge. If this happens frequently enough between two refresh cycles, the affected cells can leak enough charge that their stored bit value will “flip”, a phenomenon known as “disturbance error” or more recently as Rowhammer.

KennyBlanken said a month ago:

> Due to the extreme density of modern DRAM arrays, small manufacturing imperfections can cause weak electrical coupling between neighboring cells.

This makes it sound like it's unavoidable and inherent to making DRAM. It isn't.

DRAM manufacturers have been pushing the limits to an extreme. That's why. Pursuit of profit. This is no different from Ford deciding the cost of settling Pinto lawsuits (from injuries and deaths) was less than the cost of fixing the car's design.

ciupicri said a month ago:

I wonder: does Secure Memory Encryption [1] help against this?

[1]: https://www.amd.com/en/developer/sev.html

anticensor said a month ago:

Yes, but you might end up in a huge loss in stability. Even a single bitflip might become a fatal error.

formerly_proven said a month ago:

If oyu get that many bitflips the system wasn't stable to begin with.

wtallis said a month ago:

I think the implication was that memory encryption could mean that a rowhammer-induced bitflip would be amplified into scrambling the entire word of memory, which is more likely to have catastrophic effects than a single bit flip. That would be true for any reasonable definition of "stable" that admits any susceptibility to rowhammer.

indeyets said a month ago:

But that’s a good thing. Sane state would be synced to disk and any successful bitflip will halt a system telling you that something bad is going on. It would be “catastrophic” for runtime but not for the data

DarkNova6 said a month ago:

I know far too little about hardware security. Is this one of the many inevitable vulnerabilities that arise from CPU optimization and are of little feasibility in the real world?

rocqua said a month ago:

Arguably worse. This arises from the physics of DRAM. This occurs at a much lower level than an edge case of a feature that lets you leak info over a side channel. Instead this is just: the data is stored as a small charge in a grid by flipping nearby points on the grid alot you can leak some charge into your target charge.

The smaller the charge, and the closer together the charges, the easier rowhammer attacks are. Also, the smaller and closer together the charges, the faster, cheaper, denser, and efficient your RAM gets.

There are mitigations, but they are pushed to the limit.

KennyBlanken said a month ago:

From what I understand, it arises from DRAM manufacturers, interested in maximizing profits as much as possible, have been pushing the limits of how small they can make the RAM chip's features, and then backing off slightly until they felt ram was reliable "enough", but Rowhammer et al demonstrate it's very easy to cause bit flipping?

markhahn said a month ago:

"maximize profits" and "best product for customer" are dual. you specifically want small chip features - or don't you like speed, power efficiency, and low cost?

throwaway48476 said a month ago:

The point of engineering is trade offs. No one is trying to make a worse DRAM.

rocqua said a month ago:

They push the size to the limit, and stop when random writing is unlikely to cause any bitflips. Stopping at the point rowhammer would be unlikely would be stopping earlier.

As others said, this isn't just about profits. It's about being able to compete on costs (i.e. being able to survive at all) and to compete on the best performance. This places the problem less at singular manufacturers and more at the whole industry.

dist-epoch said a month ago:

This is a RAM problem, not a CPU one.

oldge said a month ago:

Does this work when full memory encryption, poisoning, and address xor is turned on?

reliabilityguy said a month ago:

With memory encryption it wont lead to system exploitation, just to a system crash.

So, with memory encryption you are safer.

binkHN said a month ago:

> ...ZenHammer could not trigger flips on nine out of ten devices ... more research is necessary to find more effective patterns for DDR5 devices.

So I guess DDR5 still has a little bit of time here. Anyone know if this also affects LPDDR5x?

wtallis said a month ago:

The DRAM interface is pretty well decoupled from the memory array itself. So whether you're looking at DDR5 or LPDDR5(x) or GDDR6(x) or HBM3(e) isn't the right question. What matters are the implementation details up to the manufacturer's discretion, such as on-die ECC.

axytol said a month ago:

They mention Zen 2 and 3, any info on Zen 1? Would it simply apply as well?

wmf said a month ago:

The real news appears to be that rowhammer is mostly fixed on DDR5.

samtheprogram said a month ago:

Not news, and per the article:

> Furthermore, we show that ZenHammer can trigger Rowhammer bit flips on a DDR5 device for the first time.

dist-epoch said a month ago:

DDR5 is so fragile they had to include on-die ECC to make it work, even when ECC is not exposed externally.

jeffbee said a month ago:

That only brings DRAM into alignment with flash and magnetic storage, so it's not really a negative. Everything in your computer is converging on semiconductor with bounded probabilistic state + math.

titzer said a month ago:

It's always been that way, just how many nines of reliability we're talking about. E.g. at Google scale, bitflips in memory from cosmic rays and general noise happy every day. Everything has checksums on it.

samstave said a month ago:

May you please ELI5 why DDR5 is 'fragile' as you put it?

Was its design pushing material sciences such that the theory worked, but practical implementation required the 'crutch' of ECC?

adgjlsfhk1 said a month ago:

basically. pushing the timing and sizes makes it likely that some of your bits will fail to be built correctly. rather than dropping the speed and sizes to get reliability, you just throw an extra chip on to give you redundancy.

RachelF said a month ago:

Take a look at the spec. The speeds are so high that they use some modem channel characterization features on the memory bus.

Linus was right about ECC being needed, with higher capacities and speeds and reduced feature size it's becoming a must.

merb said a month ago:

Well they said that it needs further testing. If it would be mostly fixed, it would mean that ecc could help even more. I mean the on-die-ecc probably already helps