48

I've just read this article about space-graded CPUs [1].

I am not a space expert at all but a question was born naturally:
why don't we prefer to shield earth-designed CPUs (far less expensive) than design brand-new radiation-proof CPUs?

PS: I have read some questions here but no one talks about this side of the issue.


[1] https://arstechnica.com/science/2019/11/space-grade-cpus-how-do-you-send-more-computing-power-into-space/

JBentley
  • 273
  • 1
  • 10
mattia.b89
  • 543
  • 1
  • 5
  • 10
  • 1
    That's a cool article btw, there's a lot of history covered there. – uhoh Nov 12 '19 at 16:23
  • 14
    The vast majority of cosmic rays that impinge upon a computer simply pass right through. A tiny fraction impart damage. There's a similar issue with shielding. The majority of cosmic rays that impinge upon a layer of lead a couple of meters thick simply pass right through. (That's cheating on my part; lead makes for a lousy shield against cosmic rays.) Aluminum, water, and liquid hydrogen are far better. All that is needed is a megaton of those materials to provide adequate shielding for a tiny little computer. That amount of shielding means launch is impossible. – David Hammen Nov 13 '19 at 02:45
  • 1
    No evidence of research. Various qns under the cosmic radiation tag throw light on this area, eg https://space.stackexchange.com/questions/388/what-materials-provide-the-best-protection-from-cosmic-radiation –  Nov 13 '19 at 06:05
  • @andy256: does that actually shed any light on how radiation-sensitive a modern CPU made on a 10nm silicon process actually is? With features that small, it doesn't take much stray charge to flip a bit somewhere (e.g. in cache where all the data is "precious". Or even in the register allocation table for an out-of-order execution CPU leading to instructions getting the wrong inputs forwarded to them. Or probably other bit errors could stall the CPU entirely (ROB thinks it's not executed and can't retire, but it's already left the scheduler so will never execute. Retirement stalls forever) – Peter Cordes Nov 13 '19 at 06:17
  • (So obviously you need some kind of robust watchdog system that resets the computer if it doesn't poke some hardware periodically. Lockup errors might be less bad than undetected corrupt-data errors.) – Peter Cordes Nov 13 '19 at 06:19
  • 1
    @PeterCordes That kind of passive-aggressive response is why I don't spend time on Stack anymore. –  Nov 13 '19 at 07:23
  • @andy256: I wasn't intending it to be passive aggressive. After a quick skim of that link (and searching for the word "CPU"), I didn't see anything that shed any light on how much shielding a computer would need. I was asking in case I'd missed something that was relevant to shielding CPUs; it's not at all obvious even after reading the (very good) Ars Technica article what level of shielding an off-the-shelf ARM Cortex A53 (the basis for the HPSC) would need. Then I got carried away talking about CPU architecture, which is the main reason I spend time on SO. This Q looks good to me. – Peter Cordes Nov 13 '19 at 07:47
  • 3
    @RonJohn: Of course they've thought of it. Since they don't, there must be a reason. Asking for an explanation of this reason (e.g. that a high-energy cosmic ray can penetrate significant shielding and create a shower of lower-energy particles in the process) is the point of this question, I assume, in more detail than "it wouldn't work". – Peter Cordes Nov 13 '19 at 07:55
  • @PeterCordes if OP wanted to know why we need to send Rad-hardened chips into space, he's have asked that. But by saying "Why don't we shield existing CPUs?", he's pretty blatantly implying that shielding is the obvious answer. – RonJohn Nov 13 '19 at 09:05
  • 5
    @RonJohn: I still think "why exactly can't shielding work?" is a fair question. The answer is "because it would have to be way too thick/heavy to be sufficient" is the answer, in whatever level of detail you want to go into. And/or that saving that shielding mass with a rad-hardened CPU is worth the cost because mass = $. – Peter Cordes Nov 13 '19 at 09:11
  • @PeterCordes the Q should at least acknowledge that Experts In The Field have thought about the problem and chosen the seemingly illogical solution. – RonJohn Nov 13 '19 at 15:59
  • Vaguely similar on [physics.se]: https://physics.stackexchange.com/questions/386297/cosmic-ray-shielding-for-electronics-on-earth The lay populous seems to have a very optimisitic notion of what radiation shielding can achieve. – dmckee --- ex-moderator kitten Nov 13 '19 at 19:26
  • @DavidHammen "The majority of cosmic rays that impinge upon a layer of lead a couple of meters thick simply pass right through." from this answer, "Dropping the radiation by a factor of 1000 to a 10cm cube would take about half a mm of lead, adding up to something like 250g." EDIT: The comments seem to contradict this. Does anyone have actual numbers? – JollyJoker Nov 14 '19 at 08:38
  • 3
    @andy256 The comments you replied to don't seem to contain even a hint of being passive aggressive to me. On the contrary they seem a well considered counter to your earlier comment. Merely questioning someone's assertion on something doesn't mean it is passive aggressive. – JBentley Nov 14 '19 at 17:11
  • @JollyJoker Those comments would appear to be based on a serious mis-understanding of what you are trying to shield against. There are situations where quite modest shielding is highly effective (you epidermus is sufficient shielding against external sources of decay alpahs, for isntance), but the threats here are much more penetrating than that. – dmckee --- ex-moderator kitten Nov 14 '19 at 18:48
  • @RonJohn wrote "Q should at least acknowledge that experts in the field..." - but OP wrote "I'm not an expert at all." OP has implied that people more knowledgeable likely made the decision. Even so, that acknowledgement is mostly irrelevant to the question asked. – Aaron Nov 14 '19 at 19:07
  • 3
    @andy256 I find your commenting style to be way more combative and aggressive than the one you responded to. Indeed two of your comments contain rants about how you are wasting your time, and this site isn't worth visiting, etc. I found the other comment to be much more considered and factual. You may disagree entirely with the points raised there, but if we silence everyone who we disagree with, this site would be useless. I don't find your "beating your wife" analogy to be a useful comparison at all. – JBentley Nov 15 '19 at 13:56
  • 2
    This question specifically asked about shielding on the Apollo missions, however the answers provide a lot of useful background which is helpful on this one, including some counter intuitive arguments that its sometimes better to not shield than to shield too little. (at least for humans) – Cort Ammon Nov 15 '19 at 16:26

3 Answers3

73

Because shielding against radiation is heavy, and weight is the enemy of getting things into space.

CPUs are quite sensitive to radiation, and some types of radiation (cosmic rays) are not only quite good at penetrating most things, as they do, they cause a cascade of secondary radiation. To protect a device form any of this radiation getting through is not an easy (light) task. At a point, re-designing the CPU to make it tolerant to a few impacts is more economical as you don't have to rule out any bit flipping event if you are tolerant to 1 per cycle.

A few additional thoughts:

  • Part of the expense is to do with the one-off nature of space hardware and the need for testing etc. Even if the CPU was free to manufacture, by the time it had been flight-tested and brought from BAE, it would be big bucks.

  • New CPU architecture is not the only way to make chips less sensitive. For example: "One way to use faster, consumer CPUs in space is simply to have three times as many CPUs as you need: The three CPUs perform the same calculation and vote on the result. If one of the CPUs makes a radiation-induced error, the other two will still agree, thus winning the vote and giving the correct result.". This is the approach used by a NASA program Environmentally Adaptive Fault-Tolerant Computing or "EAFTC". The EAFTC computers sort of serve the same purpose. However, they are still not considered as reliable as dedicated radiation-hardened CPUs. There is an expectation that there will be some use of these or similar systems to offload some of the work from radiation-hardened CPUs. I don't know what the status of this is though.

ANone
  • 3,432
  • 9
  • 21
  • 2
    Can you give me some numbers? e.g. the "weight" to shield appliance; second, For the same reason (weight) are astronauts less/more protected than appliances? – mattia.b89 Nov 12 '19 at 18:00
  • 9
    @mattia.b89 astronauts are, to some degree, radiation-hardened. Our DNA has some copy correction mechanisms, and if a cell is damaged beyond repair, most of the time it will just suicide. This is good enough for low-level radiation, with the risk of occasional cancer, of course. An Earth CPU does not have any of those mechanisms, so any cosmic ray is susceptible to messing up the computation in unpredictable ways. So, a dosage that is fine for humans will not be for typical electronics. The shielding is typically the same: the polyethylene layers. – Davidmh Nov 13 '19 at 06:28
  • @Davidmh: Most mainstream CPUs use ECC (error correction codes) in their caches, typically with 32 or 64-bit granularity. (Fun fact: Intel L1-data cache (in at least some uarches) only uses parity not ECC like L1i/L2/L3 so they can support efficient single-byte and unaligned stores without a RMW cycle). ECC for main memory is also possible (and commonly used on servers). That's not of course a full solution, and it's only SECDED (single-error correction, double detection). SRAM cells also use more than the minimum transistors so they can run more stably at very minimal voltage. – Peter Cordes Nov 13 '19 at 07:54
  • 1
    @mattia.b89 (Re the weight): Its hard to say with certainty because there's no endpoint to being "safe", even on earth bit flip events happen, they are just very rare, and reducing radiation to earth surface levels would be prohibitive: tonnes, or maybe tens of tones. However ECC memory, let alone a voting system could mean way less than that is needed, to reach radiation hardened reliability. – ANone Nov 13 '19 at 09:45
  • 6
    @mattia.b89 (Re the humans): There are those who can tell you a lot more than me in these parts, but: at the moment, they are very similarly protected for example (by order of 2.5mm of Alu for the ISS). The difference is how susceptible, and to what types, of radiation. For people long term exposure is the real risk, for CPUs its the chance of a bad calculation being used for something critical. – ANone Nov 13 '19 at 09:52
  • 6
    I have recently attended a seminar where an engineer who is an expert in this area was asked this exact question, and his answer was exactly this. We simply don't know a way to shield a CPU that's even remotely plausible weight-wise. – xLeitix Nov 13 '19 at 14:03
  • 7
    @mattia.b89 : in the case of an off the shelf computer, if one single transistor is destroyed, or if one single bit is flipped, the software will typically crash or deliver incorrect results. In case of a human, the death of a single cell doesn't have any significant effect, cells die and reproduce all the time. Humans have much more redundancy than computers. – vsz Nov 14 '19 at 05:11
  • 1
    @mattia.b89 The answer to how astronauts tolerate the radiation they get on a deep space mission is "we've never tried" https://www.nasa.gov/feature/goddard/real-martians-how-to-protect-astronauts-from-space-radiation-on-mars – richardb Nov 14 '19 at 08:51
  • @mattia.b89, I can't give you the weight of the shielding, but I can give the price: $3000/kg aboard the Falcon 9, and you'll need multiple kilograms of shielding per chip. – Mark Nov 14 '19 at 21:30
  • @Mark That actually seems reaallly cheap. Are you sure there aren't a few 0s missing somewhere? – Voo Nov 15 '19 at 13:10
  • @Voo It looks like he may be right. This article says a Falcon 9 can lift payload for $2,700/kg, contrasting with the shuttle at $54,500/kg – Cort Ammon Nov 15 '19 at 16:20
  • @Cort That's quite impressive how cheaply we can send things to space these days. The 50k price point was more in line with my expectations. Amazing. – Voo Nov 15 '19 at 17:47
14

You actually ask a really good question. And the answer is, we do both, depending on the needs.

NASA tends to go for the ultra-reliable, and radiation tolerant components are more reliable, thus it is their preferred way. Many commercial satellites, however, use non-space grade components that are shielded lightly, and with software and hardware built in a way to allow for 2 CPUs to compute the same calculation, if they get different results, they re-calculate it. For memory, a common way is using a triple redundancy, where the memory is stored 3 times in different chips, and the answers are compared. The most sensitive and important components are still usually radiation tolerant, but these are a relatively small subset of components in a satellite, the heavy lifting can rely on a more radiation sensitive and much less expensive component.

PearsonArtPhoto
  • 121,132
  • 22
  • 347
  • 614
2

Throwing some rough and ready math at the question, happy to be corrected by anybody with actual numbers.

Hardening increases the radiation level to trigger errors by several orders of magnitude, call it 1000 for this.

Dropping the radiation by a factor of 1000 to a 10cm cube would take about half a mm of lead, adding up to something like 250g. Most computer modules are larger and more awkwardly shaped than that so couple of kg of shielding.

So shielding would be achievable but would cost an instrument or backup element out of the final design.

What is possibly missing from this is who actually paid for the rad hardened CPU, and what a shielded but conventional CPU would have cost to test. The hardened CPUs are mostly born out of military spending rather than space exploration (so NASA would not get the saved money), and arrive with lots of paperwork specifying not just radiation the hardening but 'free' hardening against extreme temperatures and vibration.

An off the shelf CPU would need to be boxed up in the radiation hardened case and then subjected to the relevant tests before launch. It would probably fail and need to be re-engineered and re tested several times to get it right. So the final cost would probably be cheaper, but might turn out to be much higher or even delay things enough to miss launch window and that would be unknown at planing time. Where the rad hardened unit would be a known quantity in terms of price, weight and power from early in design process.

So quite possibly if you are making a family of LEO sats where can afford to have the first couple fail, and spread the testing cost across the rest of the family this can work, and in fact many current generation satellites (particularly cube/smallsats) do go this route.

If you are designing a space probe with a half billion dollar budget and flying for decades then trading a couple of million for enough weight to add another sensor starts to look better. Especially if you can get the computers at less than ticket price and call it a research/outreach project for the agency that designed them.

So this is math that can go either way, depending greatly on the details.

GremlinWranger
  • 22,391
  • 1
  • 56
  • 87
  • 8
    I don't think the math(s) for this is quite correct. 0.5mm of lead is not close to sufficient to prevent bit errors in a typical CPU in deep space. Out of curiosity, where did you get the factor of 1000 from. I have only seen this in the context of total radiation dose to cause long term damage, not single event effects. – ANone Nov 13 '19 at 13:49
  • Agree the maths is wrong in several different directions and happy to either delete or update the post if someone has better numbers than the first page of search results produced. There are powers of ten differences coming from what your radiation consists of (particles or pure EM), distance (inverse square law) and the exact nature of your electronics but the observable results (cubesats and mars probes) suggests that both options can be made to work as long as you accept the costs/constraints. – GremlinWranger Nov 13 '19 at 14:19
  • 6
    There absolutely are uses of not-quite-as-severely rad-hardened devices in use in space. As you rightly point out, not all space is the same (the major difference is inside or outside of earth's magnetosphere). However this isn't the major factor in the need for rad-hardening. A simple stateless ASIC producing data only needs to survive the radiation dose. Anomalies can be filtered out on earth when the data is received. However the CPU that controls something critical has to not make mistakes. It doesn't mean you can never use non-rad hardened devices, just that you can't always either. – ANone Nov 13 '19 at 15:48
  • 4
    Re Dropping the radiation by a factor of 1000 to a 10cm cube would take about half a mm of lead, No. A half a millimeter of lead has almost no effect against either the solar wind (mostly protons) or galactic cosmic rays (a mix comprising protons, deuterons, and alpha particles, bust also a trace amount of very damaging heavier ions). The effect of a half of a millimeter of lead is in fact worse than no shielding at all due to secondary radiation. But at least a half of a millimeter of lead is not nearly as bad as five millimeters of lead, which is about twice as bad as no shielding at all. – David Hammen Nov 14 '19 at 13:21
  • 3
    Lead is good at blocking x-rays and gammas, but those are not a significant threat from a dwarf yellow star. The key threats are the solar wind, which peaks at solar maximum, and galactic cosmic rays, which peak at solar minimum. Lead is such an incredibly bad choice against protons and heavier nuclei that it is included in studies for comparison purposes only. – David Hammen Nov 14 '19 at 13:26