Despite October Mishap, SpaceX’s Computers Are a Go

SpaceX’s Dragon capsule at the end of the International Space Station’s robotic arm. Photo Credit: NASA

Two months ago, one of SpaceX’s Dragon capsules lost one of its three flight computers while docked with the International Space Station. The likely cause was a radiation hit, but according to SpaceX’s director of vehicle certification, John Muratore, the computer mishap was a function of the radiation-tolerant system design and not its non-radiation hardened (rad-hardened) parts. The computer problem didn’t dog the mission, and it’s not posing a longer-term problem to the company either. 

Every Dragon has six computers, which, under the company’s current contract with NASA, aren’t reused from mission to mission. Each of the three computer units is actually a pair—two computers that keep one another in check—with 18 distinct processing units. That means that in all, each Dragon has 54 processors on board. This architecture means Dragon can tolerate a failure. Even with one computer offline, there would still be two more pairs voting on something. What would happen if the two remaining computers disagreed, however, is unclear.

Overall, this computer structure is a fairly robust, fault-tolerant setup—a necessity when flying so close to the ISS. Even non-radiation hardened (rad-hardened) components are flightworthy with this level of redundancy.

The International Space Station, as seen by the STS-134 crew after undocking in 2011. Photo Credit: NASA

So why didn’t SpaceX use rad-hardened hardware? It wasn’t a requirement from NASA. The space agency only requires that SpaceX undertake a thorough analysis of the radiation environment so it knows exactly what its hardware is flying in to. With this as the guideline, SpaceX knowingly flew non rad-hardened parts, though the company says the overall system is safe to fly in the fairly familiar radiation environment 200 miles above the Earth. The Dragon’s systems’ robustness was also a factor in the non rad-hardened part decision. The three units meet NASA’s safety requirements, leaving SpaceX free to focus on what it considers the really important parts of the flight hardware: how much power they use, how much memory they hold, how much they process, how physically big they are, and the type of information and language the computers use.

But even without all rad-hardened parts, SpaceX says it’s highly unlikely that all three of Dragon’s flight computers could be knocked out by radiation. If this did happen, SpaceX could just power up the vehicle with its computers down.

The effects of radiation on the computers depend on what part bears the brunt of the hit. If radiation hits a dense part of the system, like the memory bank, the computer repairs itself. But if radiation hits a more delicate part, like the circuits that impact how and where information is imported and processes, the signals flip. Zeroes will become ones and vice versa. This leads to an error, which causes the computer to reboot itself. This is basically what happened in October.

Rebooted doesn’t mean the failed computer will be back in synch with the other two. It needs to be re-synched, something SpaceX didn’t do in October. Again, the company says this was NASA’s call. Re-synching involves uploading the data stored in the other two computers, which we’re assuming as still in agreement after a radiation event, into the rebooted computer. Coordinating this re-synching is complicated. SpaceX would need to explain the synching to the other parties involved—in this case NASA and the ISS—and that’s a time-consuming exercise. With the one computer offline, SpaceX determined Dragon could safely fly away from the ISS, so NASA opted to let it do just that.

Musk’s vision for Mars: Dragon capsules everywhere. Image Credit: SpaceX

For the time being, SpaceX is only talking about making some changes to Dragon’s flight computer system. Adjusting and speeding up the re-synching procedure might be worthwhile down the line. But any change will come slowly. SpaceX says its performance on this last mission met every one of NASA’s safety requirements and that every piece of hardware that had any kind of hit completely recovered.

That doesn’t mean the company isn’t continuing to evolve its hardware. In the last two years, SpaceX has moved through three generations of flight computers and a fourth is in the works. The company pays special attention to marketplace trends to take advantage of the best software, the best people, and the best techniques to achieve optimal flight designs. Elon Musk is, after all, planning to get to Mars with all SpaceX systems. Every piece of Dragon hardware the company builds, tests, and flies now will support that audacious goal in the long term. The company’s designs keep evolving, and it’s learning every step of the way.

Missions » ISS »

NASA Looking to Fly Educator’s Experiments

40 Years Since Apollo 17: Part 2 – Unpicking the Men