UCS Blog - All Things Nuclear, Nuclear Power Safety - Latest 2

Marijuana and Nuclear Power Plants

The Nuclear Regulatory Commission (NRC) adopted regulations in the mid-1980s seeking to ensure that nuclear power plant workers are fit for duty. The NRC’s regulations contained provisions seeking to verify that workers were trustworthy and reliable as well as measures intended to prevent workers from being impaired on duty. The former measures included background checks before workers could gain access to the plant while the latter components included drug and alcohol testing.

The regulations require that nuclear plant owners test workers for marijuana and alcohol use at the time of hiring, randomly thereafter, and for cause when circumstances warrant it. In 2014, marijuana use was the #1 reason for positive drug and alcohol tests by contractors and vendors and was the #2 reasons for positive tests by nuclear plant employees. Positive tests for alcohol are the #1 reason for positive tests by employees and the #2 reason for positive tests by contractors and vendors. A positive test may not be a career killer, but it is often a career crimper.

Fig. 1 (Source: Nuclear Regulatory Commission)

Alcohol can be legally purchased and consumed in all 50 states. So, mere detection of having used alcohol will not result in a positive test. But detection of a blood alcohol concentration of 0.04 percent or higher yields a positive test. People have different metabolisms and alcoholic beverages come in different sizes, but that threshold is often equated to having consumed one alcoholic beverage within five hours of the test. Similar to the reason that states require motorists to not drive under the influence of alcohol (i.e., don’t drink and drive), the NRC’s regulations seek to control alcohol consumption by workers (i.e, don’t drink and operate nuclear plants.)

Unlike the reason for the alcohol controls, the NRC’s ban on marijuana use is not because it might make them more likely to make mistakes or otherwise impair their performance, thus reducing nuclear safety levels. The NRC banned marijuana use because at the time marijuana was an illegal substance in all 50 states and its criminal use meant that workers fell short of the trustworthiness and reliability standards in the fitness for duty regulation. Since the NRC adopted its regulation, 8 states have legalized recreational use of marijuana and another 12 states have decriminalized its use.

Fig. 2 (Source: NORML)

The NRC recognized that marijuana’s legalization creates potential problems with its fitness for duty regulation. If an individual uses marijuana in a state that has legalized or decriminalized its use but tests positive at a nuclear plant in a state where its use is not legal, is the individual sufficiently trustworthy and reliable? In the eyes of the NRC, the answer remains yes.

Fig. 3 (Source: Nuclear Regulatory Commission)

The NRC conceded that no comparable scientific basis links marijuana use to performance impairment as existed when the alcohol limits were established. But the NRC continues to consider marijuana use as indicating one lacks the trustworthiness needed to work in a nuclear power plant.

The NRC is in a hard spot on this one. Revising its regulations to eliminate marijuana as a disqualifier for working in a nuclear power plant would likely spawn news reports about the agency permitting Reefer Madness at nuclear plants. But the country’s evolving mores are undermining the basis for the NRC’s regulation.

Nuclear Plant Cyber Security

There has been considerable media coverage recently about alleged hacking into computer systems at or for U.S. nuclear power plants. The good news is that the Nuclear Regulatory Commission (NRC) and the nuclear industry are not merely reacting to this news and playing catch-up to the cyber threat. The NRC included cyber security protective measures among the regulatory requirements it imposed on the nuclear industry in the wake of 9/11. The hacking reported to date seems to have involved non-critical systems at nuclear plants as explained below.

The bad news is that there are bad people out there trying to do bad things to good people. We are better protected against cyber attacks than we were 15 years ago, but are not invulnerable to them.

Nuclear Plant Cyber Security History

The NRC has long had regulations in place requiring that nuclear plant owners take steps to protect their facilities from sabotage by a small group of intruders and/or an insider. After 9/11, the NRC issued a series of orders mandating upgrades to the security requirements. An order issued in February 2002 included measures intended to address cyber security vulnerabilities. An order issued in April 2003 established cyber attack characteristics that the NRC required owners to protect against.

The orders imposed regulatory requirements for cyber security on nuclear plant owners. To help the owners better understand the agency’s expectations for what it took to comply with the requirements, the NRC issued NUREG/CR-6847, “Cyber Security Self-Assessment Method for U.S. Nuclear Power Plants,” in October 2004; Regulatory Guide 5.71, “Cyber Security Programs for Nuclear Facilities,” in January 2010; NUREG/CR-7117, “Secure Network Design,” in June 2012; and NUREG/CR-7141, “The U.S. Nuclear Regulatory Commission’s Cyber Security Regulatory Framework for Nuclear Power Reactors,” in November 2014. In parallel, the Nuclear Energy Institute developed NEI-08-09, “Cyber Security Plan for Nuclear Power Reactors,” in April 2010 that the NRC formally endorsed as an acceptable means for conforming to the cyber security regulatory requirements.

First Step: NANA

Anyone who has read more than one report about the U.S. nuclear power industry will appreciate that NANA was a key step in the road to cyber security regulations—Need A New Acronym. The nuclear industry and its regulator need to be able to talk in public without any chance of the public following the conversation, so acronyms are essential elements of the nukespeak. Many FTEs (full-time equivalents, or NRC person-hours) went into the search for the new acronym, but the effort yielded CDA—Critical Digital Assets. It was a perfect choice. Even if one decoded the acronym, the words don’t give away much about what the heck it means.

Finding CDA Among the NCDA, CAA, and NCAA

Armed with the perfect acronym, the next step involved distinguishing CDA from non-critical digital assets (NCDA), critical analog assets (CAA), and non-critical analog assets (NCAA, sorry college sports enthusiasts). Doing so is an easy three-step process.

Step 1: Inventory the Plant’s Digital Assets

The NRC bins the digital assets at a nuclear power plant into the six categories shown in Figure 1. Security systems include the computers that control access to vital areas within the plant, sensors that detect unauthorized entries, and cameras that monitor restricted areas. Business systems include the computers that enable workers to access PDFs of procedures, manuals, and engineering reports. Emergency preparedness systems include the digital equipment used to notify offsite officials of conditions at the plant. Data acquisition systems include sensors monitoring plant parameters and the equipment relaying that information to gauges and indicators in the control room as well as to the plant process computer. Safety systems could include the equipment detecting high temperatures or smoke and automatically initiate fire suppression systems. Control systems include process controllers that govern the operation of the main turbine or regulate the rate of feedwater flow to the steam generators (pressurized water reactors) or reactor pressure vessels (boiling water reactors). The first step has owners inventorying the digital assets at their nuclear power plants.

Fig.1 (Source: Nuclear Regulatory Commission)

Step 2: Screen Out the Non-Critical Systems, Screen in the Critical Systems

Figure 2 illustrates the evaluations performed for the inventory of digital assets assembled in Step 1 to determine which systems are critical. The first decision involves whether the digital asset performs a safety, security, or emergency preparedness (SSEP) function. If not, the evaluation then determines whether the digital asset affects, supports, or protects a critical system. If the answer to any question is yes, the digital asset is a critical system. If all the answers are no, the digital asset is a non-critical system.

Fig. 2 (Source: Nuclear Regulatory Commission)

Step 3: Screen Out the NCDA, Screen in the CDA

Figure 3 illustrates the evaluations performed for the inventory of critical systems identified in Step 2 to determine which are critical digital assets. The first decision involves whether the critical system performs a safety, security, or emergency preparedness (SSEP) function. If not, the evaluation determines whether the critical system affects, supports, or protects a critical asset. If the answer to any question is yes, the critical system is a critical digital asset. If all the answers are no, the critical system is a non-critical digital asset.

Fig. 3 (Source: Nuclear Regulatory Commission)

Remaining Steps

Once the CDAs are identified, the NRC requires that owners use defense-in-depth strategies to protect workers and the public from harm caused by a cyber-based attack. The defense-in-depth protective layers are:

  • Prompt detection and response to a cyber-based attack
  • Mitigating the adverse consequences of a cyber-based attack
  • Restoring CDAs affected by a cyber-based attack
  • Correcting vulnerabilities exploited by a cyber-based attack

The Power of One (Bad Person)

The NRC instituted cyber security regulatory requirements many years ago. The NRC’s inspectors have assessed how effectively measures undertaken by plant owners conform to these requirements. Thus, the U.S. nuclear industry does not have to quickly develop protections against cyber attacks in response to recent reports of hacking and attacking. The job instead is to ensure required protections remain in place as effectively as possible.

Unfortunately, digital technology can also broaden the potential harm caused by an insider. The NRC’s security regulations have long recognized that an insider might attempt sabotage alone or in conjunction with unauthorized intruders. In what the military terms a “force multiplier,” digital technology could enable the insider to attack multiple CDAs. The insider could also supply passwords to the outside bad guys, saving them the trouble of hacking and the risk of detection.

The hacking of computer systems by outsiders made news. The mis-use of CDAs by an insider can make for grim headlines.

Cooper: Nuclear Plant Operated 89 Days with Key Safety System Impaired

The Nebraska Public Power District’s Cooper Nuclear Station about 23 miles south of Nebraska City has one boiling water reactor that began operating in the mid-1970s to add about 800 megawatts of electricity to the power grid. Workers shut down the reactor on September 24, 2016, to enter a scheduled refueling outage. That process eventually led to NRC special inspections.

Following the outage, workers reconnected the plant to the electrical grid on November 8, 2016, to begin its 30th operating cycle. During the outage, workers closed two valves that are normally open when while the reactor operates. Later during the outage, workers were directed to re-open the valves and they completed paperwork indicating the valves had been opened. But a quarterly check on February 5, 2017, revealed that both of the valves remained closed. The closed valves impaired a key safety system for 89 days until the mis-positioned valves were discovered and opened. The NRC dispatched a special inspection team to the site on March 1, 2017, to look into the causes and consequences of the improperly closed valves.

The Event

Workers shut down the reactor on September 24, 2016. The drywell head and reactor vessel head were removed to allow access to the fuel in the reactor core. By September 28, the water level had been increased to more than 21 feet above the flange where the reactor vessel head is bolted to the lower portion of the vessel. Flooding this volume—called the reactor cavity or refueling well—permits spent fuel bundles to be removed while still underwater, protecting workers from the radiation.

With the reactor shut down and so much water inventory available, the full array of emergency core cooling systems required when the reactor operates was reduced to a minimal amount. The reduction of systems required to remain in service facilitates maintenance and testing of out-of-service components.

In the late afternoon of September 29, workers removed Loop A of the Residual Heat Removal (RHR) system from service for maintenance. The RHR system is like a nuclear Swiss Army knife—it can supply cooling water for the reactor core, containment building, and suppression pool and it can provide makeup water to the reactor vessel and suppression pool. Cross-connections enable the RHR system to perform so many diverse functions. Workers open and close valves to transition from one RHR mode of operation to another.

As indicated in Figure 1, the RHR system at Cooper consisted of two subsystems called Loop A and Loop B. The two subsystems provide redundancy—only one loop need function for the necessary cooling or makeup job to be accomplished successfully.

Fig. 1 (Source: Nebraska Public Power District, Individual Plant Examination (1993))

RHR Loop A features two motor-driven pumps (labeled P-A and P-C in the figure) that can draw water from the Condensate Storage Tank (CST), suppression chamber, or reactor vessel. The pump(s) send the water through, or around, a heat exchanger (labeled HX-A). When passing through the heat exchanger, heat is conducted through the metal tube walls to be carried away by the Service Water (SW) system. The water can be sent to the reactor vessel, sprayed inside the containment building, or sent to the suppression chamber. RHR Loop B is essentially identical.

Work packages for maintenance activities include steps when applicable to open electrical breakers to de-energize components and protect workers from electrical shocks and close valves to allow isolated sections of piping to be drained of water so valves or pumps can be removed or replaced. The instructions for the RHR Loop A maintenance begun on September 29 included closing valves V-58 and V-60. These are valves that can only be opened and closed manually using handwheels. Valve V-58 is in the minimum flow line for RHR Pump A while V-60 is in the minimum flow line for RHR Pump C. These two minimum flow lines connect downstream of these manual valves and then this common line connects to a larger pipe going to the suppression chamber.

Motor-operated valve MOV-M016A in the common line automatically opens when either RHR Pump A or C is running and the pump’s flow rate is less than 2,731 gallons per minute. The large RHR pumps generate considerable heat when they are running. The minimum flow line arrangement ensures that there’s sufficient water flow through the pumps to prevent them from being damaged by overheating. MOV-M016A automatically closes when pump flow rises above 2,731 gallons per minute to prevent cooling flow or makeup flow from being diverted.

The maintenance on RHR Loop A was completed by October 7. The work instructions directed operators to reopen valves V-58 and V-60 and then seal the valves in the opened position. For these valves, sealing involved installing a chain and padlock around the handwheel so the valve could not be repositioned. The valves were sealed, but mistakenly in the closed rather than opened position. Another operator independently verified that this step in the work instruction had been completed, but failed to notice that the valves were sealed in the wrong position.

At that time during the refueling outage, RHR Loop A was not required to be operable. All of the fuel had been offloaded from the reactor core into the spent fuel pool. On October 19, workers began transferring fuel bundles back into the reactor core.

On October 20, operators declared RHR Loop A operable. Due to the closed valves in the minimum flow lines, RHR Loop A was actually inoperable, but that misalignment was not known at the time.

The plant was connected to the electrical grid on November 8 to end the refueling outage and begin the next operating cycle.

Between November 23 and 29, workers audited all sealed valves in the plant per a procedure required to be performed every quarter. Workers confirmed that valves V-58 and V-60 were sealed, but failed to notice that the valves were sealed closed instead of opened.

On February 5, 2017, workers were once again performing the quarterly audit of all sealed valves. This time, they noticed that valves V-58 and V-60 were not opened as required. They corrected the error and notified the NRC about its discovery.

The Consequences

Valves V-58 and V-60 had been improperly closed for 89 days, 12 hours, and 49 minutes. During that period, the pumps in RHR Loop A had been operated 15 times for various tests. The longest time that any pump was operated without its minimum flow line available was determined to be 2 minutes and 18 seconds. Collectively, the pumps in RHR Loop A operated for a total of 21 minutes and 28 seconds with flow less than 2,731 gallons per minute.

Running the pumps at less than “minimum” flow introduced the potential for their having been damaged by overheating. Workers undertook several steps to determine whether damage had occurred. Considerable data is collected during periodic testing of the RHR pumps (as suggested by the fact it was known that the longest a pump ran without its minimum flow line was 2 minutes and 18 seconds). Workers reviewed data such as differential pressures and vibration levels from tests over the prior two years and found that current pump performance was unchanged from performance prior to the fall 2016 refueling outage.

Workers also calculated how long it would take a RHR pump to operate before becoming damaged. They estimated that time to be 32 minutes. To double-check their work, a consulting firm was hired to independently answer the same question. The consultant concluded that it would take an hour for an RHR pump to become damaged. (The 28 minute difference between the two calculations was likely due to the workers onsite making conservative assumptions that the more detailed analysis was able to reduce. But it’s a difference without distinction—both calculations yield ample margin to the total time the RHR pumps ran.)

The testing and analysis clearly indicate that the RHR pumps were not damaged by their operating during the 89-plus days their minimum flow lines were unavailable.

The Potential Consequences  

The RHR system can perform a variety of safety functions. If the largest pipe connected to the reactor vessel were two rupture, the two pumps in either RHR Loop are designed to provide more than sufficient makeup flow to refill the reactor vessel before the reactor core overheats.

The RHR system has high capacity, low head pumps. This means the pumps supply a lot of water (many thousands of gallons each minute) but at a low pressure. The RHR pumps deliver water at roughly one-third of the normal operating pressure inside the reactor vessel. When small or medium-sized pipes ruptured, cooling water drains out but the reactor vessel pressure takes longer to drop below the point where the RHR pumps can supply makeup flow. During such an accident, the RHR pumps will automatically start but will send water through the minimum flow lines until reactor vessel pressures drops low enough. The closure of valves V-58 and V-60 could have resulted in RHR Pumps A and C being disabled by overheating about an hour into an accident.

Had RHR Pumps B and D remained available, their loss would have been inconsequential. Had RHR Pumps B and D been unavailable (such as due to failure of the emergency diesel generator that supplies them electricity), the headline could have been far worse.

NRC Sanctions

The NRC’s special inspection team identified the following two apparent violations of regulatory requirements, both classified as Green in the agency’s Green, White, Yellow and Red classification system:

  • Exceeding the allowed outage time in the operating license for RHR Loop A being inoperable. The operating license permitted Cooper to run for up to 7 days with one RHR loop unavailable, but the reactor operated far longer than that period with the mis-positioned valves.
  • Failure to implement an adequate procedure to control equipment. Workers used a procedure every quarter to check sealed valves. But the guidance in that procedure was not clear enough to ensure workers verified both that a valve was sealed and that it was in the correct position.

UCS Perspective

This near-miss illustrates the virtues, and limitations, of the defense-in-depth approach to nuclear safety.

The maintenance procedure directed operators to re-open valves V-58 and V-60 when the work on RHR Loop A was completed.

While quite explicit, that procedure step alone was not deemed reliable enough. So, the maintenance procedure required a second operator to independently verify that the valves had been re-opened.

While the backup measure was also explicit, it was not considered an absolute check. So, another procedure required each sealed valves to be verified every quarter.

It would have been good had the first quarterly check identified the mis-positioned valves.

It would have been better had the independent verifier found the mis-positioned valves.

It would have best had the operator re-opened the valves as instructed.

But because no single barrier is 100% reliable, multiple barriers are employed. In this case, the third barrier detected and corrected a problem before it could be contribute to a really bad day at the nuclear plant.

Defense-in-depth also accounts for the NRC’s levying two Green findings instead of imposing harsher sanctions. The RHR system performs many safety roles in mitigating accidents. The mis-positioned valves impaired, but did not incapacitate, one of two RHR loops. That impairment could have prevented one RHR loop from successfully performing its necessary safety function during some, but not all, credible accident scenarios. Even had the impairment taken RHR Loop A out of the game, other players on the Emergency Core Cooling System team at Cooper could have stepped in.

Had the mis-positioned valves left Cooper with a shorter list of “what ifs” that needed to line up to cause disaster or with significantly fewer options available to mitigate an accident, the NRC’s sanctions would have been more severe. The Green findings are sufficient in this case to remind Cooper’s owner, and other nuclear plant owners, of the importance of complying with safety regulations.

Accidents certainly reveal lessons that can be learned to lessen the chances of another accident. Near-misses like this one also reveal lessons of equal value, but at a cheaper price.

Turkey Point: Fire and Explosion at the Nuclear Plant

The Florida Power & Light Company’s Turkey Point Nuclear Generating Station about 20 miles south of Miami has two Westinghouse pressurized water reactors that began operating in the early 1970s. Built next to two fossil-fired generating units, Units 3 and 4 each add about 875 megawatts of nuclear-generated electricity to the power grid.

Both reactors hummed along at full power on the morning of Saturday, March 18, 2017, when problems arose.

The Event

At 11:07 am, a high energy arc flash (HEAF) in Cubicle 3AA06 of safety-related Bus 3A ignited a fire and caused an explosion. The explosion inside the small concrete-wall room (called Switchgear Room 3A) injured a worker and blew open Fire Door D070-3 into the adjacent room housing the safety-related Bus 3B (called Switchgear Room 3B.)

A second later, the Unit 3 reactor automatically tripped when Reactor Coolant Pump 3A stopped running. This motor-driven pump received its electrical power from Bus 3A. The HEAF event damaged Bus 3A, causing the reactor coolant pump to trip on under-voltage (i.e., less than the desired voltage of 4,160 volts.) The pump’s trip triggered the insertion of all control rods into the reactor core, terminating the nuclear chain reaction.

Another second later and Reactor Coolant Pumps 3B and 3C also stopped running. These motor-driven pumps received electricity from Bus 3B. The HEAF event should have been isolated to the Switchgear Room 3A, but the force of the explosion blew open the connecting fire door, allowing Bus 3B to also be affected. Reactor Coolant Pumps 3B and 3C tripped on under-frequency (i.e., alternating current electricity at too much less than the desired 60 cycles per second). Each Turkey Point unit has three Reactor Coolant Pumps that force the flow of water through the reactor core, out the reactor vessel to the steam generators where heat gets transferred to a secondary loop of water, and then back to the reactor vessel. With all three pumps turned off, the reactor core would be cooled by natural circulation. Natural circulation can remove small amounts of heat, but not larger amounts; hence, the reactor automatically shuts down when even one of its three Reactor Coolant Pumps is not running.

At shortly before 11:09 am, the operators in the control room received word about a fire in Switchgear Room 3A and the injured worker. The operators dispatched the plant’s fire brigade to the area. At 11:19 am, the operators declared an emergency due to a “Fire or Explosion Affecting the Operability of Plant Systems Required to Establish or Maintain Safe Shutdown.”

At 11:30 am, the fire brigade reported to the control room operators that there was no fire in either Switchgear Room 3A or 3B.

Complication #1

The Switchgear Building is shown on the right end of the Unit 3 turbine building. Switchgear Rooms 3A and 3B are located adjacent to each other within the Switchgear Building. The safety-related buses inside these rooms take 4,160 volt electricity from the main generator, the offsite power grid, or an EDG and supply it to safety equipment needed to protect workers and the public from transients and accidents. Buses 3A and 3B are fully redundant; either can power enough safety equipment to mitigate accidents.

Fig. 1 (Source: Nuclear Regulatory Commission)

To guard against a single file disabling both Bus 3A and Bus 3B despite their proximity, each switchgear room is designed as a 3-hour fire barrier. The floor, walls, and ceiling of the room are made from reinforced concrete. The opening between the rooms has a normally closed door with a 3-hour fire resistance rating.

Current regulatory requirements do not require the room to have blast resistant fire doors, unless the doors are within 3 feet of a potential explosive hazard. (I could give you three guesses why all the values are 3’s, but a correct guess would divulge one-third of nuclear power’s secrets.) Cubicle 3AA06 that experienced the HEAF event was 14.5 feet from the door.

Fire Door D070-3, presumably unaware that it was well outside the 3-feet danger zone, was blown open by the HEAF event. The opened door created the potential for one fire to disable Buses 3A and 3B, plunging the site into a station blackout. Fukushima reminded the world why it is best to stay out of the station blackout pool.

Complication #2

The HEAF event activated all eleven fire detectors in Switchgear Room 3A and activated both of the very early warning fire detectors in Switchgear Room 3B. Activation of these detectors sounded alarms at Fire Alarm Control Panel 3C286, which the operators acknowledged. These detectors comprise part of the plant’s fire detection and suppression systems intended to extinguish fires before they cause enough damage to undermine nuclear safety margins.

But workers failed to reset the detectors and restore them to service until 62 hours later. Bus 3B provided the only source of electricity to safety equipment after Bus 3A was damaged by the HEAF event. The plant’s fire protection program required that Switchgear Room 3B be protected by the full array of fire detectors or by a continuous fire watch (i.e., workers assigned to the area to immediately report signs of smoke or fire to the control room.) The fire detectors were out-of-service for 62 hours after the HEAF event and the continuous fire watches were put in place late.

Workers were in Switchgear Room 3B for nearly four hours after the HEAF event performing tasks like smoke removal. But a continuous fire watch was not posted after they left the area until 1:15 pm on March 19, the day following the HEAF event. And these workers were placed in Switchgear Room 3A, not in Switchgear Room 3B housing the bus that needed to be protected.

Had a fire started in Switchgear Room 3B, neither the installed fire detectors nor the human fire detectors would have alerted control room operators. The lights going out on Broadway, or whatever they call the main avenue at Turkey Point, might have been their first indication.

Complication #3

At 12:30 pm on March 18, workers informed the control room operators that the HEAF event damaged Bus 3A such that it could not be re-energized until repairs were completed. Bus 3A provided power to Reactor Coolant Pump 3A and to other safety equipment like the ventilation fan for the room containing Emergency Diesel Generator (EDG) 3A. Due to the loss of power to the room’s ventilation fan, the operators immediately declared EDG 3A inoperable.

EDGs 3A and 3B are the onsite backup sources of electrical power for safety equipment. When the reactor is operating, the equipment is powered by electricity produced by the main generator as shown by the green line in Figure 2. When the reactor is not operating, electricity from the offsite power grid flows in through transformers and Bus 3A to the equipment as indicated by the blue line in Figure 2. When under-voltage or under-frequency is detected on their respective bus, EDG 3A and 3B will automatically start and connect to the bus to supply electricity for the equipment as shown by the red line in Figure 2.

Fig. 2 (Source: Nuclear Regulatory Commission with colors added by UCS)

Very shortly after the HEAF event, EDG 3A automatically started due to under-voltage on Bus 3A. But protective relays detected a fault on Bus 3A and prevented electrical breakers from closing to connect EDG 3A to Bus 3A. EDG 3A was operating, but disconnected from Bus 3A, when the operators declared it inoperable at 12:30 pm due to loss of the ventilation fan for its room.

But the operators allowed “inoperable” EDG 3A to continue operating until 1:32 pm. Given that (a) its ventilation fan was not functioning, and (b) it was not even connected to Bus 3A, they should not have permitted this inoperable EDG from operating for over an hour.

Complication #4

A few hours before the HEAF event on Unit 3, workers removed High Head Safety Injection (HHSI) pumps 4A and 4B from service for maintenance. The HHSI pumps are designed to transfer makeup water from the Refueling Water Storage Tank (RWST) to the reactor vessel during accidents that drain cooling water from the vessel. Each unit has two HHSI pumps; only one HHSI pump needs to function in order to provide adequate reactor cooling until the pressure inside the reactor vessel drops low enough to permit the Low Head Safety Injection pumps to take over.

On the day before, workers found a small leak from a small test line downstream of the common pipe for the recirculation lines of HHSI Pumps 4A and 4B (circled in orange in Figure 3). The repair work was estimated to take 18 hours. Both pumps had to be isolated in order for workers to repair the leaking section.

Pipes cross-connect the HHSI systems for Units 3 and 4 such that HHSI Pumps 3A and 3B (circled in purple in Figure 3) could supply makeup cooling water to the Unit 4 reactor vessel when HHSI Pumps 4A and 4B were removed from service. The operating license allowed Unit 4 to continue running for up to 72 hours in this configuration.

Fig. 3 (Source: Nuclear Regulatory Commission with colors added by UCS)

Before removing HHSI Pumps 4A and 4B from service, operators took steps to protect HHSI Pumps 3A and 3B by further restricting access to the rooms housing them and posting caution signs at the electrical breakers supplying electricity to these motor-driven pumps.

But operators did not protect Buses 3A and 3B that provide power to HHSI Pumps 3A and 3B respectively. Instead, they authorized work to be performed in Switchgear Room 3A that caused the HEAF event.

The owner uses a computer program to characterize risk of actual and proposed plant operating configurations. Workers can enter components that are broken and/or out of service for maintenance and the program bins the associated risk into one of three color bands: green, yellow, and red in order of increasing risk. With only HHSI Pumps 4A and 4B out of service, the program determined the risk for Units 3 and 4 to be in the green range. After the HEAF event disabled HHSI Pump 3A, the program determined that the risk for Unit 4 increased to nearly the green/yellow threshold while the risk for Unit 3 moved solidly into the red band.

The Cause(s)

On the morning of Saturday, March 18, 2017, workers were wrapping a fire-retardant material called Thermo-Lag around electrical cabling in the room housing Bus 3A. Meshing made from carbon fibers was installed to connect sections of Thermal-Lag around the cabling for a tight fit. To minimize the amount of debris created in the room, workers cut the Thermal-Lag material to the desired lengths at a location outside the room about 15 feet away. But they cut and trimmed the carbon fiber mesh to size inside the room.

Bus 3A is essentially the nuclear-sized equivalent of a home’s breaker panel. Open the panel and one can open a breaker to stop the flow of electricity through that electrical circuit within the house. Bus 3A is a large metal cabinet. The cabinet is made up of many cubicles housing the electrical breakers controlling the supply of electricity to the bus and the flow of electricity to components powered by the bus. Because energized electrical cables and components emit heat, the metal doors of the cubicles often have louvers to let hot air escape.

The louvers also allow dust and small airborne debris (like pieces of carbon fiber) to enter the cubicles. The violence of the HEAF event (a.k.a. the explosion) destroyed some of the evidence at the scene, but carbon fiber pieces were found inside the cubicle where the HEAF occurred.  The carbon fiber was conductive, meaning that it could transport electrical current. Carbon fiber pieces inside the cubicle, according to the NRC, “may have played a significant factor in the resulting bus failure.”

Further evidence inside the cubicle revealed that the bolts for the connection of the “C” phase to the bottom of the panel had been installed backwards. These backwards bolts were the spot where high-energy electrical current flashed over, or arced, to the metal cabinet.

As odd as it seems, installing fire retardant materials intended to lessen the chances that a single fire compromises both electrical safety systems started a fire that compromised both electrical safety systems.

The Precursor Events (and LEAF)

On February 2, 2017, three electrical breakers unexpectedly tripped open while workers were cleaning up after removing and replacing thermal insulation in the new electrical equipment room.

On February 8, 2017, “A loud bang and possible flash were reported to have occurred” in the new electrical equipment room as workers were cutting and installing Thermo-Lag. Two electrical breakers unexpectedly tripped open. The equipment involved used 480 volts or less, making this a low energy arc fault (LEAF) event.

NRC Sanctions

The NRC dispatched a special inspection team to investigate the causes and corrective actions of this HEAF event. The NRC team identified the following apparent violations of regulatory requirements that the agency is processing to determine the associated severity levels of any applicable sanctions:

  • Failure to establish proper fire detection capability in the area following the HEAF event.
  • Failure to properly manage risk by allowing HHSI Pumps 4A and 4B to be removed from service and then allowing work inside the room housing Bus 3A.
  • Failure to implement effective Foreign Material Exclusion measures inside the room housing Bus 3A that enabled conductive particles to enter energized cubicles.
  • Failure to provide adequate design control in that equipment installed inside Cubicle 3AA06 did not conform to vendor drawings or engineering calculations.

UCS Perspective

This event illustrates both the lessons learned and the lessons unlearned from the fire at the Browns Ferry Nuclear Plant in Alabama that happened almost exactly 42 years earlier. The lesson learned was that a single fire could disable primary safety systems and their backups.

The NRC adopted regulations in 1980 intended to lessen the chances that one fire could wreak so much damage. The NRC found in the late 1990s that most of the nation’s nuclear power reactors, including those at Browns Ferry, did not comply with these fire protection regulations. The NRC amended its regulations in 2004 giving plant owners an alternative means for managing the fire hazard risk. Workers were installing fire protection devices at Turkey Point in March 2017 seeking to achieve compliance with the 2004 regulations because the plant never complied with the 1980 regulations.

The unlearned lesson involved sheer and utter failures to take steps after small miscues to prevent a bigger miscue from happening. The fire at Browns Ferry was started by a worker using a lit candle to check for air leaking around sealed wall penetrations. The candle’s flame ignited the highly flammable sealant material. The fire ultimately damaged cables for all the emergency core cooling systems on Unit 1and most of those systems on Unit 2. Candles had routinely been used at Browns Ferry and other nuclear power plants to check for air leaks. Small fires had been started, but had always been extinguished before causing much damage. So, the unsafe and unsound practice was continued until it very nearly caused two reactors to meltdown. Then and only then did the nuclear industry change to a method that did not stick open flames next to highly flammable materials to see if air flow caused the flames to flicker.

Workers at Turkey Point were installing fire retardant materials around cabling. They cut some material in the vicinity of its application. On two occasions in February 2017, small debris caused electrical breakers to trip open unexpectedly. But they continued the unsafe and unsound practice until it caused a fire and explosion the following month that injured a worker and risked putting the reactor into a station blackout event. Then and only then did the plant owner find a better way to cut and install the material. That must have been one of the easiest searches in nuclear history.

The NRC – Ahead of this HEAF Curveball

The NRC and its international regulatory counterparts have been concerned about HEAF events in recent years. During the past two annual Regulatory Information Conferences (RICs), the NRC conducted sessions about fire protection research that covered HEAF. For example, the 2016 RIC included presentations from the Japanese and American regulators about HEAF. These presentations included videos of HEAF events conducted under lab conditions. The 2017 RIC included presentations about HEAF by the German and American regulators. Ironically, the HEAF event at Turkey Point occurred just a few days after the 2017 RIC session.

HEAF events were not fully appreciated when regulations were developed and plants were designed and built. The cooperative international research efforts are defining HEAF events faster than could be accomplished by any country alone. The research is defining factors that affect the chances and consequences of HEAF events. For example, the research indicates that the presence of aluminum, like in cable trays holding the energized electrical cables, can be ignited during a HEAF event, significantly adding to the magnitude and duration of the event.

As HEAF research defined risk factors, the NRC has been working with nuclear industry representatives to better understand the role these factors may play across the US fleet of reactors. For example, the NRC recently obtained a list of aluminum usage around high voltage electrical equipment.

The NRC needs to understand HEAF factors as fully as practical before it can determine if additional measures are needed to manage the risk. The NRC is also collecting information about potential HEAF vulnerabilities. Collectively, these efforts should enable the NRC to identify any nuclear safety problems posed by HEAF events and to implement a triaged plan that resolves the biggest vulnerabilities sooner rather than later.

Nuclear Regulatory Commission: Contradictory Decisions Undermine Nuclear Safety

As described in a recent All Things Nuclear commentary, one of the two emergency diesel generators (EDGs) for the Unit 3 reactor at the Palo Verde Nuclear Generation Station in Arizona was severely damaged during a test run on December 15, 2016. The operating license issued by the Nuclear Regulatory Commission (NRC) allowed the reactor to continue running for up to 10 days with one EDG out of service. Because the extensive damage required far longer than the 10 days provided in the operating license to repair, the owner asked the NRC for permission to continue operating Unit 3 for up to 62 days with only one EDG available. The NRC approved that request on January 4, 2017.

The NRC’s approval contradicted four other agency decisions on virtually the same issue.

Two of the four decisions also involved the Palo Verde reactors, so it’s not a case of the underlying requirements varying. And one of the four decisions was made afterwards, so it’s not a case of the underlying requirements changing over time. UCS requested that Hubert Bell, the NRC’s Inspector General, have his office investigate these five NRC decisions to determine whether they are consistent with regulations, policies, and practices and, if not, identify gaps that the NRC staff needs to close in order to make better decisions more often in the future.

Emergency Diesel Generator Safety Role

NRC’s safety regulations, specifically General Design Criteria 34 and 35 in Appendix A to 10 CFR Part 50, require that nuclear power reactors be designed to protect the public from postulated accidents such as the rupture of the largest diameter pipe connected to the reactor vessel that causes cooling water to rapidly drain away and impedes the flow of makeup cooling water. For reliability, an array of redundant emergency pumps—most powered by electricity but a few steam-driven—are installed. Reliability also requires redundant sources of electricity for these emergency pumps. At least two transmission lines must connect the reactor to its offsite electrical power grid and at least two onsite source of backup electrical power must be provided.  Emergency diesel generators are the onsite backup power sources at every U.S. nuclear power plant except one (Oconee in South Carolina which relies on backup power from generators at a nearby hydroelectric dam).

Because, as the March 2011 earthquake in Japan demonstrated at Fukushima, all of the multiple connections to the offsite power grid could be disabled for the same reason, the NRC’s safety regulations require that postulated accidents be mitigated relying solely on emergency equipment powered from the onsite backup power sources. If electricity from the offsite power grid is available, workers are encouraged to use it. But the reactor must be designed to cope with accidents assuming that offsite power is not available.

The NRC’s safety regulations further require that reactors cope with postulated accidents assuming offsite power is not available and that one additional safety system malfunction or single operator mistake impairs the response. This single failure provision is the reason that Palo Verde and other U.S. nuclear power reactors have two or more EDGs per reactor.

Should a pipe connected to the reactor vessel break when offsite power is unavailable and a single failure disables one EDG, the remaining EDG(s) are designed to automatically startup and connect to in-plant electrical circuit within seconds. The array of motor-driven emergency pumps are then designed to automatically start and begin supplying makeup cooling water to the reactor vessel within a few more seconds. Computer studies are run to confirm that sufficient makeup flow is provided in time to prevent the reactor core from getting overheated and damaged.

Palo Verde: 62-Day EDG Outage Time Basis

In the safety evaluation issued with the January 4, 2017, amendment, the NRC staff wrote “Offsite power sources, and one train of onsite power source would continue to be available for the scenario of a loss-of-coolant-accident.” That statement contradicted NRC’s statements previously made about Palo Verde and DC Cook and subsequently made about the regulations themselves. Futhermore, this statement pretended that the regulations in General Design Criteria 34 and 35 simply do not exist.

Palo Verde: 2006 Precedent

On December 5, 2006, the NRC issued an amendment to the operating licenses for Palo Verde Units 1, 2, and 3 extending the EDG allowed outage time to 10 days from its original 72 hour limit. In the safety evaluation issued for this 2006 amendment, the NRC staff explicitly linked the reactor’s response to a loss of coolant accident with concurrent loss of offsite power:

During plant operation with both EDGs operable, if a LOOP [loss of offsite power] occurs, the ESF [engineered safeguards or emergency system] electrical loads are automatically and sequentially loaded to the EDGs in sufficient time to provide for safe reactor shutdown or to mitigate the consequences of a design-basis accident (DBA) such as a loss-of-coolant accident (LOCA).

Palo Verde: 2007 Precedent

On February 21, 2007, the NRC issued a White inspection finding for one of the EDGs on Palo Verde Unit 3 being non-functional for 18 days while the reactor operated (exceeding the 10 day allowed outage time provided by the December 2006 amendment.) The NRC determined the EDG impairment actually existed for a total of 58 days. The affected EDG was successfully tested 40 days into that period. Workers discovered a faulty part in the EDG 18 days later. The NRC assumed the EDG was non-functional between its last successful test run and replacement of the faulty part. Originally, the NRC staff estimated that the affected EDG has a 75 percent chance of successfully starting during the initial 40 days and a 0 percent chance of successfully starting during the final 18 days. Based on those assumptions, the NRC determined the risk to approach the White/Yellow inspection finding threshold. The owner contested the NRC’s preliminary assessment. The NRC’s final assessment and associated White inspection finding only considered the EDG’s unavailability during the final 18 days.

Fig. 1 (Source: NRC)

Somehow, the same NRC that estimated a risk rising to the White level for an EDG being unavailable for 18 days and a risk rising to the White/Yellow level for an additional 40 days of the EDG being impaired by 25 percent concluded that an EDG being unavailable for 62 days now had risk of Green or less. The inconsistency makes no sense. And it makes little safety.

DC Cook: 2015 Precedent

One of the two EDGs for the Unit 1 reactor at the DC Cook nuclear plant in Michigan was severely damaged during a test run on May 21, 2015. The owner applied to the NRC for a one-time amendment to the operating license to allow the reactor to continue running for up to 65 days while the EDG was repaired and restored to service.

The NRC asked the owner how the reactor would respond to a loss of coolant accident with a concurrent loss of offsite power and the single failure of the remaining EDG. In other words, the NRC asked how the reactor would comply with federal safety regulations.

The owner shut down the Unit 1 reactor and restarted it on July 29, 2015, after repairing its broken EDG.

Rulemaking: 2017 Subsequent

On January 26, 2017, the NRC staff asked their Chairman and Commissioners for permission to terminate a rulemaking effort initiated in 2008 seeking to revise federal regulations to decouple LOOP from LOCA. The NRC staff explained that their work to date had identified numerous safety issues about decoupling LOOP from LOCA. Rather than put words in the NRC’s mouth, I’ll quote from the NRC staff’s paper: “The NRC staff determined that these issues would need to be adequately addressed in order to complete a regulatory basis that could support a proposed LOOP/LOCA rulemaking. To complete a fully developed regulatory basis for the LOOP/LOCA rulemaking, the NRC staff would need to ensure that these areas of uncertainty are adequately addressed as part of the rulemaking activity.”

It’s baffling how the numerous issues that had to be resolved before the NRC staff could complete a regulatory basis for the LOOP/LOCA rulemaking would not also have to resolved before the NRC would approve running a reactor for months assuming that a LOOP/LOCA could not occur.

4 out of 5 Ain’t Safe Enough

In deciding whether a loss of offsite power event could be unlinked from a postulated loss of coolant accident, the NRC answered “no” four out of five times.

Fig. 2 (Source: UCS)

Four out of five may be enough when it comes to dentists who recommend sugarless gum, but it’s not nearly save enough when the lives of millions of Americans are at stake.

We are hopeful that the Inspector General will help the NRC do better in the future.

Trump Administration Blocks Government Scientists from Attending International Meeting on Nuclear Power

The Trump administration has barred the participation of US government technical experts on nuclear energy from attending a major international conference in Russia.The conference, co-sponsored by the International Atomic Energy Agency (IAEA) and ROSATOM, the Russia state atomic energy corporation, began today in the city of Ekaterinburg.

Preventing US government scientists from delivering scheduled talks at an IAEA conference is highly unusual. This decision is apparently a consequence of the deteriorating relationship between the US and Russia. I learned about this when I arrived at the conference today to find that I was one of only a handful of US participants, out of several hundred attendees.

With so many communication channels between the U.S. and Russia now cut off, it is essential to preserve scientific cooperation in areas where there is common ground between the two countries. The Trump administration’s action is inconsistent with this goal.

Nuclear Leaks: The Back Story the NRC Doesn’t Want You to Know about Palo Verde

As described in a recent All Things Nuclear commentary, one of two emergency diesel generators (EDGs) for the Unit 3 reactor at the Palo Verde Nuclear Generation Station in Arizona was severely damaged during a test run on December 15, 2016. The operating license issued by the Nuclear Regulatory Commission (NRC) allowed the reactor to continue running for up to 10 days with one EDG out of service. Because the extensive damage required far longer than 10 days to repair, the owner asked the NRC for permission to continue operating Unit 3 for up to 62 days with only one EDG available. The NRC approved that request.

Around May 18, 2017, I received an envelope in the mail containing internal NRC documents with the back story for this EDG saga. I submitted a request under the Freedom of Information Act (FOIA) for these materials, but the NRC informed me that they could not release the documents because the matter was still under review by the agency. I asked the NRC’s Office of Public Affairs for a rough estimate of when the agency would conclude its review and release the documents. I was told that their review of the safety issues raised in the documents wasn’t a priority for the NRC and they’d get to it when they got to it.

Well, nuclear safety is a priority for me at UCS. And since I already have the documents, I don’t need to wait for the NRC to get around to concluding its stonewalling— I mean “review”—of the issues.  Here is the back story the NRC does not want you to know about the busted EDG at Palo Verde.

Emergency Diesel Generator Safety Role

The NRC issued the operating license for Palo Verde Unit 3 on November 25, 1987. That initial operating license allowed Unit 3 to continue running for up to 72 hours with one of its two EDGs out of service. Called the “allowable outage time,” the 72 hours balanced the safety need to have a reliable backup power supply with the need to periodically test the EDGs and perform routine maintenance.

The EDGs are among the most important safety equipment at nuclear power plants like Palo Verde. The March 2011 accident at Fukushima Daiichi tragically demonstrated this vital role. A large earthquake knocked out the electrical power grid to which Fukushima Daiichi’s operating reactors were connected. Power was lost to the pumps providing cooling water to the reactor vessels, but the EDGs automatically started and took over this role. About 45 minutes later, a tsunami wave spawned by the earthquake inundated the site and flooded the rooms housing the EDGs. With both the normal and backup power supplies unavailable, workers could only supply makeup cooling water using battery-powered systems and portable generators. They fought a heroic but futile battle and all three reactors operating at the time suffered meltdowns.

More EDG Allowable Outage Time

On December 23, 2005, the owner of Palo Verde submitted a request to the NRC seeking to extend the allowable outage time for an EDG to be out of service to 10 days from 72 hours. Longer EDG allowable outage times were being sought by nuclear plant owners. Originally, nuclear power reactors shut down every year for refueling. The refueling outages provided ample time to conduct the routine testing and inspection tasks required for the EDGs. To boost electrical output (and hence revenue), owners transitioned to only refueling reactors every 18 or 24 months and to shorten the duration of the refueling outages. To facilitate the transitions, more and more testing and inspections previously performed during refueling outages were being conducted with the reactors operating. The argument supporting online maintenance was that while it adversely affected availability (i.e., an EDG was deliberately removed from service for testing and inspecting), the increased reliability (i.e., tests to confirm EDGs were operable were conducted every few weeks instead of spot checks every 18 to 24 months). The NRC approved the amendment to the operating licenses extending the EDG allowable outage times to 10 days on December 5, 2006.

More NRC/Industry Efforts on Allowable Outage Times

While the EDGs have important safety roles to play, they are not the only safety role players. The operating license for a nuclear power reactor covers dozens of components, each with its own allowable outage time. Around the time that longer EDG allowable outage times were sought and obtained at Palo Verde, the nuclear industry and the NRC were working on protocols to make proper decisions about allowable outage times for various safety components. On behalf of the nuclear industry, the Nuclear Energy Institute submitted guidance document NEI 06-09 to the NRC. On May 17, 2007, the NRC issued its safety evaluation report documenting its endorsement of NEI-06-09 along with its qualifications for that endorsement.

To create yet another acronym for no apparent reason, the nuclear industry and NRC conjured up Risk Informed Completion Time (RICT) to use in place of allowable outage time (AOT). The NRC explicitly endorsed a 30-day limit on RICTs (AOTs):

“The RICT is further limited to a deterministic maximum of 30 days (referred to as the backstop CT [completion time] from the time the TS [technical specification or operating license requirement] was first entered.”

The NRC explained why the 30-day maximum limit was necessary:

“The 30-day backstop CT assures that the TS equipment is not out of service for extended periods, and is a reasonable upper limit to permit repairs and restoration of equipment to an operable status.”

NEI 06-09 and the NRC’s safety evaluation applied to all components within a nuclear power reactor’s operating license. The 30-day backstop limit was the longest AOT (RICT) permitted. Shorter RICTs (AOTs) might apply for components with especially vital safety roles.

For example, the NRC established more limiting AOTs (RICTs) for the EDGs. In February 2002, the NRC issued Branch Technical Position 8-8, “Onsite (Emergency Diesel Generators) and Offsite Power Sources Allowed Outage Time Extensions.” This Branch Technical Position is part of the NRC’s Standard Review Plan for operating reactors. The Standard Review Plan helps plant owners meet NRC’s expectations and NRC reviewers and inspectors verify that expectations have been met. The Branch Technical Position is quite clear about the EDG allowable outage time limit:

“An EDG or offsite power AOT license amendment of more than 14 days should not be considered by the staff for review.” [underlining in original]

Exceptions and Precedent

Consistent with the “every rule has its exception” cliché, neither the 14-day EDG AOT in NRC Branch Technical Position 8-8 nor the 30-day backstop limit in the NRC’s safety evaluation for NEI 06-09 are considered hard and fast limits. Owners can, and do, request NRC’s permission for longer times under special circumstances.

The owner of the DC Cook nuclear plant in Michigan asked the NRC on May 28, 2015, for permission to operate the Unit 1 reactor for up to 65 days with one of its two EDGs out of service. The operating licensee for Unit 1 already allowed one EDG to be out of service for up to 14 days. During testing of an EDG on May 21, 2015, inadequate lubrication caused one of the bearings to be severely damaged. Repairs were estimated to require 56 days.

The NRC emailed the owner questions about the 65-day EDG AOT on May 28 and May 29. Among the questions asked by the NRC was how Unit 1 would respond to a design basis loss of coolant accident (LOCA) concurrent with a loss of offsite power (LOOP) and a single failure of the only EDG in service. The EDGs are designed to automatically start from the standby mode and deliver electricity to safety components within seconds. This rapid response is needed to ensure the reactor core is cooled should a broken pipe (i.e., LOCA) drain cooling water should electrical power to the makeup pumps not be available (i.e., LOOP). The single failure provision is an inherent element of the redundancy and defense-in-depth approach to nuclear safety.

The NRC did not approve the request for a 65-day EDG AOT for Cook Unit 1.

The NRC did not deny the request either.

On June 1, 2015, the owner formally withdrew its request for the 65-day EDG AOT and shut down the Unit 1 reactor. The Unit 1 reactor was restarted on July 29, 2015.

More on the Back Story

About 18 months after one of two EDGs for the Unit 1 reactor at DC Cook was severely damaged during a test run, one of two EDGs for the Unit 3 reactor at Palo Verde was severely damaged during a test run.

About 18 months after DC Cook’s owner requested permission from the NRC to continue running Unit 1 for up to 65 days with only one EDG in service, Palo Verde’s owner requested permission to continue running Unit 3 for up to 62 days.

About 18 months after the NRC staff asked DC Cook’s owner how Unit 1 would respond to a loss of coolant accident concurrent with a loss of offsite power and failure of the remaining EDG, the NRC staff merely assumed that a loss of coolant accident would not happen during the 62 days that Palo Verde Unit 3 ran with only one EDG in service. Enter the back story as reported by the Arizona Republic.

On December 23, 2016, and January 9, 2017, Differing Professional Opinions (DPOs) were initiated by member(s) of the NRC staff registering formal disagreement with NRC senior management’s plan to allow the 62-day EDG AOT for Palo Verde Unit 3. The initiator(s) checked a box on the DPO form to have the DPO case file be made publicly available (Fig. 1).

Fig. 1 (Source: United States Postal Service)

The DPO initiator(s) allege that the 62-day EDG AOT was approved by the NRC because the agency assumed that a loss of coolant accident simply would not happen. The DPO stated:

“The NRC and licensee ignored the loss of coolant accident (LOCA) consequence element. Longer outage times increase the vulnerability to a design basis accident involving a LOCA with the loss of offsite power (LOOP) event with a failure of Train A equipment.”

Palo Verde has two fully redundant sets of safety equipment, Trains A and B. The broken EDG provided electrical power (when unbroken) to Train B equipment. The 62-day EDG AOT was approved based on workers scurrying about to manually start combustible gas turbines and portable generators to provide electrical power that would otherwise be supplied by EDG 3B. The DPO stated:

“The Train B EDG auto starts and loads all safety equipment in 40 seconds. The manual actions take at least 20 minutes, if not significantly longer.”

Again, the rapid response is required to mitigate a loss of coolant accident that drains water from the reactor vessel. When water does not drain away, it takes time for the reactor core’s decay heat to warm up and boil away the reactor vessel’s water, justifying a slower response time.

The NRC staff considered a loss of coolant accident for the broken EDG at Cook but allegedly dismissed it at Palo Verde. Curious.

The DPO also disparaged the non-routine measures undertaken by the NRC to hide their deliberations from the public:

“The pre-submittal call occurred on a “non-recorded” [telephone] line. The NRC staff debated the merits of the call in a headquarters staff only discussion. Note that the Notice of Enforcement Discretion calls are done on recorded [telephone] lines.”

President Richard Nixon’s downfall occurred when it become known that tape recordings of his impeachable offenses existed. The NRC avoided this trap by deliberately not following their routine practice of recording the telephone discussions. Peachy!

Cognitive Dissonance or Unnatural Selection?

The NRC’s approval of the 62-day EDG AOT for Palo Verde Unit 3 is perplexing, at best.

In the amendment it issued January 4, 2017, approving the extension, the NRC wrote:

“Offsite power sources and one train of onsite power source would continue to be available for the scenario of a loss-of-coolant accident” while EDG 3B was out of service.

In other words, the NRC assumed that loss of offsite power (LOOP) and loss of coolant accident (LOCA) are separate events. The NRC assumed that if a LOCA occurred, electrical power from the offsite grid would enable safety equipment to refill the reactor vessel and prevent meltdown. And the NRC assumed that if a LOOP occurred, a LOCA would not drain water from the reactor vessel, giving workers time to find, deploy, and start up the portable equipment and prevent core overheating.

But in the amendment it issued December 5, 2006, establishing the 10-day EDG AOT, the NRC wrote:

“During plant operation with both EDGs operable, if a LOOP occurs, the ESF [engineered safeguards] electrical loads are automatically and sequentially loaded to the EDGs in sufficient time to provide for safe reactor shutdown or to mitigate the consequences of a design-basis accident (DBA) such as a loss-of-coolant accident (LOCA).”

In those words, the NRC assumed that LOOP and LOCA could occur concurrently in design basis space.

More importantly, page B 3.8.1-2 of the bases document dated May 12, 2016, for the Palo Verde operating licenses is quite explicit about the LOOP/LOCA relationship:

“In the event of a loss of preferred power, the ESF electrical loads are automatically connected to the DGs in sufficient time to provide for safe reactor shutdown and to mitigate the consequences of a Design Basis Accident (DBA) such as a loss of coolant accident (LOCA).”

In those words, the operating licenses issued the NRC assumed that LOOP and LOCA could occur concurrently in design basis space.

So, the NRC either experienced cognitive dissonance in having two opposing viewpoints on the same issue or made the unnatural selection of LOCA without LOOP.

Actions May Speak Louder Than Words, But Inaction Shouts Loudest

Check out this chronology:

  • December 15, 2016: EDG 3B for Palo Verde Unit 3 failed catastrophically during a test run
  • December 21, 2016: Owner requested 21-day EDG AOT
  • December 23 2016: NRC approved 21-day EDG AOT
  • December 23, 2016: DPO submitted opposing 21-day EDG AOT
  • December 30, 2016: Owner requested 62-day EDG AOT
  • January 4, 2017: NRC approved 62-day EDG AOT
  • January 9, 2017: DPO submitted opposing 62-day EDG AOT
  • February 6, 2017: NRC special inspection team arrived at Palo Verde to examine EDG’s failure cause
  • February 10, 2017: NRC special inspection team concluded its onsite examinations
  • April 10, 2017: NRC issued special inspection team report

The NRC jumped through hoops during the Christmas and New Year’s holidays to expeditiously approve a request to allow Unit 3 to continue generating revenue.

The NRC has not yet responded to two DPOs questioning the safety rationale behind the NRC’s approval.

If the NRC really and truly had a solid basis for letting Palo Verde Unit 3 run for so long with only one EDG, they have had plenty of time to address the issues raised in the DPOs. Way more than 62 days, in fact.

William Shakespeare wrote about something rotten in Denmark.

The bard never traveled to Rockville to visit the NRC’s headquarters. Had he done so, he might have discovered that rottenness is not confined to Denmark.

Oyster Creek Reactor: Bad Nuclear Vibrations

The Oyster Creek Nuclear Generating Station near Forked River, New Jersey is the oldest nuclear power plant operating in the United States. It began operating in 1969 around the time Neil Armstrong and Buzz Aldrin were hiking the lunar landscape.

Oyster Creek has a boiling water reactor (BWR) with a Mark I containment design, similar to the Unit 1 reactor at Fukushima Daiichi. Water entering the reactor vessel is heated to the boiling point by the energy released by the nuclear chain reaction within the core (see Figure 1). The steam flows through pipes from the reactor vessel to the turbines. The steam spins the turbines connected to the generator that produces electricity distributed by the offsite power grid. Steam discharged from the turbines flows into the condenser where it is cooled by water drawn from the Atlantic Ocean, or Barnegat Bay. The steam vapor is converted back into liquid form. Condensate and feedwater pumps supply the water collected in the condenser to the reactor vessel to repeat the cycle.

Fig. 1 (Source: Tennessee Valley Authority)

The turbine is actually a set of four turbines—one high pressure turbine (HPT) and three low pressure turbines (LPTs). The steam passes through the high pressure turbine and then enters the moisture separators. The moisture separators remove any water droplets that may have formed during the steam’s passage through the high pressure turbine. The steam leaving the moisture separators then flows in parallel through the three low pressure turbines.

The control system for the turbine uses the speed of the turbine shaft (normally 1,800 revolutions per minute) and the pressure of the steam entering the turbine (typically around 940 pounds per square inch) to regulate the position of control valves (CVs) in the steam pipes to the high pressure turbine. If the turbine speed drops or the inlet pressure rises, the control system opens the control valves a bit to bring these parameters back to their desired values. Conversely, if the turbine speed increases or the inlet pressure drops, the control system signals the control valves to close a tad to restore the proper conditions. It has been said that the turbine is slave to the reactor—if the reactor power level increases or decreases, the turbine control system automatically repositions the control valves to correspond to the changed steam flow rate.

The inlet pressure is monitored by Pressure Transmitters (PT) that send signals to the Electro-Hydraulic Control (EHC) system. The EHC system derives its name from the fact that it uses electrical inputs (e.g, inlet pressure, turbine speed, desired speed, desired inlet pressure, etc.) to regulate the oil pressure in the hydraulic system that positions the valves.

Fig. 2 (Source: Nuclear Regulatory Commission)

Bad Vibrations

In the early morning hours of November 20, 2016, the operators at Oyster Creek were conducting the quarterly test of the turbine control system. With the reactor at 95 percent power, the operator depressed a test pushbutton at 3:26 am per the procedure. The plant’s response was unexpected. The positions of the control valves and bypass valves began opening and closing small amounts causing the reactor pressure to fluctuate. Workers in the turbine building notified the control room operators that the linkages to the valves were vibrating. The operators began reducing the reactor power level in an attempt to stop the vibrations and pressure fluctuations.

The reactor automatically shut down at 3:42 pm from 92 percent power on high neutron flux in the reactor. Workers later found the linkage for control valve #2 had broken due to the vibrations and the linkage for control valve #4 had vibrated loose. The linkages are “mechanical arms” that enable the turbine control system to reposition the valves. The broken and loosened linkages impaired the ability of the control system to properly reposition the valves.

These mechanical malfunctions prevented the EHC system from properly controlling reactor pressure during the test and subsequent power reduction. The pressure inside the reactor vessel increased. In a BWR, reactor pressure increases collapse and shrink steam bubbles. Displacing steam void spaces with water increases the reactor power level. When atoms split to release energy, they also release neutrons. The neutrons can interact with other atoms to causing them to split. Water is much better than steam bubbles at slower down the neutrons to the range where the neutrons best interact with atoms. Put another way, the steam bubbles permit high energy neutrons to speed away from the fuel and get captured by non-fuel parts within the reactor vessel while the water better confines the neutrons to the fuel region.

The EHC system’s problem allowed the pressure inside the reactor vessel to increase. The higher pressure collapsed steam bubbles, increasing the reactor power level. As the reactor power level increased, more neutrons scurried about as more and more atoms split. The neutron monitoring system detected the increasing inventory of neutrons and initiated the automatic shut down of the reactor to avoid excessive power and fuel damage.

Workers attributed the vibrations to a design flaw. A component in the EHC system is specifically designed to dampen vibrations in the tubing providing hydraulic fluid to the linkages governing valve positions. But under certain conditions, depressing the test pushbutton creates a pressure pulse on that component. Instead of dampening the pressure piles, the component reacts in a way that causes the hydraulic system pressure to oscillate, creating the vibrations that damaged the linkages.

The component and damaged linkages were replaced. In addition, the test procedure was revised to avoid performing that specific portion of the test when the reactor is operating. In the future, that part of the turbine valve test will be performed during an outage.

Vibrations Re-Visited

It was not the first time that Oyster Creek was shut down due to problems performing this test. It wasn’t even the first time this decade.

On December 14, 2013, operators conducted the quarterly test of the turbine control system at 95 percent power. They encountered unanticipated valve responses and reactor pressure changes during the test. The operators manually shut down the reactor as reactor pressure rose towards the automatic shut down setpoint.

Improper assembly of components in the EHC system and vibrations that caused them to come apart resulted in control valves #2 and #3 closing. Their closure increased the pressure within the reactor pressure, leading the operators to manually shut down the reactor before it automatically scrammed.

The faulty parts were replaced.

Bad Vibrations at a Good Time

If every test was always successful, there would be little value derived by the testing program.

Similarly, if every test was seldom successful, there would be little value from the testing program.

Tests that occasionally are unsuccessful have value.

First, they reveal things that need to be fixed

Second, they provide insights on the reliability of the items being tested. (I suppose tests that always fail also yield insights about reliability, so I should qualify this statement to say they provide useful and meaningful insights about reliability.)

Third, they occur during a test rather than when needed to prevent or mitigate an accident. Accidents may reveal more insights than those revealed by test failures. But the cost per insight is a better deal with test failures.

Increase in Cancer Risk for Japanese Workers Accidentally Exposed to Plutonium

According to news reports, five workers were accidentally exposed to high levels of radiation at the Oarai nuclear research and development center in Tokai-mura, Japan on June 6th. The Japan Atomic Energy Agency, the operator of the facility, reported that five workers inhaled plutonium and americium that was released from a storage container that the workers had opened. The radioactive materials were contained in two plastic bags, but they had apparently ripped.

We wish to express our sympathy for the victims of this accident.

This incident is a reminder of the extremely hazardous nature of these materials, especially when they are inhaled, and illustrates why they require such stringent procedures when they are stored and processed.

According to the earliest reports, it was estimated that one worker had inhaled 22,000 becquerels (Bq) of plutonium-239, and 220 Bq of americium-241. (One becquerel of a radioactive substance undergoes one radioactive decay per second.) The others inhaled between 2,200 and 14,000 Bq of plutonium-239 and quantities of americium-241 similar to that of the first worker.

More recent reports have stated that the amount of plutonium inhaled by the most highly exposed worker is now estimated to be 360,000 Bq, and that the 22,000 Bq measurement in the lungs was made 10 hours after the event occurred. Apparently, the plutonium that remains in the body decreases rapidly during the first hours after exposure, as a fraction of the quantity initially inhaled is expelled through respiration. But there are large uncertainties.

The mass equivalent of 360,000 Bq of Pu-239 is about 150 micrograms. It is commonly heard that plutonium is so radiotoxic that inhaling only one microgram will cause cancer with essentially one hundred percent certainty. This is not far off the mark for certain isotopes of plutonium, like Pu-238, but Pu-239 decays more slowly, so it is less toxic per gram.  The actual level of harm also depends on a number of other factors. Estimating the health impacts of these exposures in the absence of more information is tricky, because those impacts depend on the exact composition of the radioactive materials, their chemical forms, and the sizes of the particles that were inhaled. Smaller particles become more deeply lodged in the lungs and are harder to clear by coughing. And more soluble compounds will dissolve more readily in the bloodstream and be transported from the lungs to other organs, resulting in exposure of more of the body to radiation. However, it is possible to make a rough estimate.

Using Department of Energy data, the inhalation of 360,000 Bq of Pu-239 would result in a whole-body radiation dose to an average adult over a 50-year period between 580 rem and nearly 4300 rem, depending on the solubility of the compounds inhaled. The material was most likely an oxide, which is relatively insoluble, corresponding to the lower bound of the estimate. But without further information on the material form, the best estimate would be around 1800 rem.

What is the health impact of such a dose? For isotopes such as plutonium-239 or americium-241, which emit relatively large, heavy charged particles known as alpha particles, there is a high likelihood that a dose of around 1000 rem will cause a fatal cancer. This is well below the radiation dose that the most highly exposed worker will receive over a 50-year period. This shows how costly a mistake can be when working with plutonium.

The workers are receiving chelation therapy to try to remove some plutonium from their bloodstream. However, the effectiveness of this therapy is limited at best, especially for insoluble forms, like oxides, that tend to be retained in the lungs.

The workers were exposed when they opened up an old storage can that held materials related to production of fuel from fast reactors. The plutonium facilities at Tokai-mura have been used to produce plutonium-uranium mixed-oxide (MOX) fuel for experimental test reactors, including the Joyo fast reactor, as well as the now-shutdown Monju fast reactor. Americium-241 was present as the result of the decay of the isotope plutonium-241.

I had the opportunity to tour some of these facilities about twenty years ago. MOX fuel fabrication at these facilities was primarily done in gloveboxes through manual means, and we were able to stand next to gloveboxes containing MOX pellets. The gloveboxes represented the only barrier between us and the plutonium they contained. In light of the incident this week, that is a sobering memory.

Palo Verde: Running Without a Backup Power Supply

The Arizona Public Service Company’s Palo Verde Generating Station about 60 miles west of Phoenix has three Combustion Engineering pressurized water reactors that began operating in the mid 1980s. In the early morning hours of Thursday, December 15, 2016, workers started one of two emergency diesel generators (EDGs) on the Unit 3 reactor for a routine test. The EDGs are the third tier of electrical power to emergency equipment for Unit 3.

When the unit is operating, the source of power is the electricity produced by the main generator (labeled A in Figure 1.) The electricity flows through the Main Transformer to the switchyard and offsite power grid and also flows through the Unit Auxiliary Transformer to in-plant equipment. If the unit is not operating, electrical power flows from the offsite power grid through the Startup Transformer (B) to in-plant equipment. When the main generator is offline and power from the offsite power grid is unavailable, the EDGs (C) step in to provide electrical power to a subset of in-plant equipment—the emergency equipment needed to protect the reactor core and minimize release of radioactivity to the environment. An additional backup power source exists at Palo Verde in the form of gas turbine generators (D) that can supply power to any of the three units.

Fig. 1 (Source: Arizona Public Service Company)

I toured the Palo Verde site on May 11, 2016. The tour included one of EDG rooms on Unit 2 as shown in Figure 2. Each unit at Palo Verde has two EDGs. The EDG being tested on December 15, 2016, was manufactured in 1981 and was a Cooper Bessemer 20-cylinder V-type turbocharged engine. The engine operated at 600 revolutions per minute with a rated output of 5,500,000 watts.

Fig. 2 (Source: Arizona Public Service Company)

Assuming one of the two EDGs for a unit fails and there are no additional equipment failures, the remaining EDG and the equipment powered by it are sufficient to mitigate any design basis accident (including a loss of coolant accident caused by a broken pipe connected to the reactor vessel) and protect workers and the public from excessive exposure to radiation. Figure 3 shows the major components powered by the Unit 3 EDGs—a High Pressure Safety Injection (HPSI) train, a Low Pressure Safety Injection (LPSI) train, a Containment Spray train, an Essential Cooling Water Pump, an Auxiliary Feedwater Pump, and so on.

Fig. 3 (Source: Arizona Public Service Company Individual Plant Examination)

Because the EDGs are normally in standby mode, the operating license for each unit requires that they be periodically tested to verify they remain ready to save the day should that need arise. At 3:02 am on December 15, 2016, workers started EDG 3B. Workers increased the loading on EDG 3B to about 2,700,000 watts, roughly half load, at 3:46 am per the test procedure.

Ten minutes later, alarms sounded and flashed in the Unit 3 Control Room alerting operators that EDG B had automatically stopped running to due low lube oil pressure. A worker in the area notified the control room operators about a large amount of smoke as well as oil on the floor of the EDG room. The operators contacted the onsite fire department which arrived in the EDG room at 4:06 am. There was no fire ongoing when they arrived, but they remained on scene for about 90 minutes to assist in the response to the event.

Operators declared an Alert, the third most serious in the NRC’s four emergency classifications, at 4:10 am due to a fire or explosion resulting in control room indication of degraded safety system performance. The emergency declaration was terminated at 6:36 am.

Seven weeks later after the fire had long been out, the oil on the floor long since wiped up, and all sharp-edged metal fragments long gone, and any toxic smoke long dissipated, the Nuclear Regulatory Commission (NRC) dispatched a special inspection team to investigate the event and its cause. The NRC dispatched its special inspection team more than a month after it authorized Unit 3 to continue operating for up to 62 days while its blown-up backup power source was repaired. The Unit 3 operating license originally allowed the reactor to operate for only 10 days with one of two EDGs out of service.

Workers at Palo Verde determined that EDG 3B failed because the connecting rod on cylinder 9R failed. It was the fifth time that an EDG of that type at a US nuclear power plant experienced a connecting rod failure and it was the second time that Cylinder 9R on EDG 3B at Palo Verde. It had also failed during a test in 1986.

Examinations in 2017 following the most recent failure traced its root cause back to the first failure. The forces resulting from that failure caused misalignment of the main engine crankshaft. (In this engine, the crankshaft rotates. The crankshaft causes the connecting rods to rise and fall with each rotation, in turn driving the pistons in and out of the cylinders.) The misalignment was very minor—the tolerances are on the order of thousands of an inch. But this minor misalignment over hundreds of hours of EDG operation over the ensuing three decades resulted in high cyclic fatigue failure of the connecting rod.

Workers installed a new crankshaft aligned within the tight tolerances established by the vendor. Workers also installed new connecting rods and repaired the crankcase. After testing the repairs, EDG B was returned to service.

NRC Sanctions

The NRC’s special inspection team did not identify any violations contributing to the cause of the EDG failure, in the response to the failure, or in the corrective actions undertaken to remedy the failure.

UCS Perspective

The NRC’s timeline for this event isn’t comforting.

The operating licenses issued by the NRC for the three reactors at Palo Verde allow each unit to continue running for up to 10 days when one of two EDGs is out of service. The Unit 3 EDG that was blown apart on December 15 could not be repaired within 10 days. So, the owner applied to the NRC for permission to operate Unit 3 for up to 21 days with only one EDG. But the EDG could not be repaired within 21 days. So, the owner applied to the NRC for permission to operate Unit 3 for up to 62 days with only one EDG.

The NRC approved both requests, the second on January 4, 2017. More than a month later, on February 6, 2017, the NRC special inspection team arrived onsite to examine what happened and why it happened.

Wouldn’t a prudent safety regulator have asked and answered those questions before allowing a reactor to continue operating for six times as permitted by its operating license?

Wouldn’t a prudent safety regulator have ensured the cause of EDG 3B blowing itself apart might not also cause EDG 3A to blow itself apart before allowing a reactor to continue operating for two months with a potential explosion in waiting?

Whether the answers are yes or no, could that prudent regulator please call the NRC and share some of that prudency? The NRC may be many things, but it’ll seldom be accused and never be convicted of excessive prudency.

Where’s a prudent regulator when America needs one?

TVA’s Nuclear Allegators

The Nuclear Regulatory Commission (NRC) receives reports about potential safety problems from plant workers, the public, members of the news media, and elected officials. The NRC calls these potential safety problems allegations, making the sources allegators. In the five years between 2012 and 2016, the NRC received 450 to 600 allegations each year. The majority of the allegations involve the nuclear power reactors licensed by the NRC.

Fig. 1 (Source: Nuclear Regulatory Commission)

While the allegations received by the NRC about nuclear power reactors cover a wide range of issues, nearly half involve chilled work environments where workers don’t feel free to raise concerns and discrimination by management for having raised concerns.

Fig. 2 (Source: Nuclear Regulatory Commission)

In 2016, the NRC received more allegations about conditions at the Watts Bar nuclear plant in Tennessee than about any other facility in America. Watts Bar’s 31 allegations exceeded the allegations from the second highest site (the Sequoyah nuclear plant, also in Tennessee, at 17) and third highest site (the Palo Verde nuclear plant in Arizona, at 12) combined.  The Browns Ferry nuclear plant in Alabama and the Pilgrim nuclear plant in Massachusetts tied for fourth place with 10 allegations each. In other words, Watts Bar tops the list with a very comfortable margin.

Fig. 3 (Source: Nuclear Regulatory Commission)

In 2016, the NRC received double-digit numbers of allegations about five nuclear plants. Watts Bar, Sequoyah and Browns Ferry are owned and operated by the Tennessee Valley Authority (TVA). Why did three TVA nuclear plants place among the top five sources of allegations to the NRC?

Because TVA only operates three nuclear plants.

The NRC received zero allegations about ten nuclear plants during 2016. In the five year period between 2012 and 2016, the NRC only received a total of three allegations each about the Clinton nuclear plant in Illinois and the Three Mile Island Unit 1 reactor in Pennsylvania (the unit that didn’t melt down). By comparison, the NRC received 110 allegations about Watts Bar, 55 allegations about Sequoyah, and 58 allegations about Browns Ferry.

TVA President Bill Johnson told Chattanooga Time Free Press Business Editor Dave Flessner that TVA is working on its safety culture problems and “there should be no public concern about the safety of our nuclear plants.” The NRC received 30 of the 31 allegations last year from workers at Watts Bar, all 17 allegations last year from workers at Sequoyah, and all 10 allegations last year from workers at Browns Ferry.

So President Johnson is somewhat right— the public has no concerns about the safety of TVA’s nuclear plants. But when so many TVA nuclear plant workers have so many nuclear safety concerns, the public has every reason to be very, very concerned.

Nuclear plant workers are somewhat like canaries in coal mines. Each is likely to be the first to sense danger. And when nuclear canaries morph into nuclear allegators in such large numbers, that sense of ominous danger cannot be downplayed.

Ad Hoc Fire Protection at Nuclear Plants Not Good Enough

A fire at a nuclear reactor is serious business. There are many ways to trigger a nuclear accident leading to damage of the reactor core, which can result in the release of radiation. But according to a senior manager at the US Nuclear Regulatory Commission (NRC), for a typical nuclear reactor, roughly half the risk that the reactor core will be damaged is due to the risk of fire. In other words, the odds that a fire will cause an accident leading to core damage equals that from all other causes combined. And that risk estimate assumes the fire protection regulations are being met.

However, a dozen reactors are not in compliance with NRC fire regulations:

  • Prairie Island Units 1 and 2 in Minnesota
  • HB Robinson in South Carolina
  • Catawba Units 1 and 2 in South Carolina
  • McGuire Units 1 and 2 in North Carolina
  • Beaver Valley Units 1 and 2 in Pennsylvania
  • Davis-Besse in Ohio
  • Hatch Units 1 and 2 in Georgia

Instead, they are using “compensatory measures,” which are not defined or regulated by the NRC. While originally intended as interim measures while the reactor came into compliance with the regulations, some reactors have used these measures for decades rather than comply with the fire regulations.

The Union of Concerned Scientists and Beyond Nuclear petitioned the NRC on May 1, 2017, to amend its regulations to include requirements for compensatory measures used when fire protection regulations are violated.

Fire Risks

The dangers of fire at nuclear reactors were made obvious in March 1975 when a fire at the Browns Ferry nuclear plant disabled all the emergency core cooling systems on Unit 1 and most of those systems on Unit 2. Only heroic worker responses prevented one or both reactor cores from damage.

The NRC issued regulations in 1980 requiring electrical cables for a primary safety system to be separated from the cables for its backup, making it less likely that a single fire could disable multiple emergency systems.

Fig. 1 Fire burning insulation off cables installed in metal trays passing through a wall. (Source: Tennessee Valley Authority)

After discovering in the late 1990s that most operating reactors did not meet the 1980 regulations, the NRC issued alternative regulations in 2004. These regulations would permit electrical cables to be in close proximity as long as analysis showed the fire could be put out before it damaged both sets of cables. Owners had the option of complying with either the 1980 or 2014 regulations. But the dozen reactors listed above are still not in compliance with either set of regulations.

The NRC issued the 1980 and 2004 fire protection regulations following formal rulemaking processes that allowed plant owners to contest proposed measures they felt were too onerous and the public to contest measures considered too lax. These final rules defined the appropriate level of protection against fire hazards.

Rules Needed for “Compensatory Measures”

UCS and Beyond Nuclear petitioned the NRC to initiate a rulemaking process that will define the compensatory measures that can be substituted for compliance with the fire protection regulations.

The rule we seek will reduce confusion about proper compensatory measures. The most common compensatory measure is “fire watches”—human fire detectors who monitor for fires and report any sightings to the control room operators who then call out the onsite fire brigades.

For example, the owner of the Waterford nuclear plant in Louisiana deployed “continuous fire watches.” The NRC later found that they had secretly and creatively redefined “continuous fire watch” to be someone wandering by every 15 to 20 minutes. The NRC was not pleased by this move, but could not sanction the owner because there are no requirements for fire protection compensatory measures. Our petition seeks to fill that void.

The rule we seek will also restore public participation in nuclear safety decisions. The public had opportunities to legally challenge elements of the 1980 and 2004 fire protection regulations it felt to be insufficient. But because fire protection compensatory measures are governed only by an informal, cozy relationship between the NRC and plant owners, the public has been locked out of the process. Our petition seeks to rectify that situation.

The NRC is currently reviewing our submittal to determine whether it satisfies the criteria to be accepted as a petition for rulemaking. When it does, the NRC will publish the proposed rule in the Federal Register for public comment. Stay tuned—we’ll post another commentary when the NRC opens the public comment period so you can register your vote (hopefully in favor of formal requirements for fire protection compensatory measures.)

Exelon Generation Company (a.k.a. Nuclear Whiners)

The Unit 3 reactor at the Dresden Nuclear Power Station near Morris, Illinois is a boiling water reactor with a Mark I containment design that began operating in 1971. On June 27, 2016, operators manually started the high pressure coolant injection (HPCI) system for a test run required every quarter by the reactor’s operating license. Soon after starting HPCI, alarms sounded in the main control room. The operators shut down the HPCI system and dispatched equipment operators to the HPCI room in the reactor building to investigate the problem.

The equipment operators opened the HPCI room door and saw flames around the HPCI system’s auxiliary oil pump motor and the room filling with smoke. They reported the fire to the control room operators and used a portable extinguisher to put out the fire within three minutes.

Fig. 1 (Source: NRC)

What Broke?

The HPCI system is part of the emergency core cooling systems (ECCS) on boiling water reactors like Dresden Unit 3. The HPCI system is normally in standby mode when the reactor is operating. The HPCI system’s primary purpose is to provide makeup water to the reactor vessel in event that a small-diameter pump connected to the vessel breaks. The rupture of a small-diameter pipe allows cooling water to escape, but maintains the pressure within the reactor vessel too high for the many low pressure ECCS pumps to deliver makeup flow. The HPCI system takes steam produced by the reactor core’s decay heat to spin a turbine connected to a pump. The steam-driven pump transfers water from a large storage tank outside the reactor building into the reactor vessel. The HPCI system can also be used during transients without broken pipes. The HPCI system’s operation can be used by operators to help control the pressure inside the reactor vessel by drawing off the steam being produced by decay heat.

The HPCI auxiliary oil pump is powered by an electric motor. The auxiliary oil pump runs to provide lubricating oil to the HPCI system as the system starts and begins operating. Once the HPCI system is up and running, the auxiliary oil pump is no longer needed. At other boiling water reactors, the auxiliary oil pump is automatically turned off once the HPCI system is up and running—at Dresden, the auxiliary oil pump continues running.

Why the Failure was Reported

On August 25, 2016, Exelon Generation Company (hereafter Exelon) reported the HPCI system problem to the Nuclear Regulatory Commission (NRC). Exelon reported the problem “under 10 CFR 50.73(a)(20(v)(D), ‘Any event or condition that could have prevented the fulfillment of the safety function of structures or systems that are needed to mitigate the consequences of an accident.’”

Why It Broke

Exelon additionally informed the NRC that the HPCI system auxiliary oil pump motor caught fire due to “inadequate control of critical parameters when installing a DC shunt wound motor.” The HPCI system auxiliary oil pump motor had failed in March 2015 during planned maintenance. The failure in 2015 was attributed by Exelon to “inadequate cleaning and inspection of the motor” which allowed carbon dust to accumulate inside the motor.

How the NRC Assessed the Failure

The NRC issued an inspection report on December 5, 2016, with a preliminary white finding for the HPCI system problem. The NRC determined that the repair of HPCI system auxiliary oil pump motor following its failure in March 2015 resulted in the motor receiving higher electrical current than needed for the motor to run. Consequently, when the HPCI system was tested in June 2016, the high electrical current flowing to the auxiliary oil pump motor caused its windings to overheat and catch fire. The NRC determined that the inadequate repair in March 2015 caused the failure in June 2016. The NRC proposed a white finding in its green, white, yellow, and red string of increasing significant findings and gave Exelon ten days to contest that classification.

During a telephone call between the NRC staff and Exelon representatives on December 15, 2016, Exelon “did not contest the characterization of the risk significance of this finding” and declined the option “to discuss this issue in a Regulatory Conference or to provide a written response.” With the proposed white finding seemingly uncontested, the NRC issued the final white finding on February 27, 2017.

Why the NRC Reassessed the Failure

It took the NRC over two months to finalize an uncontested preliminary finding because Exelon essentially contested the preliminary finding, but not in the way used by the rest of the industry and consistent with the NRC’s longstanding procedures over the 17 years that the agency’s Reactor Oversight Process has been in place.

Instead, Exelon mailed a letter dated January 12, 2017, to the NRC requesting that the agency improve the computer models it uses to determine the significance of events.  Exelon whined that NRC’s computer model over-estimated the real risk because it considered only the failure of a standby component to start and the failure causing a running component to stop. Exelon pointed out that the auxiliary oil pump did permit the HPCI system to successfully start during the June 2016 test run and it later catching on fire did not disable the HPCI system. Exelon whined that the NRC’s modeling was “analogous to the situation where the starter motor of a car breaks down after the car is running and then concluding that ‘the car won’t run’ even though it is already running.”

The NRC carefully considered each of Exelon’s whines in its January 12 letter and still concluded that the failure warranted a white finding. So, the agency issued a white finding. With respect to Exelon’s whine that the auxiliary oil pump burned up after the HPCI system was up and running, the NRC reminded the company that the operators shut down the HPCI system in response to the alarms—had it been necessary to restart the HPCI system, the toasted auxiliary oil pump would have prevented it. It is not uncommon for the HPCI system to be automatically shut down (e.g., due to high water level in the reactor vessel) or to be manually shut down (e.g., due to operators restoring the vessel water level to within the prescribed band or responding to a fire alarm in the HPCI room) only to be restarted later during the transient. The NRC’s review determined that their computer model’s treatment of a “failure to restart” would yield results very similar to its treatment of a “failure to start.”

The auxiliary oil pump’s impairment reduced the HPCI system to one and done use. In an actual emergency, one and done might not have cut it—thus, NRC issued the white finding for Exelon’s poor performance that let the auxiliary oil pump literally go up in smoke.

The NRC conducted a public meeting on May 2, 2017, in response to Exelon’s letter. I called into the meeting to see if Exelon’s whines are as shallow and ill-conceived as they appear in print. I admit to being surprised—their whining came across even shallower live than in writing. And I would have bet it impossible after reading, and re-reading, their whiny letter.

What’s With the Whining?

Does Exelon hire whiners, or does the company train people to become whiners?

It’s a moot point because Exelon should stop whining and start performing.

Exelon whined that the NRC failed to recognize or appreciate that the auxiliary oil pump is only needed during startup of the HPCI system. During the June 2016 test run, the HPCI system successfully started and achieved steady-state running before the auxiliary oil pump caught fire. Workers put out the fire before it disabled the HPCI pump. But the NRC’s justification for the final white characterization of the “uncontested” finding explained why those considerations did not change their conclusion. While the auxiliary oil pump did not catch fire until after the HPCI system was successfully started during the June 2016 test run, its becoming toast would have prevented a second start.

Exelon expended considerable effort contesting and re-contesting the “uncontested” white finding. Had Exelon expended a fraction of that effort properly cleaning and inspection the auxiliary oil pump motor, the motor would not have failed in March 2015. Had Exelon expended a fraction of that effort properly setting control parameters when the failed motor was replaced in March 2015, it would not have caught on fire in June 2016. If the motor had not caught on fire in June 2016, the NRC would not even have reached for its box of crayons in December 2016. If the NRC had not reached for its box of crayons, Exelon would not have been whining in January and May 2017 that the green crayon instead of the white one should have been picked.

So, Exelon would be better off if it stopped whining and started performing. And the people living around Exelon’s nuclear plants would be better off, too.

US Needs More Options than Yucca Mountain for Nuclear Waste

On Wednesday, I testified at a hearing of the Environment Subcommittee of the House Energy and Commerce Committee. The hearing focused on the discussion draft of a bill entitled “The Nuclear Waste Policy Amendments Act of 2017.”

Yucca Mountain (Source: White House)

The draft bill’s primary objective is to revive the program to build a geologic repository at the Yucca Mountain site in Nevada for spent nuclear fuel and other high-level radioactive wastes. The Obama administration cancelled the program in 2009, calling it “unworkable,” and the state of Nevada is bitterly opposed to it, but Yucca Mountain still has devoted advocates in Congress, including the chairman of the subcommittee, John Shimkus (R-IL).

UCS supports the need for a geologic repository for nuclear waste in the United States but doesn’t have a position on the suitability of the Yucca Mountain site. We don’t have the scientific expertise needed to make that judgement.

However, in my testimony, I expressed several concerns about the draft bill, including its focus on locating a repository only at Yucca Mountain and its proposal to weaken the NRC’s review standards for changes to repository design.

UCS believes that rigorous science must underlie the choice of any geologic repository, and that the US needs options in addition to Yucca Mountain, which has many unresolved safety issues. In addition, we believe that any legislation that revises the Nuclear Waste Policy Act must be comprehensive and include measures to enhance the safety and security of spent fuel at reactor sites—where it will be for at least several more decades. For example, we think it is essential to speed up the transfer of spent fuel from pools to dry storage casks.

Watts Bar Lacks a Proper Safety Culture

The Nuclear Regulatory Commission (NRC) issued a Chilled Work Environment Letter to the Tennessee Valley Authority (TVA) on March 23, 2016, about safety culture problems at the Watts Bar nuclear plant. TVA promised to take steps to restore a proper safety culture at the plant.

Nearly 13 months later, has a proper safety culture been restored at Watts Bar?

No, according to a report issued April 19, 2017, by the TVA Office of the Inspector General (TVA OIG).

Fig. 1. (Source: D. Lochbaum)

The TVA OIG report paints a very disturbing picture of conditions at Watts Bar. I monitored safety culture problems at Millstone (1996-2000), Davis-Besse (2002-2004), and Salem/Hope Creek (2004-2005). The problems described in the TVA OIG report are comparable to the unacceptable conditions that existed at Millstone and Davis-Besse. A difference is that the NRC did not allow Millstone or Davis-Besse to operate until those safety culture problems were corrected to an acceptable level.

The TVA OIG report explains why TVA keeps reporting that the chilled work environment at Watts Bar was confined to the Operations Department and did not contaminate other work organizations at the site: The TVA Office of the General Counsel instructed the Employee Concerns Program and others within TVA not to use “chilled work environment” and to use “degraded work environment” instead. So, while TVA cannot find chilled work environments outside Operations, they find “degraded work environments” almost every place they look. But through an artifice of semantics conjured up by TVA’s attorneys, no chilled work environments are being found.

The TVA OIG didn’t buy the semantics: “Additionally, when 75 percent of a work group at a nuclear utility perceives that they are working in a chilled environment as is the case with ECP at TVA, it would seem reasonable to conclude that there is a chilled work environment in that group and unreasonable to pass it off as a ‘degraded work environment’.”

How bad is the chilled work environment at Watts Bar? The TVA OIG report indicates that 75% of the Employee Concerns Program (ECP) staff did not feel safe to raise concerns without fear of retaliation. ECP is supposed to be the organization that workers with safety concerns can go for help resolving them. When the helpers feel chilled, how can they truly help workers?

The ECP hired two individuals from outside TVA in February 2016 to conduct an independent investigation of the work environment at Watts Bar. According to the TVA OIG, this investigation was independent and forthright, but the ensuing report was anything but independent. The TVA OIG reviewed emails and interviewed the independent investigators and found that “the term ‘chilled work environment’ was edited out of the text of the report by ECP personnel.” In fact, the independent investigators did not write the six-page Executive Summary for “their” report—ECP wrote it. ECP wrote that a “degraded work environment” rather than a “chilled work environment” existed at Watts Bar. TVA OIG reported being unable to find “degraded work environment” being used within TVA or elsewhere prior to this “independent” report.

One of the two independent investigators told the TVA OIG that TVA management “did not like the fact that he stated that TVA management contributed to the poor SCWE [safety conscious work environment]” at Watts Bar. He was not invited back to participate in subsequent debriefing activities which “he attributed to management’s reaction to his report-out to them of the results from Phase I.” In other words, TVA shot the messenger.

The TVA OIG report states that “both the independent investigation commissioned by TVA and the SRTR [Special Review Team Report] were inappropriately influenced by TVA management” and that “the independent investigators were told by TVA ECP what they could and could not put in their report and the Executive Summary of that report was written by ECP, not the independent investigators.”

As to whether the chilled work environment issues were confined to the Operations Department, “Through personnel interviews conducted by OIG investigators, it was learned that many instances of HIRD [harassment, intimidation, retaliation, and/or discrimination] have occurred or have been alleged to have occurred in Operations and in other departments at WBN [Watts Bar Nuclear].” More specifically, surveys conducted during 2016 after workers raised concerns that led to the NRC’s Chilled Work Environment Letter being issued reveal safety culture issues outside of the Operations Department at Watts Bar.

Maintenance Department: 36% of workers feel free to report problems and concerns. 55% of workers believe they could report problems and concerns without fear of retaliation. 91% of the workers witnessed behavior contrary to a healthy nuclear safety culture.

Chemistry Department: 50% of workers feel free to report problems and concerns. 50% of workers believe they could report problems and concerns without fear of retaliation. 50% of the workers witnessed behavior contrary to a healthy nuclear safety culture.

Security Department: 34% of workers believe they could report problems and concerns without fear of retaliation. 67% of the workers witnessed behavior contrary to a healthy nuclear safety culture.

Engineering Department: 67% of workers believe they could report problems and concerns without fear of retaliation. 66% of the workers witnessed behavior contrary to a healthy nuclear safety culture.

Radiation Protection Department: 78% of the workers witnessed behavior contrary to a healthy nuclear safety culture.

The TVA OIG explicitly states “TVA’s continuing denials have been found to be incorrect by the NRC and independent assessors: a chilled work environment exists in at least several departments at WBN and within the ECP program itself.”

The TVA OIG makes an interesting observation regarding the 51 actions that TVA identified as necessary to correct the problems expressed in the NRC’s Chilled Work Environment Letter—none of them pertain to TVA’s upper management. The TVA OIG states “It is certainly worth considering whether this might be at least a contributor, if not a root cause, of the failure of any of the CAPRs [corrective actions to prevent recurrence], remediation plans, and the like to correct the continuing recurrence of chilled work environments at TVA over the past decade.” Indeed!

Watts Bar Needs a Proper Safety Culture

The TVA OIG report makes it extremely clear that Watts Bar lacks a proper safety culture and that lack is broader than just within the Operations Department.

Watts Bar needs a proper safety culture because it is the fundamental foundation for nuclear safety overall. If workers do not raise safety concerns—either out of fear of retaliation or out of distrust that management will correct them—the inventory of unresolved safety concerns increases over time. Nuclear power plants are robust and require a large number of failures and malfunctions before an incident morphs into a disaster. The rising number of unresolved safety concerns reduces the number of failures needed to facilitate such transformations.

Proper safety cultures cannot be acquired from eBay or Amazon. Senior managers must make it happen. If TVA’s senior managers can’t or won’t make it happen, either TVA needs new senior managers or NRC needs to write TVA another letter—a stronger letter perhaps along the lines of a Show Cause Order compelling TVA’s lawyers to explain why Watts Bar can continue to operate safely with “degraded work environments” all over the site.

In the meantime, if Watts Bar experiences a disaster, it won’t be an accident. It’ll be an outcome of operating a nuclear power reactor with a safety culture documented to be woefully inadequate.

Columbia Generating Station: NRC’s Special Inspection of Self-Inflicted Safety Woes

Energy Northwest’s Columbia Generating Station near Richland, Washington has one General Electric boiling water reactor (BWR/5) with a Mark II containment design that began operating in 1984. In the late morning hours of Sunday, December 18, 2016, the station stopped generating electricity and began generating problems.

The Nuclear Regulatory Commission (NRC) dispatched a special inspection team to investigate the event after determining it could have increased the risk of reactor core damage by a factor of ten. The NRC team sought to understand the problems occurring during this near-miss as well as assess the breadth and effectiveness of the solutions proposed by the company for them.

Trouble Begins Offsite

The plant was operating at full power when the main generator output breakers opened at 11:24 am due to an electrical transient within the Ashe substation. The Ashe substation is owned and maintained by the Bonneville Power Authority and serves as the connection between electricity produced at the plant and the offsite power grid. At least three electrical breakers at the Ashe substation were supposed to have opened to de-energize the faulted transmission line(s). Had they done so, the loss of the transmission lines could have triggered protective devices at the Columbia Generating Station to automatically trip the main generator. But cold weather kept the breakers from functioning properly. Instead of the protective systems at the Columbia Generating Station responding on a system level (i.e., the de-energized transmission line(s) triggering a main generator trip), they responded at the component level (i.e., the main generator output breaker sensed the electrical transient and opened).

The turbine control valves automatically closed because the main generator was no longer fully loaded with its output breakers opened. The closure of the turbine control valves automatically tripped the reactor. The control rods fully inserted within seconds to stop the nuclear chain reaction. The output breakers, turbine control valves, and control rods all functioned per the plant’s design (see Figure 1).

Fig. 1 (Source: Nuclear Regulatory Commission annotated by UCS)

Before the trip, the main generator was producing electricity at 25,000 volts. The main transformer increased the voltage up to 500,000 volts for transmission out to the offsite power grid. The auxiliary transformers reduced the voltage to 4,160 volts and 6,900 volts for supply to equipment in the plant. The output breakers that opened to start this event are represented by the square box in the upper left corner of Figure 2.

Fig. 2 (Source: Nuclear Regulatory Commission annotated by UCS)

Trouble Begins Onsite – Loss of Heat Sink and Normal Makeup

The main generator was disconnected from the offsite power grid but continued to supply electricity through the auxiliary transformers to plant equipment. Because steam was no longer flowing to the turbine, the voltage and frequency of the electricity dropped. The voltages flowing to in-plant equipment dropped low enough to cause electrical breakers to automatically open at 11:25 am to protect motors and other electrical equipment from damage caused by under-voltage. For example, an electric motor requires an electrical current of a certain voltage in order to operate. Electrical current of lower voltage may not be enough to enable the motor to run, but that current flowing through the motor may be enough to heat it up and damage it. One of the de-energized loads caused the Main Steam Isolation Valves (MSIVs) to close. Their closure meant that steam produced by the reactor’s decay heat no longer flowed to the condenser where it got cooled by water from the plant’s cooling towers. Instead, the steam bottled up in the reactor vessel and piping until it increased the pressure to the point where the safety/relief valves opened to discharge steam to the suppression pool (see Figure 3).

The closure of the MSIVs also stopped the normal flow of makeup cooling water to the reactor vessel. The feedwater system uses steam-driven turbines connected to pumps to supply makeup cooling water to the reactor vessel. But the steam supply for the feedwater pumps is downstream of the now-closed MSIVs. The condensate and condensate booster pumps upstream of the feedwater pumps have electric motors and continued to be available. But collectively they only pump water at about two-thirds of the pressure inside the reactor vessel, meaning they could not supply makeup water unless the pressure inside the reactor vessel decreased by nearly one-third its normal pressure.

Fig. 3 (Source: Nuclear Regulatory Commission annotated by UCS)

Troubles Onsite Grow – Loss of Normal Power for Safety Buses

At 11:28 am, the safety buses SM7 and SM8 tripped on low voltage, causing their respective emergency diesel generators to start and provide power to these vital buses. This was not supposed to happen during this event. By procedure, the operators were directed to manually trip the turbine and generator following the automatic trip of the reactor. They tripped the turbine at 11:27 am, but never tripped the main generator. Tripping the main generator as specified in the procedures would have immediately caused electrical breakers to close and other electrical breakers to open to swap the supply of electricity to plant equipment from the auxiliary transformers to the startup transformers as shown in Figure 4. The startup transformers reduce 230,000 volt electricity from the offsite power grid to 4,160 volts and 6,900 volts for use by plant equipment when the main generator is unavailable. With electricity to plant equipment from the startup transformers, the MSIVs would have remained open and makeup cooling water supplied by the feedwater pumps as normally provided.

Fig. 4 (Source: Nuclear Regulatory Commission annotated by UCS)

Even More Trouble Onsite – Loss of Backup Makeup

The operators manually started the Reactor Core Isolation Cooling (RCIC) system (not shown on the Figure 3, but a smaller version of the High Pressure Coolant System) at 11:32 am to provide makeup cooling water because the feedwater system was unavailable. The RCIC systems’ primary function is to supply makeup cooling water when the feedwater system cannot do so. Like the feedwater pumps, the RCIC pump is connected to a steam-driven turbine. Unlike the feedwater pumps, the RCIC pump’s turbine is supplied with steam from the reactor vessel through a connection upstream of the closed MSIVs. The RCIC pump transfers water from a large storage tank to the reactor vessel.

The operators failed to follow the procedure when starting the RCIC system. The procedure called for them to close the steam admission valve (V-45) and then open the trip valve (V-1) as soon as V-45 was fully closed (see Figure 5). But they did not open V-1. The failure to open V-1 disabled the control system designed to bring the RCIC turbine up to desired speed in 12 seconds. Instead, the RCIC turbine tried to obtain the desired speed instantly. Too much steam too soon caused the RCIC turbine to automatically trip on high speed. This trip guards against the spinning turbine blades coming apart due to excessive forces.

It took about 13 minutes for workers to go down into the RCIC room in the reactor building’s basement and reset the mis-positioned valves to allow the system to be properly started. In that time, the water level inside the reactor vessel dropped about a foot as it boiled away. That still left 162 inches (13.5 feet) of water above the top of fuel in the reactor core. The operators had several hours to restore makeup cooling water flow before the reactor core started uncovering and overheating.

Fig. 5 (Source: Nuclear Regulatory Commission annotated by UCS)

The operators manually started the High Pressure Core Spray (HPCS) system at 12:09 pm to provide makeup cooling water with the feedwater and RCIC systems both unavailable. The main HPCS pump (HPCS-P-1) has an electric motor. The pump transfer water from the large storage tank to the reactor vessel. While RCIC is designed to supply makeup water to compensate for inventory boiled off after the reactor shuts down, the HPCS system is designed to also compensate for water being lost through a small-diameter (about 2 inches) pipe that drains cooling water from the reactor vessel. Consequently, the HPCS system flow rate is about ten times greater than the RCIC system flow rate. And whereas the RCIC system flow rate can be throttled to match the makeup need, the HPCS system makeup flow is either full or zero.

The HPCS system refilled the reactor vessel soon after it was started. The operators closed the HPCS system injection valve (V-4) after about a minute. The minimum flow valve (V-12) automatically opened to direct the pump flow to the suppression pool instead of to the reactor vessel (see Figure 6). The HCPS system ran in “idle” mode for the next 3 hours and 42 minutes.

Fig. 6 (Source: Nuclear Regulatory Commission annotated by UCS)

Yet More Trouble Onsite – Water Leaking into Reactor Building

On December 18, workers discovered that the restricting orifice (RO) downstream of V-12 had leaked an estimated 4.7 gallons per minute into the reactor building while the HPCS system had operated. The NRC team learned that the gasket material used in this restricting orifice had been the subject of an industry operating experience report in 2007. A condition report was written at Columbia Generating Station in 2008 to have engineering assess the operating experience report and gasket materials used at the plant. In early 2010, the condition report was closed out based on engineering’s evaluation to use the gasket material recommended in the industry report. But the “bad” gaskets were not replaced.

Operating experience cited in the 2007 industry report revealed that the original gasket material was vulnerable to erosion. The report described two adverse consequences from the material’s erosion. First, pieces of the gasket could be carried by the water into the reactor vessel where the material impacting the fuel rods could damage their cladding. Second, gasket erosion could allow leakage. The 2007 industry report thus forecast the problem experienced at Columbia Generating Station in December 2016. The solution recommended by the 2007 report was not implemented until after the forecast problem has occurred.

NRC Sanctions

The NRC’s special inspection team identified three safety violations at the Columbia Generating Station. Two violations involved the operators failing to follow written procedures: (1) the failure to trip the main generator which resulted in the unnecessary closure of the MSIVs, and (2) the failure to properly start the RCIC system which resulted in the unnecessary trip of its turbine. The third violation was associated with the continued use of gasket material determined nearly a decade earlier to be improper for this application.

UCS Perspective

Self-inflicted problems turned a fairly routine incident into a near-miss. Luck stopped it from progressing further.

The problem started offsite due to causes outside the control of the plant’s owner. Those uncontrollable causes resulted in the main generator output breakers opening as designed.

By procedure, the operators were supposed to trip the main generator. Failing to do so resulted in the unnecessary closure of the MSIVs and the loss of the normal makeup cooling flow to the reactor vessel.

By procedure, the operators were supposed to manually start the RCIC system to provide backup cooling water flow to the reactor vessel. But they failed to properly start the system and it immediately tripped.

Procedures are like recipes—positive outcomes are achieved only when they are followed.

The operators resorted to using the HPCS system. It took about a minute for the HPCS system to recover the reactor vessel water level—the operators left it running in “idle” for the next three hours and 42 minutes during which time about 5 gallons per minute leaked into the reactor building. The leak was through eroded gasket material that had been identified as improper for this application nearly a decade earlier, but never replaced.

Defense-in-depth is a nuclear safety hallmark. That hallmark works best when operators don’t bypass barriers and when workers patch known holes in barriers. Luckily, other barriers remained effective to thwart this near-miss from becoming a disaster. But luck is a fickle factor that needs to be minimized whenever possible.

Managing Nuclear Worker Fatigue

The Nuclear Regulatory Commission (NRC) issued a policy statement on February 18, 1982, seeking to protect nuclear plant personnel against impairment by fatigue from working too many hours. The NRC backed up this policy statement by issuing Generic Letter 82-12, “Nuclear Power Plant Staff Working Hours,” on June 15, 1982. The Generic Letter outlined guidelines such as limiting individuals to 16-hour shifts and providing for a break of at least 8 hours between shifts. But policy statements and guidelines are not enforceable regulatory requirements.

Fig. 1 (Source: GDJ’s Clipart)

UCS issued a report titled “Overtime and Staffing Problems in the Commercial Nuclear Power Industry” in March 1999 describing how the NRC’s regulations failed to adequately protect against human impairment caused by fatigue. Our report revealed that workers at one nuclear plant in the Midwest logged more than 50,000 overtime hours in one year.

Barry Quigley, then a worker at a nuclear plant in the Midwest, submitted a petition for rulemaking to the NRC on September 28, 1999. The NRC issued regulations in the 1980s intended to protect against human impairment caused by drugs and alcohol. Nuclear plant workers were subject to initial, random follow-up, and for-cause drug and alcohol testing. Quiqley’s petition sought to extend the fitness-for-duty requirements to include limits on working hours. The NRC revised its regulations on March 31, 2008, to require that owners implement fatigue management measures. The revised regulations permit individuals to exceed the working hour limits, but only under certain conditions. Owners are required to submit annual reports to the NRC on the number of working hour limit waivers granted.

The NRC’s Office of Nuclear Regulatory Research recently analyzed the first five years of the working hour limits regulation. The analysis reported that in 2000, the year when the NRC initiated the rulemaking process, more than 7,500 waivers of the working hour limits suggested by Generic Letter 82-12 were being issued at some plants while about one-third of the plants granted over 1,000 waivers annually. In 2010, the first year the revised regulations were in effect, a total of 3,800 waivers were granted for the entire fleet of operating reactors. By 2015, the number of waivers for all nuclear plants had dropped to 338. The Grand Gulf nuclear plant near Port Gibson, Mississippi topped the 2015 list with 69 waivers. But 54 (78%) of the waivers were associated with the force-on-force security exercise.

The analysis indicates that owners have learned how to manage worker shifts within the NRC’s revised regulations. Zero waivers are unattainable due to unforeseen events like workers calling in sick and tasks unexpectedly taking longer to complete. The analysis suggests that the revised regulations enable owners to handle such unforeseen needs without the associated controls and reporting being an undue burden.

The regulatory requirements adopted by the NRC to protect against sleepy nuclear plant workers should let people living near nuclear plants sleep a little better.