You do not have permission to edit this page, for the following reasons:

This IP address has been blocked from editing Wikipedia.
This does not affect your ability to read Wikipedia pages.
Most people who see this message have done nothing wrong. Some kinds of blocks restrict editing from specific service providers or telecom companies in response to recent abuse or vandalism, and can sometimes affect other users who are unrelated to that abuse. Review the information below for assistance if you do not believe that you have done anything wrong.

The IP address or range 54.163.0.0/16 has been blocked (disabled) by ‪Blablubbs‬ for the following reason(s):

The IP address that you are currently using has been blocked because it is believed to be a web host provider or colocation provider. To prevent abuse, web hosts and colocation providers may be blocked from editing Wikipedia.
You will not be able to edit Wikipedia using a web host or colocation provider because it hides your IP address, much like a proxy or VPN.
We recommend that you attempt to use another connection to edit. For example, if you use a proxy or VPN to connect to the internet, turn it off when editing Wikipedia. If you edit using a mobile connection, try using a Wi-Fi connection, and vice versa. If you are using a corporate internet connection, switch to a different Wi-Fi network. If you have a Wikipedia account, please log in.
If you do not have any other way to edit Wikipedia, you will need to request an IP block exemption.

How to appeal if you are confident that your connection does not use a colocation provider's IP address:
If you are confident that you are not using a web host, you may appeal this block by adding the following text on your talk page: ((unblock|reason=Caught by a colocation web host block but this host or IP is not a web host. My IP address is _______. Place any further information here. ~~~~)). You must fill in the blank with your IP address for this block to be investigated. Your IP address can be determined here. Alternatively, if you wish to keep your IP address private you can use the unblock ticket request system. There are several reasons you might be editing using the IP address of a web host or colocation provider (such as if you are using VPN software or a business network); please use this method of appeal only if you think your IP address is in fact not a web host or colocation provider.

Administrators: The IP block exemption user right should only be applied to allow users to edit using web host in exceptional circumstances, and requests should usually be directed to the functionaries team via email. If you intend to give the IPBE user right, a CheckUser needs to take a look at the account. This can be requested most easily at SPI Quick Checkuser Requests. Unblocking an IP or IP range with this template is highly discouraged without at least contacting the blocking administrator.

This block will expire on 17:28, 10 March 2024.

Even when blocked, you will usually still be able to edit your user talk page, as well as email administrators and other editors.

For information on how to proceed, please read the FAQ for blocked users and the guideline on block appeals. The guide to appealing blocks may also be helpful.

Other useful links: Blocking policy · Help:I have been blocked
Your IP address is in a range that has been blocked on all Wikimedia Foundation wikis. The block was made by ‪Jon Kolbert‬. The reason given is Open proxy/Webhost: See the help page if you are affected .
- Start of block: 18:27, 12 November 2023
- Expiry of block: 18:27, 12 November 2028
Your current IP address is 54.163.179.242. The blocked range is 54.163.0.0/16. Please include all above details in any queries you make. If you believe you were blocked by mistake, you can find additional information and instructions in the No open proxies global policy. Otherwise, to discuss the block please post a request for review on Meta-Wiki. You could also send an email to the stewards VRT queue at stewards@wikimedia.org including all above details.

You can view and copy the source of this page:

((Short description|Type of computing error))
((Distinguish|Software error))
((More citations needed|date=November 2011))
((Use dmy dates|date=March 2020|cs1-dates=y))
In [[electronics]] and [[computing]], a '''soft error''' is a type of [[error]] where a signal or datum is wrong. Errors may be caused by a [[wikt:defect|defect]], usually understood either to be a mistake in design or construction, or a broken component. A soft error is also a signal or datum which is wrong, but is not assumed to imply such a mistake or breakage. After observing a soft error, there is no implication that the system is any less reliable than before.  One cause of soft errors is [[single event upset]]s from cosmic rays.

In a computer's memory system, a soft error changes an instruction in a program or a data value. Soft errors typically can be remedied by [[cold booting]] the computer. A soft error will not damage a system's hardware; the only damage is to the data that is being processed.

There are two types of soft errors, ''chip-level soft error'' and ''system-level soft error''.  Chip-level soft errors occur when particles hit the chip, e.g., when [[Air shower (physics)|secondary particles]] from [[cosmic ray]]s land on the [[Die (integrated circuit)|silicon die]]. If a particle with [[Soft error#Critical charge|certain properties]] hits a [[Memory cell (computing)|memory cell]] it can cause the cell to change state to a different value. The atomic reaction in this example is so tiny that it does not damage the physical structure of the chip.  System-level soft errors occur when the data being processed is hit with a noise phenomenon, typically when the data is on a data bus. The computer tries to interpret the noise as a data bit, which can cause errors in addressing or processing program code. The bad data bit can even be saved in memory and cause problems at a later time.

If detected, a soft error may be corrected by rewriting correct data in place of erroneous data. Highly reliable systems use [[error correction]] to correct soft errors on the fly. However, in many systems, it may be impossible to determine the correct data, or even to discover that an error is present at all. In addition, before the correction can occur, the system may have [[crash (computing)|crashed]], in which case the [[recovery procedure]] must include a [[Reboot (computer)|reboot]].  Soft errors involve changes to data((mdashb))the [[electrons]] in a storage circuit, for example((mdashb))but not changes to the physical circuit itself, the [[atoms]]. If the data is rewritten, the circuit will work perfectly again.  Soft errors can occur on transmission lines, in digital logic, analog circuits, magnetic storage, and elsewhere, but are most commonly known in semiconductor storage.

== ((Anchor|CRITICAL-CHARGE))Critical charge ==
Whether or not a circuit experiences a soft error depends on the energy of the incoming particle, the geometry of the impact, the location of the strike, and the design of the logic circuit. Logic circuits with higher [[capacitance]] and higher logic voltages are less likely to suffer an error. This combination of capacitance and voltage is described by the ''critical [[electric charge|charge]]'' parameter, Q<sub>crit</sub>, the minimum electron charge disturbance needed to change the logic level. A higher Q<sub>crit</sub> means fewer soft errors. Unfortunately, a higher Q<sub>crit</sub> also means a slower logic gate and a higher power dissipation. Reduction in chip feature size and supply voltage, desirable for many reasons, decreases Q<sub>crit</sub>. Thus, the importance of soft errors increases as chip technology advances.

In a logic circuit, Q<sub>crit</sub> is defined as the minimum amount of induced charge required at a circuit node to cause a voltage pulse to propagate from that node to the output and be of sufficient duration and magnitude to be reliably latched.  Since a logic circuit contains many nodes that may be struck, and each node may be of unique capacitance and distance from output, Q<sub>crit</sub> is typically characterized on a per-node basis.

== Causes of soft errors ==

=== Alpha particles from package decay ===
Soft errors became widely known with the introduction of [[dynamic RAM]] in the 1970s. In these early devices, ceramic chip packaging materials contained small amounts of [[radioactive]] contaminants. Very low decay rates are needed to avoid excess soft errors, and chip companies have occasionally suffered problems with contamination ever since. It is extremely hard to maintain the material purity needed. Controlling alpha particle emission rates for critical packaging materials to less than a level of 0.001 counts per hour per cm<sup>2</sup> (cph/cm<sup>2</sup>) is required for reliable performance of most circuits. For comparison, the count rate of a typical shoe's sole is between 0.1 and 10 cph/cm<sup>2</sup>.

Package radioactive decay usually causes a soft error by [[alpha particle]] emission.  The positive charged alpha particle travels through the semiconductor and disturbs the distribution of electrons there.  If the disturbance is large enough, a [[Digital data|digital]] [[signal (information theory)|signal]] can change from a 0 to a 1 or vice versa.  In [[combinational logic]], this effect is transient, perhaps lasting a fraction of a nanosecond, and this has led to the challenge of soft errors in combinational logic mostly going unnoticed.  In sequential logic such as [[Latch (electronic)|latches]] and [[Random Access Memory|RAM]], even this transient upset can become stored for an indefinite time, to be read out later.  Thus, designers are usually much more aware of the problem in storage circuits.

A 2011 [[Black Hat Briefings|Black Hat]] paper discusses the real-life security implications of such bit-flips in the Internet's [[Domain Name System]]. The paper found up to 3,434 incorrect requests per day due to bit-flip changes for various common domains. Many of these bit-flips would probably be attributable to hardware problems, but some could be attributed to alpha particles.<ref>((cite web |url=https://media.blackhat.com/bh-us-11/Dinaburg/BH_US_11_Dinaburg_Bitsquatting_WP.pdf |title=Bitsquatting - DNS Hijacking without Exploitation |author=Artem Dinaburg |date=July 2011 |access-date=2011-12-26  |archive-date=2018-06-11  |archive-url=https://web.archive.org/web/20180611050923/https://media.blackhat.com/bh-us-11/Dinaburg/BH_US_11_Dinaburg_Bitsquatting_WP.pdf |url-status=dead ))</ref> These bit-flip errors may be taken advantage of by malicious actors in the form of [[bitsquatting]].

[[Isaac Asimov]] received a letter congratulating him on an accidental prediction of alpha-particle RAM errors in a 1950s novel.<ref>[[Gold (Asimov)|Gold]] (1995): "This letter is to inform you and congratulate you on another remarkable scientific prediction of the future; namely your foreseeing of the dynamic random-access memory (DRAM) logic upset problem caused by alpha particle emission, first observed in 1977, but written about by you in Caves of Steel in 1957." [Note: Actually, 1952.] ... "These failures are caused by trace amounts of radioactive elements present in the packaging material used to encapsulate the silicon devices ... in your book, Caves of Steel, published in the 1950s, you use an alpha particle emitter to 'murder' one of the robots in the story, by destroying ('randomizing') its positronic brain. This is, of course, as good a way of describing a logic upset as any I've heard ... our millions of dollars of research, culminating in several international awards for the most important scientific contribution in the field of reliability of semiconductor devices in 1978 and 1979, was predicted in substantially accurate form twenty years [Note: twenty-five years, actually] before the events took place</ref>

=== Cosmic rays creating energetic neutrons and protons ===
Once the electronics industry had determined how to control package contaminants, it became clear that other causes were also at work. [[James F. Ziegler]] led a program of work at [[IBM]] which culminated in the publication of a number of papers (Ziegler and Lanford, 1979) demonstrating that [[cosmic ray]]s also could cause soft errors. Indeed, in modern devices, cosmic rays may be the predominant cause. Although the primary particle of the cosmic ray does not generally reach the Earth's surface, it creates a [[Air shower (physics)|shower]] of energetic secondary particles. At the Earth's surface approximately 95% of the particles capable of causing soft errors are energetic neutrons with the remainder composed of protons and pions.<ref name="Ziegler1996">
((cite journal |last1=Ziegler |first1=J. F. |title=Terrestrial cosmic rays |journal = [[IBM Journal of Research and Development]] |volume=40 |issue=1 |pages=19–39 |date=January 1996 |doi=10.1147/rd.401.0019 | issn = 0018-8646 ))</ref>
IBM estimated in 1996 that one error per month per 256&nbsp;[[MiB]] of RAM was expected for a desktop computer.<ref name="cosmicRayAlert" />
This flux of energetic neutrons is typically referred to as "cosmic rays" in the soft error literature. Neutrons are uncharged and cannot disturb a circuit on their own, but undergo [[neutron capture]] by the nucleus of an atom in a chip. This process may result in the production of charged secondaries, such as alpha particles and oxygen nuclei, which can then cause soft errors.

Cosmic ray flux depends on altitude. For the common reference location of 40.7°&nbsp;N, 74°&nbsp;W at sea level ([[New York City]], NY, USA), the flux is approximately 14 neutrons/cm<sup>2</sup>/hour. Burying a system in a cave reduces the rate of cosmic-ray-induced soft errors to a negligible level. In the lower levels of the atmosphere, the flux increases by a factor of about 2.2 for every 1000&nbsp;m (1.3 for every 1000&nbsp;ft) increase in altitude above sea level. Computers operated on top of mountains experience an order of magnitude higher rate of soft errors compared to sea level. The rate of upsets in [[aircraft]] may be more than 300 times the sea level upset rate. This is in contrast to package decay-induced soft errors, which do not change with location.<ref name="GordonGoldhagen2004">((cite journal |last1=Gordon |first1=M. S. |last2=Goldhagen |first2=P. |last3=Rodbell |first3=K. P. |last4=Zabel |first4=T. H. |last5=Tang |first5=H. H. K. |last6=Clem |first6=J. M. |last7=Bailey |first7=P. |title=Measurement of the flux and energy spectrum of cosmic-ray induced neutrons on the ground |journal=IEEE Transactions on Nuclear Science |volume=51 |issue=6 |date=2004 |pages=3427–3434 |issn=0018-9499 |doi=10.1109/TNS.2004.839134 |bibcode=2004ITNS...51.3427G|s2cid=9573484 ))</ref>
As [[Moore's law|chip density increases]], [[Intel]] expects the errors caused by cosmic rays to increase and become a limiting factor in design.<ref name="cosmicRayAlert">((cite magazine |last=Simonite |first=Tom |date=March 2008 |title=Should every computer chip have a cosmic ray detector? |url=https://www.newscientist.com/blog/technology/2008/03/do-we-need-cosmic-ray-alerts-for.html |magazine=[[New Scientist]] |archive-url=https://web.archive.org/web/20111202020146/https://www.newscientist.com/blog/technology/2008/03/do-we-need-cosmic-ray-alerts-for.html |archive-date=2 December 2011 |access-date=26 November 2019))</ref>

The average rate of cosmic-ray soft errors is ''inversely'' proportional to sunspot activity. That is, the average number of cosmic-ray soft errors decreases during the active portion of the [[sunspot cycle]] and increases during the quiet portion. This counter-intuitive result occurs for two reasons. The Sun does not generally produce cosmic ray particles with energy above 1&nbsp;GeV that are capable of penetrating to the Earth's upper atmosphere and creating particle showers, so the changes in the solar flux do not directly influence the number of errors. Further, the increase in the solar flux during an active sun period does have the effect of reshaping the Earth's magnetic field providing some additional shielding against higher energy cosmic rays, resulting in a decrease in the number of particles creating showers. The effect is fairly small in any case resulting in a ±7% modulation of the energetic neutron flux in New York City. Other locations are similarly affected.((citation needed|date=December 2015))

One experiment measured the soft error rate at the sea level to be 5,950&nbsp;[[failures in time]] (FIT = failures per billion hours) per DRAM chip.  When the same test setup was moved to an underground vault, shielded by over ((Convert|50|feet|m)) of rock that effectively eliminated all cosmic rays, zero soft errors were recorded.<ref>((cite web|author-last=Dell|author-first=Timothy J.|date=1997|title=A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory|url=https://asset-pdf.scinapse.io/prod/48011110/48011110.pdf|access-date=2021-11-03|website=ece.umd.edu|page=13))</ref>  In this test, all other causes of soft errors are too small to be measured, compared to the error rate caused by cosmic rays.
 
Energetic neutrons produced by cosmic rays may lose most of their kinetic energy and reach thermal equilibrium with their surroundings as they are scattered by materials. The resulting neutrons are simply referred to as [[thermal neutrons]] and have an average kinetic energy of about 25 millielectron-volts at 25&nbsp;°C. Thermal neutrons are also produced by environmental radiation sources such as the decay of naturally occurring uranium or thorium. The thermal neutron flux from sources other than cosmic-ray showers may still be noticeable in an underground location and an important contributor to soft errors for some circuits.

=== Thermal neutrons ===
Neutrons that have lost kinetic energy until they are in thermal equilibrium with their surroundings are an important cause of soft errors for some circuits. At low energies many [[neutron capture]] reactions become much more probable and result in fission of certain materials creating charged secondaries as fission byproducts. For some circuits the capture of a [[thermal neutron]] by the nucleus of the <sup>10</sup>B [[isotopes of boron|isotope of boron]] is particularly important. This nuclear reaction is an efficient producer of an [[alpha particle]], [[lithium|<sup>7</sup>Li]] nucleus and [[gamma ray]]. Either of the charged particles (alpha or <sup>7</sup>Li) may cause a soft error if produced in very close proximity, approximately 5&nbsp;[[μm]], to a critical circuit node. The capture cross section for <sup>11</sup>B is 6 [[orders of magnitude]] smaller and does not contribute to soft errors.<ref name="BaumannHossain1995">((cite book |last1=Baumann |first1=R. |title=33rd IEEE International Reliability Physics Symposium |last2=Hossain |first2=T. |last3=Murata |first3=S. |last4=Kitagawa |first4=H. |chapter=Boron compounds as a dominant source of alpha particles in semiconductor devices |date=1995 |pages=297–302 |doi=10.1109/RELPHY.1995.513695 |isbn=978-0-7803-2031-4|s2cid=110078856 ))</ref>

[[Boron]] has been used in [[Borophosphosilicate glass|BPSG]], the insulator in the interconnection layers of integrated circuits, particularly in the lowest one. The inclusion of boron lowers the melt temperature of the glass providing better [[reflow soldering|reflow]] and planarization characteristics. In this application the glass is formulated with a boron content of 4% to 5% by weight. Naturally occurring boron is 20% <sup>10</sup>B with the remainder the <sup>11</sup>B isotope. Soft errors are caused by the high level of <sup>10</sup>B in this critical lower layer of some older integrated circuit processes. Boron-11, used at low concentrations as a p-type dopant, does not contribute to soft errors. Integrated circuit manufacturers eliminated borated dielectrics by the time individual circuit components decreased in size to 150&nbsp;nm, largely due to this problem.

In critical designs, depleted boron((mdashb))consisting almost entirely of boron-11((mdashb))is used, to avoid this effect and therefore to reduce the soft error rate. Boron-11 is a by-product of the [[nuclear power|nuclear industry]].

For applications in medical electronic devices this soft error mechanism may be extremely important. Neutrons are produced during high-energy cancer radiation therapy using photon beam energies above 10&nbsp;MeV. These neutrons are moderated as they are scattered from the equipment and walls in the treatment room resulting in a thermal neutron flux that is about 40&nbsp;×&nbsp;10<sup>6</sup> higher than the normal environmental neutron flux. This high thermal neutron flux will generally result in a very high rate of soft errors and consequent circuit upset.<ref name="WilkinsonBounds2005">((cite journal |last1=Wilkinson |first1=J. D. |last2=Bounds |first2=C. |last3=Brown |first3=T. |last4=Gerbi |first4=B. J. |last5=Peltier |first5=J. |title=Cancer-radiotherapy equipment as a cause of soft errors in electronic equipment |journal=IEEE Transactions on Device and Materials Reliability |volume=5 |issue=3 |date=2005 |pages=449–451 |issn=1530-4388 |doi=10.1109/TDMR.2005.858342|s2cid=20789261 ))</ref><ref name="Franco">Franco, L., Gómez, F., Iglesias, A., Pardo, J., Pazos, A., Pena, J., Zapata, M., SEUs on commercial SRAM induced by low energy neutrons produced at a clinical linac facility, RADECS Proceedings, September 2005</ref>

=== Other causes ===
Soft errors can also be caused by [[random noise]] or [[signal integrity]] problems, such as inductive or capacitive [[crosstalk]]. However, in general, these sources represent a small contribution to the overall soft error rate when compared to radiation effects.

Some tests conclude that the isolation of [[DRAM]] memory cells can be circumvented by unintended side effects of specially crafted accesses to adjacent cells.  Thus, accessing data stored in DRAM causes memory cells to leak their charges and interact electrically, as a result of high cells density in modern memory, altering the content of nearby memory rows that actually were not addressed in the original memory access.<ref name="kyungbae">((cite book |author-first1=Kyungbae |author-last1=Park |author-first2=Sanghyeon |author-last2=Baeg |author-first3=ShiJie |author-last3=Wen |author-first4=Richard |author-last4=Wong |title=2014 IEEE International Integrated Reliability Workshop Final Report (IIRW) |chapter=Active-precharge hammering on a row induced failure in DDR3 SDRAMs under 3× nm technology |pages=82–85 |publisher=[[IEEE]] |date=October 2014 |doi=10.1109/IIRW.2014.7049516 |isbn=978-1-4799-7308-8|s2cid=14464953 ))</ref>  This effect is known as [[row hammer]], and it has also been used in some [[privilege escalation]] computer security [[Exploit (computer security)|exploits]].<ref>((cite web |url=http://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf |title=Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors |date=2014-06-24 |access-date=2015-03-10 |author-first1=Yoongu |author-last1=Kim |author-first2=Ross |author-last2=Daly |author-first3=Jeremie |author-last3=Kim |author-first4=Chris |author-last4=Fallin |author-first5=Ji Hye |author-last5=Lee |author-first6=Donghyuk |author-last6=Lee |author-first7=Chris |author-last7=Wilkerson |author-first8=Konrad |author-last8=Lai |author-first9=Onur |author-last9=Mutlu |publisher=[[IEEE]] |website=ece.cmu.edu))</ref><ref>((cite web |url=https://arstechnica.com/security/2015/03/cutting-edge-hack-gives-super-user-status-by-exploiting-dram-weakness/ |title=Cutting-edge hack gives super user status by exploiting DRAM weakness |date=2015-03-10 |access-date=2015-03-10 |author-first=Dan |author-last=Goodin |publisher=[[Ars Technica]]))</ref>

== Designing around soft errors ==

=== Soft error mitigation===
A designer can attempt to minimize the rate of soft errors by judicious device design, choosing the right semiconductor, package and substrate materials, and the right device geometry. Often, however, this is limited by the need to reduce device size and voltage, to increase operating speed and to reduce power dissipation. The susceptibility of devices to upsets is described in the industry using the [[JEDEC]] [[JESD-89]] standard.

One technique that can be used to reduce the soft error rate in digital circuits is called [[radiation hardening]]. This involves increasing the
capacitance at selected circuit nodes in order to increase its effective Q<sub>crit</sub> value. This reduces the range of particle energies
to which the logic value of the node can be upset.  Radiation hardening is often accomplished by increasing the size of transistors who share
a drain/source region at the node.  Since the area and power overhead of radiation hardening can be restrictive to design, the technique is often applied selectively to nodes which are predicted to have the highest probability of resulting in soft errors if struck. Tools and models that can
predict which nodes are most vulnerable are the subject of past and current research in the area of soft errors.

=== Detecting soft errors===
There has been  work addressing soft errors in processor and memory resources using both hardware and software techniques. Several research efforts addressed soft errors by proposing error detection and recovery via hardware-based redundant multi-threading.<ref name="ReinhardtMukherjee2000">((cite journal |last1=Reinhardt |first1=Steven K. |last2=Mukherjee |first2=Shubhendu S. |title=Transient fault detection via simultaneous multithreading |journal=ACM SIGARCH Computer Architecture News |volume=28 |issue=2 |date=2000 |pages=25–36 |issn=0163-5964 |doi=10.1145/342001.339652|citeseerx=10.1.1.112.37))</ref><ref name="MukherjeeKontz2002">((cite journal |last1=Mukherjee |first1=Shubhendu S. |last2=Kontz |first2=Michael |last3=Reinhardt |first3=Steven K. |title=Detailed design and evaluation of redundant multithreading alternatives |journal=ACM SIGARCH Computer Architecture News |volume=30 |issue=2 |date=2002 |pages=99 |issn=0163-5964 |doi=10.1145/545214.545227 |citeseerx=10.1.1.13.2922|s2cid=1909214 ))</ref><ref name="VijaykumarPomeranz2002">((cite journal |last1=Vijaykumar |first1=T. N. |last2=Pomeranz |first2=Irith|author2-link= Irith Pomeranz |last3=Cheng |first3=Karl |title=Transient-fault recovery using simultaneous multithreading |journal=ACM SIGARCH Computer Architecture News |volume=30 |issue=2 |date=2002 |pages=87 |issn=0163-5964 |doi=10.1145/545214.545226|s2cid=2270600 ))</ref>
These approaches used special hardware to replicate an application execution to identify errors in the output, which increased hardware design complexity and cost including high performance overhead. Software-based soft error tolerant schemes, on the other hand, are flexible and can be applied on commercial off-the-shelf microprocessors. Many works propose compiler-level instruction replication and result checking for soft error detection.
<ref name="oh2002error">((cite journal |last1=Nahmsuk |first1=Oh |last2=Shirvani |first2=Philip P. |last3=McCluskey |first3=Edward J. |title= Error detection by duplicated instructions in super-scalar processors |journal=IEEE Transactions on Reliability |volume=51 |date=2002 |pages=63–75 |doi=10.1109/24.994913))</ref><ref name="reis2005swift">((cite book |last1=Reis A. |first1=George A. |title=International Symposium on Code Generation and Optimization |last2=Chang |first2=Jonathan |last3=Vachharajani |first3=Neil |last4=Rangan |first4=Ram |last5=August |first5=David I. |chapter=SWIFT: Software implemented fault tolerance |location=Proceedings of the international symposium on Code generation and optimization |date=2005 |pages=243–254 |doi=10.1109/CGO.2005.34 |isbn=978-0-7695-2298-2 |citeseerx=10.1.1.472.4177|s2cid=5746979 ))</ref> 
<ref name="Didehban2016nZDC">((citation |last1=Didehban |first1=Moslem |last2=Shrivastava |first2=Aviral |title=Proceedings of the 53rd Annual Design Automation Conference |chapter=NZDC: A compiler technique for near zero silent data corruption |date=2016 |publisher=ACM |location=Proceedings of the 53rd Annual Design Automation Conference (DAC) |page=48 |doi=10.1145/2897937.2898054 |isbn=9781450342360|s2cid=5618907 ))</ref>

=== Correcting soft errors ===
((see also|ECC memory))

Designers can choose to accept that soft errors will occur, and design systems with appropriate error detection and correction to recover gracefully. Typically, a semiconductor memory design might use [[forward error correction]], incorporating redundant data into each [[Word (computer architecture)|word]] to create an [[error correcting code]]. Alternatively, [[roll-back error correction]] can be used, detecting the soft error with an [[Error detection and correction|error-detecting code]] such as [[parity bit|parity]], and rewriting correct data from another source. This technique is often used for [[write-through]] [[cache memory|cache memories]].

Soft errors in [[logic circuits]] are sometimes detected and corrected using the techniques of [[fault tolerance|fault tolerant design]]. These often include the use of redundant circuitry or computation of data, and typically come at the cost of circuit area, decreased performance, and/or higher power consumption. The concept of [[triple modular redundancy]] (TMR) can be employed to ensure very high soft-error reliability in logic circuits. In this technique, three identical copies of a circuit compute on the same data in parallel and outputs are fed into [[majority voting logic]], returning the value that occurred in at least two of three cases. In this way, the failure of one circuit due to soft error is discarded assuming the other two circuits operated correctly.  In practice, however, few designers can afford the greater than 200% circuit area and power overhead required, so it is usually only selectively applied.  Another common concept to correct soft errors in logic circuits is temporal (or time) redundancy, in which one circuit operates on the same data multiple times and compares subsequent evaluations for consistency.  This approach, however, often incurs performance overhead, area overhead (if copies of latches are used to store data), and power overhead, though is considerably more area-efficient than modular redundancy.

Traditionally, [[Dynamic random access memory|DRAM]] has had the most attention in the quest to reduce or work around soft errors, due to the fact that DRAM has comprised the majority-share of susceptible device surface area in desktop, and server computer systems (ref. the prevalence of ECC RAM in server computers).  Hard figures for DRAM susceptibility are hard to come by, and vary considerably across designs, fabrication processes, and manufacturers. 1980s technology 256 kilobit DRAMS could have clusters of five or six bits flip from a single [[alpha particle]]. Modern DRAMs have much smaller feature sizes, so the deposition of a similar amount of charge could easily cause many more bits to flip.

The design of error detection and correction circuits is helped by the fact that soft errors usually are localised to a very small area of a chip. Usually, only one cell of a memory is affected, although high energy events can cause a multi-cell upset. Conventional memory layout usually places one bit of many different correction words adjacent on a chip. So, even a ''multi-cell upset'' leads to only a number of separate ''[[Single event upset|single-bit upsets]]'' in multiple correction words, rather than a ''multi-bit upset'' in a single correction word. So, an error correcting code needs only to cope with a single bit in error in each correction word in order to cope with all likely soft errors. The term 'multi-cell' is used for upsets affecting multiple cells of a memory, whatever correction words those cells happen to fall in. 'Multi-bit' is used when multiple bits in a single correction word are in error.

== Soft errors in combinational logic ==
The three natural masking effects in [[combinational logic]] that determine whether
a [[single event upset]] (SEU) will propagate to become a soft error are [[electrical masking]], [[logical masking]], and [[temporal (or timing-window) masking]]. An SEU is ''logically masked'' if its
propagation is blocked from reaching an output latch because off-path gate
inputs prevent a logical transition of that gate's output. An SEU is 
''electrically masked'' if the signal is attenuated by the electrical properties of
gates on its propagation path such that the resulting pulse is of insufficient magnitude to be
reliably latched. An SEU is ''temporally masked'' if the erroneous pulse reaches
an output latch, but it does not occur close enough to when the latch is actually triggered to hold.

If all three masking effects fail to occur, the propagated pulse becomes latched and the output of the logic circuit will be an erroneous value. In the context of circuit operation, this erroneous output value may be considered a soft error event. However, from a microarchitectural-level standpoint, the affected result may not change the output of the currently-executing program. For instance, the erroneous data could be overwritten before use, masked in subsequent logic operations, or simply never be used.  If erroneous data does not affect the output of a program, it is considered to be an example of ''microarchitectural masking''.

== Soft error rate ==
Soft error rate (SER) is the rate at which a device or system encounters or is predicted to encounter soft errors. It is typically expressed as either the number of failures-in-time (FIT) or [[mean time between failures]] (MTBF). The unit adopted for quantifying failures in time is called FIT, which is equivalent to one error per billion hours of device operation. MTBF is usually given in years of device operation; to put it into perspective, one FIT equals to approximately 1,000,000,000&nbsp;/ (24&nbsp;× 365.25)&nbsp;= 114,077 times longer between errors than one-year MTBF.

While many electronic systems have an MTBF that exceeds the expected lifetime of the circuit, the SER may still be unacceptable to the manufacturer or customer. For instance, many failures per million circuits due to soft errors can be expected in the field if the system does not have adequate soft error protection. The failure of even a few products in the field, particularly if catastrophic, can tarnish the reputation of the product and company that designed it. Also, in safety- or cost-critical applications where the cost of system failure far outweighs the cost of the system itself, a 1% risk of soft error failure per lifetime may be too high to be acceptable to the customer. Therefore, it is advantageous to design for low SER when manufacturing a system in high-volume or requiring extremely high reliability.

== See also ==
((Portal|Electronics))
* [[Single event upset]] (SEU)
* [[Glitch]]
* [[Don't care]]
* [[Logic hazard]]

== References ==
((reflist|40em))

== Further reading ==
* ((cite journal |last1=Ziegler |first1=J. F. |last2=Lanford |first2=W. A. |title=Effect of Cosmic Rays on Computer Memories |journal=Science |volume=206 |issue=4420 |date=1979 |pages=776–788 |issn=0036-8075 |doi=10.1126/science.206.4420.776 |pmid=17820742 |bibcode=1979Sci...206..776Z|s2cid=2000982 ))
* Mukherjee, S., "Architecture Design for Soft Errors," Elsevier, Inc., February 2008.
* Mukherjee, S., "Computer Glitches from Soft Errors: A Problem with Multiple Solutions," Microprocessor Report, 19 May 2008.

== External links ==
((Div col|colwidth=40em))
* [http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf Soft Errors in Electronic Memory - A White Paper] - A good summary paper with many references - Tezzaron January 2004.  Concludes that 1000–5000 FIT per Mbit (0.2–1 error per day per Gbyte) is a typical DRAM soft error rate.
* [http://www-1.ibm.com/servers/eserver/pseries/campaigns/chipkill.pdf Benefits of Chipkill-Correct ECC for PC Server Main Memory] - A 1997 discussion of SDRAM reliability - some interesting information on "soft errors" from [[cosmic ray]]s, especially with respect to [[Error-correcting code]] schemes
* [https://web.archive.org/web/20070416115228/http://www.edn.com/article/CA454636.html Soft errors' impact on system reliability] - Ritesh Mastipuram and Edwin C. Wee, Cypress Semiconductor, 2004
* [https://web.archive.org/web/20041018081918/http://nepp.nasa.gov/DocUploads/40D7D6C9-D5AA-40FC-829DC2F6A71B02E9/Scal-00.pdf Scaling and Technology Issues for Soft Error Rates] - A Johnston - 4th Annual Research Conference on Reliability Stanford University, October 2000
* [http://www.rcnp.osaka-u.ac.jp/~annurep/2001/genkou/sec3/kobayashi.pdf Evaluation of LSI Soft Errors Induced by Terrestrial Cosmic rays and Alpha Particles] - H. Kobayashi, K. Shiraishi, H. Tsuchiya, H. Usuki (all of Sony), and Y. Nagai, K. Takahisa (Osaka University), 2001.
* [http://www.selse.org/ SELSE Workshop Website] - Website for the workshop on the System Effects of Logic Soft Errors
((div col end))

((Authority control))

((DEFAULTSORT:Soft Error))
[[Category:Computer memory]]
[[Category:Data quality]]
[[Category:Digital electronics]]

Pages transcluded onto the current version of this page (help):

Soft error (edit)
Template:Ambox (view source) (template editor protected)
Template:Anchor (view source) (template editor protected)
Template:Authority control (view source) (template editor protected)
Template:Category handler (view source) (protected)
Template:Citation (view source) (protected)
Template:Citation needed (view source) (protected)
Template:Cite book (view source) (protected)
Template:Cite journal (view source) (protected)
Template:Cite magazine (view source) (protected)
Template:Cite web (view source) (protected)
Template:Convert (view source) (template editor protected)
Template:DMCA (view source) (template editor protected)
Template:Dated maintenance category (view source) (template editor protected)
Template:Dated maintenance category (articles) (view source) (template editor protected)
Template:Delink (view source) (protected)
Template:Distinguish (view source) (template editor protected)
Template:Div col (view source) (template editor protected)
Template:Div col/styles.css (view source) (template editor protected)
Template:Div col end (view source) (template editor protected)
Template:FULLROOTPAGENAME (view source) (template editor protected)
Template:Find sources mainspace (view source) (template editor protected)
Template:Fix (view source) (protected)
Template:Fix/category (view source) (protected)
Template:Hlist/styles.css (view source) (protected)
Template:Main other (view source) (protected)
Template:Mdashb (view source) (semi-protected)
Template:More citations needed (view source) (template editor protected)
Template:Ns has subpages (view source) (protected)
Template:Pagetype (view source) (protected)
Template:Portal (view source) (template editor protected)
Template:Reflist (view source) (protected)
Template:Reflist/styles.css (view source) (protected)
Template:SDcat (view source) (protected)
Template:See also (view source) (template editor protected)
Template:Short description (view source) (protected)
Template:Short description/lowercasecheck (view source) (protected)
Template:Tooltip (view source) (template editor protected)
Template:Tooltip/styles.css (view source) (template editor protected)
Template:Use dmy dates (view source) (template editor protected)
Template:Yesno (view source) (protected)
Template:Yesno-no (view source) (template editor protected)
Template:Yesno-yes (view source) (template editor protected)
Template:Zero width joiner em dash zero width non joiner (view source) (semi-protected)
Module:Anchor (view source) (template editor protected)
Module:Arguments (view source) (protected)
Module:Authority control (view source) (template editor protected)
Module:Authority control/config (view source) (template editor protected)
Module:Category handler (view source) (protected)
Module:Category handler/blacklist (view source) (protected)
Module:Category handler/config (view source) (protected)
Module:Category handler/data (view source) (protected)
Module:Category handler/shared (view source) (protected)
Module:Check for unknown parameters (view source) (protected)
Module:Citation/CS1 (view source) (protected)
Module:Citation/CS1/COinS (view source) (protected)
Module:Citation/CS1/Configuration (view source) (protected)
Module:Citation/CS1/Date validation (view source) (protected)
Module:Citation/CS1/Identifiers (view source) (protected)
Module:Citation/CS1/Utilities (view source) (protected)
Module:Citation/CS1/Whitelist (view source) (protected)
Module:Citation/CS1/styles.css (view source) (protected)
Module:Convert (view source) (template editor protected)
Module:Convert/data (view source) (template editor protected)
Module:Convert/text (view source) (template editor protected)
Module:Delink (view source) (protected)
Module:Disambiguation/templates (view source) (protected)
Module:Distinguish (view source) (template editor protected)
Module:EditAtWikidata (view source) (protected)
Module:Find sources (view source) (template editor protected)
Module:Find sources/config (view source) (template editor protected)
Module:Find sources/links (view source) (template editor protected)
Module:Find sources/templates/Find sources mainspace (view source) (template editor protected)
Module:Format link (view source) (template editor protected)
Module:Hatnote (view source) (template editor protected)
Module:Hatnote/styles.css (view source) (template editor protected)
Module:Hatnote list (view source) (template editor protected)
Module:Labelled list hatnote (view source) (template editor protected)
Module:Message box (view source) (protected)
Module:Message box/ambox.css (view source) (protected)
Module:Message box/configuration (view source) (protected)
Module:Namespace detect/config (view source) (protected)
Module:Namespace detect/data (view source) (protected)
Module:Navbar (view source) (protected)
Module:Navbar/configuration (view source) (protected)
Module:Navbox (view source) (template editor protected)
Module:Navbox/configuration (view source) (template editor protected)
Module:Navbox/styles.css (view source) (template editor protected)
Module:Ns has subpages (view source) (protected)
Module:Pagetype (view source) (protected)
Module:Pagetype/config (view source) (protected)
Module:Pagetype/disambiguation (view source) (protected)
Module:Pagetype/setindex (view source) (protected)
Module:Pagetype/softredirect (view source) (protected)
Module:Portal (view source) (template editor protected)
Module:Portal/images/e (view source) (template editor protected)
Module:Portal/styles.css (view source) (template editor protected)
Module:SDcat (view source) (protected)
Module:String (view source) (protected)
Module:TableTools (view source) (protected)
Module:Unsubst (view source) (protected)
Module:Wikitext Parsing (view source) (protected)
Module:Yesno (view source) (protected)

Return to Soft error.

Retrieved from "https://en.wikipedia.org/wiki/Soft_error"