Design and Optimization of Resistive RAM-based Storage and Computing Systems

157305-Thumbnail Image.png
Description
The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory

technology because of its attractive attributes, including excellent scalability (< 10 nm), low

programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),

good endurance (u

The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory

technology because of its attractive attributes, including excellent scalability (< 10 nm), low

programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),

good endurance (up to 1012 cycles) and great compatibility with silicon CMOS technology [1].

However, ReRAM suffers from larger write latency, energy and reliability issue compared to

Dynamic Random Access Memory (DRAM). To improve the energy-efficiency, latency efficiency and reliability of ReRAM storage systems, a low cost cross-layer approach that spans device, circuit, architecture and system levels is proposed.

For 1T1R 2D ReRAM system, the effect of both retention and endurance errors on

ReRAM reliability is considered. Proposed approach is to design circuit-level and architecture-level techniques to reduce raw Bit Error Rate significantly and then employ low cost Error Control Coding to achieve the desired lifetime.

For 1S1R 2D ReRAM system, a cross-point array with “multi-bit per access” per subarray

is designed for high energy-efficiency and good reliability. The errors due to cell-level as well

as array-level variations are analyzed and a low cost scheme to maintain reliability and latency

with low energy consumption is proposed.

For 1S1R 3D ReRAM system, access schemes which activate multiple subarrays with

multiple layers in a subarray are used to achieve high energy efficiency through activating fewer

subarray, and good reliability is achieved through innovative data organization.

Finally, a novel ReRAM-based accelerator design is proposed to support multiple

Convolutional Neural Networks (CNN) topologies including VGGNet, AlexNet and ResNet.

The multi-tiled architecture consists of 9 processing elements per tile, where each tile

implements the dot product operation using ReRAM as computation unit. The processing

elements operate in a systolic fashion, thereby maximizing input feature map reuse and

minimizing interconnection cost. The system-level evaluation on several network benchmarks

show that the proposed architecture can improve computation efficiency and energy efficiency

compared to a state-of-the-art ReRAM-based accelerator.
Date Created
2019
Agent

Space Radiation Effects in Conductive Bridging Random Access Memory

156908-Thumbnail Image.png
Description
This work investigates the effects of ionizing radiation and displacement damage on the retention of state, DC programming, and neuromorphic pulsed programming of Ag-Ge30Se70 conductive bridging random access memory (CBRAM) devices. The results show that CBRAM devices are susceptible to

This work investigates the effects of ionizing radiation and displacement damage on the retention of state, DC programming, and neuromorphic pulsed programming of Ag-Ge30Se70 conductive bridging random access memory (CBRAM) devices. The results show that CBRAM devices are susceptible to both environments. An observable degradation in electrical response due to total ionizing dose (TID) is shown during neuromorphic pulsed programming at TID below 1 Mrad using Cobalt-60. DC cycling in a 14 MeV neutron environment showed a collapse of the high resistance state (HRS) and low resistance state (LRS) programming window after a fluence of 4.9x10^{12} n/cm^2, demonstrating the CBRAM can fail in a displacement damage environment. Heavy ion exposure during retention testing and DC cycling, showed that failures to programming occurred at approximately the same threshold, indicating that the failure mechanism for the two types of tests may be the same. The dose received due to ionizing electronic interactions and non-ionizing kinetic interactions, was calculated for each ion species at the fluence of failure. TID values appear to be the most correlated, indicating that TID effects may be the dominate failure mechanism in a combined environment, though it is currently unclear as to how the displacement damage also contributes to the response. An analysis of material effects due to TID has indicated that radiation damage can limit the migration of Ag+ ions. The reduction in ion current density can explain several of the effects observed in CBRAM while in the LRS.
Date Created
2018
Agent

Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

156804-Thumbnail Image.png
Description
Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In

Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level.

Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well.

Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm.

Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.
Date Created
2018
Agent

Design of Resistive Synaptic Devices and Array Architectures for Neuromorphic Computing

156195-Thumbnail Image.png
Description
Over the past few decades, the silicon complementary-metal-oxide-semiconductor (CMOS) technology has been greatly scaled down to achieve higher performance, density and lower power consumption. As the device dimension is approaching its fundamental physical limit, there is an increasing demand for

Over the past few decades, the silicon complementary-metal-oxide-semiconductor (CMOS) technology has been greatly scaled down to achieve higher performance, density and lower power consumption. As the device dimension is approaching its fundamental physical limit, there is an increasing demand for exploration of emerging devices with distinct operating principles from conventional CMOS. In recent years, many efforts have been devoted in the research of next-generation emerging non-volatile memory (eNVM) technologies, such as resistive random access memory (RRAM) and phase change memory (PCM), to replace conventional digital memories (e.g. SRAM) for implementation of synapses in large-scale neuromorphic computing systems.

Essentially being compact and “analog”, these eNVM devices in a crossbar array can compute vector-matrix multiplication in parallel, significantly speeding up the machine/deep learning algorithms. However, non-ideal eNVM device and array properties may hamper the learning accuracy. To quantify their impact, the sparse coding algorithm was used as a starting point, where the strategies to remedy the accuracy loss were proposed, and the circuit-level design trade-offs were also analyzed. At architecture level, the parallel “pseudo-crossbar” array to prevent the write disturbance issue was presented. The peripheral circuits to support various parallel array architectures were also designed. One key component is the read circuit that employs the principle of integrate-and-fire neuron model to convert the analog column current to digital output. However, the read circuit is not area-efficient, which was proposed to be replaced with a compact two-terminal oscillation neuron device that exhibits metal-insulator-transition phenomenon.

To facilitate the design exploration, a circuit-level macro simulator “NeuroSim” was developed in C++ to estimate the area, latency, energy and leakage power of various neuromorphic architectures. NeuroSim provides a wide variety of design options at the circuit/device level. NeuroSim can be used alone or as a supporting module to provide circuit-level performance estimation in neural network algorithms. A 2-layer multilayer perceptron (MLP) simulator with integration of NeuroSim was demonstrated to evaluate both the learning accuracy and circuit-level performance metrics for the online learning and offline classification, as well as to study the impact of eNVM reliability issues such as data retention and write endurance on the learning performance.
Date Created
2018
Agent

Algorithm and Hardware Co-design for Learning On-a-chip

155897-Thumbnail Image.png
Description
Machine learning technology has made a lot of incredible achievements in recent years. It has rivalled or exceeded human performance in many intellectual tasks including image recognition, face detection and the Go game. Many machine learning algorithms require huge amount

Machine learning technology has made a lot of incredible achievements in recent years. It has rivalled or exceeded human performance in many intellectual tasks including image recognition, face detection and the Go game. Many machine learning algorithms require huge amount of computation such as in multiplication of large matrices. As silicon technology has scaled to sub-14nm regime, simply scaling down the device cannot provide enough speed-up any more. New device technologies and system architectures are needed to improve the computing capacity. Designing specific hardware for machine learning is highly in demand. Efforts need to be made on a joint design and optimization of both hardware and algorithm.

For machine learning acceleration, traditional SRAM and DRAM based system suffer from low capacity, high latency, and high standby power. Instead, emerging memories, such as Phase Change Random Access Memory (PRAM), Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM), and Resistive Random Access Memory (RRAM), are promising candidates providing low standby power, high data density, fast access and excellent scalability. This dissertation proposes a hierarchical memory modeling framework and models PRAM and STT-MRAM in four different levels of abstraction. With the proposed models, various simulations are conducted to investigate the performance, optimization, variability, reliability, and scalability.

Emerging memory devices such as RRAM can work as a 2-D crosspoint array to speed up the multiplication and accumulation in machine learning algorithms. This dissertation proposes a new parallel programming scheme to achieve in-memory learning with RRAM crosspoint array. The programming circuitry is designed and simulated in TSMC 65nm technology showing 900X speedup for the dictionary learning task compared to the CPU performance.

From the algorithm perspective, inspired by the high accuracy and low power of the brain, this dissertation proposes a bio-plausible feedforward inhibition spiking neural network with Spike-Rate-Dependent-Plasticity (SRDP) learning rule. It achieves more than 95% accuracy on the MNIST dataset, which is comparable to the sparse coding algorithm, but requires far fewer number of computations. The role of inhibition in this network is systematically studied and shown to improve the hardware efficiency in learning.
Date Created
2017
Agent

Current Sensing Amplifier Design for RRAM Crossbar Arrays

134726-Thumbnail Image.png
Description
Resistive Random Access Memory (RRAM) is an emerging type of non-volatile memory technology that seeks to replace FLASH memory. The RRAM crossbar array is advantageous in its relatively small cell area and faster read latency in comparison to NAND and

Resistive Random Access Memory (RRAM) is an emerging type of non-volatile memory technology that seeks to replace FLASH memory. The RRAM crossbar array is advantageous in its relatively small cell area and faster read latency in comparison to NAND and NOR FLASH memory; however, the crossbar array faces design challenges of its own in sneak-path currents that prevent proper reading of memory stored in the RRAM cell. The Current Sensing Amplifier is one method of reading RRAM crossbar arrays. HSpice simulations are used to find the associated reading delays of the Current Sensing Amplifier with respect to various sizes of RRAM crossbar arrays, as well as the largest array size compatible for accurate reading. It is found that up to 1024x1024 arrays are achievable with a worst-case read delay of 815ps, and it is further likely 2048x2048 arrays are able to be read using the Current Sensing Amplifier. In comparing the Current Sensing Amplifier latency results with previously obtained latency results from the Voltage Sensing Amplifier, it is shown that the Voltage Sensing Amplifier reads arrays in sizes up to 256x256 faster while the Current Sensing Amplifier reads larger arrays faster.
Date Created
2016-12
Agent

Digital Modeling of Analog Effect Circuits

135932-Thumbnail Image.png
Description
While SPICE circuit simulation software gives researchers and industry accurate information regarding the behavior and characteristics of circuits, the auditory effect of SPICE circuit simulation on audio circuits is not well documented. This project takes a thoroughly analyzed and popular

While SPICE circuit simulation software gives researchers and industry accurate information regarding the behavior and characteristics of circuits, the auditory effect of SPICE circuit simulation on audio circuits is not well documented. This project takes a thoroughly analyzed and popular audio effect circuit called the Ibanez Tubescreamer and simulates its distortion effect on a .wav file in order to hear the effect of SPICE simulation. Specifically, the TS-808 schematic is drawn in the SPICE program LTSPICE and simulated using generated sinusoids and recorded .wav files. Specific components are imported using .MODEL and .SUBCKT to accurately represent the diodes, bipolar transistors, op amps, and other components in order to hear how each component affects the response. Various transient responses are extracted as .wav files and assembled as figures in order to characterize the result of the circuit on the input. Once the actual circuit is built and debugged, all of the same transient analysis is applied and then compared to the SPICE simulation figures gathered in the digital simulation. These results are then compared along with a subjective hearing test of the digital simulation and analog circuit in order to test the validity of the SPICE simulations. The digital simulations reveal that the distortion follows the signature characteristics of Ibanez Tubescreamer which shows that SPICE simulation will give insight into the real effects of audio circuits modeled in SPICE programs. Diodes--such as Silicon, Germanium, Zener, Red LEDs and Blue LEDs--can dramatically change the waveforms and sound of the inputs within the circuit where as the Op-amps--such as the JRC4558, TL072, and NE5532--have little to no effect on the waveforms and subjective effects on the output .wav files. After building the circuit and hearing the difference between the analog circuit and digital simulation, the differences between the two are apparent but very similar in nature--proving that the SPICE simulation can give meaningful insight into the sound of the actual analog circuit. Some of the differences can be explained by the variance of equipment and environment used in recording and playback. Since this project did not use high fidelity audio recording equipment and consistency in the equipment used for playback, it is uncertain if the simulation and actual circuit could be classified as completely accurate. Any further work on the project would be recording and playing back in a constant environment and looking into a wider range of specific components instead of looking into one permutation.
Date Created
2015-12
Agent

Voltage Sense Amplifier (VSA) Design For RRAM Cross-Point Memory Array Structures

135777-Thumbnail Image.png
Description
RRAM is an emerging technology that looks to replace FLASH NOR and possibly NAND memory. It is attractive because it uses an adjustable resistance and does not rely on charge; in the sub-10nm feature size circuitry this is critical. However,

RRAM is an emerging technology that looks to replace FLASH NOR and possibly NAND memory. It is attractive because it uses an adjustable resistance and does not rely on charge; in the sub-10nm feature size circuitry this is critical. However, RRAM cross-point arrays suffer tremendously from leakage currents that prevent proper readings in larger array sizes. In this research an exponential IV selector was added to each cell to minimize this current. Using this technique the largest array-size supportable was determined to be 512x512 cells using the conventional voltage sense amplifier by HSPICE simulations. However, with the increase in array size, the sensing latency also remarkably increases due to more sneak path currents, approaching 873 ns for the 512x512 array.
Date Created
2016-05
Agent

Cu-Silica Based Programmable Metallization Cell: Fabrication, Characterization and Applications

155700-Thumbnail Image.png
Description
The Programmable Metallization Cell (PMC) is a novel solid-state resistive switching technology. It has a simple metal-insulator-metal “MIM” structure with one metal being electrochemically active (Cu) and the other one being inert (Pt or W), an insulating film (silica) acts

The Programmable Metallization Cell (PMC) is a novel solid-state resistive switching technology. It has a simple metal-insulator-metal “MIM” structure with one metal being electrochemically active (Cu) and the other one being inert (Pt or W), an insulating film (silica) acts as solid electrolyte for ion transport is sandwiched between these two electrodes. PMC’s resistance can be altered by an external electrical stimulus. The change of resistance is attributed to the formation or dissolution of Cu metal filament(s) within the silica layer which is associated with electrochemical redox reactions and ion transportation. In this dissertation, a comprehensive study of microfabrication method and its impacts on performance of PMC device is demonstrated, gamma-ray total ionizing dose (TID) impacts on device reliability is investigated, and the materials properties of doped/undoped silica switching layers are illuminated by impedance spectroscopy (IS). Due to the inherent CMOS compatibility, Cu-silica PMCs have great potential to be adopted in many emerging technologies, such as non-volatile storage cells and selector cells in ultra-dense 3D crosspoint memories, as well as electronic synapses in brain-inspired neuromorphic computing. Cu-silica PMC device performance for these applications is also assessed in this dissertation.
Date Created
2017
Agent

Stochastic Learning in Oxide Binary Synaptic Device for Neuromorphic Computing

128181-Thumbnail Image.png
Description

Hardware implementation of neuromorphic computing is attractive as a computing paradigm beyond the conventional digital computing. In this work, we show that the SET (off-to-on) transition of metal oxide resistive switching memory becomes probabilistic under a weak programming condition. The

Hardware implementation of neuromorphic computing is attractive as a computing paradigm beyond the conventional digital computing. In this work, we show that the SET (off-to-on) transition of metal oxide resistive switching memory becomes probabilistic under a weak programming condition. The switching variability of the binary synaptic device implements a stochastic learning rule. Such stochastic SET transition was statistically measured and modeled for a simulation of a winner-take-all network for competitive learning. The simulation illustrates that with such stochastic learning, the orientation classification function of input patterns can be effectively realized. The system performance metrics were compared between the conventional approach using the analog synapse and the approach in this work that employs the binary synapse utilizing the stochastic learning. The feasibility of using binary synapse in the neurormorphic computing may relax the constraints to engineer continuous multilevel intermediate states and widens the material choice for the synaptic device design.

Date Created
2013-10-31
Agent