Cao, Yu | KEEP

Scalable surface-potential-based compact model of high-voltage LDMOS transistors

Description

Lateral Double-diffused (LDMOS) transistors are commonly used in power management, high voltage/current, and RF circuits. Their characteristics include high breakdown voltage, low on-resistance, and compatibility with standard CMOS and BiCMOS manufacturing processes. As with other semiconductor devices, an accurate and physical compact model is critical for LDMOS-based circuit design. The goal of this research work is to advance the state-of-the-art by developing a physics-based scalable compact model of LDMOS transistors. The new model, SP-HV, is constructed from a surface-potential-based bulk MOSFET model, PSP, and a nonlinear resistor model, R3. The use of independently verified and mature sub-models leads to increased accuracy and robustness of an overall LDMOS model. Improved geometry scaling and simplified statistical modeling are other useful and practical consequences of the approach. Extensions are made to both PSP and R3 for improved modeling of LDMOS devices, and one internal node is introduced to connect the two component models. The presence of the lightly-doped drift region in LDMOS transistors causes some characteristic device effects which are usually not observed in conventional MOSFETs. These include quasi-saturation, a sharp peak in transconductance at low VD, gate capacitance exceeding oxide capacitance at positive VD, negative transcapacitances CBG and CGB at positive VD, a "double-hump" IB(VG) current and expansion effects. SP-HV models these effects accurately. It also includes a scalable self-heating model which is important to model the geometry dependence of the expansion effect. SP-HV, including its scalability, is verified extensively by comparison both to TCAD simulations and experimental data. The close agreement confirms the validity of the model structure. Circuit simulation examples are presented to demonstrate its convergence.

Date Created

2012

Agent

Author (aut): Yao, Wei
Thesis advisor (ths): Gildenblat, Gennady
Committee member: Barnaby, Hugh
Committee member: Cao, Yu
Committee member: McAndrew, Colin
Publisher (pbl): Arizona State University

System level power and thermal management on embedded processors

Description

Semiconductor scaling technology has led to a sharp growth in transistor counts. This has resulted in an exponential increase on both power dissipation and heat flux (or power density) in modern microprocessors. These microprocessors are integrated as the major components in many modern embedded devices, which offer richer features and attain higher performance than ever before. Therefore, power and thermal management have become the significant design considerations for modern embedded devices. Dynamic voltage/frequency scaling (DVFS) and dynamic power management (DPM) are two well-known hardware capabilities offered by modern embedded processors. However, the power or thermal aware performance optimization is not fully explored for the mainstream embedded processors with discrete DVFS and DPM capabilities. Many key problems have not been answered yet. What is the maximum performance that an embedded processor can achieve under power or thermal constraint for a periodic application? Does there exist an efficient algorithm for the power or thermal management problems with guaranteed quality bound? These questions are hard to be answered because the discrete settings of DVFS and DPM enhance the complexity of many power and thermal management problems, which are generally NP-hard. The dissertation presents a comprehensive study on these NP-hard power and thermal management problems for embedded processors with discrete DVFS and DPM capabilities. In the domain of power management, the dissertation addresses the power minimization problem for real-time schedules, the energy-constrained make-span minimization problem on homogeneous and heterogeneous chip multiprocessors (CMP) architectures, and the battery aware energy management problem with nonlinear battery discharging model. In the domain of thermal management, the work addresses several thermal-constrained performance maximization problems for periodic embedded applications. All the addressed problems are proved to be NP-hard or strongly NP-hard in the study. Then the work focuses on the design of the off-line optimal or polynomial time approximation algorithms as solutions in the problem design space. Several addressed NP-hard problems are tackled by dynamic programming with optimal solutions and pseudo-polynomial run time complexity. Because the optimal algorithms are not efficient in worst case, the fully polynomial time approximation algorithms are provided as more efficient solutions. Some efficient heuristic algorithms are also presented as solutions to several addressed problems. The comprehensive study answers the key questions in order to fully explore the power and thermal management potentials on embedded processors with discrete DVFS and DPM capabilities. The provided solutions enable the theoretical analysis of the maximum performance for periodic embedded applications under power or thermal constraints.

Date Created

2012

Agent

Author (aut): Zhang, Sushu
Thesis advisor (ths): Chatha, Karam S
Committee member: Cao, Yu
Committee member: Konjevod, Goran
Committee member: Vrudhula, Sarma
Committee member: Xue, Guoliang
Publisher (pbl): Arizona State University

Multidimensional DFT IP generators for FPGA platforms

Description

Multidimensional (MD) discrete Fourier transform (DFT) is a key kernel algorithm in many signal processing applications, such as radar imaging and medical imaging. Traditionally, a two-dimensional (2-D) DFT is computed using Row-Column (RC) decomposition, where one-dimensional (1-D) DFTs are computed along the rows followed by 1-D DFTs along the columns. However, architectures based on RC decomposition are not efficient for large input size data which have to be stored in external memories based Synchronous Dynamic RAM (SDRAM). In this dissertation, first an efficient architecture to implement 2-D DFT for large-sized input data is proposed. This architecture achieves very high throughput by exploiting the inherent parallelism due to a novel 2-D decomposition and by utilizing the row-wise burst access pattern of the SDRAM external memory. In addition, an automatic IP generator is provided for mapping this architecture onto a reconfigurable platform of Xilinx Virtex-5 devices. For a 2048x2048 input size, the proposed architecture is 1.96 times faster than RC decomposition based implementation under the same memory constraints, and also outperforms other existing implementations. While the proposed 2-D DFT IP can achieve high performance, its output is bit-reversed. For systems where the output is required to be in natural order, use of this DFT IP would result in timing overhead. To solve this problem, a new bandwidth-efficient MD DFT IP that is transpose-free and produces outputs in natural order is proposed. It is based on a novel decomposition algorithm that takes into account the output order, FPGA resources, and the characteristics of off-chip memory access. An IP generator is designed and integrated into an in-house FPGA development platform, AlgoFLEX, for easy verification and fast integration. The corresponding 2-D and 3-D DFT architectures are ported onto the BEE3 board and their performance measured and analyzed. The results shows that the architecture can maintain the maximum memory bandwidth throughout the whole procedure while avoiding matrix transpose operations used in most other MD DFT implementations. The proposed architecture has also been ported onto the Xilinx ML605 board. When clocked at 100 MHz, 2048x2048 images with complex single-precision can be processed in less than 27 ms. Finally, transpose-free imaging flows for range-Doppler algorithm (RDA) and chirp-scaling algorithm (CSA) in SAR imaging are proposed. The corresponding implementations take advantage of the memory access patterns designed for the MD DFT IP and have superior timing performance. The RDA and CSA flows are mapped onto a unified architecture which is implemented on an FPGA platform. When clocked at 100MHz, the RDA and CSA computations with data size 4096x4096 can be completed in 323ms and 162ms, respectively. This implementation outperforms existing SAR image accelerators based on FPGA and GPU.

Date Created

2012

Agent

Author (aut): Yu, Chi-Li
Thesis advisor (ths): Chakrabarti, Chaitali
Committee member: Papandreou-Suppappola, Antonia
Committee member: Karam, Lina
Committee member: Cao, Yu
Publisher (pbl): Arizona State University

Digitally controlled average current mode buck converter

Description

During the past decade, different kinds of fancy functions are developed in portable electronic devices. This trend triggers the research of how to enhance battery lifetime to meet the requirement of fast growing demand of power in portable devices. DC-DC converter is the connection configuration between the battery and the functional circuitry. A good design of DC-DC converter will maximize the power efficiency and stabilize the power supply of following stages. As the representative of the DC-DC converter, Buck converter, which is a step down DC-DC converter that the output voltage level is smaller than the input voltage level, is the best-fit sample to start with. Digital control for DC-DC converters reduces noise sensitivity and enhances process, voltage and temperature (PVT) tolerance compared with analog control method. Also it will reduce the chip area and cost correspondingly. In battery-friendly perspective, current mode control has its advantage in over-current protection and parallel current sharing, which can form different structures to extend battery lifetime. In the thesis, the method to implement digitally average current mode control is introduced; including the FPGA based digital controller design flow. Based on the behavioral model of the close loop Buck converter with digital current control, the first FPGA based average current mode controller is burned into board and tested. With the analysis, the design metric of average current mode control is provided in the study. This will be the guideline of the parallel structure of future research.

Date Created

2011

Agent

Author (aut): Fu, Chao
Thesis advisor (ths): Bakkaloglu, Bertan
Committee member: Cao, Yu
Committee member: Vermeire, Bert
Publisher (pbl): Arizona State University

Digitally controlled DC-DC buck converters with lossless current sensing

Description

Current sensing ability is one of the most desirable features of contemporary current or voltage mode controlled DC-DC converters. Current sensing can be used for over load protection, multi-stage converter load balancing, current-mode control, multi-phase converter current-sharing, load independent control, power efficiency improvement etc. There are handful existing approaches for current sensing such as external resistor sensing, triode mode current mirroring, observer sensing, Hall-Effect sensors, transformers, DC Resistance (DCR) sensing, Gm-C filter sensing etc. However, each method has one or more issues that prevent them from being successfully applied in DC-DC converter, e.g. low accuracy, discontinuous sensing nature, high sensitivity to switching noise, high cost, requirement of known external power filter components, bulky size, etc. In this dissertation, an offset-independent inductor Built-In Self Test (BIST) architecture is proposed which is able to measure the inductor inductance and DCR. The measured DCR enables the proposed continuous, lossless, average current sensing scheme. A digital Voltage Mode Control (VMC) DC-DC buck converter with the inductor BIST and current sensing architecture is designed, fabricated, and experimentally tested. The average measurement errors for inductance, DCR and current sensing are 2.1%, 3.6%, and 1.5% respectively. For the 3.5mm by 3.5mm die area, inductor BIST and current sensing circuits including related pins only consume 5.2% of the die area. BIST mode draws 40mA current for a maximum time period of 200us upon start-up and the continuous current sensing consumes about 400uA quiescent current. This buck converter utilizes an adaptive compensator. It could update compensator internally so that the overall system has a proper loop response for large range inductance and load current. Next, a digital Average Current Mode Control (ACMC) DC-DC buck converter with the proposed average current sensing circuits is designed and tested. To reduce chip area and power consumption, a 9 bits hybrid Digital Pulse Width Modulator (DPWM) which uses a Mixed-mode DLL (MDLL) is also proposed. The DC-DC converter has a maximum of 12V input, 1-11 V output range, and a maximum of 3W output power. The maximum error of one least significant bit (LSB) delay of the proposed DPWM is less than 1%.

Date Created

2011

Agent

Author (aut): Liu, Tao
Thesis advisor (ths): Bakkaloglu, Bertan
Committee member: Ozev, Sule
Committee member: Vermeire, Bert
Committee member: Cao, Yu
Publisher (pbl): Arizona State University

Neuromorphic controller for low power systems from devices to circuits

Description

A workload-aware low-power neuromorphic controller for dynamic power and thermal management in VLSI systems is presented. The neuromorphic controller predicts future workload and temperature values based on the past values and CPU performance counters and preemptively regulates supply voltage and frequency. System-level measurements from stateof-the-art commercial microprocessors are used to get workload, temperature and CPU performance counter values. The controller is designed and simulated using circuit-design and synthesis tools. At device-level, on-chip planar inductors suffer from low inductance occupying large chip area. On-chip inductors with integrated magnetic materials are designed, simulated and fabricated to explore performance-efficiency trade offs and explore potential applications such as resonant clocking and on-chip voltage regulation. A system level study is conducted to evaluate the effect of on-chip voltage regulator employing magnetic inductors as the output filter. It is concluded that neuromorphic power controller is beneficial for fine-grained per-core power management in conjunction with on-chip voltage regulators utilizing scaled magnetic inductors.

Date Created

2011

Agent

Author (aut): Sinha, Saurabh
Thesis advisor (ths): Cao, Yu
Committee member: Bakkaloglu, Bertan
Committee member: Yu, Hongbin
Committee member: Christen, Jennifer B.
Publisher (pbl): Arizona State University

Film bulk acoustic resonators of high quality factors in liquid environments for biosensing applications

Description

Micro-electro-mechanical systems (MEMS) film bulk acoustic resonator (FBAR) demonstrates label-free biosensing capabilities and is considered to be a promising alternative of quartz crystal microbalance (QCM). FBARs achieve great success in vacuum, or in the air, but find limited applications in liquid media because squeeze damping significantly degrades quality factor (Q) and results in poor frequency resolution. A transmission-line model shows that by confining the liquid in a thickness comparable to the acoustic wavelength of the resonator, Q can be considerably improved. The devices exhibit damped oscillatory patterns of Q as the liquid thickness varies. Q assumes its maxima and minima when the channel thickness is an odd and even multiple of the quarter-wavelength of the resonance, respectively. Microfluidic channels are integrated with longitudinal-mode FBARs (L-FBARs) to realize this design; a tenfold improvement of Q over fully-immersed devices is experimentally verified. Microfluidic integrated FBAR sensors have been demonstrated for detecting protein binding in liquid and monitoring the Vroman effect (the competitive protein adsorption behavior), showing their potential as a promising bio-analytical tool. A contour-mode FBAR (C-FBAR) is developed to further improve Q and to alleviate the need for complex integration of microfluidic channels. The C-FBAR consists of a suspended piezoelectric ring made of aluminum nitride and is excited in the fundamental radial-extensional mode. By replacing the squeeze damping with shear damping, high Qs (189 in water and 77 in human whole blood) are obtained in semi-infinite depth liquids. The C-FBAR sensors are characterized by aptamer - thrombin binding pairs and aqueous glycerine solutions for mass and viscosity sensing schemes, respectively. The C-FBAR sensor demonstrates accurate viscosity measurement from 1 to 10 centipoise, and can be deployed to monitor in-vitro blood coagulation processes in real time. Results show that its resonant frequency decreases as the viscosity of the blood increases during the fibrin generation process after the coagulation cascade. The coagulation time and the start/end of the fibrin generation are quantitatively determined, showing the C-FBAR can be a low-cost, portable yet reliable tool for hemostasis diagnostics.

Date Created

2011

Agent

Author (aut): Xu, Wencheng
Thesis advisor (ths): Chae, Junseok
Committee member: Phillips, Stephen
Committee member: Cao, Yu
Committee member: Kozicki, Michael
Publisher (pbl): Arizona State University

Low-power design of a neuromorphic IC and MICS transceiver

Description

The first part describes Metal Semiconductor Field Effect Transistor (MESFET) based fundamental analog building blocks designed and fabricated in a single poly, 3-layer metal digital CMOS technology utilizing fully depletion mode MESFET devices. DC characteristics were measured by varying the power supply from 2.5V to 5.5V. The measured DC transfer curves of amplifiers show good agreement with the simulated ones with extracted models from the same process. The accuracy of the current mirror showing inverse operation is within ±15% for the current from 0 to 1.5mA with the power supply from 2.5 to 5.5V. The second part presents a low-power image recognition system with a novel MESFET device fabricated on a CMOS substrate. An analog image recognition system with power consumption of 2.4mW/cell and a response time of 6µs is designed, fabricated and characterized. The experimental results verified the accuracy of the extracted SPICE model of SOS MESFETs. The response times of 4µs and 6µs for one by four and one by eight arrays, respectively, are achieved with the line recognition. Each core cell for both arrays consumes only 2.4mW. The last part presents a CMOS low-power transceiver in MICS band is presented. The LNA core has an integrated mixer in a folded configuration. The baseband strip consists of a pseudo differential MOS-C band-pass filter achieving demodulation of 150kHz-offset BFSK signals. The SRO is used in a wakeup RX for the wake-up signal reception. The all digital frequency-locked loop drives a class AB power amplifier in a transmitter. The sensitivity of -85dBm in the wakeup RX is achieved with the power consumption of 320µW and 400µW at the data rates of 100kb/s and 200kb/s from 1.8V, respectively. The sensitivities of -70dBm and -98dBm in the data-link RX are achieved with NF of 40dB and 11dB at the data rate of 100kb/s while consuming only 600µW and 1.5mW at 1.2V and 1.8V, respectively.

Date Created

2011

Agent

Author (aut): Kim, Sung
Thesis advisor (ths): Bakkaloglu, Bertan
Committee member: Christen, Jennifer Blain
Committee member: Cao, Yu
Committee member: Thornton, Trevor
Publisher (pbl): Arizona State University

Energy-efficient DSP system design based on the Redundant Binary number system

Description

Redundant Binary (RBR) number representations have been extensively used in the past for high-throughput Digital Signal Processing (DSP) systems. Data-path components based on this number system have smaller critical path delay but larger area compared to conventional two's complement systems. This work explores the use of RBR number representation for implementing high-throughput DSP systems that are also energy-efficient. Data-path components such as adders and multipliers are evaluated with respect to critical path delay, energy and Energy-Delay Product (EDP). A new design for a RBR adder with very good EDP performance has been proposed. The corresponding RBR parallel adder has a much lower critical path delay and EDP compared to two's complement carry select and carry look-ahead adder implementations. Next, several RBR multiplier architectures are investigated and their performance compared to two's complement systems. These include two new multiplier architectures: a purely RBR multiplier where both the operands are in RBR form, and a hybrid multiplier where the multiplicand is in RBR form and the other operand is represented in conventional two's complement form. Both the RBR and hybrid designs are demonstrated to have better EDP performance compared to conventional two's complement multipliers. The hybrid multiplier is also shown to have a superior EDP performance compared to the RBR multiplier, with much lower implementation area. Analysis on the effect of bit-precision is also performed, and it is shown that the performance gain of RBR systems improves for higher bit precision. Next, in order to demonstrate the efficacy of the RBR representation at the system-level, the performance of RBR and hybrid implementations of some common DSP kernels such as Discrete Cosine Transform, edge detection using Sobel operator, complex multiplication, Lifting-based Discrete Wavelet Transform (9, 7) filter, and FIR filter, is compared with two's complement systems. It is shown that for relatively large computation modules, the RBR to two's complement conversion overhead gets amortized. In case of systems with high complexity, for iso-throughput, both the hybrid and RBR implementations are demonstrated to be superior with lower average energy consumption. For low complexity systems, the conversion overhead is significant, and overpowers the EDP performance gain obtained from the RBR computation operation.

Date Created

2011

Agent

Author (aut): Mahadevan, Rupa
Thesis advisor (ths): Chakrabarti, Chaitali
Committee member: Kiaei, Sayfe
Committee member: Cao, Yu
Publisher (pbl): Arizona State University

An analytical approach to efficient circuit variability analysis in scaled CMOS design

Description

Process variations have become increasingly important for scaled technologies starting at 45nm. The increased variations are primarily due to random dopant fluctuations, line-edge roughness and oxide thickness fluctuation. These variations greatly impact all aspects of circuit performance and pose a grand challenge to future robust IC design. To improve robustness, efficient methodology is required that considers effect of variations in the design flow. Analyzing timing variability of complex circuits with HSPICE simulations is very time consuming. This thesis proposes an analytical model to predict variability in CMOS circuits that is quick and accurate. There are several analytical models to estimate nominal delay performance but very little work has been done to accurately model delay variability. The proposed model is comprehensive and estimates nominal delay and variability as a function of transistor width, load capacitance and transition time. First, models are developed for library gates and the accuracy of the models is verified with HSPICE simulations for 45nm and 32nm technology nodes. The difference between predicted and simulated σ/μ for the library gates is less than 1%. Next, the accuracy of the model for nominal delay is verified for larger circuits including ISCAS'85 benchmark circuits. The model predicted results are within 4% error of HSPICE simulated results and take a small fraction of the time, for 45nm technology. Delay variability is analyzed for various paths and it is observed that non-critical paths can become critical because of Vth variation. Variability on shortest paths show that rate of hold violations increase enormously with increasing Vth variation.

Date Created

2011

Agent

Author (aut): Gummalla, Samatha
Thesis advisor (ths): Chakrabarti, Chaitali
Thesis advisor (ths): Cao, Yu
Committee member: Bakkaloglu, Bertan
Publisher (pbl): Arizona State University

Subscribe to Cao, Yu