skip to content



Research Description

Our mission is to invent and evaluate new ideas in computer architecture. As academics, we are more interested in stimulating creative thinking than in the immediate practicality of the ideas. Of course, it is always nice to see one's ideas implemented in a product, but (a) in academia, the quality of one's PhD research should not hinge on the immediate practicality of the ideas, (b) there have been many far-reaching ideas that were initially branded as impractical that eventually ended up having significant impact on the industry, and (c) with much of the computer industry's research having shifted to a short term product focus, computer architects in academia need to pursue far-reaching ideas with the potential for broad longer term impact.

The reality is that we pursue a mix of research with shorter term potential and efforts that are far out on the horizon. Some of this current and past work is summarized below. More detail is found in the accompanying principal publications (a more complete list of publications can be found here).

Power- and Reliability-Aware Computing

Our efforts in power-aware computing include the development of microarchitectural mechanisms to mitigate inductive noise in multi-threaded microprocessors, power-aware issue queue techniques, reducing leakage power in arithmetic units, two-level and banked register files, and multi-threaded processor fetch mechanisms to improve issue queue efficiency. (Related power-aware research in our group in Adaptive Processing, GALS microarchitectures, and Clustered Multi-Threaded processors is discussed further down on this page.)

Currently, we are investigating how to mitigate the deleterious effects of technology scaling while meeting throughput and power requirements. These effects include vulnerability to particle strikes, on-chip noise sources, aging defects, and inter- and intra-die process variations. We are devising a microarchitecture that brings together some of our prior efforts in dynamically optimized microarchitectures, yet forges new directions, to comprehensively address these issues.

Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures, J.A. Winter and D.H. Albonesi, 38th International Conference on Dependable Systems and Networks, June 2008.

Addressing Thermal Non-Uniformity in SMT Workloads, J.A. Winter and D.H. Albonesi, ACM Transactions on Architecture and Code Optimization, 2008.

Synergistic Temperature and Energy Management in GALS Processor Architectures, Y. Zhu and D.H. Albonesi, International Symposium on Low Power Electronics and Design, pp. 55-60, October 2006.

Localized Microarchitecture-Level Voltage Management, Y. Zhu and D.H. Albonesi, International Symposium on Circuits and Systems, pp. 37-40, May 2006.

Power Efficient Error Tolerance in Chip Multi-Processors, M.W. Rashid, E.J. Tan, M.C. Huang, and D.H. Albonesi, IEEE Micro, Special Issue on Reliability-Aware Microarchitectures, Vol. 25, No. 6, pp. 60-70, November/December 2005.

Exploiting Coarse-Grain Verification Parallelism for Power-Efficient Fault Tolerance, M.W. Rashid, E.J. Tan, M.C. Huang, and D.H. Albonesi, 14th International Conference on Parallel Architectures and Compilation Techniques, pp. 315-325, September 2005.

An Evaluation of a Configurable VLIW Microarchitecture for Embedded DSP Applications, W. Liu, D.H. Albonesi, J. Gostomski, L. Palum, D. Hinterberger, R. Wanzenried, and M. Indovina, Journal of Circuits, Systems, and Computers, Special Issue on VLSI Architectures for Multimedia Applications, Vol. 13, No. 6, pp. 1321-1345, December 2004.

Mitigating Inductive Noise in SMT Processors, W. El-Essawy and D.H. Albonesi, International Symposium on Low Power Electronics and Design, pp. 332-337, August 2004.

Front-End Policies for Improved Issue Efficiency in SMT Processors, A. El-Moursy and D.H. Albonesi, 9th International Symposium on High-Performance Computer Architecture, pp. 31-40, February 2003.

Managing Static Leakage Energy in Microprocessor Functional Units, S. Dropsho, V. Kursun, D.H. Albonesi, S. Dwarkadas, and E.G. Friedman, 35th International Symposium on Microarchitecture, pp. 321-332, November 2002.

An Oldest-First Selection Logic Implementation for Non-Compacting Issue Queues, A. Buyuktosunoglu, A. El-Moursy, and D.H. Albonesi, 15th International ASIC/SOC Conference, pp. 31-35, September 2002.

Tradeoffs in Power-Efficient Issue Queue Design, A. Buyuktosunoglu, D.H. Albonesi, P. Bose, P. Cook, S. Schuster, International Symposium on Low Power Electronics and Design, pp. 184-189, August 2002.

A Microarchitectural-Level Step-Power Analysis Tool, W. El-Essawy, D.H. Albonesi, and B. Sinharoy, International Symposium on Low Power Electronics and Design, pp. 263-266, August 2002.

Power-Efficient Issue Queue Design, A. Buyuktosunoglu, D.H. Albonesi, S. Schuster, D. Brooks, P. Bose, P. Cook, in Power Aware Computing, R. Graybill and R. Melhem (Eds), Kluwer Academic Publishers, Chapter 3, pp. 37-60, 2002.

Reducing the Complexity of the Register File in Dynamic Superscalar Processors, R. Balasubramonian, S. Dwarkadas, and D.H. Albonesi, 34th International Symposium on Microarchitecture, pp. 237-248, December 2001.

Application of Silicon Photonics to Microarchitecture

Although photonic devices have long been touted as potentially superior to traditional electrical interconnects in computer systems, the lack of CMOS-compatible devices has impeded progress in the area. However, there has been considerable device-level innovation in the past several years that is bringing a CMOS-compatible photonic interconnect system closer to reality. While inter-chip interconnects will be the near term application of this technology, our focus is on silicon photonics for on-chip interconnects in microprocessors and memories. Our project takes an integrated approach that spans devices, integrated circuits, and microarchitectures.

On-Chip Optical Interconnects: Challenges and Critical Directions, G. Chen, H. Chen, M. Haurylau, N.A. Nelson, D.H. Albonesi, P.M. Fauchet, and E.G. Friedman, Proceedings of the European Optical Society Topical Meeting on Optical Microsystems, p. 97, October 2007.

On-Chip Optical Interconnect for Reduced Delay Uncertainty, G. Chen, H. Chen, M. Haurylau, N.A. Nelson, D.H. Albonesi, P.M. Fauchet, and E.G. Friedman, Proceedings of Nano-Net, September 2007.

On-chip Optical Technology in Future Bus-based Multicore Designs: Opportunities and Challenges, N. Kırman, M. Kırman, R.K. Dokania, J. Martínez, A.B. Apsel, M.A. Watkins, and D.H. Albonesi, IEEE Micro, Special Issue on the Top Picks from Microarchitecture Conferences, Vol. 27, No. 1, January/February 2007.

On-Chip Optical Interconnect Roadmap: Challenges and Critical Directions, M. Haurylau, G. Chen, H. Chen, J. Zhang, N.A. Nelson, D.H. Albonesi, E.G. Friedman, and P.M. Fauchet, IEEE Journal of Selected Topics in Quantum Electronics, Special Issue on Silicon Photonics, Vol. 12, No. 6, pp. 1699-1705, November/December 2006.

Leveraging Optical Technology in Future Bus-based Chip Multiprocessors, N. Kırman, M. Kırman, R.K. Dokania, J. Martínez, A.B. Apsel, M.A. Watkins, and D.H. Albonesi, 39th International Symposium on Microarchitecture, December 2006.

On-Chip Copper-Based vs. Optical Interconnects: Delay Uncertainty, Latency, Power, and Bandwidth Density Comparative Predictions, G. Chen, H. Chen, M. Haurylau, N.A. Nelson, D.H. Albonesi, P.M. Fauchet, and E.G. Friedman, IEEE International Interconnect Technology Conference, pp. 39-41, June 2006.

On-chip Optical Interconnect Roadmap: Challenges and Critical Directions, M. Haurylau, H. Chen, J. Zhang, G. Chen, N.A. Nelson, D.H. Albonesi, E.G. Friedman, and P.M. Fauchet, 2nd International Group IV Photonics Conference, pp. 17-19, September 2005.

Electrical and Optical On-Chip Interconnects in Scaled Microprocessors, G. Chen, H. Chen, M. Haurylau, N. Nelson, D.H. Albonesi, P.M. Fauchet, and E.G. Friedman, International Symposium on Circuits and Systems, pp. 2514-2517, May 2005.

Predictions of CMOS Compatible On-Chip Optical Interconnect, G. Chen, H. Chen, M. Haurylau, N. Nelson, P.M. Fauchet, E.G. Friedman, and D.H. Albonesi, 7th International Workshop on System Level Interconnect Prediction, pp. 13-20, April 2005.

Alleviating Thermal Constraints while Maintaining Performance Via Silicon-Based On-Chip Optical Interconnects, N. Nelson, G. Briggs, M. Haurylau, G. Chen, H. Chen, D.H. Albonesi, E.G. Friedman, and P.M. Fauchet, Workshop on Unique Chips and Systems, March 2005.

Adaptive Processing

Applications go through phases of execution in which their fundamental characteristics may vary widely. Conventional microprocessors are fixed at design time and therefore are inevitably a compromise: a particular design may be best overall for some given workload, but for any given application, or even a phase of an application, a different microarchitecture is often preferable in terms of performance and power dissipation.

In adaptive processing, major microprocessor resources are dynamically tuned during execution to better match varying phase behavior. This is in contrast to the fixed resources supplied at design time in a conventional microprocessor, and to common techniques that turn off idle sections of a processor. By presenting the application with the right amount of hardware at the right time, either performance can be improved (by dynamically trading off frequency for instructions per cycle), or power can be saved (by disabling portions of hardware structures that are not needed in each phase).

Dynamic Capacity-Speed Tradeoffs in SMT Processor Caches, S. López, S. Dropsho, D.H. Albonesi, O. Garnica, and J. Lanchares, International Conference on High Performance Embedded Architectures and Compilers, January 2007.

Dynamically Trading Frequency for Complexity in a GALS Microprocessor, S. Dropsho, G. Semeraro, D.H. Albonesi, G. Magklis, and M.L. Scott, 37th International Symposium on Microarchitecture, pp. 157-168, December 2004.

Dynamically Tuning Processor Resources with Adaptive Processing, D.H. Albonesi, R. Balasubramonian, S.G. Dropsho, S. Dwarkadas, E.G. Friedman, M.C. Huang, V. Kursun, G. Magklis, M.L. Scott, G. Semeraro, P. Bose, A. Buyuktosunoglu, P.W. Cook, and S.E. Schuster, IEEE Computer, Special Issue on Power-Aware Computing, Vol. 36, No. 12, pp. 49-58, December 2003.

A Dynamically Tunable Memory Hierarchy, R. Balasubramonian, D.H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas, IEEE Transactions on Computers, pp. 1243-1258, October 2003.

Energy Efficient Co-Adaptive Instruction Fetch and Issue, A. Buyuktosunoglu, T. Karkhanis, D.H. Albonesi, and P. Bose, 30th International Symposium on Computer Architecture, pp. 147-156, June 2003.

Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power, S. Dropsho, A. Buyuktosunoglu, R. Balasubramonian, D.H. Albonesi, S. Dwarkadas, G. Semeraro, G. Magklis, and M.L. Scott, 11th International Conference on Parallel Architectures and Compilation Techniques, pp. 141-152, September 2002.

Dynamically Allocating Processor Resources Between Nearby and Distant ILP, R. Balasubramonian, S. Dwarkadas, and D.H. Albonesi, 28th International Symposium on Computer Architecture, pp. 26-37, June 2001.

A Circuit Level Implementation of an Adaptive Issue Queue for Power-Aware Microprocessors, A. Buyuktosunoglu, S. Schuster, D. Brooks, P. Bose, P. Cook, and D.H. Albonesi, 11th Great Lakes Symposium on VLSI, pp. 73-78, March 2001.

Memory Hierarchy Reconfiguration for Energy and Performance in General-Purpose Processor Architectures, R. Balasubramonian, D.H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas, 33rd International Symposium on Microarchitecture, pp. 245-257, December 2000.

An Adaptive Issue Queue for Reduced Power at High Performance, A. Buyuktosunoglu, S. Schuster, D. Brooks, P. Bose, P. Cook, and D.H. Albonesi, Workshop on Power-Aware Computer Systems, held at the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000. Also appears in Springer-Verlag Lecture Notes in Computer Science, Volume 2008.

Selective Cache Ways: On-Demand Cache Resource Allocation, D.H. Albonesi, 32nd International Symposium on Microarchitecture, pp. 248-259, November 1999.

A Methodology for the Analysis of Dynamic Application Parallelism and Its Application to Reconfigurable Computing, B. Xu and D.H. Albonesi, SPIE International Conference on Reconfigurable Technology: FPGAs for Computing and Applications, pp. 78-86, September 1999. Warning: Huge ps file

Dynamic IPC/Clock Rate Optimization, D.H. Albonesi, 25th International Symposium on Computer Architecture, pp. 282-292, June 1998.

The Inherent Energy Efficiency of Complexity-Adaptive Processors, D.H. Albonesi, 1998 Power-Driven Microarchitecture Workshop, held at the 25th International Symposium on Computer Architecture, pp. 107-112, June 1998.

GALS Microarchitectures

In a Globally Asynchronous, Locally Synchronous (GALS) system, the design is divided into several different domains, each with their own independent clock generation and distribution system. The potential benefits of GALS include reduced clock skew and overhead, and the potential to better tolerate process variations, a growing concern in the nanoscale regime.

Our research explores the application of a GALS design methodology to each core of a multi-core microprocessor, and within each processor core itself. In terms of the latter, we have devised algorithms for general purpose Dynamic Voltage Scaling (DVS) within our Multiple Clock Domain (MCD) processor design. In MCD, domains that are off the critical execution path can be slowed down (either under hardware or software control) to save energy without undue performance loss. This localized DVS approach applies to a wide range of applications. We have also explored the use of loop fusion within MCD for energy savings, and devised a more complexity-effective version of MCD that achieves better energy efficiency with simplified hardware. Current directions include multi-threaded MCD microarchitectures and localized DVS for temperature management using the MCD microarchitecture.

Synergistic Temperature and Energy Management in GALS Processor Architectures, Y. Zhu and D.H. Albonesi, International Symposium on Low Power Electronics and Design, pp. 55-60, October 2006.

Localized Microarchitecture-Level Voltage Management, Y. Zhu and D.H. Albonesi, International Symposium on Circuits and Systems, pp. 37-40, May 2006.

A High Performance, Energy Efficient, GALS Processor Microarchitecture with Reduced Implementation Complexity, Y. Zhu, D.H. Albonesi, and A. Buyuktosunoglu, International Symposium on Performance Analysis of Systems and Software, pp. 42-53, March 2005.

Dynamically Trading Frequency for Complexity in a GALS Microprocessor, S. Dropsho, G. Semeraro, D.H. Albonesi, G. Magklis, and M.L. Scott, 37th International Symposium on Microarchitecture, pp. 157-168, December 2004.

The Energy Impact of Aggressive Loop Fusion, Y. Zhu, G. Magklis, M.L. Scott, C. Ding, and D.H. Albonesi, 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 153-164, September 2004.

Hiding Synchronization Delays in a GALS Processor Microarchitecture, G. Semeraro, D.H. Albonesi, G. Magklis, M.L. Scott, S.G. Dropsho, and S. Dwarkadas, 10th International Symposium on Asynchronous Circuits and Systems, pp. 159-169, April 2004.

Dynamic Frequency and Voltage Scaling for a Multiple-Clock-Domain Microprocessor, G. Magklis, G. Semeraro, D.H. Albonesi, S.G. Dropsho, S. Dwarkadas, and M.L. Scott, IEEE Micro, Special Issue on the Top Picks from Microarchitecture Conferences, Vol. 23, No. 6, pp. 62-68, November/December 2003.

Profile-based Dynamic Voltage and Frequency Scaling for a Multiple Clock Domain Microprocessor, G. Magklis, M.L. Scott, G. Semeraro, D.H. Albonesi, and S. Dropsho, 30th International Symposium on Computer Architecture, pp. 14-25, June 2003.

Dynamic Frequency and Voltage Control for a Multiple Clock Domain Microarchitecture, G. Semeraro, D.H. Albonesi, S.G. Dropsho, G. Magklis, S. Dwarkadas, and M.L. Scott, 35th International Symposium on Microarchitecture, pp. 356-367, November 2002.

Energy Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling, G. Semeraro, G. Magklis, R. Balasubramonian, D.H. Albonesi, S. Dwarkadas, and M.L. Scott, 8th International Symposium on High-Performance Computer Architecture, pp. 29-40, February 2002.

Clustered Multi-Threaded Microprocessors

In order to extract both instruction-level parallelism (ILP) and thread-level parallelism (TLP) in a multi-threaded processor core, complex hardware resources are required. The potential ramifications of this increased complexity are reduced clock frequency, reduced throughout (due to the need to overpipeline to maintain frequency), increased power dissipation, and difficulty in scaling the design to a new process technology. In a Clustered Multi-Threaded (CMT) microarchitecture, the core is divided into smaller, more scalable, clusters, with communication paths introduced between them. Instructions from different threads are assigned to clusters according to a steering algorithm implemented in the front-end of the machine.

Our early work in clustered microarchitectures for single-threaded machines explored dynamically trading off communication and parallelism on an application phase basis. More recently, we have demonstrated that CMT processors with efficient multi-threaded steering mechanisms can achieve almost all of the cycle-level performance of very complex monolithic multi-threaded cores, with a significant reduction in power consumption. We've also shown how a multi-core design of CMT processor cores is an extremely attractive design option for the future. Our most recent work exploits the built-in steering mechanisms of CMTs for thermal management.

Addressing Thermal Non-Uniformity in SMT Workloads, J.A. Winter and D.H. Albonesi, ACM Transactions on Architecture and Code Optimization, 2008.

Compatible Phase Co-Scheduling on a CMP of Multi-Threaded Processors, A. El-Moursy, R. Garg, D.H. Albonesi, and S. Dwarkadas, 20th International Parallel and Distributed Processing Symposium, April 2006.

Partitioning Multi-Threaded Processors with a Large Number of Threads, A. El-Moursy, R. Garg, D.H. Albonesi, and S. Dwarkadas, International Symposium on Performance Analysis of Systems and Software, pp. 112-123, March 2005.

Dynamically Matching ILP Characteristics Via a Heterogeneous Clustered Microarchitecture, L. Chen, D.H. Albonesi, and S. Dropsho, IBM Watson Conference on the Interaction Between Architecture, Circuits, and Compilers, pp. 136-143, October 2004.

Dynamically Managing the Communication-Parallelism Trade-off in Future Clustered Processors, R. Balasubramonian, S. Dwarkadas, and D.H. Albonesi, 30th International Symposium on Computer Architecture, pp. 275-286, June 2003.

Dynamic Data Dependence Tracking

Many microprocessor optimizations rely on precise information regarding dependences among instructions in the pipeline, yet this information is not readily available. We have developed an efficient mechanism for dynamic data dependence tracking among all the in-flight instructions in the machine, and shown how this information can be used to significantly improve branch prediction accuracy, and guide the steering mechanism in a heterogeneous clustered microarchitecture.

Dynamically Matching ILP Characteristics Via a Heterogeneous Clustered Microarchitecture, L. Chen, D.H. Albonesi, and S. Dropsho, IBM Watson Conference on the Interaction Between Architecture, Circuits, and Compilers, pp. 136-143, October 2004.

Dynamic Data Dependence Tracking and its Application to Branch Prediction, L. Chen, S. Dropsho, and D.H. Albonesi, 9th International Symposium on High-Performance Computer Architecture, pp. 65-76, February 2003.

PhD Student Researchers

Current

Mark Cianchetti
Paula Petrica (co-advised with Rajit Manohar)
Basit Sheikh (co-advised with Rajit Manohar)
Matt Watkins
Jonathan Winter

Graduated

Yongkang Zhu, ``Hardware and Software Optimizations for Multiple Clock Domain Microprocessors,'' Department of Electrical and Computer Engineering, University of Rochester, September 2005. First employment: Microsoft Corporation.

Ali El-Moursy, ``Highly Efficient Multithreaded Architecture,'' Department of Electrical and Computer Engineering, University of Rochester, August 2005 (unofficial co-advisor: Sandhya Dwarkadas). First employment: Intel Corporation.

Wael El-Essawy, ``Architectural Level Analysis and Mitigation of Inductive Noise in Simultaneous Multi-Threaded Processors,'' Department of Electrical and Computer Engineering, University of Rochester, October 2004. First employment: IBM Austin Research Laboratory.

Lei Chen, ``Dynamic Data Dependence Tracking and its Applications,'' Department of Electrical and Computer Engineering, University of Rochester, October 2004. First employment: IBM Austin Research Laboratory.

Greg Semeraro, ``Multiple Clock Domain Microarchitecture Design and Analysis,'' Department of Electrical and Computer Engineering, University of Rochester, October 2003. First employment: Rochester Institute of Technology.

Rajeev Balasubramonian, ``Dynamic Management of Microprocessor Resources in Future Microprocessors,'' Department of Computer Science, University of Rochester, August 2003 (unofficial co-advisor; advisor: Sandhya Dwarkadas). First employment: University of Utah.

Alper Buyuktosunoglu, ``Power-Efficient Issue Queue Design,'' Department of Electrical and Computer Engineering, University of Rochester, June 2003. First employment: IBM T.J. Watson Research Center.

Faculty and Industry Collaborators

(principal people with whom I have co-authored papers or joint grants, and corporate liaisons)

Cornell

Alyssa Apsel
Rajit Manohar
José Martínez
Sally McKee

University of Rochester

Steve Dropsho
Sandhya Dwarkadas
Philippe Fauchet
Eby Friedman
Michael Huang
Michael Scott

Universidad Complutense de Madrid

Oscar Garnica
Juan Lanchares

IBM

Pradip Bose
David Brooks (while at IBM before joining Harvard)
Alper Buyuktosunoglu (former PhD student but we still collaborate)
Peter Cook
Stan Schuster
Balaram Sinharoy

Intel

George Cai
Doug Carmean
Steve Gunther
Grant McFarland

Improv Systems

John Gostomski
Dave Hinterberger
Mark Indovina
Lloyd Palum
Rick Wanzenried

Funding

National Science Foundation
DARPA-IPTO
IBM Corporation
Improv Systems
Intel Corporation
Center for Circuit and System Solutions (C2S2)
Center for Electronic Imaging Systems