Access the full text.
Sign up today, get DeepDyve free for 14 days.
Mahzabeen Islam, Shashank Adavally, Marko Scrbak, K. Kavi (2020)
On-the-fly Page Migration and Address Reconciliation for Heterogeneous Memory SystemsACM Journal on Emerging Technologies in Computing Systems (JETC), 16
Mitesh Meswani, S. Blagodurov, D. Roberts, John Slice, Mike Ignatowski, G. Loh (2015)
Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)
Fengguang Song, S. Moore, J. Dongarra (2009)
Analytical modeling and optimization for affinity based thread scheduling on multicore systems2009 IEEE International Conference on Cluster Computing and Workshops
(2018)
Hybrid Memory Cube Consortium
(2017)
SAP HANA Memory Usage Explained
Jee Ryoo, L. John, Arkaprava Basu (2018)
A Case for Granularity Aware Page MigrationProceedings of the 2018 International Conference on Supercomputing
Carlos Villavieja, Vasileios Karakostas, L. Vilanova, Yoav Etsion, Alex Ramírez, A. Mendelson, N. Navarro, A. Cristal, O. Unsal (2011)
DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory2011 International Conference on Parallel Architectures and Compilation Techniques
Bogdan Romanescu, A. Lebeck, Daniel Sorin, Anne Bracy (2010)
UNified Instruction/Translation/Data (UNITD) coherence: One protocol to rule them allHPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture
(2018)
3D ICs
Amro Awad, Arkaprava Basu, S. Blagodurov, Yan Solihin, G. Loh (2017)
Avoiding TLB Shootdowns Through Self-Invalidating TLB Entries2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)
Iulia Stirb (2018)
NUMA-BTLP: A static algorithm for thread classification2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)
(2015)
MiniFE: finite element solver
A. Prodromou, Mitesh Meswani, N. Jayasena, G. Loh, D. Tullsen (2017)
MemPod: A Clustered Architecture for Efficient and Scalable Migration in Flat Address Space Multi-level Memories2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Naveen Muralimanohar, R. Balasubramonian, N. Jouppi (2007)
CACTI 6 . 0 : A Tool to Understand Large Caches
John Tramm, A. Siegel, T. Islam, M. Schulz (2014)
XSBENCH - THE DEVELOPMENT AND VERIFICATION OF A PERFORMANCE ABSTRACTION FOR MONTE CARLO REACTOR ANALYSIS
(2018)
US Department of Energy ECP Proxy Application Suite
Dynamically Adapting Page Migration Policies Based on Applications Memory Access Behaviors 23
Jaewoong Sim, Alaa Alameldeen, Zeshan Chishti, C. Wilkerson, Hyesoon Kim (2014)
Transparent Hardware Management of Stacked DRAM as Part of Memory2014 47th Annual IEEE/ACM International Symposium on Microarchitecture
D. Bovet, M. Cesati (2005)
Understanding the Linux Kernel - from I / O ports to process management: covers Linux Kernel version 2.4 (2. ed.)
Katie Willingham (2018)
Bloom FilterColorado Review, 45
N. Muralimanohar, Rajeev Balasubramonian, N. Jouppi (2007)
CACTI 6, 0
Jee Ryoo, Mitesh Meswani, A. Prodromou, L. John (2016)
SILC-FM: Subblocked InterLeaved Cache-Like Flat Memory Organization2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Yoongu Kim, W. Yang, O. Mutlu (2016)
Ramulator: A Fast and Extensible DRAM SimulatorIEEE Computer Architecture Letters, 15
Jagadish Kotra, Haibo Zhang, Alaa Alameldeen, C. Wilkerson, M. Kandemir (2018)
CHAMELEON: A Dynamically Reconfigurable Heterogeneous Memory System2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, R. Iyer, S. Makineni, D. Newell, Yan Solihin, R. Balasubramonian (2010)
CHOP: Adaptive filter-based DRAM caching for CMP server platformsHPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture
(2006)
SPEC CPU 2006
(2017)
2017. Memory-Driven Computing
(2001)
Eranian, IA-64 Linux kernel: design and implementation
Xiangyao Yu, C. Hughes, N. Satish, O. Mutlu, S. Devadas (2017)
Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
D. Mosberger, S. Eranian (2002)
IA-64 Linux Kernel: Design and Implementation
Carlos Villavieja, Vasileios Karakostas, Lluis Vilanova, Yoav Etsion, Alex Ramirez, Avi Mendelson, Nacho Navarro, Adrian Cristal, Osman S. Unsal (2011)
Didi: Mitigating the performance impact of TLB shootdowns using a shared TLB directoryProceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT’11). IEEE, 2011
Prashant Nair, Chiachen Chou, B. Rajendran, Moinuddin Qureshi (2015)
Reducing read latency of phase change memory via early read and Turbo Read2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)
Moinuddin Qureshi, S. Gurumurthi, B. Rajendran (2011)
Phase Change Memory: From Devices to Systems
K. Keeton (2017)
Memory-Driven Computing
(2015)
Standard Performance Evaluation Corporation
(2007)
A tool to understand large caches. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.147.3834&rep=rep1&type=pdf
R. Hornung, J. Keasler, M. Gokhale (2011)
HYDRODYNAMICS CHALLENGE PROBLEM
(2009)
Measuring function duration with ftrace
(2017)
SAP HANA Memory Usage Explained. https://www.sap.com/documents/2016/08/205c8299-867c-0010- 82c7-eda71af511fa.html. [Online; accessed January-20-2019
(2018)
PinPlay
(2013)
Co-design for molecular dynamics: An exascale proxy application. https://www.lanl.gov/orgs/adtsc/publications/science_highlights_2013/docs/ Pg88_89.pdf
Jaewoong Sim, Alaa R. Alameldeen, Zeshan Chishti, Chris Wilkerson, Hyesoon Kim (2014)
Transparent hardware management of stacked dram as part of memoryProceedings of the 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). IEEE, 2014
Luiz Ramos, E. Gorbatov, R. Bianchini (2011)
Page placement in hybrid memory systems
(2018)
Pin -A Dynamic Binary Instrumentation Tool
(2015)
Graph500-v2-spec
(2015)
MiniFE: finite element solver. https://portal.nersc.gov/project/CAL/ designforward.htm#MiniFE
(2013)
Co-design for molecular dynamics: An exascale proxy
Chiachen Chou, A. Jaleel, Moinuddin Qureshi (2014)
CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache2014 47th Annual IEEE/ACM International Symposium on Microarchitecture
Daniel P. Bovet, Marco Cesati (2005)
Understanding the Linux Kernel: From I/O Ports to Process ManagementO’Reilly Media.
(2013)
Co-design for molecular dynamics: An exascale proxy application, 2013
Apostolos Kokolis, Dimitrios Skarlatos, J. Torrellas (2019)
PageSeer: Using Page Walks to Trigger Page Swaps in Hybrid Memory Systems2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Joint Electron Devices Engineering Council
Chun-Yi Su, D. Roberts, E. León, K. Cameron, B. Supinski, G. Loh, Dimitrios Nikolopoulos (2015)
HpMC: An Energy-aware Management System of Multi-level Memory ArchitecturesProceedings of the 2015 International Symposium on Memory Systems
Emre Kultursay, M. Kandemir, A. Sivasubramaniam, O. Mutlu (2013)
Evaluating STT-RAM as an energy-efficient main memory alternative2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
(2013)
Co-design for molecular dynamics: An exascale proxy application
There have been numerous studies on heterogeneous memory systems comprised of faster DRAM (e.g., 3D stacked HBM or HMC) and slower non-volatile memories (e.g., PCM, STT-RAM). However, most of these studies focused on static policies for managing data placement and migration among the different memory devices. These policies are based on the average behavior across a range of applications. Results show that these techniques do not always result in higher performance when compared to systems that do not migrate data across the devices: some applications show performance gains, but other applications show performance losses. It is possible to utilize offline analyses to identify which applications benefit from page migration (migration friendly) and use page migration only with those applications. However, we observed that several applications exhibit both migration friendly and migration unfriendly behaviors during different phases of execution supporting a need for adaptive page migration techniques. We introduce and evaluate techniques that dynamically adapt to the behavior of applications and either reduce or increase migrations, or even halt migrations. Our adaptive techniques show performance gains for both migration friendly (on average of 81% over no migrations) and unfriendly workloads (by an average of 3%): it should be remembered that previous migration techniques resulted in performance losses for unfriendly workloads.
ACM Journal on Emerging Technologies in Computing Systems (JETC) – Association for Computing Machinery
Published: Mar 24, 2021
Keywords: Heterogeneous memory systems
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.