ACSI2015

JHPCN special session

Program

─── 1/28(水) ───

13:30∼15:00

Session-1 (Chair: Takeshi Nanri (Kyushu University))
- 超並列宇宙プラズマ粒子シミュレーションの研究
  Development of Scalable Plasma Particle Simulator with OhHelp Dynamic Load Balancer
  Yohei Miyake（三宅洋平）(Kobe University)
  abstract
- GPGPUによる地震ハザード評価
  Application of GPGPU to Seismic Hazard Assessment
  Shin Aoi（青井真）(National Research Institute for Earth Science and Disaster Prevention)
  abstract
- 大規模データ系のVR可視化解析を効率化する多階層精度圧縮数値記録 (JHPCN-DF)の実用化研究
  Study of Efficient Data Compression by JHPCN-DF
  Katsumi Hagita（萩田克美）(National Defense Academy of Japan)
  abstract
15:15∼16:15
Session-2 (Chair: Yoshiki Sato (Univ. of Tokyo))
- スパコンとインタークラウドの連携による大規模分散設計探査フレームワークの構築
  Building Large-Scale Distributed Design Exploration Framework by Collaborating Supercomputers and Inter-Cloud Systems
  Masaharu Munetomo（棟朝雅晴）(Hokkaido University)
  abstract
- 次世代スーパーコンピュータ向けの軽量な仮想計算機環境の実現に向けた研究開発
  Toward Lightweight Virtual Machine Environments for Next-generation Super Computers
  Takahiro Shinagawa（品川高廣）(Univ. of Tokyo)
  abstract

Abstracts

(1-1) Development of Scalable Plasma Particle Simulator with OhHelp Dynamic Load Balancer
Yohei Miyake, Kobe University
The particle-in-cell (PIC) simulation is widely used to study micro- and meso-scale plasma processes. Because the simulation must process a huge number of plasma particles, the computation should be parallelized for efficient execution as well as for feasible implementation on distributed memory systems. We proposed a scalable load balancing method named OhHelp for the simulation. Although the code parallelization is based on simple block domain decomposition, OhHelp accomplishes load balancing and thus the scalability in terms of the number of particles by making each computation node help another heavily loaded node. We applied OhHelp to various kinds of production-level PIC simulators and evaluated their performances on modern distributed-memory supercomputers. Consequently, we confirmed good scalability of the “OhHelp’ed” simulators for thousand-scale parallelism both in cases of uniform and congested particle distributions. We review our recent-year challenges in HPC programming of the PIC simulation and its applications to large-scale practical problems in the space plasma physics/engineering. We also discuss prospects of our further HPC challenge toward post-Peta and Exa-scale era such as optimizations for emerging manycore architectures.
(1-2) Application of GPGPU to Seismic Hazard Assessment
Shin Aoi, National Research Institute for Earth Science and Disaster Prevention
A high-accurate and large-scale ground motion simulation is required for the seismic hazard assessment. 3-D FDM is one of the key techniques for the ground motion simulation. We have developed the simulation code for GPGPU using CUDA based on the solver of GMS (Ground Motion Simulator) which is a total system for seismic wave propagation simulation based on 3-D FDM using a discontinuous grid. The computational model is decomposed in two horizontal directions and each decomposed model is allocated to a different GPU. For a high performance GPU calculation, we have proposed a new efficient concealing technique that successfully avoids the discontinuous memory accesses. The performance test for the multi GPU calculation showed almost perfect linearity for the weak scaling test up to the simulation with 1024 GPUs; the model size for the 1024 GPUs case was about 22 billion grids. Lastly, we performed the long-period ground motion simulation for the Nankai Trough earthquake using the multi GPU code.
(1-3) Study of Efficient Data Compression by JHPCN-DF
Katsumi Hagita, National Defense Academy of Japan
In the area of HPC (high performance computing), transferring data and storage are a significant problem. Thus, we have proposed JHPCN-DF (Jointed Hierarchical Precision Compression Number - Data Format), an efficient data compression method, which uses bit segmentation and Huffman coding. When we use common application programming interfaces (APIs), such as HDF5, supporting data compression, changes of these APIs are not required for use of JHPCN-DF. For visualizations and analyses, the lower bits of the IEEE 754 format are considered to be unnecessary. The data size after data compression is thus reduced by setting these lower bits to 0. The required bits can be determined by direct numerical computations and their corresponding resultant visualizations and analyses. In this paper, we evaluated the data size of HDF5 files in the JPHCN-DF framework for various types of simulation such as those using the plasma PIC (electromagnetic particle-in- cell) model, simulation of electromagnetic field based on the finite difference time domain (FDTD) technique, the finite element method (FEM), and for phase separated polymer materials. We confirmed that visualization software works well with HDF5 files coordinated by the JPHCN-DF framework. In visualization, the data size of the HDF5 file appears to be reduced to 1/5 of the original HDF5 file size. JPHCN-DF is useful for a wide range of applications in data science and simulation.
(2-1) Building Large-Scale Distributed Design Exploration Framework by Collaborating Supercomputers and Inter-Cloud Systems
Masaharu Munetomo, Hokkaido University
The Large-Scale Distributed Design Exploration Framework (LDDEF) project seeks for building a nation-wide inter-cloud infrastructure collaborating supercomputers and academic cloud systems for extreme-scale design explorations. The design exploration process tries to find an optimal configuration of input parameters that maximize or minimize objective functions, which needs to perform a number of large-scale simulations.
In our framework, simulations are performed in parallel employing multiple supercomputers of the nation-wide high-performance infrastructure (HPCI), and virtual/real machines in the academic cloud systems are employed for parameter surveys and/or optimizations. For extreme-scale collaborations of supercomputers and cloud systems, we employ scalable distributed databases and objective storages to store the input parameters and resulting information by the simulation.
The proposed system consists of the following components: (1) simulators performed in supercomputers, (2) optimization engines deployed in cloud systems, (3) distributed databases to store solution candidates as input parameters and resulting evaluations by the simulations, (4) object storages to store information on simulations results except evaluations, and (5) controllers including user interfaces. By employing NoSQL distributed databases, the proposed system not only extends to virtually infinite scale but also can manage a number of target problems in a unified manner with multi-tenant databases.
(2-2) Toward Lightweight Virtual Machine Environments for Next-generation Super Computers
Takahiro Shinagawa, The University of Tokyo
We are developing our original hypervisor called BitVisor for research and practical use. We apply this hypervisor to construct a virtual computing environment for next-generation super computers. The purpose of our research and development is to offer the maximum performance of hardware resources by allowing users to use their own operating systems (OSs) specifically configured and customized for their high performance computing usages, while preserving the security of the computer systems required by the organization that manages the super computers. We exploit BitVisor to protect the computer hardware from user's OSs and perform management operations such as OS deployment to support the users to easily select their own OSs, while allowing pass-through access to the hardware resources from the OSs to avoid incurring virtualization overhead. We studied the performance of BitVisor on real super computers, e.g. a number of Fujitsu PRIMERGY RX200 with InfiniBand network, and identified the remaining overhead. We also developed a transparent OS deployment system and measured the performance of the guest OS on that system.