장명환/박정민’s paper has been accepted in
Title: Accelerating Storage-based Training for Graph Neural Networks
Author: Myung-Hwan Jang, Jeong-Min Park, Yunyong Ko, and Sang-Wook Kim
Abstract
Graph neural networks (GNNs) have achieved breakthroughs in various real-world downstream tasks due to their powerful expressiveness. As the scale of real-world graphs has been continuously growing, a storage-based approach to GNN training has been studied, which leverages external storage (e.g.,NVMe SSDs) to handle such web-scale graphs on a single machine. Although such storage-based GNN training methods have shown promising potential in large-scale GNN training, we observed that they suffer from a severe bottleneck in data preparation since they overlook a critical challenge: how to handle a large number of small storage IOs. To address the challenge, in this paper, we propose a novel storage-based GNN training framework, named AGNES, that employs a method of block-wise storage IO processing to fully utilize the IO bandwidth provided by high-performance storage devices. Moreover, in order to further enhance the efficiency of each storage IO, AGNES employs a simple yet effective strategy based on considering the characteristics of real-world graphs: hyperbatch-based processing. Via comprehensive experiments on five real-world graphs, we verify the superiority of AGNES over four state-of-the-art storage-based GNN training methods.