Accelerating research with GPU computing

In recent times, data has become one of the most precious resources in both business and science. For projects such as iToBoS, which aims to utilize deep learning in the global fight against melanoma, the veracity, validity and volume of data is essential.

A large volume, however, brings with it its own set of challenges. As the amount and availability of data increased drastically, the effective processing of said data became a considerable challenge for the employed technologies. One of the greatest tools in our hands, which revolutionized data processing and consequently deep learning, are graphical processing units (GPUs).

As their name suggests, the original purpose of GPUs was to be used for rendering complex computer graphics. It became apparent however, that the tremendous computational power of these devices can be utilized to greatly accelerate computing in research areas such as machine learning. A GPU’s ability to process vast amounts of data quickly comes from its structure. As opposed to central processing units (CPUs), they contain hundreds or even thousands of cores. Although these cores are appropriate only for simple computational tasks, the sheer number of them enables GPUs to perform many calculations in parallel. As this execution profile fits machine learning tasks well, GPUs played a great part in the rise and widespread use of deep learning.

While the usefulness of GPU cards is unquestionable, getting a hold of these devices can be challenging for members of the scientific community. To support their work, numerous research infrastructures have been established, which often provide access to GPU resources. Such an infrastructure is ELKH Cloud [1], a prominent general-purpose cloud-service which provides high-capacity hardware resources to iToBoS, and many other Hungarian and international research projects.

Motivated by positive feedback and an increasing demand for machine learning applications, the resources of ELKH Cloud were significantly expanded in 2021. Thanks to this development, the cloud platform made 72 data center GPUs, including NVIDIA V100 [3] and NVIDIA A100 [4] cards available to its users, providing considerable computational power to research projects in need. On top of hardware devices, ELKH Cloud provides quickly and easily deployable blueprints of digital research environments, aiding scientists in efficiently utilizing their provisioned infrastructure. These reference architectures [2] cover a variety of common use-cases, and many of them have built in support for GPU resources. These services help ensure that research projects can gain access to and utilize the tremendous computational power of GPU cards in their experiments.

The computational capacity available in Hungary was significantly increased further with the launch of the supercomputer Komondor [5], which took place on 13th of January 2023 [6]. The architecture of the supercomputer, which provides roughly ten times the high performance computing capacity available in Hungary before, is mostly made up of GPU accelerated partitions. Following a lengthy development and testing period, Komondor now serves the computational needs of the Hungarian scientific community, assisting research in a variety of fields.

[1] ELKH Cloud, Cloud services for national and international research projects
Available at https://science-cloud.hu/en

[2] ELKH Cloud, Reference architectures
Available at https://science-cloud.hu/en/reference-architectures

[3] NVIDIA, V100 Tensor Core GPU
Available at https://www.nvidia.com/en-us/data-center/v100

[4] NVIDIA, A100 Tensor Core GPU
Available at https://www.nvidia.com/en-us/data-center/a100

[5] HPC Competence Center, Komondor
Available at https://hpc.kifu.hu/en/komondor

[6] University of Debrecen, Komondor Has Arrived at UD
Available at https://hirek.unideb.hu/en/komondor-has-arrived-ud