NVIDIA DGX Station A100 installation and use (DGX_9130_v1_revE_08Sep2021)


Overview

For Research Use Only.

Document version: DGX_9130_v1_revE_08Sep2021

1. Introduction

NVIDIA DGX Station A100 brings AI supercomputing to data science teams, offering data center technology without a data center or additional IT infrastructure.

Designed for multiple, simultaneous users, DGX Station A100 leverages server-grade components in an office-friendly form factor. It has four fully interconnected and Multi-Instance GPU (MIG)-capable NVIDIA A100 Tensor Core GPUs with 320 GB of total GPU memory that can plug into a standard power outlet.

  • AI workgroup server delivering 2.5 petaFLOPS of performance that your team can use without limits — for training, inference, and data analytics
  • Server-grade, plug-and-go, without requiring data center power and cooling
  • World-class AI platform, with no complicated installation or IT help needed
  • The world’s only workstation-style system with four fully interconnected NVIDIA A100 Tensor Core GPUs and 320 gigabytes (GB) of GPU memory
  • Delivers a fast-track to AI transformation with NVIDIA know-how and experience

The DGX A100 station provides users with a simple solution to expand GPU resource to run GPU-enabled tools. Provided with installation instructions and optimal running parameters for running production basecalling tools, users can expand their post-run basecalling capacity. Installation instructions and optimal running conditions for other GPU-enabled Oxford Nanopore tools will be made available. Users can run other tools at their discretion with support from NVIDIA.

Technical specs

Component Specifications
GPUs 4x NVIDIA A100 80 GB GPUs
GPU memory 320 GB total
Performance 2.5 petaFLOPS AI
5 petaOPS INT8
Size and weight System weight: 43.1 kg
Packaged system weight: 57.93 kg

System dimensions
Height: 639 mm, Width: 256 mm, Length: 518 mm
Software Ubuntu Linux OS
Memory/RAM 512 GB RAM
CPU Single AMD 7742, 64 cores, 2.25 GHz (base)–3.4 GHz (max boost)
Storage/hard drive OS: 1x 1.92 TB NVME drive
Internal storage: 7.68 TB U.2 NVME drive
Environmental conditions Operating temperature range 5–35°C

For more information, please see the NVIDIA DGX Station A100 datasheet.

To install the DGX Station A100, please consult the Quick Start Guide that is supplied with the instrument.

Alternatively, you can find installation instructions online at https://docs.nvidia.com/dgx/dgx-station-a100-qsg/index.html#abstract

Support

After purchase of the device and 3-year enterprise support package, you should receive an email from NVIDIA to the account that performed the purchase with details on how to setup your support account with NVIDIA.

If you have any problems with the DGX Station A100, please contact NVIDIA enterprise support for assistance: https://www.nvidia.com/en-us/support/enterprise/ . We suggest you copy your local Field Application Specialist into your communication with NVIDIA enterprise support.

2. Overview

Basecalling of sequencing data can be carried out on the DGX Station A100 using the Guppy software.

To use Guppy on the DGX Station A100, install the GPU version of the software on the station from the Debian package as described in the Guppy protocol.

To maximise Guppy performance on the DGX Station A100, there are two main requirements:

  1. Use the Guppy basecall server, so that there is a single program managing Guppy's GPU use.
  2. Use multiple processes to overcome issues which arise when a single process is unable to read .fast5 files fast enough.

This document will cover two main use cases:

  1. Basecalling a single folder with a lot of data in it.
  2. Basecalling many data folders in separate tasks, as if a job scheduler is being used.

At its current performance, the Guppy basecall server will occupy most of the GPU memory available, preventing its use by other programs. To enable the use of other programs such as Medaka, Megalodon and Bonito, this guide will assume the following workflow:

  1. Starting a new basecall server.
  2. Using the basecall server for basecalling, in one of the two use cases outlined above.
  3. Shutting down the server when basecalling is complete.

3. Start a new basecall server

Open a terminal window and type the following commands to launch the basecall server:

guppy_basecall_server --log_path <log_dir_location> --config dna_r9.4.1_450bps_fast_prom.cfg --num_callers 12 --ipc_threads 20 --device "cuda:0 cuda:1 cuda:2 cuda:3" --port ipc:///tmp/.guppy/5556 --num_alignment_threads 24 --max_queued_reads 20000

Choose a suitable location for the server to store its log files using the --log_path or -l argument. Log files can be used for debugging.

The server needs to be launched in a location where it can continue to run until it is shut down. For example, start a separate terminal session to manage the server, or launch the server in the background before starting basecalling.

OPTIONAL ACTION

To launch the server in the background and retain its process ID, enter the following command:

$ guppy_basecall_server <extra_args> &
$ export GUPPY_SERVER_PID=$!

4. Basecalling

IMPORTANT

The DGX Station A100 has a 7.68 TB internal data storage. Data flow should be managed accordingly to best suit the needs of the user. The location and volume of the data, network connection speed as well as the basecalling model selected will all have an impact on the choice of data location and flow. Please refer to the "Data management" section of this guide.

Basecalling a single folder

Guppy ships with a "basecaller supervisor" that will launch many basecalling clients in parallel, to mitigate issues with reading .fast5 files. Each client has an ID, numbered upwards from zero, which allows their output files to be created in parallel.

To launch the supervisor, enter the following command:

guppy_basecaller_supervisor --input_path <input_folder> --save_path <output_folder> --config <config> --port ipc:///tmp/.guppy/5556 --num_clients <num_clients>
OPTIONAL ACTION

Depending on your analysis pipeline, the following additional options may be useful:

--bam_out Output BAM files in addition to FASTQ. --compress_fastq Compress output FASTQ files so that they become fastq.gz.

Choosing num_clients

If using the Fast basecall model, we recommend to set num_clients to 50. For all other models, set num_clients to 20.

The output folder structure will look similar to this when basecalling is complete:

--- /save_folder/
    | fastq_runid_6dce0a5_client0_0_0.fastq
    | fastq_runid_6dce0a5_client1_0_0.fastq
    | fastq_runid_6dce0a5_client2_0_0.fastq
    | guppy_basecaller_0_log-2019-11-25_15-11-53.log
    | guppy_basecaller_1_log-2019-11-25_15-11-53.log
    | guppy_basecaller_2_log-2019-11-25_15-11-53.log
    | guppy_basecaller_supervisor_log-2019-11-25_15-11-53.log
    | sequencing_summary_0.txt
    | sequencing_summary_1.txt
    | sequencing_summary_2.txt
    | sequencing_telemetry_0.js
    | sequencing_telemetry_1.js
    | sequencing_telemetry_2.js
OPTIONAL ACTION

In some cases, downstream analysis tools require merging files together.

For example, to merge FASTQ files together (change FASTQ to fastq.gz to merge gzipped FASTQ files instead), enter the following command:

cat save_folder/*.fastq > merged.fastq

To merge sequencing_summary files together, enter the following command:

awk 'NR == 1 { print }; FNR > 1 { print }' save_folder/sequencing_summary* > merged_sequencing_summary.txt

Basecalling many data folders

The most scalable way to basecall many data folders is to use a single basecall client for each data folder.

To launch multiple basecaller clients, enter the following command:

guppy_basecaller_supervisor --input_path <input_folder> --save_path <output_folder> --config <config> --port ipc:///tmp/.guppy/5556

Note: each output folder should be unique, unless using the --client_id argument.

OPTIONAL ACTION

In addition to the options outlined above, you can use the following arguments:

--client_id <id> Append <id> as part of the output filename. For example, sequencing_summary.txt would become sequencing_summary_<id>.txt. Use this when having multiple clients output to the same folder.

Approximate basecall speeds

Below are approximate basecall speeds, in Gbases per hour, that you should be able to attain using the Guppy setup outlined above. Actual speeds will vary depending on the type of data you have: for example, shorter reads will basecall more slowly as they are less efficient to move through the basecall server and process on the GPU. Basecall speeds will also decrease if operations are requested that require additional processing time, such as barcoding or alignment.

Guppy configuration Approximate speed, Gbases/hour; basecall supervisor with 50 clients
dna_r9.4.1_450bps_fast_prom.cfg 150
dna_r9.4.1_450bps_hac_prom.cfg 55
dna_r9.4.1_450bps_sup_prom.cfg 15

5. Shut down the basecall server

The basecall server will shut down if it receives a SIGTERM signal (identical to that which is sent by Ctrl+C). How the signal is sent depends on how the server was launched:

If the server is running in the foreground in its own Terminal window, hitting Ctrl+C will cause the server to shut down.

If the server is running in the background in your current terminal session and assuming you had exported it as described earlier, use the server's process ID to shut it down:

  1. Obtain a PID for guppy_basecall_server: ps -ef | grep guppy
  2. Kill the process: Kill -9 PID_number

6. Data management

Overview

The NVIDIA DGX Station A100 has 7.68 TB of internal data storage. Data flow should be managed accordingly to best suit the needs of the end users. The location and volume of the data, network connection speed as well as the basecalling model selected will all have an impact on optimal data flow.

Data volumes

100 Gbases of sequencing data in .fast5 format typically occupies approximately 1 Tbyte of storage. Variables such as read length will alter this ratio.

Gbases .fast5 storage Gbytes FASTQ storage Gbytes
50 500 50
100 1000 100
200 2000 200

Basecalling speeds based on data location

For the fastest basecalling speed, .fast5 data should be stored locally on the DGX Station A100. Depending on the basecalling model used and the networked storage available, users could consider basecalling data from networked storage. The table below compares the % basecalling speed vs model speed for basecalling data from local SSD storage when Guppy is run with the suggested parameters.

Note that these benchmarks are for the previous model of the DGX Station A100 (160 GB). Benchmarks for the 320 GB model will be released soon.


dna_r9.4.1_450bps_fast_prom dna_r9.4.1_450bps_hac_prom dna_r9.4.1_450bps_sup_prom
DGX Station A100 local SSD storage 100% 100% 100%
High performance enterprise storage 16.2% 98.8% 99.3%
Basic fibre networked storage (e.g. Synology) 23.7% 99.6% 97.3%

Data flow for basecalling post-run from the PromethION

If you choose to basecall from .fast5 files locally on the DGX Station A100 to reduce time copying .fast5 data, we recommend creating only a temporary copy of the .fast5 data on the DGX Station A100 and deleting the .fast5 data after basecalling. Move only the analysed (FASTQ) data off the station.

Data flow

Last updated: 3/4/2024

Document options

Language: