Aliyun

OSSutil

download ossutil

first, you need to download ossutil first

OS:

curl https://gosspublic.alicdn.com/ossutil/install.sh  | sudo bash

curl -o ossutil-v1.7.19-windows-386.zip https://gosspublic.alicdn.com/ossutil/1.7.19/ossutil-v1.7.19-windows-386.zip

config ossutil

./ossutil config

Params	Description	Instruction
endpoint	the Endpoint of the region where the Bucket is located
accessKeyID	OSS AccessKey	get from user info panel
accessKeySecret	OSS AccessKeySecret	get from user info panel
stsToken	token for sts service	could be empty

Info

you can also modify /home/<$user>/.ossutilconfig file directly to change the configuration.

list files

ossutil ls oss://<$PATH>

For exmaple

ossutil ls oss://csst-data/CSST-20240312/dfs/

download file/dir

you can use cp to download or upload file

ossutil cp -r oss://<$PATH> <$PTHER_PATH>

For exmaple

ossutil cp -r oss://csst-data/CSST-20240312/dfs/ /data/nfs/data/pvc...

upload file/dir

ossutil cp -r <$SOURCE_PATH> oss://<$PATH>

For exmaple

ossutil cp -r /data/nfs/data/pvc/a.txt  oss://csst-data/CSST-20240312/dfs/b.txt

ECS DNS

ZJADC (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

options timeout:2 attempts:3 rotate
nameserver 10.255.9.2
nameserver 10.200.12.5

And then you probably need to modify yum.repo.d as well, check link

YQGCY (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

nameserver 172.27.205.79

And then restart kube-system.coredns-xxxx

Google DNS

nameserver 8.8.8.8
nameserver 4.4.4.4
nameserver 223.5.5.5
nameserver 223.6.6.6

Restart DNS

OS:

vim /etc/NetworkManager/NetworkManager.conf

vim /etc/NetworkManager/NetworkManager.conf

sudo systemctl is-active systemd-resolved
sudo resolvectl flush-caches
# or sudo systemd-resolve --flush-caches

add "dns=none" under '[main]' part

systemctl restart NetworkManager

Modify `ifcfg-ethX` [Optional]

if you cannot get ipv4 address, you can try to modify ifcfg-ethX

vim /etc/sysconfig/network-scripts/ifcfg-ens33

set ONBOOT=yes

Tencent

Zhejianglab

👨‍💻Schedmd Slurm

The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world’s supercomputers and computer clusters.

It provides three key functions:

allocating exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work,
providing a framework for starting, executing, and monitoring work, typically a parallel job such as Message Passing Interface (MPI) on a set of allocated nodes, and
arbitrating contention for resources by managing a queue of pending jobs.

Content

Build & Install

Install On Debian
Install Slurm from Debian
Install On Ubuntu
Install Slurm from Ubuntu
Install From Binary
Install Slurm from binary
Install From Helm Chart
Install Slurm from helm chart
Install From K8s Operator
Install Slurm from K8s Operator
Try OpenSCOW
What is SCOW? SCOW is a HPC cluster management system built by PKU. SCOW used four virtual machines to run slurm cluster. It is a good choice for you to learn how to use slurm. You should check https://pkuhpc.github.io/OpenSCOW/docs/hpccluster, it works well.

Install On Debian

Cluster Setting

1 Manager
1 Login Node
2 Compute nodes

hostname	IP	role	quota
manage01 (slurmctld, slurmdbd)	192.168.56.115	manager	2C4G
login01 (login)	192.168.56.116	login	2C4G
compute01 (slurmd)	192.168.56.117	compute	2C4G
compute02 (slurmd)	192.168.56.118	compute	2C4G

Software Version:

software	version
os	Debian 12 bookworm
slurm	24.05.2

Important

when you see (All Nodes), you need to run the following command on all nodes

when you see (Manager Node), you only need to run the following command on manager node

when you see (Login Node), you only need to run the following command on login node

Prepare Steps `(All Nodes)`

Modify the /etc/apt/sources.list file Using tuna mirror

cat > /etc/apt/sources.list << EOF
deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware

deb https://mirrors.tuna.tsinghua.edu.cn/debian-security/ bookworm-security main contrib non-free non-free-firmware
deb-src https://mirrors.tuna.tsinghua.edu.cn/debian-security/ bookworm-security main contrib non-free non-free-firmware
EOF

if you cannot get ipv4 address

Modify the /etc/network/interfaces

allow-hotplug enps08
iface enps08 inet dhcp

restart the network

systemctl restart networking

Update apt cache

apt clean all && apt update

Set hostname on each node

Node:

hostnamectl set-hostname manage01

hostnamectl set-hostname login01

hostnamectl set-hostname compute01

hostnamectl set-hostname compute02

Set hosts file

cat >> /etc/hosts << EOF
192.168.56.115 manage01
192.168.56.116 login01
192.168.56.117 compute01
192.168.56.118 compute02
EOF

Disable firewall

systemctl stop nftables && systemctl disable nftables

Install packages ntpdate

apt-get -y install ntpdate

Sync server time

ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time.windows.com

Add cron job to sync time

crontab -e
*/5 * * * * /usr/sbin/ntpdate time.windows.com

Create ssh key pair on each node

ssh-keygen -t rsa -b 4096 -C $HOSTNAME

Test ssh login other nodes without password

Node:

ssh-copy-id -i ~/.ssh/id_rsa.pub root@login01
ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute01
ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute02

ssh-copy-id -i ~/.ssh/id_rsa.pub root@manage01
ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute01
ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute02

Install Components

Install NFS server (Manager Node)

there are many ways to install NFS server

using yum install -y nfs-utils, check https://pkuhpc.github.io/SCOW/docs/hpccluster/nfs
using apt install -y nfs-kernel-server, check https://www.linuxtechi.com/how-to-install-nfs-server-on-debian/
or you can directly mount other shared storage.

create shared folder

mkdir /data
chmod 755 /data

modify vim /etc/exports

/data *(rw,sync,insecure,no_subtree_check,no_root_squash)

start nfs server

systemctl start rpcbind 
systemctl start nfs-server 

systemctl enable rpcbind 
systemctl enable nfs-server

check nfs server

showmount -e localhost

# Output
Export list for localhost:
/data *

Install munge service

add user munge (All Nodes)

groupadd -g 1108 munge
useradd -m -c "Munge Uid 'N' Gid Emporium" -d /var/lib/munge -u 1108 -g munge -s /sbin/nologin munge

Install rng-tools-debian (Manager Nodes)

apt-get install -y rng-tools-debian

# modify service script
vim /usr/lib/systemd/system/rngd.service

[Service]
ExecStart=/usr/sbin/rngd -f -r /dev/urandom

systemctl daemon-reload
systemctl start rngd
systemctl enable rngd

apt-get install -y libmunge-dev libmunge2 munge

generate secret key (Manager Nodes)

dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key

copy munge.key from manager node to the rest node (All Nodes)

scp -p /etc/munge/munge.key root@login01:/etc/munge/
scp -p /etc/munge/munge.key root@compute01:/etc/munge/
scp -p /etc/munge/munge.key root@compute02:/etc/munge/

grant privilege on munge.key (All Nodes)

chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

systemctl start munge
systemctl enable munge

Using systemctl status munge to check if the service is running

test munge

munge -n | ssh compute01 unmunge

Install Mariadb (Manager Nodes)

apt-get install -y mariadb-server

create database and user

systemctl start mariadb
systemctl enable mariadb

ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"
mysql -uroot -p$ROOT_PASS -e 'create database slurm_acct_db'

create user slurm，and grant all privileges on database slurm_acct_db

mysql -uroot -p$ROOT_PASS

create user slurm;

grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;

flush privileges;

create Slurm user

groupadd -g 1109 slurm
useradd -m -c "Slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm

Install Slurm `(All Nodes)`

Install basic Debian package build requirements:

apt-get install -y build-essential fakeroot devscripts equivs

Unpack the distributed tarball:

wget https://download.schedmd.com/slurm/slurm-24.05.2.tar.bz2 -O slurm-24.05.2.tar.bz2 &&
tar -xaf slurm*tar.bz2

cd to the directory containing the Slurm source:

cd slurm-24.05.2 &&   mkdir -p /etc/slurm && ./configure

compile slurm

make install

modify configuration files (Manager Nodes)

modify /etc/slurm/slurm.conf Refer to slurm.conf

cp /root/slurm-24.05.2/etc/slurm.conf.example /etc/slurm/slurm.conf
vim /etc/slurm/slurm.conf

focus on these options:

SlurmctldHost=manage

AccountingStorageEnforce=associations,limits,qos
AccountingStorageHost=manage
AccountingStoragePass=/var/run/munge/munge.socket.2
AccountingStoragePort=6819  
AccountingStorageType=accounting_storage/slurmdbd  

JobCompHost=localhost
JobCompLoc=slurm_acct_db
JobCompPass=123456
JobCompPort=3306
JobCompType=jobcomp/mysql
JobCompUser=slurm
JobContainerType=job_container/none
JobAcctGatherType=jobacct_gather/linux

modify /etc/slurm/slurmdbd.conf Refer to slurmdbd.conf

cp /root/slurm-24.05.2/etc/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
vim /etc/slurm/slurmdbd.conf

modify /etc/slurm/cgroup.conf

cp /root/slurm-24.05.2/etc/cgroup.conf.example /etc/slurm/cgroup.conf

send configuration files to other nodes

scp -r /etc/slurm/*.conf  root@login01:/etc/slurm/
scp -r /etc/slurm/*.conf  root@compute01:/etc/slurm/
scp -r /etc/slurm/*.conf  root@compute02:/etc/slurm/

grant privilege on some directories (All Nodes)

mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
mkdir /var/log/slurm
chown slurm: /var/log/slurm

mkdir /var/spool/slurmctld
chown slurm: /var/spool/slurmctld

chown slurm: /etc/slurm/slurmdbd.conf
chmod 600 /etc/slurm/slurmdbd.conf

start slurm services on each node

Node:

systemctl start slurmdbd
systemctl enable slurmdbd

systemctl start slurmctld
systemctl enable slurmctld

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status xxxx` to check if the `xxxx` service is running

Example slurmdbd.server

```text
# vim /usr/lib/systemd/system/slurmdbd.service


[Unit]
Description=Slurm DBD accounting daemon
After=network-online.target remote-fs.target munge.service mysql.service mysqld.service mariadb.service sssd.service
Wants=network-online.target
ConditionPathExists=/etc/slurm/slurmdbd.conf

[Service]
Type=simple
EnvironmentFile=-/etc/sysconfig/slurmdbd
EnvironmentFile=-/etc/default/slurmdbd
User=slurm
Group=slurm
RuntimeDirectory=slurmdbd
RuntimeDirectoryMode=0755
ExecStart=/usr/local/sbin/slurmdbd -D -s $SLURMDBD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

Example slumctld.server

```text
# vim /usr/lib/systemd/system/slurmctld.service


[Unit]
Description=Slurm controller daemon
After=network-online.target remote-fs.target munge.service sssd.service
Wants=network-online.target
ConditionPathExists=/etc/slurm/slurm.conf

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/slurmctld
EnvironmentFile=-/etc/default/slurmctld
User=slurm
Group=slurm
RuntimeDirectory=slurmctld
RuntimeDirectoryMode=0755
ExecStart=/usr/local/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

Example slumd.server

```text
# vim /usr/lib/systemd/system/slurmd.service


[Unit]
Description=Slurm node daemon
After=munge.service network-online.target remote-fs.target sssd.service
Wants=network-online.target
#ConditionPathExists=/etc/slurm/slurm.conf

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/slurmd
EnvironmentFile=-/etc/default/slurmd
RuntimeDirectory=slurm
RuntimeDirectoryMode=0755
ExecStart=/usr/local/sbin/slurmd --systemd $SLURMD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
LimitNOFILE=131072
LimitMEMLOCK=infinity
LimitSTACK=infinity
Delegate=yes


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

Test Your Slurm Cluster `(Login Node)`

check cluster configuration

scontrol show config

check cluster status

sinfo
scontrol show partition
scontrol show node

submit job

srun -N2 hostname
scontrol show jobs

check job status

check job status
squeue -a

Install On Ubuntu

Cluster Setting

1 Manager
1 Login Node
2 Compute nodes

hostname	IP	role	quota
manage01 (slurmctld, slurmdbd)	192.168.56.115	manager	2C4G
login01 (login)	192.168.56.116	login	2C4G
compute01 (slurmd)	192.168.56.117	compute	2C4G
compute02 (slurmd)	192.168.56.118	compute	2C4G

Software Version:

software	version
os	Ubuntu 22.04
slurm	24.05.2

Important

when you see (All Nodes), you need to run the following command on all nodes

when you see (Manager Node), you only need to run the following command on manager node

when you see (Login Node), you only need to run the following command on login node

Prepare Steps `(All Nodes)`

Modify the /etc/apt/sources.list file Using tuna mirror

cat > /etc/apt/sources.list << EOF

EOF

if you cannot get ipv4 address

Modify the /etc/network/interfaces

allow-hotplug enps08
iface enps08 inet dhcp

restart the network

systemctl restart networking

Update apt cache

apt clean all && apt update

Set hosts file

cat >> /etc/hosts << EOF
10.119.2.36 juice-036
10.119.2.37 juice-037
10.119.2.38 juice-038
EOF

Install packages ntpdate

apt-get -y install ntpdate

Sync server time

ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate ntp.aliyun.com

Add cron job to sync time

crontab -e
*/5 * * * * /usr/sbin/ntpdate ntp.aliyun.com

Create ssh key pair on each node

ssh-keygen -t rsa -b 4096 -C $HOSTNAME

Test ssh login other nodes without password

Node:

ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-036
ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-037
ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-038

Install Components

Install NFS server (Manager Node)

there are many ways to install NFS server

using apt install -y nfs-kernel-server, check https://www.linuxtechi.com/how-to-install-nfs-server-on-debian/

create shared folder

mkdir /data
chmod 755 /data

modify vim /etc/exports

/data *(rw,sync,insecure,no_subtree_check,no_root_squash)

start nfs server

systemctl start rpcbind 
systemctl start nfs-server 

systemctl enable rpcbind 
systemctl enable nfs-server

check nfs server

showmount -e localhost

# Output
Export list for localhost:
/data *

Install munge service

add user munge (All Nodes)

sudo apt install -y build-essential git wget munge libmunge-dev libmunge2 \
    mariadb-server libmariadb-dev libssl-dev libpam0g-dev \
    libhwloc-dev liblua5.3-dev libreadline-dev libncurses-dev \
    libjson-c-dev libyaml-dev libhttp-parser-dev libjwt-dev libdbus-glib-1-dev libbpf-dev libdbus-1-dev


which mungekey

# 如果有，使用它生成 key
sudo systemctl stop munge
sudo mungekey -c
sudo chown munge:munge /etc/munge/munge.key
sudo chmod 400 /etc/munge/munge.key
sudo systemctl start munge

copy munge.key from manager node to the rest node (All Nodes)

sudo scp /etc/munge/munge.key juice-036:/tmp/munge.key
sudo scp /etc/munge/munge.key juice-037:/tmp/munge.key
sudo scp /etc/munge/munge.key juice-038:/tmp/munge.key

grant privilege on munge.key (All Nodes)

systemctl stop munge

sudo mv /tmp/munge.key /etc/munge/munge.key
chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

systemctl start munge
systemctl status munge
systemctl enable munge

Using systemctl status munge to check if the service is running

test munge

munge -n | ssh juice-036 unmunge
munge -n | ssh juice-037 unmunge
munge -n | ssh juice-038 unmunge

Install Mariadb (Manager Nodes)

apt-get install -y mariadb-server

create database and user

systemctl start mariadb
systemctl enable mariadb

ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"
mysql -uroot -p$ROOT_PASS -e 'create database slurm_acct_db'

create user slurm，and grant all privileges on database slurm_acct_db

mysql -uroot -p$ROOT_PASS

create user slurm;

grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;

flush privileges;

create Slurm user

groupadd -g 1109 slurm
useradd -m -c "Slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm

Install Slurm `(All Nodes)`

Install basic Debian package build requirements:

apt-get install -y build-essential fakeroot devscripts equivs

Unpack the distributed tarball:

wget https://download.schedmd.com/slurm/slurm-25.05.2.tar.bz2 -O slurm-25.05.2.tar.bz2 &&
tar -xaf slurm*tar.bz2

cd to the directory containing the Slurm source:

cd slurm-25.05.2 &&   mkdir -p /etc/slurm && ./configure --prefix=/usr --sysconfdir=/etc/slurm  --enable-cgroupv2

compile slurm

make install

modify configuration files (Manager Nodes)

modify /etc/slurm/slurm.conf Refer to slurm.conf

cp /root/slurm-25.05.2/etc/slurm.conf.example /etc/slurm/slurm.conf
vim /etc/slurm/slurm.conf

focus on these options:

SlurmctldHost=manage

AccountingStorageEnforce=associations,limits,qos
AccountingStorageHost=manage
AccountingStoragePass=/var/run/munge/munge.socket.2
AccountingStoragePort=6819  
AccountingStorageType=accounting_storage/slurmdbd  

JobCompHost=localhost
JobCompLoc=slurm_acct_db
JobCompPass=123456
JobCompPort=3306
JobCompType=jobcomp/mysql
JobCompUser=slurm
JobContainerType=job_container/none
JobAcctGatherType=jobacct_gather/linux

modify /etc/slurm/slurmdbd.conf Refer to slurmdbd.conf

cp /root/slurm-25.05.2/etc/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
vim /etc/slurm/slurmdbd.conf

modify /etc/slurm/cgroup.conf

cp /root/slurm-25.05.2/etc/cgroup.conf.example /etc/slurm/cgroup.conf

send configuration files to other nodes

scp -r /etc/slurm/*.conf  root@juice-037:/etc/slurm/
scp -r /etc/slurm/*.conf  root@juice-038:/etc/slurm/

grant privilege on some directories (All Nodes)

mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
mkdir /var/log/slurm
chown slurm: /var/log/slurm

mkdir /var/spool/slurmctld
chown slurm: /var/spool/slurmctld

chown slurm: /etc/slurm/slurmdbd.conf
chmod 600 /etc/slurm/slurmdbd.conf

start slurm services on each node

Node:

systemctl start slurmdbd
systemctl enable slurmdbd

systemctl start slurmctld
systemctl enable slurmctld

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status xxxx` to check if the `xxxx` service is running

Example slurmdbd.server

```text
# vim /usr/lib/systemd/system/slurmdbd.service


[Unit]
Description=Slurm DBD accounting daemon
After=network-online.target remote-fs.target munge.service mysql.service mysqld.service mariadb.service sssd.service
Wants=network-online.target
ConditionPathExists=/etc/slurm/slurmdbd.conf

[Service]
Type=simple
EnvironmentFile=-/etc/sysconfig/slurmdbd
EnvironmentFile=-/etc/default/slurmdbd
User=slurm
Group=slurm
RuntimeDirectory=slurmdbd
RuntimeDirectoryMode=0755
ExecStart=/usr/sbin/slurmdbd -D -s $SLURMDBD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

Example slumctld.server

```text
# vim /usr/lib/systemd/system/slurmctld.service


[Unit]
Description=Slurm controller daemon
After=network-online.target remote-fs.target munge.service sssd.service
Wants=network-online.target
ConditionPathExists=/etc/slurm/slurm.conf

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/slurmctld
EnvironmentFile=-/etc/default/slurmctld
User=slurm
Group=slurm
RuntimeDirectory=slurmctld
RuntimeDirectoryMode=0755
ExecStart=/usr/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

Example slumd.server

```text
# vim /usr/lib/systemd/system/slurmd.service


[Unit]
Description=Slurm node daemon
After=munge.service network-online.target remote-fs.target sssd.service
Wants=network-online.target
#ConditionPathExists=/etc/slurm/slurm.conf

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/slurmd
EnvironmentFile=-/etc/default/slurmd
RuntimeDirectory=slurm
RuntimeDirectoryMode=0755
ExecStart=/usr/sbin/slurmd --systemd $SLURMD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
LimitNOFILE=131072
LimitMEMLOCK=infinity
LimitSTACK=infinity
Delegate=yes


# Uncomment the following lines to disable logging through journald.
# NOTE: It may be preferable to set these through an override file instead.
#StandardOutput=null
#StandardError=null

[Install]
WantedBy=multi-user.target
```

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

systemctl start slurmd
systemctl enable slurmd

Using `systemctl status slurmd` to check if the `slurmd` service is running

Test Your Slurm Cluster `(Login Node)`

check cluster configuration

scontrol show config

check cluster status

sinfo
scontrol show partition
scontrol show node

submit job

srun -N2 hostname
scontrol show jobs

check job status

check job status
squeue -a

Install From Binary

Important

(All Nodes) means all type nodes should install this component.

(Manager Node) means only the manager node should install this component.

(Login Node) means only the Auth node should install this component.

(Cmp) means only the Compute node should install this component.

Typically, there are three nodes are required to run Slurm.
1 Manage(Manager Node), 1 Login Node and N Compute(Cmp).
but you can choose to install all service in single node. check

Prequisites

change hostname (All Nodes)

hostnamectl set-hostname (manager|auth|computeXX)

modify /etc/hosts (All Nodes)

echo "192.aa.bb.cc (manager|auth|computeXX)" >> /etc/hosts

disable firewall, selinux, dnsmasq, swap (All Nodes). more detail here
NFS Server (Manager Node). NFS is used as the default file system for the Slurm accounting database.
[NFS Client] (All Nodes). all node should mount the NFS share
Install NFS Client
```
mount <$nfs_server>:/data /data -o proto=tcp -o nolock
```

Munge (All Nodes). The auth/munge plugin will be built if the MUNGE authentication development library is installed. MUNGE is used as the default authentication mechanism.

Install Munge

All node need to have the munge user and group.

groupadd -g 1108 munge
useradd -m -c "Munge Uid 'N' Gid Emporium" -d /var/lib/munge -u 1108 -g munge -s /sbin/nologin munge

yum install epel-release -y
yum install munge munge-libs munge-devel -y

Create global secret key

/usr/sbin/create-munge-key -r
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key

sync secret to the rest of nodes

scp -p /etc/munge/munge.key root@<$rest_node>:/etc/munge/

ssh root@<$rest_node> "chown munge: /etc/munge/munge.key && chmod 400 /etc/munge/munge.key"
ssh root@<$rest_node> "systemctl start munge && systemctl enable munge"

test munge if it works

munge -n | unmunge

Database (Manager Node). MySQL support for accounting will be built if the MySQL or MariaDB development library is present. A currently supported version of MySQL or MariaDB should be used.

Install MariaDB

install mariadb

yum -y install mariadb-server
systemctl start mariadb && systemctl enable mariadb
ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"

mysql -u root -p${ROOT_PASS}
create database slurm_acct_db;
create user slurm;
grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;
flush privileges;
quit

Install Slurm

create slurm user (All Nodes)

groupadd -g 1109 slurm
useradd -m -c "slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm

Install Slurm from

Build RPM package

install depeendencies (Manager Node)

yum -y install gcc gcc-c++ readline-devel perl-ExtUtils-MakeMaker pam-devel rpm-build mysql-devel python3

build rpm package (Manager Node)

wget https://download.schedmd.com/slurm/slurm-24.05.2.tar.bz2 -O slurm-24.05.2.tar.bz2
rpmbuild -ta --nodeps slurm-24.05.2.tar.bz2

The rpm files will be installed under the $(HOME)/rpmbuild directory of the user building them.

send rpm to rest nodes (Manager Node)

ssh root@<$rest_node> "mkdir -p /root/rpmbuild/RPMS/"
scp -p $(HOME)/rpmbuild/RPMS/x86_64 root@<$rest_node>:/root/rpmbuild/RPMS/x86_64

install rpm (Manager Node)

ssh root@<$rest_node> "yum localinstall /root/rpmbuild/RPMS/x86_64/slurm-*"

modify configuration file (Manager Node)

cp /etc/slurm/cgroup.conf.example /etc/slurm/cgroup.conf
cp /etc/slurm/slurm.conf.example /etc/slurm/slurm.conf
cp /etc/slurm/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
chmod 600 /etc/slurm/slurmdbd.conf
chown slurm: /etc/slurm/slurmdbd.conf

cgroup.conf doesnt need to change.

edit /etc/slurm/slurm.conf, you can use this link as a reference

edit /etc/slurm/slurmdbd.conf, you can use this link as a reference

Install yum repo directly

install slurm (All Nodes)
```
yum -y slurm-wlm slurmdbd
```
modify configuration file (All Nodes)
```
vim /etc/slurm-llnl/slurm.conf
```
```
vim /etc/slurm-llnl/slurmdbd.conf
```
cgroup.conf doesnt need to change.
edit /etc/slurm/slurm.conf, you can use this link as a reference
edit /etc/slurm/slurmdbd.conf, you can use this link as a reference

send configuration (Manager Node)

 scp -r /etc/slurm/*.conf  root@<$rest_node>:/etc/slurm/
 ssh rootroot@<$rest_node> "mkdir /var/spool/slurmd && chown slurm: /var/spool/slurmd"
 ssh rootroot@<$rest_node> "mkdir /var/log/slurm && chown slurm: /var/log/slurm"
 ssh rootroot@<$rest_node> "mkdir /var/spool/slurmctld && chown slurm: /var/spool/slurmctld"

start service (Manager Node)

ssh rootroot@<$rest_node> "systemctl start slurmdbd && systemctl enable slurmdbd"
ssh rootroot@<$rest_node> "systemctl start slurmctld && systemctl enable slurmctld"

start service (All Nodes)

ssh rootroot@<$rest_node> "systemctl start slurmd && systemctl enable slurmd"

Test

show cluster status

scontrol show config

sinfo
scontrol show partition
scontrol show node

submit job

srun -N2 hostname
scontrol show jobs

check job status

squeue -a

Reference:

Install From Helm Chart

Despite the complex binary installation, helm chart is a better way to install slurm.

Source code could be found from https://github.com/AaronYang0628/slurm-on-k8s

Prequisites

Kubernetes has installed, if not check 🔗link
Helm binary has installed, if not check 🔗link

Installation

get helm repo and update

helm repo add ay-helm-mirror https://aaronyang0628.github.io/helm-chart-mirror/charts

install slurm chart

# wget -O slurm.values.yaml https://raw.githubusercontent.com/AaronYang0628/slurm-on-k8s/refs/heads/main/chart/values.yaml
helm install slurm ay-helm-mirror/chart -f slurm.values.yaml --version 1.0.10

Or you can get template values.yaml from https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurm.values.yaml

check chart status
```
helm -n slurm list
```

Install From K8s Operator

Despite the complex binary installation, using k8s operator is a better way to install slurm.

Source code could be found from https://github.com/AaronYang0628/slurm-on-k8s

Prequisites

Kubernetes has installed, if not check 🔗link
Helm binary has installed, if not check 🔗link

Installation

deploy slurm operator

kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/operator_install.yaml

Expectd Output

[root@ay-zj-ecs operator]# kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/operator_install.yaml
namespace/slurm created
customresourcedefinition.apiextensions.k8s.io/slurmdeployments.slurm.ay.dev created
serviceaccount/slurm-operator-controller-manager created
role.rbac.authorization.k8s.io/slurm-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/slurm-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/slurm-operator-metrics-auth-role created
clusterrole.rbac.authorization.k8s.io/slurm-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-admin-role created
clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-editor-role created
clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-viewer-role created
rolebinding.rbac.authorization.k8s.io/slurm-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/slurm-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/slurm-operator-metrics-auth-rolebinding created
service/slurm-operator-controller-manager-metrics-service created
deployment.apps/slurm-operator-controller-manager created

check operator status

kubectl -n slurm get pod

Expectd Output

[root@ay-zj-ecs operator]# kubectl -n slurm get pod
NAME                                READY   STATUS    RESTARTS   AGE
slurm-operator-controller-manager   1/1     Running   0          27s

apply CRD slurmdeployment

kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurmdeployment.zj.values.yaml

Expectd Output

[root@ay-zj-ecs operator]# kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurmdeployment.zj.values.yaml
slurmdeployment.slurm.ay.dev/lensing created

check operator status

kubectl get slurmdeployment
kubectl -n slurm logs -f deploy/slurm-operator-controller-manager
# kubectl get slurmdep
# kubectl -n test get pods

Expectd Output

[root@ay-zj-ecs ~]# kubectl get slurmdep -w
NAME      CPU   GPU   LOGIN   CTLD   DBD   DBSVC   JOB COMMAND                     STATUS
lensing   0/1   0/0   0/1     0/1    0/1   0/1     sh -c srun -N 2 /bin/hostname   
lensing   1/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
lensing   2/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname

upgrade slurmdep

kubectl edit slurmdep lensing
# set SlurmCPU.replicas = 3

Expectd Output

[root@ay-zj-ecs ~]# kubectl edit slurmdep lensing
slurmdeployment.slurm.ay.dev/lensing edited

[root@ay-zj-ecs ~]# kubectl get slurmdep -w
NAME      CPU   GPU   LOGIN   CTLD   DBD   DBSVC   JOB COMMAND                     STATUS
lensing   2/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
lensing   2/3   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
lensing   3/3   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname

Try OpenSCOW

What is SCOW?

SCOW is a HPC cluster management system built by PKU.

SCOW used four virtual machines to run slurm cluster. It is a good choice for you to learn how to use slurm.

You should check https://pkuhpc.github.io/OpenSCOW/docs/hpccluster, it works well.

CheatSheet

Common Environment Variables

File Operations

Submit Jobs

Common Environment Variables

Variable	Description
$SLURM_JOB_ID	The Job ID.
$SLURM_JOBID	Deprecated. Same as $SLURM_JOB_ID
$SLURM_SUBMIT_HOST	The hostname of the node used for job submission.
$SLURM_JOB_NODELIST	Contains the definition (list) of the nodes that is assigned to the job.
$SLURM_NODELIST	Deprecated. Same as SLURM_JOB_NODELIST.
$SLURM_CPUS_PER_TASK	Number of CPUs per task.
$SLURM_CPUS_ON_NODE	Number of CPUs on the allocated node.
$SLURM_JOB_CPUS_PER_NODE	Count of processors available to the job on this node.
$SLURM_CPUS_PER_GPU	Number of CPUs requested per allocated GPU.
$SLURM_MEM_PER_CPU	Memory per CPU. Same as –mem-per-cpu .
$SLURM_MEM_PER_GPU	Memory per GPU.
$SLURM_MEM_PER_NODE	Memory per node. Same as –mem .
$SLURM_GPUS	Number of GPUs requested.
$SLURM_NTASKS	Same as -n, –ntasks. The number of tasks.
$SLURM_NTASKS_PER_NODE	Number of tasks requested per node.
$SLURM_NTASKS_PER_SOCKET	Number of tasks requested per socket.
$SLURM_NTASKS_PER_CORE	Number of tasks requested per core.
$SLURM_NTASKS_PER_GPU	Number of tasks requested per GPU.
$SLURM_NPROCS	Same as -n, –ntasks. See $SLURM_NTASKS.
$SLURM_TASKS_PER_NODE	Number of tasks to be initiated on each node.
$SLURM_ARRAY_JOB_ID	Job array’s master job ID number.
$SLURM_ARRAY_TASK_ID	Job array ID (index) number.
$SLURM_ARRAY_TASK_COUNT	Total number of tasks in a job array.
$SLURM_ARRAY_TASK_MAX	Job array’s maximum ID (index) number.
$SLURM_ARRAY_TASK_MIN	Job array’s minimum ID (index) number.

A full list of environment variables for SLURM can be found by visiting the SLURM page on environment variables.

File Operations

File Distribution

sbcast is used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
- Feature
  1. distribute file：Quickly copy files to all compute nodes assigned to the job, avoiding the hassle of manually distributing files. Faster than traditional scp or rsync, especially when distributing to multiple nodes。
  2. simplify script：one command to distribute files to all nodes assigned to the job。
  3. imrpove performance：Improve file distribution speed by parallelizing transfers, especially for large or multiple files。
- Usage
  1. Alone
```
sbcast <source_file> <destination_path>
```
  1. Embedded in a job script
```
#!/bin/bash
#SBATCH --job-name=example_job
#SBATCH --output=example_job.out
#SBATCH --error=example_job.err
#SBATCH --partition=compute
#SBATCH --nodes=4

# Use sbcast to distribute the file to the /tmp directory of each node
sbcast data.txt /tmp/data.txt

# Run your program using the distributed files
srun my_program /tmp/data.txt
```

File Collection

File Redirection When submitting a job, you can use the #SBATCH –output and #SBATCH –error directives to redirect standard output and standard error to specified files.
```
 #SBATCH --output=output.txt
 #SBATCH --error=error.txt
```
Or
```
sbatch -N2 -w "compute[01-02]" -o result/file/path xxx.slurm
```
Send the destination address manually Using scp or rsync in the job to copy the files from the compute nodes to the submit node
Using NFS If a shared file system (such as NFS, Lustre, or GPFS) is configured in the computing cluster, the result files can be written directly to the shared directory. In this way, the result files generated by all nodes are automatically stored in the same location.
Using sbcast

Submit Jobs

3 Type Jobs

srun is used to submit a job for execution or initiate job steps in real time.
- Example
  1. run shell
```
srun -N2 bin/hostname
```
  1. run script
```
srun -N1 test.sh
```
  1. exec into slurmd node
```
srun -w slurm-lensing-slurm-slurmd-cpu-2 --pty /bin/bash
```

sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

Example

submit a batch job

sbatch -N2 -w "compute[01-02]" -o job.stdout /data/jobs/batch-job.slurm

batch-job.slurm

#!/bin/bash

#SBATCH -N 1
#SBATCH --job-name=cpu-N1-batch
#SBATCH --partition=compute
#SBATCH --mail-type=end
#SBATCH --mail-user=xxx@email.com
#SBATCH --output=%j.out
#SBATCH --error=%j.err

srun -l /bin/hostname #you can still write srun <command> in here
srun -l pwd

submit a parallel task to process differnt data partition

sbatch /data/jobs/parallel.slurm

parallel.slurm

#!/bin/bash
#SBATCH -N 2 
#SBATCH --job-name=cpu-N2-parallel
#SBATCH --partition=compute
#SBATCH --time=01:00:00
#SBATCH --array=1-4  # 定义任务数组，假设有4个分片
#SBATCH --ntasks-per-node=1 # 每个节点只运行一个任务
#SBATCH --output=process_data_%A_%a.out
#SBATCH --error=process_data_%A_%a.err

TASK_ID=${SLURM_ARRAY_TASK_ID}

DATA_PART="data_part_${TASK_ID}.txt" #make sure you have that file

if [ -f ${DATA_PART} ]; then
    echo "Processing ${DATA_PART} on node $(hostname)"
    # python process_data.py --input ${DATA_PART}
else
    echo "File ${DATA_PART} does not exist!"
fi

how to split file

split -l 1000 data.txt data_part_ 
&& mv data_part_aa data_part_1 
&& mv data_part_ab data_part_2

salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.
- Example
  1. allocate resources (more like create an virtual machine)
```
salloc -N2 bash
```
  This command will create a job which allocates 2 nodes and spawn a bash shell on each node. and you can execute srun commands in that environment. After your computing task is finsihs, remember to shutdown your job.
```
scancel <$job_id>
```
  when you exit the job, the resources will be released.

Configuration Files

MPI Libs

Test Intel MPI Jobs

Test Open MPI Jobs

Test Intel MPI Jobs

在SLURM集群中使用MPI（Message Passing Interface）进行并行计算，通常需要以下几个步骤：

1. 安装MPI库

确保你的集群节点已经安装了MPI库，常见的MPI实现包括：

OpenMPI
Intel MPI
MPICH 可以通过以下命令检查集群是否安装了MPI：

mpicc --version  # 检查MPI编译器
mpirun --version # 检查MPI运行时环境

2. 测试MPI性能

mpirun -n 2 IMB-MPI1 pingpong

3. 编译MPI程序

你可以用mpicc（C语言）或mpic++（C++语言）来编译MPI程序。例如：

以下是一个简单的MPI “Hello, World!” 示例程序，假设文件名为 hello_mpi.c, 还有一个进行矩阵计算的示例程序，文件名为dot_product.c，任意挑选一个即可：

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[]) {
    int rank, size;
    
    // 初始化MPI环境
    MPI_Init(&argc, &argv);

    // 获取当前进程的rank和总进程数
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // 输出进程的信息
    printf("Hello, World! I am process %d out of %d processes.\n", rank, size);

    // 退出MPI环境
    MPI_Finalize();

    return 0;
}

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N 8  // 向量大小

// 计算向量的局部点积
double compute_local_dot_product(double *A, double *B, int start, int end) {
    double local_dot = 0.0;
    for (int i = start; i < end; i++) {
        local_dot += A[i] * B[i];
    }
    return local_dot;
}

void print_vector(double *Vector) {
    for (int i = 0; i < N; i++) {
        printf("%f ", Vector[i]);   
    }
    printf("\n");
}

int main(int argc, char *argv[]) {
    int rank, size;

    // 初始化MPI环境
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // 向量A和B
    double A[N], B[N];

    // 进程0初始化向量A和B
    if (rank == 0) {
        for (int i = 0; i < N; i++) {
            A[i] = i + 1;  // 示例数据
            B[i] = (i + 1) * 2;  // 示例数据
        }
    }

    // 广播向量A和B到所有进程
    MPI_Bcast(A, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
    MPI_Bcast(B, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);

    // 每个进程计算自己负责的部分
    int local_n = N / size;  // 每个进程处理的元素个数
    int start = rank * local_n;
    int end = (rank + 1) * local_n;
    
    // 如果是最后一个进程，确保处理所有剩余的元素（处理N % size）
    if (rank == size - 1) {
        end = N;
    }

    double local_dot_product = compute_local_dot_product(A, B, start, end);

    // 使用MPI_Reduce将所有进程的局部点积结果汇总到进程0
    double global_dot_product = 0.0;
    MPI_Reduce(&local_dot_product, &global_dot_product, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

    // 进程0输出最终结果
    if (rank == 0) {
        printf("Vector A is\n");
        print_vector(A);
        printf("Vector B is\n");
        print_vector(B);
        printf("Dot Product of A and B: %f\n", global_dot_product);
    }

    // 结束MPI环境
    MPI_Finalize();
    return 0;
}

3. 创建Slurm作业脚本

创建一个SLURM作业脚本来运行该MPI程序。以下是一个基本的SLURM作业脚本，假设文件名为 mpi_test.slurm:

#!/bin/bash
#SBATCH --job-name=mpi_job       # Job name
#SBATCH --nodes=2                # Number of nodes to use
#SBATCH --ntasks-per-node=1      # Number of tasks per node
#SBATCH --time=00:10:00          # Time limit
#SBATCH --output=mpi_test_output_%j.log     # Standard output file
#SBATCH --error=mpi_test_output_%j.err     # Standard error file

# Manually set Intel OneAPI MPI and Compiler environment
export I_MPI_PMI=pmi2
export I_MPI_PMI_LIBRARY=/usr/lib/x86_64-linux-gnu/slurm/mpi_pmi2.so
export I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.14
export INTEL_COMPILER_ROOT=/opt/intel/oneapi/compiler/2025.0
export PATH=$I_MPI_ROOT/bin:$INTEL_COMPILER_ROOT/bin:$PATH
export LD_LIBRARY_PATH=$I_MPI_ROOT/lib:$INTEL_COMPILER_ROOT/lib:$LD_LIBRARY_PATH
export MANPATH=$I_MPI_ROOT/man:$INTEL_COMPILER_ROOT/man:$MANPATH

# Compile the MPI program
icx-cc -I$I_MPI_ROOT/include  hello_mpi.c -o hello_mpi -L$I_MPI_ROOT/lib -lmpi

# Run the MPI job

mpirun -np 2 ./hello_mpi

#!/bin/bash
#SBATCH --job-name=mpi_job       # Job name
#SBATCH --nodes=2                # Number of nodes to use
#SBATCH --ntasks-per-node=1      # Number of tasks per node
#SBATCH --time=00:10:00          # Time limit
#SBATCH --output=mpi_test_output_%j.log     # Standard output file
#SBATCH --error=mpi_test_output_%j.err     # Standard error file

# Manually set Intel OneAPI MPI and Compiler environment
export I_MPI_PMI=pmi2
export I_MPI_PMI_LIBRARY=/usr/lib/x86_64-linux-gnu/slurm/mpi_pmi2.so
export I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.14
export INTEL_COMPILER_ROOT=/opt/intel/oneapi/compiler/2025.0
export PATH=$I_MPI_ROOT/bin:$INTEL_COMPILER_ROOT/bin:$PATH
export LD_LIBRARY_PATH=$I_MPI_ROOT/lib:$INTEL_COMPILER_ROOT/lib:$LD_LIBRARY_PATH
export MANPATH=$I_MPI_ROOT/man:$INTEL_COMPILER_ROOT/man:$MANPATH

# Compile the MPI program
icx-cc -I$I_MPI_ROOT/include  dot_product.c -o dot_product -L$I_MPI_ROOT/lib -lmpi

# Run the MPI job

mpirun -np 2 ./dot_product

4. 编译MPI程序

在运行作业之前，你需要编译MPI程序。在集群上使用mpicc来编译该程序。假设你将程序保存在 hello_mpi.c 文件中，使用以下命令进行编译：

mpicc -o hello_mpi hello_mpi.c

mpicc -o dot_product dot_product.c

5. 提交Slurm作业

保存上述作业脚本（mpi_test.slurm）并使用以下命令提交作业：

sbatch mpi_test.slurm

6. 查看作业状态

你可以使用以下命令查看作业的状态：

squeue -u <your_username>

7. 检查输出

作业完成后，输出将保存在你作业脚本中指定的文件中（例如 mpi_test_output_<job_id>.log）。你可以使用 cat 或任何文本编辑器查看输出：

cat mpi_test_output_*.log

示例输出如果一切正常，输出会类似于：

Hello, World! I am process 0 out of 2 processes.
Hello, World! I am process 1 out of 2 processes.

Result Matrix C (A * B):
14 8 2 -4 
20 10 0 -10 
-1189958655 1552515295 21949 -1552471397 
0 0 0 0

Test Open MPI Jobs

在SLURM集群中使用MPI（Message Passing Interface）进行并行计算，通常需要以下几个步骤：

1. 安装MPI库

确保你的集群节点已经安装了MPI库，常见的MPI实现包括：

OpenMPI
Intel MPI
MPICH 可以通过以下命令检查集群是否安装了MPI：

mpicc --version  # 检查MPI编译器
mpirun --version # 检查MPI运行时环境

2. 编译MPI程序

你可以用mpicc（C语言）或mpic++（C++语言）来编译MPI程序。例如：

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[]) {
    int rank, size;
    
    // 初始化MPI环境
    MPI_Init(&argc, &argv);

    // 获取当前进程的rank和总进程数
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // 输出进程的信息
    printf("Hello, World! I am process %d out of %d processes.\n", rank, size);

    // 退出MPI环境
    MPI_Finalize();

    return 0;
}

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N 8  // 向量大小

// 计算向量的局部点积
double compute_local_dot_product(double *A, double *B, int start, int end) {
    double local_dot = 0.0;
    for (int i = start; i < end; i++) {
        local_dot += A[i] * B[i];
    }
    return local_dot;
}

void print_vector(double *Vector) {
    for (int i = 0; i < N; i++) {
        printf("%f ", Vector[i]);   
    }
    printf("\n");
}

int main(int argc, char *argv[]) {
    int rank, size;

    // 初始化MPI环境
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // 向量A和B
    double A[N], B[N];

    // 进程0初始化向量A和B
    if (rank == 0) {
        for (int i = 0; i < N; i++) {
            A[i] = i + 1;  // 示例数据
            B[i] = (i + 1) * 2;  // 示例数据
        }
    }

    // 广播向量A和B到所有进程
    MPI_Bcast(A, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
    MPI_Bcast(B, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);

    // 每个进程计算自己负责的部分
    int local_n = N / size;  // 每个进程处理的元素个数
    int start = rank * local_n;
    int end = (rank + 1) * local_n;
    
    // 如果是最后一个进程，确保处理所有剩余的元素（处理N % size）
    if (rank == size - 1) {
        end = N;
    }

    double local_dot_product = compute_local_dot_product(A, B, start, end);

    // 使用MPI_Reduce将所有进程的局部点积结果汇总到进程0
    double global_dot_product = 0.0;
    MPI_Reduce(&local_dot_product, &global_dot_product, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

    // 进程0输出最终结果
    if (rank == 0) {
        printf("Vector A is\n");
        print_vector(A);
        printf("Vector B is\n");
        print_vector(B);
        printf("Dot Product of A and B: %f\n", global_dot_product);
    }

    // 结束MPI环境
    MPI_Finalize();
    return 0;
}

3. 创建Slurm作业脚本

创建一个SLURM作业脚本来运行该MPI程序。以下是一个基本的SLURM作业脚本，假设文件名为 mpi_test.slurm:

#!/bin/bash
#SBATCH --job-name=mpi_test                 # 作业名称
#SBATCH --nodes=2                           # 请求节点数
#SBATCH --ntasks-per-node=1                 # 每个节点上的任务数
#SBATCH --time=00:10:00                     # 最大运行时间
#SBATCH --output=mpi_test_output_%j.log     # 输出日志文件

# 加载MPI模块（如果使用模块化环境）
module load openmpi

# 运行MPI程序
mpirun --allow-run-as-root -np 2 ./hello_mpi

#!/bin/bash
#SBATCH --job-name=mpi_test                 # 作业名称
#SBATCH --nodes=2                           # 请求节点数
#SBATCH --ntasks-per-node=1                 # 每个节点上的任务数
#SBATCH --time=00:10:00                     # 最大运行时间
#SBATCH --output=mpi_test_output_%j.log     # 输出日志文件

# 加载MPI模块（如果使用模块化环境）
module load openmpi

# 运行MPI程序
mpirun --allow-run-as-root -np 2 ./dot_product

4. 编译MPI程序

在运行作业之前，你需要编译MPI程序。在集群上使用mpicc来编译该程序。假设你将程序保存在 hello_mpi.c 文件中，使用以下命令进行编译：

mpicc -o hello_mpi hello_mpi.c

mpicc -o dot_product dot_product.c

5. 提交Slurm作业

保存上述作业脚本（mpi_test.slurm）并使用以下命令提交作业：

sbatch mpi_test.slurm

6. 查看作业状态

你可以使用以下命令查看作业的状态：

squeue -u <your_username>

7. 检查输出

作业完成后，输出将保存在你作业脚本中指定的文件中（例如 mpi_test_output_<job_id>.log）。你可以使用 cat 或任何文本编辑器查看输出：

cat mpi_test_output_*.log

示例输出如果一切正常，输出会类似于：

Hello, World! I am process 0 out of 2 processes.
Hello, World! I am process 1 out of 2 processes.

Result Matrix C (A * B):
14 8 2 -4 
20 10 0 -10 
-1189958655 1552515295 21949 -1552471397 
0 0 0 0

☁️CSP Related

Subsections of ☁️CSP Related

Aliyun

Subsections of Aliyun

OSSutil

download ossutil

config ossutil

list files

download file/dir

upload file/dir

ECS DNS

ZJADC (Aliyun Directed Cloud)

YQGCY (Aliyun Directed Cloud)

Google DNS

Restart DNS

Modify ifcfg-ethX [Optional]

Tencent

Zhejianglab

Subsections of Zhejianglab

👨‍💻Schedmd Slurm

Content

Subsections of 👨‍💻Schedmd Slurm

Build & Install

Subsections of Build & Install

Install On Debian

Cluster Setting

Prepare Steps (All Nodes)

Install Components

Install Slurm (All Nodes)

Test Your Slurm Cluster (Login Node)

Install On Ubuntu

Cluster Setting

Prepare Steps (All Nodes)

Install Components

Install Slurm (All Nodes)

Test Your Slurm Cluster (Login Node)

Install From Binary

Prequisites

Install Slurm

Test

Reference:

Install From Helm Chart

Prequisites

Installation

Install From K8s Operator

Prequisites

Installation

Try OpenSCOW

What is SCOW?

CheatSheet

Subsections of CheatSheet

Common Environment Variables

File Operations

File Distribution

File Collection

Submit Jobs

3 Type Jobs

Configuration Files

MPI Libs

Subsections of MPI Libs

Test Intel MPI Jobs

1. 安装MPI库

2. 测试MPI性能

3. 编译MPI程序

3. 创建Slurm作业脚本

4. 编译MPI程序

5. 提交Slurm作业

6. 查看作业状态

7. 检查输出

Test Open MPI Jobs

1. 安装MPI库

2. 编译MPI程序

3. 创建Slurm作业脚本

4. 编译MPI程序

5. 提交Slurm作业

6. 查看作业状态

7. 检查输出

Data Warehouse

Modify `ifcfg-ethX` [Optional]

Prepare Steps `(All Nodes)`

Install Slurm `(All Nodes)`

Test Your Slurm Cluster `(Login Node)`

Prepare Steps `(All Nodes)`

Install Slurm `(All Nodes)`

Test Your Slurm Cluster `(Login Node)`