Subsections of ☁️CSP Related

Subsections of Aliyun

OSSutil

download ossutil

first, you need to download ossutil first

OS:
curl https://gosspublic.alicdn.com/ossutil/install.sh  | sudo bash
curl -o ossutil-v1.7.19-windows-386.zip https://gosspublic.alicdn.com/ossutil/1.7.19/ossutil-v1.7.19-windows-386.zip

config ossutil

./ossutil config
ParamsDescriptionInstruction
endpointthe Endpoint of the region where the Bucket is located
accessKeyIDOSS AccessKeyget from user info panel
accessKeySecretOSS AccessKeySecretget from user info panel
stsTokentoken for sts servicecould be empty
Info

you can also modify /home/<$user>/.ossutilconfig file directly to change the configuration.

list files

ossutil ls oss://<$PATH>
For exmaple
ossutil ls oss://csst-data/CSST-20240312/dfs/

download file/dir

you can use cp to download or upload file

ossutil cp -r oss://<$PATH> <$PTHER_PATH>
For exmaple
ossutil cp -r oss://csst-data/CSST-20240312/dfs/ /data/nfs/data/pvc...

upload file/dir

ossutil cp -r <$SOURCE_PATH> oss://<$PATH>
For exmaple
ossutil cp -r /data/nfs/data/pvc/a.txt  oss://csst-data/CSST-20240312/dfs/b.txt
Mar 24, 2024

ECS DNS

ZJADC (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

options timeout:2 attempts:3 rotate
nameserver 10.255.9.2
nameserver 10.200.12.5

And then you probably need to modify yum.repo.d as well, check link


YQGCY (Aliyun Directed Cloud)

Append content in /etc/resolv.conf

nameserver 172.27.205.79

And then restart kube-system.coredns-xxxx


Google DNS

nameserver 8.8.8.8
nameserver 4.4.4.4
nameserver 223.5.5.5
nameserver 223.6.6.6

Restart DNS

OS:
vim /etc/NetworkManager/NetworkManager.conf
vim /etc/NetworkManager/NetworkManager.conf
sudo systemctl is-active systemd-resolved
sudo resolvectl flush-caches
# or sudo systemd-resolve --flush-caches

add "dns=none" under '[main]' part

systemctl restart NetworkManager

Modify ifcfg-ethX [Optional]

if you cannot get ipv4 address, you can try to modify ifcfg-ethX

vim /etc/sysconfig/network-scripts/ifcfg-ens33

set ONBOOT=yes

Mar 14, 2024

Tencent

    Mar 7, 2024

    Subsections of Zhejianglab

    👨‍💻Schedmd Slurm

    The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world’s supercomputers and computer clusters.

    It provides three key functions:

    • allocating exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work,
    • providing a framework for starting, executing, and monitoring work, typically a parallel job such as Message Passing Interface (MPI) on a set of allocated nodes, and
    • arbitrating contention for resources by managing a queue of pending jobs.

    func1 func1

    Content

    Aug 7, 2024

    Subsections of 👨‍💻Schedmd Slurm

    Build & Install

    Aug 7, 2024

    Subsections of Build & Install

    Install On Debian

    Cluster Setting

    • 1 Manager
    • 1 Login Node
    • 2 Compute nodes
    hostnameIProlequota
    manage01 (slurmctld, slurmdbd)192.168.56.115manager2C4G
    login01 (login)192.168.56.116login2C4G
    compute01 (slurmd)192.168.56.117compute2C4G
    compute02 (slurmd)192.168.56.118compute2C4G

    Software Version:

    softwareversion
    osDebian 12 bookworm
    slurm24.05.2

    Important

    when you see (All Nodes), you need to run the following command on all nodes

    when you see (Manager Node), you only need to run the following command on manager node

    when you see (Login Node), you only need to run the following command on login node

    Prepare Steps (All Nodes)

    1. Modify the /etc/apt/sources.list file Using tuna mirror
    cat > /etc/apt/sources.list << EOF
    deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware
    deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm main contrib non-free non-free-firmware
    
    deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware
    deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-updates main contrib non-free non-free-firmware
    
    deb https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware
    deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ bookworm-backports main contrib non-free non-free-firmware
    
    deb https://mirrors.tuna.tsinghua.edu.cn/debian-security/ bookworm-security main contrib non-free non-free-firmware
    deb-src https://mirrors.tuna.tsinghua.edu.cn/debian-security/ bookworm-security main contrib non-free non-free-firmware
    EOF
    if you cannot get ipv4 address

    Modify the /etc/network/interfaces

    allow-hotplug enps08
    iface enps08 inet dhcp

    restart the network

    systemctl restart networking
    1. Update apt cache
    apt clean all && apt update
    1. Set hostname on each node
    Node:
    hostnamectl set-hostname manage01
    hostnamectl set-hostname login01
    hostnamectl set-hostname compute01
    hostnamectl set-hostname compute02
    1. Set hosts file
    cat >> /etc/hosts << EOF
    192.168.56.115 manage01
    192.168.56.116 login01
    192.168.56.117 compute01
    192.168.56.118 compute02
    EOF
    1. Disable firewall
    systemctl stop nftables && systemctl disable nftables
    1. Install packages ntpdate
    apt-get -y install ntpdate
    1. Sync server time
    ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
    echo 'Asia/Shanghai' >/etc/timezone
    ntpdate time.windows.com
    1. Add cron job to sync time
    crontab -e
    */5 * * * * /usr/sbin/ntpdate time.windows.com
    1. Create ssh key pair on each node
    ssh-keygen -t rsa -b 4096 -C $HOSTNAME
    1. Test ssh login other nodes without password
    Node:
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@login01
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute01
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute02
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@manage01
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute01
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@compute02

    Install Components

    1. Install NFS server (Manager Node)

    there are many ways to install NFS server

    create shared folder

    mkdir /data
    chmod 755 /data

    modify vim /etc/exports

    /data *(rw,sync,insecure,no_subtree_check,no_root_squash)

    start nfs server

    systemctl start rpcbind 
    systemctl start nfs-server 
    
    systemctl enable rpcbind 
    systemctl enable nfs-server

    check nfs server

    showmount -e localhost
    
    # Output
    Export list for localhost:
    /data *
    1. Install munge service
    • add user munge (All Nodes)
    groupadd -g 1108 munge
    useradd -m -c "Munge Uid 'N' Gid Emporium" -d /var/lib/munge -u 1108 -g munge -s /sbin/nologin munge
    • Install rng-tools-debian (Manager Nodes)
    apt-get install -y rng-tools-debian
    # modify service script
    vim /usr/lib/systemd/system/rngd.service
    [Service]
    ExecStart=/usr/sbin/rngd -f -r /dev/urandom
    systemctl daemon-reload
    systemctl start rngd
    systemctl enable rngd
    apt-get install -y libmunge-dev libmunge2 munge
    • generate secret key (Manager Nodes)
    dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
    • copy munge.key from manager node to the rest node (All Nodes)
    scp -p /etc/munge/munge.key root@login01:/etc/munge/
    scp -p /etc/munge/munge.key root@compute01:/etc/munge/
    scp -p /etc/munge/munge.key root@compute02:/etc/munge/
    • grant privilege on munge.key (All Nodes)
    chown munge: /etc/munge/munge.key
    chmod 400 /etc/munge/munge.key
    
    systemctl start munge
    systemctl enable munge

    Using systemctl status munge to check if the service is running

    • test munge
    munge -n | ssh compute01 unmunge
    1. Install Mariadb (Manager Nodes)
    apt-get install -y mariadb-server
    • create database and user
    systemctl start mariadb
    systemctl enable mariadb
    
    ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
    mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"
    mysql -uroot -p$ROOT_PASS -e 'create database slurm_acct_db'
    • create user slurm,and grant all privileges on database slurm_acct_db
    mysql -uroot -p$ROOT_PASS
    create user slurm;
    
    grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;
    
    flush privileges;
    • create Slurm user
    groupadd -g 1109 slurm
    useradd -m -c "Slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm

    Install Slurm (All Nodes)

    • Install basic Debian package build requirements:
    apt-get install -y build-essential fakeroot devscripts equivs
    • Unpack the distributed tarball:
    wget https://download.schedmd.com/slurm/slurm-24.05.2.tar.bz2 -O slurm-24.05.2.tar.bz2 &&
    tar -xaf slurm*tar.bz2
    • cd to the directory containing the Slurm source:
    cd slurm-24.05.2 &&   mkdir -p /etc/slurm && ./configure 
    • compile slurm
    make install
    • modify configuration files (Manager Nodes)

      cp /root/slurm-24.05.2/etc/slurm.conf.example /etc/slurm/slurm.conf
      vim /etc/slurm/slurm.conf

      focus on these options:

      SlurmctldHost=manage
      
      AccountingStorageEnforce=associations,limits,qos
      AccountingStorageHost=manage
      AccountingStoragePass=/var/run/munge/munge.socket.2
      AccountingStoragePort=6819  
      AccountingStorageType=accounting_storage/slurmdbd  
      
      JobCompHost=localhost
      JobCompLoc=slurm_acct_db
      JobCompPass=123456
      JobCompPort=3306
      JobCompType=jobcomp/mysql
      JobCompUser=slurm
      JobContainerType=job_container/none
      JobAcctGatherType=jobacct_gather/linux
      cp /root/slurm-24.05.2/etc/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
      vim /etc/slurm/slurmdbd.conf
      • modify /etc/slurm/cgroup.conf
      cp /root/slurm-24.05.2/etc/cgroup.conf.example /etc/slurm/cgroup.conf
      • send configuration files to other nodes
      scp -r /etc/slurm/*.conf  root@login01:/etc/slurm/
      scp -r /etc/slurm/*.conf  root@compute01:/etc/slurm/
      scp -r /etc/slurm/*.conf  root@compute02:/etc/slurm/
    • grant privilege on some directories (All Nodes)

    mkdir /var/spool/slurmd
    chown slurm: /var/spool/slurmd
    mkdir /var/log/slurm
    chown slurm: /var/log/slurm
    
    mkdir /var/spool/slurmctld
    chown slurm: /var/spool/slurmctld
    
    chown slurm: /etc/slurm/slurmdbd.conf
    chmod 600 /etc/slurm/slurmdbd.conf
    • start slurm services on each node
    Node:
    systemctl start slurmdbd
    systemctl enable slurmdbd
    
    systemctl start slurmctld
    systemctl enable slurmctld
    
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status xxxx` to check if the `xxxx` service is running
    Example slurmdbd.server
    ```text
    # vim /usr/lib/systemd/system/slurmdbd.service
    
    
    [Unit]
    Description=Slurm DBD accounting daemon
    After=network-online.target remote-fs.target munge.service mysql.service mysqld.service mariadb.service sssd.service
    Wants=network-online.target
    ConditionPathExists=/etc/slurm/slurmdbd.conf
    
    [Service]
    Type=simple
    EnvironmentFile=-/etc/sysconfig/slurmdbd
    EnvironmentFile=-/etc/default/slurmdbd
    User=slurm
    Group=slurm
    RuntimeDirectory=slurmdbd
    RuntimeDirectoryMode=0755
    ExecStart=/usr/local/sbin/slurmdbd -D -s $SLURMDBD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    LimitNOFILE=65536
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    Example slumctld.server
    ```text
    # vim /usr/lib/systemd/system/slurmctld.service
    
    
    [Unit]
    Description=Slurm controller daemon
    After=network-online.target remote-fs.target munge.service sssd.service
    Wants=network-online.target
    ConditionPathExists=/etc/slurm/slurm.conf
    
    [Service]
    Type=notify
    EnvironmentFile=-/etc/sysconfig/slurmctld
    EnvironmentFile=-/etc/default/slurmctld
    User=slurm
    Group=slurm
    RuntimeDirectory=slurmctld
    RuntimeDirectoryMode=0755
    ExecStart=/usr/local/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    LimitNOFILE=65536
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    Example slumd.server
    ```text
    # vim /usr/lib/systemd/system/slurmd.service
    
    
    [Unit]
    Description=Slurm node daemon
    After=munge.service network-online.target remote-fs.target sssd.service
    Wants=network-online.target
    #ConditionPathExists=/etc/slurm/slurm.conf
    
    [Service]
    Type=notify
    EnvironmentFile=-/etc/sysconfig/slurmd
    EnvironmentFile=-/etc/default/slurmd
    RuntimeDirectory=slurm
    RuntimeDirectoryMode=0755
    ExecStart=/usr/local/sbin/slurmd --systemd $SLURMD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    KillMode=process
    LimitNOFILE=131072
    LimitMEMLOCK=infinity
    LimitSTACK=infinity
    Delegate=yes
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running

    Test Your Slurm Cluster (Login Node)

    • check cluster configuration
    scontrol show config
    • check cluster status
    sinfo
    scontrol show partition
    scontrol show node
    • submit job
    srun -N2 hostname
    scontrol show jobs
    • check job status
    check job status
    squeue -a
    Aug 7, 2024

    Install On Ubuntu

    Cluster Setting

    • 1 Manager
    • 1 Login Node
    • 2 Compute nodes
    hostnameIProlequota
    manage01 (slurmctld, slurmdbd)192.168.56.115manager2C4G
    login01 (login)192.168.56.116login2C4G
    compute01 (slurmd)192.168.56.117compute2C4G
    compute02 (slurmd)192.168.56.118compute2C4G

    Software Version:

    softwareversion
    osUbuntu 22.04
    slurm24.05.2

    Important

    when you see (All Nodes), you need to run the following command on all nodes

    when you see (Manager Node), you only need to run the following command on manager node

    when you see (Login Node), you only need to run the following command on login node

    Prepare Steps (All Nodes)

    1. Modify the /etc/apt/sources.list file Using tuna mirror
    cat > /etc/apt/sources.list << EOF
    
    EOF
    if you cannot get ipv4 address

    Modify the /etc/network/interfaces

    allow-hotplug enps08
    iface enps08 inet dhcp

    restart the network

    systemctl restart networking
    1. Update apt cache
    apt clean all && apt update
    1. Set hosts file
    cat >> /etc/hosts << EOF
    10.119.2.36 juice-036
    10.119.2.37 juice-037
    10.119.2.38 juice-038
    EOF
    1. Install packages ntpdate
    apt-get -y install ntpdate
    1. Sync server time
    ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
    echo 'Asia/Shanghai' >/etc/timezone
    ntpdate ntp.aliyun.com
    1. Add cron job to sync time
    crontab -e
    */5 * * * * /usr/sbin/ntpdate ntp.aliyun.com
    1. Create ssh key pair on each node
    ssh-keygen -t rsa -b 4096 -C $HOSTNAME
    1. Test ssh login other nodes without password
    Node:
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-036
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-037
    ssh-copy-id -i ~/.ssh/id_rsa.pub root@juice-038

    Install Components

    1. Install NFS server (Manager Node)

    there are many ways to install NFS server

    create shared folder

    mkdir /data
    chmod 755 /data

    modify vim /etc/exports

    /data *(rw,sync,insecure,no_subtree_check,no_root_squash)

    start nfs server

    systemctl start rpcbind 
    systemctl start nfs-server 
    
    systemctl enable rpcbind 
    systemctl enable nfs-server

    check nfs server

    showmount -e localhost
    
    # Output
    Export list for localhost:
    /data *
    1. Install munge service
    • add user munge (All Nodes)
    sudo apt install -y build-essential git wget munge libmunge-dev libmunge2 \
        mariadb-server libmariadb-dev libssl-dev libpam0g-dev \
        libhwloc-dev liblua5.3-dev libreadline-dev libncurses-dev \
        libjson-c-dev libyaml-dev libhttp-parser-dev libjwt-dev libdbus-glib-1-dev libbpf-dev libdbus-1-dev
    
    
    which mungekey
    
    # 如果有,使用它生成 key
    sudo systemctl stop munge
    sudo mungekey -c
    sudo chown munge:munge /etc/munge/munge.key
    sudo chmod 400 /etc/munge/munge.key
    sudo systemctl start munge
    • copy munge.key from manager node to the rest node (All Nodes)
    sudo scp /etc/munge/munge.key juice-036:/tmp/munge.key
    sudo scp /etc/munge/munge.key juice-037:/tmp/munge.key
    sudo scp /etc/munge/munge.key juice-038:/tmp/munge.key
    • grant privilege on munge.key (All Nodes)
    systemctl stop munge
    
    sudo mv /tmp/munge.key /etc/munge/munge.key
    chown munge: /etc/munge/munge.key
    chmod 400 /etc/munge/munge.key
    
    systemctl start munge
    systemctl status munge
    systemctl enable munge

    Using systemctl status munge to check if the service is running

    • test munge
    munge -n | ssh juice-036 unmunge
    munge -n | ssh juice-037 unmunge
    munge -n | ssh juice-038 unmunge
    1. Install Mariadb (Manager Nodes)
    apt-get install -y mariadb-server
    • create database and user
    systemctl start mariadb
    systemctl enable mariadb
    
    ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
    mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"
    mysql -uroot -p$ROOT_PASS -e 'create database slurm_acct_db'
    • create user slurm,and grant all privileges on database slurm_acct_db
    mysql -uroot -p$ROOT_PASS
    create user slurm;
    
    grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;
    
    flush privileges;
    • create Slurm user
    groupadd -g 1109 slurm
    useradd -m -c "Slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm

    Install Slurm (All Nodes)

    • Install basic Debian package build requirements:
    apt-get install -y build-essential fakeroot devscripts equivs
    • Unpack the distributed tarball:
    wget https://download.schedmd.com/slurm/slurm-25.05.2.tar.bz2 -O slurm-25.05.2.tar.bz2 &&
    tar -xaf slurm*tar.bz2
    • cd to the directory containing the Slurm source:
    cd slurm-25.05.2 &&   mkdir -p /etc/slurm && ./configure --prefix=/usr --sysconfdir=/etc/slurm  --enable-cgroupv2
    • compile slurm
    make install
    • modify configuration files (Manager Nodes)

      cp /root/slurm-25.05.2/etc/slurm.conf.example /etc/slurm/slurm.conf
      vim /etc/slurm/slurm.conf

      focus on these options:

      SlurmctldHost=manage
      
      AccountingStorageEnforce=associations,limits,qos
      AccountingStorageHost=manage
      AccountingStoragePass=/var/run/munge/munge.socket.2
      AccountingStoragePort=6819  
      AccountingStorageType=accounting_storage/slurmdbd  
      
      JobCompHost=localhost
      JobCompLoc=slurm_acct_db
      JobCompPass=123456
      JobCompPort=3306
      JobCompType=jobcomp/mysql
      JobCompUser=slurm
      JobContainerType=job_container/none
      JobAcctGatherType=jobacct_gather/linux
      cp /root/slurm-25.05.2/etc/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
      vim /etc/slurm/slurmdbd.conf
      • modify /etc/slurm/cgroup.conf
      cp /root/slurm-25.05.2/etc/cgroup.conf.example /etc/slurm/cgroup.conf
      • send configuration files to other nodes
      scp -r /etc/slurm/*.conf  root@juice-037:/etc/slurm/
      scp -r /etc/slurm/*.conf  root@juice-038:/etc/slurm/
    • grant privilege on some directories (All Nodes)

    mkdir /var/spool/slurmd
    chown slurm: /var/spool/slurmd
    mkdir /var/log/slurm
    chown slurm: /var/log/slurm
    
    mkdir /var/spool/slurmctld
    chown slurm: /var/spool/slurmctld
    
    chown slurm: /etc/slurm/slurmdbd.conf
    chmod 600 /etc/slurm/slurmdbd.conf
    • start slurm services on each node
    Node:
    systemctl start slurmdbd
    systemctl enable slurmdbd
    
    systemctl start slurmctld
    systemctl enable slurmctld
    
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status xxxx` to check if the `xxxx` service is running
    Example slurmdbd.server
    ```text
    # vim /usr/lib/systemd/system/slurmdbd.service
    
    
    [Unit]
    Description=Slurm DBD accounting daemon
    After=network-online.target remote-fs.target munge.service mysql.service mysqld.service mariadb.service sssd.service
    Wants=network-online.target
    ConditionPathExists=/etc/slurm/slurmdbd.conf
    
    [Service]
    Type=simple
    EnvironmentFile=-/etc/sysconfig/slurmdbd
    EnvironmentFile=-/etc/default/slurmdbd
    User=slurm
    Group=slurm
    RuntimeDirectory=slurmdbd
    RuntimeDirectoryMode=0755
    ExecStart=/usr/sbin/slurmdbd -D -s $SLURMDBD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    LimitNOFILE=65536
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    Example slumctld.server
    ```text
    # vim /usr/lib/systemd/system/slurmctld.service
    
    
    [Unit]
    Description=Slurm controller daemon
    After=network-online.target remote-fs.target munge.service sssd.service
    Wants=network-online.target
    ConditionPathExists=/etc/slurm/slurm.conf
    
    [Service]
    Type=notify
    EnvironmentFile=-/etc/sysconfig/slurmctld
    EnvironmentFile=-/etc/default/slurmctld
    User=slurm
    Group=slurm
    RuntimeDirectory=slurmctld
    RuntimeDirectoryMode=0755
    ExecStart=/usr/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    LimitNOFILE=65536
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    Example slumd.server
    ```text
    # vim /usr/lib/systemd/system/slurmd.service
    
    
    [Unit]
    Description=Slurm node daemon
    After=munge.service network-online.target remote-fs.target sssd.service
    Wants=network-online.target
    #ConditionPathExists=/etc/slurm/slurm.conf
    
    [Service]
    Type=notify
    EnvironmentFile=-/etc/sysconfig/slurmd
    EnvironmentFile=-/etc/default/slurmd
    RuntimeDirectory=slurm
    RuntimeDirectoryMode=0755
    ExecStart=/usr/sbin/slurmd --systemd $SLURMD_OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    KillMode=process
    LimitNOFILE=131072
    LimitMEMLOCK=infinity
    LimitSTACK=infinity
    Delegate=yes
    
    
    # Uncomment the following lines to disable logging through journald.
    # NOTE: It may be preferable to set these through an override file instead.
    #StandardOutput=null
    #StandardError=null
    
    [Install]
    WantedBy=multi-user.target
    ```
    
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running
    systemctl start slurmd
    systemctl enable slurmd
    Using `systemctl status slurmd` to check if the `slurmd` service is running

    Test Your Slurm Cluster (Login Node)

    • check cluster configuration
    scontrol show config
    • check cluster status
    sinfo
    scontrol show partition
    scontrol show node
    • submit job
    srun -N2 hostname
    scontrol show jobs
    • check job status
    check job status
    squeue -a
    Aug 7, 2024

    Install From Binary

    Important

    (All Nodes) means all type nodes should install this component.

    (Manager Node) means only the manager node should install this component.

    (Login Node) means only the Auth node should install this component.

    (Cmp) means only the Compute node should install this component.

    Typically, there are three nodes are required to run Slurm.

    1 Manage(Manager Node), 1 Login Node and N Compute(Cmp).

    but you can choose to install all service in single node. check

    Prequisites

    1. change hostname (All Nodes)
      hostnamectl set-hostname (manager|auth|computeXX)
    2. modify /etc/hosts (All Nodes)
      echo "192.aa.bb.cc (manager|auth|computeXX)" >> /etc/hosts
    3. disable firewall, selinux, dnsmasq, swap (All Nodes). more detail here
    4. NFS Server (Manager Node). NFS is used as the default file system for the Slurm accounting database.
    5. [NFS Client] (All Nodes). all node should mount the NFS share
      Install NFS Client
      mount <$nfs_server>:/data /data -o proto=tcp -o nolock
    6. Munge (All Nodes). The auth/munge plugin will be built if the MUNGE authentication development library is installed. MUNGE is used as the default authentication mechanism.
      Install Munge

      All node need to have the munge user and group.

      groupadd -g 1108 munge
      useradd -m -c "Munge Uid 'N' Gid Emporium" -d /var/lib/munge -u 1108 -g munge -s /sbin/nologin munge
      yum install epel-release -y
      yum install munge munge-libs munge-devel -y

      Create global secret key

      /usr/sbin/create-munge-key -r
      dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key

      sync secret to the rest of nodes

      scp -p /etc/munge/munge.key root@<$rest_node>:/etc/munge/
      ssh root@<$rest_node> "chown munge: /etc/munge/munge.key && chmod 400 /etc/munge/munge.key"
      ssh root@<$rest_node> "systemctl start munge && systemctl enable munge"

      test munge if it works

      munge -n | unmunge
    7. Database (Manager Node). MySQL support for accounting will be built if the MySQL or MariaDB development library is present. A currently supported version of MySQL or MariaDB should be used.
      Install MariaDB

      install mariadb

      yum -y install mariadb-server
      systemctl start mariadb && systemctl enable mariadb
      ROOT_PASS=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16) 
      mysql -e "CREATE USER root IDENTIFIED BY '${ROOT_PASS}'"

      login mysql

      mysql -u root -p${ROOT_PASS}
      create database slurm_acct_db;
      create user slurm;
      grant all on slurm_acct_db.* TO 'slurm'@'localhost' identified by '123456' with grant option;
      flush privileges;
      quit

    Install Slurm

    1. create slurm user (All Nodes)
      groupadd -g 1109 slurm
      useradd -m -c "slurm manager" -d /var/lib/slurm -u 1109 -g slurm -s /bin/bash slurm
    Install Slurm from

    Build RPM package

    1. install depeendencies (Manager Node)

      yum -y install gcc gcc-c++ readline-devel perl-ExtUtils-MakeMaker pam-devel rpm-build mysql-devel python3
    2. build rpm package (Manager Node)

      wget https://download.schedmd.com/slurm/slurm-24.05.2.tar.bz2 -O slurm-24.05.2.tar.bz2
      rpmbuild -ta --nodeps slurm-24.05.2.tar.bz2

      The rpm files will be installed under the $(HOME)/rpmbuild directory of the user building them.

    3. send rpm to rest nodes (Manager Node)

      ssh root@<$rest_node> "mkdir -p /root/rpmbuild/RPMS/"
      scp -p $(HOME)/rpmbuild/RPMS/x86_64 root@<$rest_node>:/root/rpmbuild/RPMS/x86_64
    4. install rpm (Manager Node)

      ssh root@<$rest_node> "yum localinstall /root/rpmbuild/RPMS/x86_64/slurm-*"
    5. modify configuration file (Manager Node)

      cp /etc/slurm/cgroup.conf.example /etc/slurm/cgroup.conf
      cp /etc/slurm/slurm.conf.example /etc/slurm/slurm.conf
      cp /etc/slurm/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
      chmod 600 /etc/slurm/slurmdbd.conf
      chown slurm: /etc/slurm/slurmdbd.conf

      cgroup.conf doesnt need to change.

      edit /etc/slurm/slurm.conf, you can use this link as a reference

      edit /etc/slurm/slurmdbd.conf, you can use this link as a reference

    Install yum repo directly

    1. install slurm (All Nodes)

      yum -y slurm-wlm slurmdbd
    2. modify configuration file (All Nodes)

      vim /etc/slurm-llnl/slurm.conf
      vim /etc/slurm-llnl/slurmdbd.conf

      cgroup.conf doesnt need to change.

      edit /etc/slurm/slurm.conf, you can use this link as a reference

      edit /etc/slurm/slurmdbd.conf, you can use this link as a reference

    1. send configuration (Manager Node)
       scp -r /etc/slurm/*.conf  root@<$rest_node>:/etc/slurm/
       ssh rootroot@<$rest_node> "mkdir /var/spool/slurmd && chown slurm: /var/spool/slurmd"
       ssh rootroot@<$rest_node> "mkdir /var/log/slurm && chown slurm: /var/log/slurm"
       ssh rootroot@<$rest_node> "mkdir /var/spool/slurmctld && chown slurm: /var/spool/slurmctld"
    2. start service (Manager Node)
      ssh rootroot@<$rest_node> "systemctl start slurmdbd && systemctl enable slurmdbd"
      ssh rootroot@<$rest_node> "systemctl start slurmctld && systemctl enable slurmctld"
    3. start service (All Nodes)
      ssh rootroot@<$rest_node> "systemctl start slurmd && systemctl enable slurmd"

    Test

    1. show cluster status
    scontrol show config
    sinfo
    scontrol show partition
    scontrol show node
    1. submit job
    srun -N2 hostname
    scontrol show jobs
    1. check job status
    squeue -a

    Reference:

    1. https://slurm.schedmd.com/documentation.html
    2. https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/
    3. https://github.com/Artlands/Install-Slurm
    Aug 7, 2024

    Install From Helm Chart

    Despite the complex binary installation, helm chart is a better way to install slurm.

    Source code could be found from https://github.com/AaronYang0628/slurm-on-k8s

    Prequisites

    1. Kubernetes has installed, if not check 🔗link
    2. Helm binary has installed, if not check 🔗link

    Installation

    1. get helm repo and update

      helm repo add ay-helm-mirror https://aaronyang0628.github.io/helm-chart-mirror/charts
    2. install slurm chart

      # wget -O slurm.values.yaml https://raw.githubusercontent.com/AaronYang0628/slurm-on-k8s/refs/heads/main/chart/values.yaml
      helm install slurm ay-helm-mirror/chart -f slurm.values.yaml --version 1.0.10

      Or you can get template values.yaml from https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurm.values.yaml

    3. check chart status

      helm -n slurm list
    Aug 7, 2024

    Install From K8s Operator

    Despite the complex binary installation, using k8s operator is a better way to install slurm.

    Source code could be found from https://github.com/AaronYang0628/slurm-on-k8s

    Prequisites

    1. Kubernetes has installed, if not check 🔗link
    2. Helm binary has installed, if not check 🔗link

    Installation

    1. deploy slurm operator

      kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/operator_install.yaml
      Expectd Output
      [root@ay-zj-ecs operator]# kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/operator_install.yaml
      namespace/slurm created
      customresourcedefinition.apiextensions.k8s.io/slurmdeployments.slurm.ay.dev created
      serviceaccount/slurm-operator-controller-manager created
      role.rbac.authorization.k8s.io/slurm-operator-leader-election-role created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-manager-role created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-metrics-auth-role created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-metrics-reader created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-admin-role created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-editor-role created
      clusterrole.rbac.authorization.k8s.io/slurm-operator-slurmdeployment-viewer-role created
      rolebinding.rbac.authorization.k8s.io/slurm-operator-leader-election-rolebinding created
      clusterrolebinding.rbac.authorization.k8s.io/slurm-operator-manager-rolebinding created
      clusterrolebinding.rbac.authorization.k8s.io/slurm-operator-metrics-auth-rolebinding created
      service/slurm-operator-controller-manager-metrics-service created
      deployment.apps/slurm-operator-controller-manager created
    2. check operator status

      kubectl -n slurm get pod
      Expectd Output
      [root@ay-zj-ecs operator]# kubectl -n slurm get pod
      NAME                                READY   STATUS    RESTARTS   AGE
      slurm-operator-controller-manager   1/1     Running   0          27s
    3. apply CRD slurmdeployment

      kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurmdeployment.zj.values.yaml
      Expectd Output
      [root@ay-zj-ecs operator]# kubectl apply -f https://raw.githubusercontent.com/AaronYang0628/helm-chart-mirror/refs/heads/main/templates/slurm/slurmdeployment.zj.values.yaml
      slurmdeployment.slurm.ay.dev/lensing created
    4. check operator status

      kubectl get slurmdeployment
      kubectl -n slurm logs -f deploy/slurm-operator-controller-manager
      # kubectl get slurmdep
      # kubectl -n test get pods
      Expectd Output
      [root@ay-zj-ecs ~]# kubectl get slurmdep -w
      NAME      CPU   GPU   LOGIN   CTLD   DBD   DBSVC   JOB COMMAND                     STATUS
      lensing   0/1   0/0   0/1     0/1    0/1   0/1     sh -c srun -N 2 /bin/hostname   
      lensing   1/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
      lensing   2/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
    5. upgrade slurmdep

      kubectl edit slurmdep lensing
      # set SlurmCPU.replicas = 3
      Expectd Output
      [root@ay-zj-ecs ~]# kubectl edit slurmdep lensing
      slurmdeployment.slurm.ay.dev/lensing edited
      
      [root@ay-zj-ecs ~]# kubectl get slurmdep -w
      NAME      CPU   GPU   LOGIN   CTLD   DBD   DBSVC   JOB COMMAND                     STATUS
      lensing   2/2   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
      lensing   2/3   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
      lensing   3/3   0/0   1/1     1/1    1/1   1/1     sh -c srun -N 2 /bin/hostname   
    Aug 7, 2024

    Try OpenSCOW

    What is SCOW?

    SCOW is a HPC cluster management system built by PKU.

    SCOW used four virtual machines to run slurm cluster. It is a good choice for you to learn how to use slurm.

    You should check https://pkuhpc.github.io/OpenSCOW/docs/hpccluster, it works well.

    Aug 7, 2024

    Subsections of CheatSheet

    Common Environment Variables

    VariableDescription
    $SLURM_JOB_IDThe Job ID.
    $SLURM_JOBIDDeprecated. Same as $SLURM_JOB_ID
    $SLURM_SUBMIT_HOSTThe hostname of the node used for job submission.
    $SLURM_JOB_NODELISTContains the definition (list) of the nodes that is assigned to the job.
    $SLURM_NODELISTDeprecated. Same as SLURM_JOB_NODELIST.
    $SLURM_CPUS_PER_TASKNumber of CPUs per task.
    $SLURM_CPUS_ON_NODENumber of CPUs on the allocated node.
    $SLURM_JOB_CPUS_PER_NODECount of processors available to the job on this node.
    $SLURM_CPUS_PER_GPUNumber of CPUs requested per allocated GPU.
    $SLURM_MEM_PER_CPUMemory per CPU. Same as –mem-per-cpu .
    $SLURM_MEM_PER_GPUMemory per GPU.
    $SLURM_MEM_PER_NODEMemory per node. Same as –mem .
    $SLURM_GPUSNumber of GPUs requested.
    $SLURM_NTASKSSame as -n, –ntasks. The number of tasks.
    $SLURM_NTASKS_PER_NODENumber of tasks requested per node.
    $SLURM_NTASKS_PER_SOCKETNumber of tasks requested per socket.
    $SLURM_NTASKS_PER_CORENumber of tasks requested per core.
    $SLURM_NTASKS_PER_GPUNumber of tasks requested per GPU.
    $SLURM_NPROCSSame as -n, –ntasks. See $SLURM_NTASKS.
    $SLURM_TASKS_PER_NODENumber of tasks to be initiated on each node.
    $SLURM_ARRAY_JOB_IDJob array’s master job ID number.
    $SLURM_ARRAY_TASK_IDJob array ID (index) number.
    $SLURM_ARRAY_TASK_COUNTTotal number of tasks in a job array.
    $SLURM_ARRAY_TASK_MAXJob array’s maximum ID (index) number.
    $SLURM_ARRAY_TASK_MINJob array’s minimum ID (index) number.

    A full list of environment variables for SLURM can be found by visiting the SLURM page on environment variables.

    Aug 7, 2024

    File Operations

    File Distribution

    • sbcast is used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
      • Feature
        1. distribute file:Quickly copy files to all compute nodes assigned to the job, avoiding the hassle of manually distributing files. Faster than traditional scp or rsync, especially when distributing to multiple nodes。
        2. simplify script:one command to distribute files to all nodes assigned to the job。
        3. imrpove performance:Improve file distribution speed by parallelizing transfers, especially for large or multiple files。
      • Usage
        1. Alone
        sbcast <source_file> <destination_path>
        1. Embedded in a job script
        #!/bin/bash
        #SBATCH --job-name=example_job
        #SBATCH --output=example_job.out
        #SBATCH --error=example_job.err
        #SBATCH --partition=compute
        #SBATCH --nodes=4
        
        # Use sbcast to distribute the file to the /tmp directory of each node
        sbcast data.txt /tmp/data.txt
        
        # Run your program using the distributed files
        srun my_program /tmp/data.txt

    File Collection

    1. File Redirection When submitting a job, you can use the #SBATCH –output and #SBATCH –error directives to redirect standard output and standard error to specified files.

       #SBATCH --output=output.txt
       #SBATCH --error=error.txt

      Or

      sbatch -N2 -w "compute[01-02]" -o result/file/path xxx.slurm
    2. Send the destination address manually Using scp or rsync in the job to copy the files from the compute nodes to the submit node

    3. Using NFS If a shared file system (such as NFS, Lustre, or GPFS) is configured in the computing cluster, the result files can be written directly to the shared directory. In this way, the result files generated by all nodes are automatically stored in the same location.

    4. Using sbcast

    Aug 7, 2024

    Submit Jobs

    3 Type Jobs

    • srun is used to submit a job for execution or initiate job steps in real time.

      • Example
        1. run shell
        srun -N2 bin/hostname
        1. run script
        srun -N1 test.sh
        1. exec into slurmd node
        srun -w slurm-lensing-slurm-slurmd-cpu-2 --pty /bin/bash
    • sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

      • Example

        1. submit a batch job
        sbatch -N2 -w "compute[01-02]" -o job.stdout /data/jobs/batch-job.slurm
        batch-job.slurm
        #!/bin/bash
        
        #SBATCH -N 1
        #SBATCH --job-name=cpu-N1-batch
        #SBATCH --partition=compute
        #SBATCH --mail-type=end
        #SBATCH --mail-user=xxx@email.com
        #SBATCH --output=%j.out
        #SBATCH --error=%j.err
        
        srun -l /bin/hostname #you can still write srun <command> in here
        srun -l pwd
        
        1. submit a parallel task to process differnt data partition
        sbatch /data/jobs/parallel.slurm
        parallel.slurm
        #!/bin/bash
        #SBATCH -N 2 
        #SBATCH --job-name=cpu-N2-parallel
        #SBATCH --partition=compute
        #SBATCH --time=01:00:00
        #SBATCH --array=1-4  # 定义任务数组,假设有4个分片
        #SBATCH --ntasks-per-node=1 # 每个节点只运行一个任务
        #SBATCH --output=process_data_%A_%a.out
        #SBATCH --error=process_data_%A_%a.err
        
        TASK_ID=${SLURM_ARRAY_TASK_ID}
        
        DATA_PART="data_part_${TASK_ID}.txt" #make sure you have that file
        
        if [ -f ${DATA_PART} ]; then
            echo "Processing ${DATA_PART} on node $(hostname)"
            # python process_data.py --input ${DATA_PART}
        else
            echo "File ${DATA_PART} does not exist!"
        fi
        
        how to split file
        split -l 1000 data.txt data_part_ 
        && mv data_part_aa data_part_1 
        && mv data_part_ab data_part_2
        
    • salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

      • Example
        1. allocate resources (more like create an virtual machine)
        salloc -N2 bash
        This command will create a job which allocates 2 nodes and spawn a bash shell on each node. and you can execute srun commands in that environment. After your computing task is finsihs, remember to shutdown your job.
        scancel <$job_id>
        when you exit the job, the resources will be released.
    Aug 7, 2024

    Configuration Files

    Aug 7, 2024

    Subsections of MPI Libs

    Test Intel MPI Jobs

    在SLURM集群中使用MPI(Message Passing Interface)进行并行计算,通常需要以下几个步骤:

    1. 安装MPI库

    确保你的集群节点已经安装了MPI库,常见的MPI实现包括:

    • OpenMPI
    • Intel MPI
    • MPICH 可以通过以下命令检查集群是否安装了MPI:
    mpicc --version  # 检查MPI编译器
    mpirun --version # 检查MPI运行时环境

    2. 测试MPI性能

    mpirun -n 2 IMB-MPI1 pingpong

    3. 编译MPI程序

    你可以用mpicc(C语言)或mpic++(C++语言)来编译MPI程序。例如:

    以下是一个简单的MPI “Hello, World!” 示例程序,假设文件名为 hello_mpi.c, 还有一个进行矩阵计算的示例程序,文件名为dot_product.c,任意挑选一个即可:

    #include <stdio.h>
    #include <mpi.h>
    
    int main(int argc, char *argv[]) {
        int rank, size;
        
        // 初始化MPI环境
        MPI_Init(&argc, &argv);
    
        // 获取当前进程的rank和总进程数
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
    
        // 输出进程的信息
        printf("Hello, World! I am process %d out of %d processes.\n", rank, size);
    
        // 退出MPI环境
        MPI_Finalize();
    
        return 0;
    }
    #include <stdio.h>
    #include <stdlib.h>
    #include <mpi.h>
    
    #define N 8  // 向量大小
    
    // 计算向量的局部点积
    double compute_local_dot_product(double *A, double *B, int start, int end) {
        double local_dot = 0.0;
        for (int i = start; i < end; i++) {
            local_dot += A[i] * B[i];
        }
        return local_dot;
    }
    
    void print_vector(double *Vector) {
        for (int i = 0; i < N; i++) {
            printf("%f ", Vector[i]);   
        }
        printf("\n");
    }
    
    int main(int argc, char *argv[]) {
        int rank, size;
    
        // 初始化MPI环境
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
    
        // 向量A和B
        double A[N], B[N];
    
        // 进程0初始化向量A和B
        if (rank == 0) {
            for (int i = 0; i < N; i++) {
                A[i] = i + 1;  // 示例数据
                B[i] = (i + 1) * 2;  // 示例数据
            }
        }
    
        // 广播向量A和B到所有进程
        MPI_Bcast(A, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
        MPI_Bcast(B, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
    
        // 每个进程计算自己负责的部分
        int local_n = N / size;  // 每个进程处理的元素个数
        int start = rank * local_n;
        int end = (rank + 1) * local_n;
        
        // 如果是最后一个进程,确保处理所有剩余的元素(处理N % size)
        if (rank == size - 1) {
            end = N;
        }
    
        double local_dot_product = compute_local_dot_product(A, B, start, end);
    
        // 使用MPI_Reduce将所有进程的局部点积结果汇总到进程0
        double global_dot_product = 0.0;
        MPI_Reduce(&local_dot_product, &global_dot_product, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
    
        // 进程0输出最终结果
        if (rank == 0) {
            printf("Vector A is\n");
            print_vector(A);
            printf("Vector B is\n");
            print_vector(B);
            printf("Dot Product of A and B: %f\n", global_dot_product);
        }
    
        // 结束MPI环境
        MPI_Finalize();
        return 0;
    }

    3. 创建Slurm作业脚本

    创建一个SLURM作业脚本来运行该MPI程序。以下是一个基本的SLURM作业脚本,假设文件名为 mpi_test.slurm:

    #!/bin/bash
    #SBATCH --job-name=mpi_job       # Job name
    #SBATCH --nodes=2                # Number of nodes to use
    #SBATCH --ntasks-per-node=1      # Number of tasks per node
    #SBATCH --time=00:10:00          # Time limit
    #SBATCH --output=mpi_test_output_%j.log     # Standard output file
    #SBATCH --error=mpi_test_output_%j.err     # Standard error file
    
    # Manually set Intel OneAPI MPI and Compiler environment
    export I_MPI_PMI=pmi2
    export I_MPI_PMI_LIBRARY=/usr/lib/x86_64-linux-gnu/slurm/mpi_pmi2.so
    export I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.14
    export INTEL_COMPILER_ROOT=/opt/intel/oneapi/compiler/2025.0
    export PATH=$I_MPI_ROOT/bin:$INTEL_COMPILER_ROOT/bin:$PATH
    export LD_LIBRARY_PATH=$I_MPI_ROOT/lib:$INTEL_COMPILER_ROOT/lib:$LD_LIBRARY_PATH
    export MANPATH=$I_MPI_ROOT/man:$INTEL_COMPILER_ROOT/man:$MANPATH
    
    # Compile the MPI program
    icx-cc -I$I_MPI_ROOT/include  hello_mpi.c -o hello_mpi -L$I_MPI_ROOT/lib -lmpi
    
    # Run the MPI job
    
    mpirun -np 2 ./hello_mpi
    #!/bin/bash
    #SBATCH --job-name=mpi_job       # Job name
    #SBATCH --nodes=2                # Number of nodes to use
    #SBATCH --ntasks-per-node=1      # Number of tasks per node
    #SBATCH --time=00:10:00          # Time limit
    #SBATCH --output=mpi_test_output_%j.log     # Standard output file
    #SBATCH --error=mpi_test_output_%j.err     # Standard error file
    
    # Manually set Intel OneAPI MPI and Compiler environment
    export I_MPI_PMI=pmi2
    export I_MPI_PMI_LIBRARY=/usr/lib/x86_64-linux-gnu/slurm/mpi_pmi2.so
    export I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.14
    export INTEL_COMPILER_ROOT=/opt/intel/oneapi/compiler/2025.0
    export PATH=$I_MPI_ROOT/bin:$INTEL_COMPILER_ROOT/bin:$PATH
    export LD_LIBRARY_PATH=$I_MPI_ROOT/lib:$INTEL_COMPILER_ROOT/lib:$LD_LIBRARY_PATH
    export MANPATH=$I_MPI_ROOT/man:$INTEL_COMPILER_ROOT/man:$MANPATH
    
    # Compile the MPI program
    icx-cc -I$I_MPI_ROOT/include  dot_product.c -o dot_product -L$I_MPI_ROOT/lib -lmpi
    
    # Run the MPI job
    
    mpirun -np 2 ./dot_product

    4. 编译MPI程序

    在运行作业之前,你需要编译MPI程序。在集群上使用mpicc来编译该程序。假设你将程序保存在 hello_mpi.c 文件中,使用以下命令进行编译:

    mpicc -o hello_mpi hello_mpi.c
    mpicc -o dot_product dot_product.c

    5. 提交Slurm作业

    保存上述作业脚本(mpi_test.slurm)并使用以下命令提交作业:

    sbatch mpi_test.slurm

    6. 查看作业状态

    你可以使用以下命令查看作业的状态:

    squeue -u <your_username>

    7. 检查输出

    作业完成后,输出将保存在你作业脚本中指定的文件中(例如 mpi_test_output_<job_id>.log)。你可以使用 cat 或任何文本编辑器查看输出:

    cat mpi_test_output_*.log

    示例输出 如果一切正常,输出会类似于:

    Hello, World! I am process 0 out of 2 processes.
    Hello, World! I am process 1 out of 2 processes.
    Result Matrix C (A * B):
    14 8 2 -4 
    20 10 0 -10 
    -1189958655 1552515295 21949 -1552471397 
    0 0 0 0 
    Aug 7, 2024

    Test Open MPI Jobs

    在SLURM集群中使用MPI(Message Passing Interface)进行并行计算,通常需要以下几个步骤:

    1. 安装MPI库

    确保你的集群节点已经安装了MPI库,常见的MPI实现包括:

    • OpenMPI
    • Intel MPI
    • MPICH 可以通过以下命令检查集群是否安装了MPI:
    mpicc --version  # 检查MPI编译器
    mpirun --version # 检查MPI运行时环境

    2. 编译MPI程序

    你可以用mpicc(C语言)或mpic++(C++语言)来编译MPI程序。例如:

    以下是一个简单的MPI “Hello, World!” 示例程序,假设文件名为 hello_mpi.c, 还有一个进行矩阵计算的示例程序,文件名为dot_product.c,任意挑选一个即可:

    #include <stdio.h>
    #include <mpi.h>
    
    int main(int argc, char *argv[]) {
        int rank, size;
        
        // 初始化MPI环境
        MPI_Init(&argc, &argv);
    
        // 获取当前进程的rank和总进程数
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
    
        // 输出进程的信息
        printf("Hello, World! I am process %d out of %d processes.\n", rank, size);
    
        // 退出MPI环境
        MPI_Finalize();
    
        return 0;
    }
    #include <stdio.h>
    #include <stdlib.h>
    #include <mpi.h>
    
    #define N 8  // 向量大小
    
    // 计算向量的局部点积
    double compute_local_dot_product(double *A, double *B, int start, int end) {
        double local_dot = 0.0;
        for (int i = start; i < end; i++) {
            local_dot += A[i] * B[i];
        }
        return local_dot;
    }
    
    void print_vector(double *Vector) {
        for (int i = 0; i < N; i++) {
            printf("%f ", Vector[i]);   
        }
        printf("\n");
    }
    
    int main(int argc, char *argv[]) {
        int rank, size;
    
        // 初始化MPI环境
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
    
        // 向量A和B
        double A[N], B[N];
    
        // 进程0初始化向量A和B
        if (rank == 0) {
            for (int i = 0; i < N; i++) {
                A[i] = i + 1;  // 示例数据
                B[i] = (i + 1) * 2;  // 示例数据
            }
        }
    
        // 广播向量A和B到所有进程
        MPI_Bcast(A, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
        MPI_Bcast(B, N, MPI_DOUBLE, 0, MPI_COMM_WORLD);
    
        // 每个进程计算自己负责的部分
        int local_n = N / size;  // 每个进程处理的元素个数
        int start = rank * local_n;
        int end = (rank + 1) * local_n;
        
        // 如果是最后一个进程,确保处理所有剩余的元素(处理N % size)
        if (rank == size - 1) {
            end = N;
        }
    
        double local_dot_product = compute_local_dot_product(A, B, start, end);
    
        // 使用MPI_Reduce将所有进程的局部点积结果汇总到进程0
        double global_dot_product = 0.0;
        MPI_Reduce(&local_dot_product, &global_dot_product, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
    
        // 进程0输出最终结果
        if (rank == 0) {
            printf("Vector A is\n");
            print_vector(A);
            printf("Vector B is\n");
            print_vector(B);
            printf("Dot Product of A and B: %f\n", global_dot_product);
        }
    
        // 结束MPI环境
        MPI_Finalize();
        return 0;
    }

    3. 创建Slurm作业脚本

    创建一个SLURM作业脚本来运行该MPI程序。以下是一个基本的SLURM作业脚本,假设文件名为 mpi_test.slurm:

    #!/bin/bash
    #SBATCH --job-name=mpi_test                 # 作业名称
    #SBATCH --nodes=2                           # 请求节点数
    #SBATCH --ntasks-per-node=1                 # 每个节点上的任务数
    #SBATCH --time=00:10:00                     # 最大运行时间
    #SBATCH --output=mpi_test_output_%j.log     # 输出日志文件
    
    # 加载MPI模块(如果使用模块化环境)
    module load openmpi
    
    # 运行MPI程序
    mpirun --allow-run-as-root -np 2 ./hello_mpi
    #!/bin/bash
    #SBATCH --job-name=mpi_test                 # 作业名称
    #SBATCH --nodes=2                           # 请求节点数
    #SBATCH --ntasks-per-node=1                 # 每个节点上的任务数
    #SBATCH --time=00:10:00                     # 最大运行时间
    #SBATCH --output=mpi_test_output_%j.log     # 输出日志文件
    
    # 加载MPI模块(如果使用模块化环境)
    module load openmpi
    
    # 运行MPI程序
    mpirun --allow-run-as-root -np 2 ./dot_product

    4. 编译MPI程序

    在运行作业之前,你需要编译MPI程序。在集群上使用mpicc来编译该程序。假设你将程序保存在 hello_mpi.c 文件中,使用以下命令进行编译:

    mpicc -o hello_mpi hello_mpi.c
    mpicc -o dot_product dot_product.c

    5. 提交Slurm作业

    保存上述作业脚本(mpi_test.slurm)并使用以下命令提交作业:

    sbatch mpi_test.slurm

    6. 查看作业状态

    你可以使用以下命令查看作业的状态:

    squeue -u <your_username>

    7. 检查输出

    作业完成后,输出将保存在你作业脚本中指定的文件中(例如 mpi_test_output_<job_id>.log)。你可以使用 cat 或任何文本编辑器查看输出:

    cat mpi_test_output_*.log

    示例输出 如果一切正常,输出会类似于:

    Hello, World! I am process 0 out of 2 processes.
    Hello, World! I am process 1 out of 2 processes.
    Result Matrix C (A * B):
    14 8 2 -4 
    20 10 0 -10 
    -1189958655 1552515295 21949 -1552471397 
    0 0 0 0 
    Aug 7, 2024

    Data Warehouse