File Operations

File Distribution

  • sbcast is used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
    • Feature
      1. distribute file:Quickly copy files to all compute nodes assigned to the job, avoiding the hassle of manually distributing files. Faster than traditional scp or rsync, especially when distributing to multiple nodes。
      2. simplify script:one command to distribute files to all nodes assigned to the job。
      3. imrpove performance:Improve file distribution speed by parallelizing transfers, especially for large or multiple files。
    • Usage
      1. Alone
      sbcast <source_file> <destination_path>
      1. Embedded in a job script
      #!/bin/bash
      #SBATCH --job-name=example_job
      #SBATCH --output=example_job.out
      #SBATCH --error=example_job.err
      #SBATCH --partition=compute
      #SBATCH --nodes=4
      
      # Use sbcast to distribute the file to the /tmp directory of each node
      sbcast data.txt /tmp/data.txt
      
      # Run your program using the distributed files
      srun my_program /tmp/data.txt

File Collection

  1. File Redirection When submitting a job, you can use the #SBATCH –output and #SBATCH –error directives to redirect standard output and standard error to specified files.

     #SBATCH --output=output.txt
     #SBATCH --error=error.txt

    Or

    sbatch -N2 -w "compute[01-02]" -o result/file/path xxx.slurm
  2. Send the destination address manually Using scp or rsync in the job to copy the files from the compute nodes to the submit node

  3. Using NFS If a shared file system (such as NFS, Lustre, or GPFS) is configured in the computing cluster, the result files can be written directly to the shared directory. In this way, the result files generated by all nodes are automatically stored in the same location.

  4. Using sbcast