Ubuntu 20.04 运行 accel-sim


accel-sim 官网:https://accel-sim.github.io/#manual
仓库地址:https://accel-sim.github.io/#manual
benchmark 地址:https://github.com/accel-sim/gpu-app-collection

建议直接去 https://github.com/accel-sim/ 找自己所需的仓库。


./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C QV100-SASS -T ./hw_run/traces/device-<device-num>/<cuda-version>/ -N myTest

./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C QV100-SASS -T ./hw_run/traces/device-<device-num>/<cuda-version>/ -N myTest

The above command will run the workloads in Accel-Sim’s SASS traces-driven mode. You can also run the workloads in PTX mode using:

PTX mode usage: ./util/job_launching/run_simulations.py -B <benchmark> -C <gpu_config> -N <run_identifier>
Optional:
[-B benchmark]              (From the gpu-app-collection compiled in Step 1)
[-C gpu_config]             (List of supported configs: accel-sim-framework/util/job_launching/configs/define-standard-cfgs.yml)

Eg:

./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C QV100-PTX -N myTest-PTX

You can monitor the tests using:

./util/job_launching/monitor_func_test.py -v -N myTest

After the jobs finish - you can collect all the stats using:

./util/job_launching/get_stats.py -N myTest | tee stats.csv

If you want to run the accel-sim.out executable command itself for specific workload, you can use:

/gpu-simulator/bin/release/accel-sim.out -trace ./hw_run/rodinia_2.0-ft/9.1/backprop-rodinia-2.0-ft/4096___data_result_4096_txt/traces/kernelslist.g -config ./gpu-simulator/gpgpu-sim/configs/tested-cfgs/SM7_QV100/gpgpusim.config -config ./gpu-simulator/configs/tested-cfgs/SM7_QV100/trace.config

However, we encourage you to use our workload launch manager ‘run_simulations’ script as shown above, which will greatly simplify the simulation process and increase productivity.

To understand what is going on and how to just run the simulator in isolation without the framework, read this.

To better undersatnd the Accel-Sim front-end and the interface with GPGPU-Sim, read this.

总的来说,sass 要收集 trace,就用 ptx,虽然慢一些。

先研究一下怎么用 ptx 跑 cudnn。

git clone https://github.com/accel-sim/gpu-app-collection  
source ./gpu-app-collection/src/setup_environment  
make -j -C ./gpu-app-collection/src deeplearning  
make -C ./gpu-app-collection/src data 


./util/job_launching/run_simulations.py -B deeplearning -C QV100-PTX -N myTest-PTX

建立当前 cuda 版本的符号链接:

sudo ln -s /usr/local/cuda-11.0 /usr/local/cuda

遗憾的是,不能运行。也就是说 PTX 模式对 deeplearning 的 benchmark 无效。我在 issues 中找到了作者提供的相关信息:alt text

也就是说,作者们之关系 trace 模式下的运行。遂按照 trace 模式重新运行。

后续在 服务器上运行,18.04

有一个很奇怪的问题,就是执行 `make -j -C ./gpu-simulator 的时候,如果之前的编译不成功,最好 make clean 之后再执行,不然会导致 segmentation fault。本人在这里卡了好久,看到 accel-sim 仓库中的 issue 后才恍然大悟。

要运行 trace 模式,以 deepbench 为例,在已经安装好 accel-sim-framework 和 gpu-app-collection 之后,运行:


source ./gpu-app-collection/src/setup_environment  
make -j -C ./gpu-app-collection/src deeplearning 
make -C ./gpu-app-collection/src data

./util/tracer_nvbit/run_hw_trace.py -B deeplearning -D <gpu-device-num-to-run-on>  

# 可以直接使用作者编制好的 trace:
./get-accel-sim-traces.py # 根据提示选 deepbench

./util/job_launching/run_simulations.py -B deeplearning -C QV100-SASS -T ./hw_run/deepbench/11.0 -N myTest

Author: Yixiang Zhang
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Yixiang Zhang !
评论
  TOC