You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following NVIDIA readme file and after adding the system configuration and building the workloads, I tried to run resnet50 benchmark but at the beginning of the execution, I get the following error:
(mlperf) mahmood@mlperf-inference-mahmood-x86-64-26486:/work$ make run RUN_ARGS="--benchmarks=resnet50 --scenarios=offline"
make[1]: Entering directory '/work'
[2024-06-28 07:33:14,784 main.py:229 INFO] Detected system ID: KnownSystem.rtx3080_ryzen3700x
[2024-06-28 07:33:15,654 generate_engines.py:173 INFO] Building engines for resnet50 benchmark in Offline scenario...
[06/28/2024-07:33:15] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 35, GPU 823 (MiB)
[06/28/2024-07:33:18] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1799, GPU +306, now: CPU 1969, GPU 1135 (MiB)
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/work/code/actionhandler/base.py", line 189, in subprocess_target
return self.action_handler.handle()
File "/work/code/actionhandler/generate_engines.py", line 176, in handle
total_engine_build_time += self.build_engine(job)
File "/work/code/actionhandler/generate_engines.py", line 159, in build_engine
builder = get_benchmark(job.config)
File "/work/code/__init__.py", line 87, in get_benchmark
return cls(conf)
File "/work/code/resnet50/tensorrt/ResNet50.py", line 332, in __init__
super().__init__(ResNet50EngineBuilderOp(**args))
File "/work/code/resnet50/tensorrt/ResNet50.py", line 148, in __init__
if self.batch_size % self.gpu_res2res3_loop_count != 0:
ZeroDivisionError: integer division or modulo by zero
[2024-06-28 07:33:19,719 generate_engines.py:173 INFO] Building engines for resnet50 benchmark in Offline scenario...
[06/28/2024-07:33:19] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 35, GPU 823 (MiB)
[06/28/2024-07:33:21] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1799, GPU +310, now: CPU 1969, GPU 1139 (MiB)
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/work/code/actionhandler/base.py", line 189, in subprocess_target
return self.action_handler.handle()
File "/work/code/actionhandler/generate_engines.py", line 176, in handle
total_engine_build_time += self.build_engine(job)
File "/work/code/actionhandler/generate_engines.py", line 159, in build_engine
builder = get_benchmark(job.config)
File "/work/code/__init__.py", line 87, in get_benchmark
return cls(conf)
File "/work/code/resnet50/tensorrt/ResNet50.py", line 332, in __init__
super().__init__(ResNet50EngineBuilderOp(**args))
File "/work/code/resnet50/tensorrt/ResNet50.py", line 148, in __init__
if self.batch_size % self.gpu_res2res3_loop_count != 0:
ZeroDivisionError: integer division or modulo by zero
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/work/code/main.py", line 231, in <module>
main(main_args, DETECTED_SYSTEM)
File "/work/code/main.py", line 144, in main
dispatch_action(main_args, config_dict, workload_setting)
File "/work/code/main.py", line 202, in dispatch_action
handler.run()
File "/work/code/actionhandler/base.py", line 82, in run
self.handle_failure()
File "/work/code/actionhandler/base.py", line 186, in handle_failure
self.action_handler.handle_failure()
File "/work/code/actionhandler/generate_engines.py", line 184, in handle_failure
raise RuntimeError("Building engines failed!")
RuntimeError: Building engines failed!
make[1]: *** [Makefile:37: generate_engines] Error 1
make[1]: Leaving directory '/work'
make: *** [Makefile:31: run] Error 2
The default generated configuration in configs/resnet50/Offline is shown below:
# Generated file by scripts/custom_systems/add_custom_system.py
# Contains configs for all custom systems in code/common/systems/custom_list.py
from . import *
@ConfigRegistry.register(HarnessType.LWIS, AccuracyTarget.k_99, PowerSetting.MaxP)
class RTX3080_RYZEN3700X(OfflineGPUBaseConfig):
system = KnownSystem.rtx3080_ryzen3700x
# Applicable fields for this benchmark are listed below. Not all of these are necessary, and some may be defined in the BaseConfig already and inherited.
# Please see NVIDIA's submission config files for example values and which fields to keep.
# Required fields (Must be set or inherited to run):
gpu_batch_size: int = 0
input_dtype: str = ''
input_format: str = ''
map_path: str = ''
precision: str = ''
tensor_path: str = ''
# Optional fields:
active_sms: int = 0
assume_contiguous: bool = False
buffer_manager_thread_count: int = 0
cache_file: str = ''
complete_threads: int = 0
deque_timeout_usec: int = 0
disable_beta1_smallk: bool = False
energy_aware_kernels: bool = False
gpu_copy_streams: int = 0
gpu_inference_streams: int = 0
gpu_res2res3_loop_count: int = 0
instance_group_count: int = 0
model_path: str = ''
offline_expected_qps: float = 0.0
performance_sample_count_override: int = 0
preferred_batch_size: str = ''
request_timeout_usec: int = 0
run_infer_on_copy_streams: bool = False
use_batcher_thread_per_device: bool = False
use_cuda_thread_per_device: bool = False
use_deque_limit: bool = False
use_graphs: bool = False
use_jemalloc: bool = False
use_same_context: bool = False
use_spin_wait: bool = False
verbose_glog: int = 0
warmup_duration: float = 0.0
workspace_size: int = 0
@ConfigRegistry.register(HarnessType.Triton, AccuracyTarget.k_99, PowerSetting.MaxP)
class RTX3080_RYZEN3700X_Triton(RTX3080_RYZEN3700X):
use_triton = True
# Applicable fields for this benchmark are listed below. Not all of these are necessary, and some may be defined in the BaseConfig already and inherited.
# Please see NVIDIA's submission config files for example values and which fields to keep.
# Required fields (Must be set or inherited to run):
gpu_batch_size: int = 0
input_dtype: str = ''
input_format: str = ''
map_path: str = ''
precision: str = ''
tensor_path: str = ''
# Optional fields:
active_sms: int = 0
assume_contiguous: bool = False
batch_triton_requests: bool = False
buffer_manager_thread_count: int = 0
cache_file: str = ''
complete_threads: int = 0
deque_timeout_usec: int = 0
disable_beta1_smallk: bool = False
energy_aware_kernels: bool = False
gather_kernel_buffer_threshold: int = 0
gpu_copy_streams: int = 0
gpu_inference_streams: int = 0
gpu_res2res3_loop_count: int = 0
instance_group_count: int = 0
max_queue_delay_usec: int = 0
model_path: str = ''
num_concurrent_batchers: int = 0
num_concurrent_issuers: int = 0
offline_expected_qps: float = 0.0
output_pinned_memory: bool = False
performance_sample_count_override: int = 0
preferred_batch_size: str = ''
request_timeout_usec: int = 0
run_infer_on_copy_streams: bool = False
use_batcher_thread_per_device: bool = False
use_concurrent_harness: bool = False
use_cuda_thread_per_device: bool = False
use_deque_limit: bool = False
use_graphs: bool = False
use_jemalloc: bool = False
use_same_context: bool = False
use_spin_wait: bool = False
verbose_glog: int = 0
warmup_duration: float = 0.0
workspace_size: int = 0
I thought that gpu_batch_size: int = 0 is causing the problem, but changing that to 1 resulted in the same error. I also checked that nvidia-smi works as below:
(mlperf) mahmood@mlperf-inference-mahmood-x86-64-26486:/work$ nvidia-smi
Fri Jun 28 07:42:30 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3080 Off | 00000000:2D:00.0 On | N/A |
| 0% 54C P8 33W / 370W | 239MiB / 10240MiB | 7% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Any idea about that?
The text was updated successfully, but these errors were encountered:
Following NVIDIA readme file and after adding the system configuration and building the workloads, I tried to run resnet50 benchmark but at the beginning of the execution, I get the following error:
The default generated configuration in
configs/resnet50/Offline
is shown below:I thought that
gpu_batch_size: int = 0
is causing the problem, but changing that to 1 resulted in the same error. I also checked that nvidia-smi works as below:Any idea about that?
The text was updated successfully, but these errors were encountered: