Skip to content

Commit 6776245

Browse files
sujik18github-actionsgithub-actions[bot]arjunsureshanandhu-eng
authored
Replaced shell commands with Python for Windows compliance script compatibility (#2344)
* [Automated Commit] Format Codebase * Updated tags for submission checker command in docs * Update mobilenets docs * Update main.py * Update main.py * update dataset download commands - waymo calib (#2130) * Merge from Master (#2155) * Update submission_checker.py | Fix open model unit in Results (#2144) * Add Llama 3.1 to special unit dict (#2150) --------- Co-authored-by: Pablo Gonzalez <[email protected]> * [Automated Commit] Format Codebase * Inference docs - Update model and dataset download commands (#2153) * Update llama2 70b model download docs * changes in model and dataset download commands * add powershell command to get result folder structure (#2156) * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Fix Typo in Interactive Latencies (#2147) (#2225) * Fix Typo in Interactive Latencies * Update submission_checker.py * Fix Typo in Interactive Latencies (#2147) (#2226) * Fix Typo in Interactive Latencies * Update submission_checker.py --------- Co-authored-by: Miro <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Update MLCFlow commands for v5.1 (#2237) * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Update main.py * [Automated Commit] Format Codebase * updating for 5.1-dev (inference doc) * [Automated Commit] Format Codebase * fix typo * [Automated Commit] Format Codebase * Update main.py * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Doc updates (#2292) * improve submission doc * Update index.md * Fix for model and dataset download commands * update submission doc * [Automated Commit] Format Codebase * Update index.md * r2_downloader -> r2-downloader * Update multithreading information about SDXL * [Automated Commit] Format Codebase * .lower() for consistency * [Automated Commit] Format Codebase * updation for llama3_1-8b edge * [Automated Commit] Format Codebase --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Arjun Suresh <[email protected]> * Add quiet flags to MLC commands (#2309) * Improve docs - submission generation (#2311) * [Automated Commit] Format Codebase * Refactor run_verification scripts to improve OS compatibility * [Automated Commit] Format Codebase * Refactor accuracy verification to use file reading instead of shell commands for compatibility * [Automated Commit] Format Codebase * Update folder structure by removing Nvidia folder --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Arjun Suresh <[email protected]> Co-authored-by: ANANDHU S <[email protected]> Co-authored-by: Nathan Wasson <[email protected]> Co-authored-by: Pablo Gonzalez <[email protected]> Co-authored-by: Miro <[email protected]>
1 parent d4bf062 commit 6776245

File tree

12 files changed

+170
-115
lines changed

12 files changed

+170
-115
lines changed

compliance/TEST01/run_verification.py

Lines changed: 60 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -76,51 +76,68 @@ def main():
7676
output_dir = os.path.join(args.output_dir, "TEST01")
7777
unixmode = ""
7878
if args.unixmode:
79-
unixmode = " --unixmode"
80-
for binary in ["wc", "md5sum", "grep", "awk", "sed", "head", "tail"]:
79+
if os.name != "posix":
80+
print(
81+
"Warning: --unixmode not supported on this OS. Using Python fallback...")
82+
unixmode = ""
83+
else:
84+
unixmode = " --unixmode"
8185
missing_binary = False
82-
if shutil.which(binary) is None:
83-
print(
84-
"Error: This script requires the {:} commandline utility".format(
85-
binary
86+
for binary in ["wc", "md5sum", "grep",
87+
"awk", "sed", "head", "tail"]:
88+
if shutil.which(binary) is None:
89+
print(
90+
"Error: This script requires the {:} commandline utility".format(
91+
binary
92+
)
8693
)
87-
)
88-
missing_binary = True
89-
if missing_binary:
90-
exit()
94+
missing_binary = True
95+
if missing_binary:
96+
exit()
9197

9298
dtype = args.dtype
9399

94100
verify_accuracy_binary = os.path.join(
95101
os.path.dirname(__file__), "verify_accuracy.py"
96102
)
103+
104+
unixmode_str = unixmode if unixmode == "" else unixmode + " "
105+
97106
# run verify accuracy
98107
verify_accuracy_command = (
99-
"python3 "
108+
sys.executable + " "
100109
+ verify_accuracy_binary
101110
+ " --dtype "
102111
+ args.dtype
103-
+ unixmode
112+
+ unixmode_str
104113
+ " -r "
105-
+ results_dir
106-
+ "/accuracy/mlperf_log_accuracy.json"
114+
+ os.path.join(results_dir, "accuracy", "mlperf_log_accuracy.json")
107115
+ " -t "
108-
+ compliance_dir
109-
+ "/mlperf_log_accuracy.json | tee verify_accuracy.txt"
116+
+ os.path.join(compliance_dir, "mlperf_log_accuracy.json")
110117
)
111118
try:
112-
os.system(verify_accuracy_command)
119+
with open("verify_accuracy.txt", "w") as f:
120+
process = subprocess.Popen(
121+
verify_accuracy_command,
122+
stdout=subprocess.PIPE,
123+
stderr=subprocess.STDOUT,
124+
shell=True,
125+
text=True
126+
)
127+
# Write output to both console and file
128+
for line in process.stdout:
129+
print(line, end="")
130+
f.write(line)
131+
process.wait()
113132
except Exception:
114133
print(
115134
"Exception occurred trying to execute:\n " +
116135
verify_accuracy_command)
117136
# check if verify accuracy script passes
118137

119-
accuracy_pass_command = "grep PASS verify_accuracy.txt"
120138
try:
121-
accuracy_pass = "TEST PASS" in subprocess.check_output(
122-
accuracy_pass_command, shell=True
123-
).decode("utf-8")
139+
with open("verify_accuracy.txt", "r") as file:
140+
accuracy_pass = "TEST PASS" in file.read()
124141
except Exception:
125142
accuracy_pass = False
126143

@@ -129,28 +146,38 @@ def main():
129146
os.path.dirname(__file__), "verify_performance.py"
130147
)
131148
verify_performance_command = (
132-
"python3 "
149+
sys.executable + " "
133150
+ verify_performance_binary
134-
+ " -r "
135-
+ results_dir
136-
+ "/performance/run_1/mlperf_log_detail.txt"
137-
+ " -t "
138-
+ compliance_dir
139-
+ "/mlperf_log_detail.txt | tee verify_performance.txt"
151+
+ " -r"
152+
+ os.path.join(results_dir, "performance",
153+
"run_1", "mlperf_log_detail.txt")
154+
+ " -t"
155+
+ os.path.join(compliance_dir, "mlperf_log_detail.txt")
140156
)
157+
141158
try:
142-
os.system(verify_performance_command)
159+
with open("verify_performance.txt", "w") as f:
160+
process = subprocess.Popen(
161+
verify_performance_command,
162+
stdout=subprocess.PIPE,
163+
stderr=subprocess.STDOUT,
164+
text=True,
165+
shell=True,
166+
)
167+
# Write output to both console and file
168+
for line in process.stdout:
169+
print(line, end="")
170+
f.write(line)
171+
process.wait()
143172
except Exception:
144173
print(
145174
"Exception occurred trying to execute:\n " +
146175
verify_performance_command)
147176

148177
# check if verify performance script passes
149-
performance_pass_command = "grep PASS verify_performance.txt"
150178
try:
151-
performance_pass = "TEST PASS" in subprocess.check_output(
152-
performance_pass_command, shell=True
153-
).decode("utf-8")
179+
with open("verify_performance.txt", "r") as file:
180+
performance_pass = "TEST PASS" in file.read()
154181
except Exception:
155182
performance_pass = False
156183

compliance/TEST01/verify_accuracy.py

Lines changed: 36 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@
2020
import subprocess
2121
import sys
2222
import shutil
23+
import hashlib
24+
import re
2325

2426
sys.path.append(os.getcwd())
2527

@@ -161,15 +163,11 @@ def main():
161163
print("Error: This script requires Python v3.3 or later")
162164
exit()
163165

164-
get_perf_lines_cmd = "wc -l " + perf_log + "| awk '{print $1}'"
165-
num_perf_lines = int(
166-
subprocess.check_output(get_perf_lines_cmd, shell=True).decode("utf-8")
167-
)
166+
with open(perf_log, "r") as file:
167+
num_perf_lines = sum(1 for _ in file)
168168

169-
get_acc_lines_cmd = "wc -l " + acc_log + "| awk '{print $1}'"
170-
num_acc_lines = int(
171-
subprocess.check_output(get_acc_lines_cmd, shell=True).decode("utf-8")
172-
)
169+
with open(acc_log, "r") as file:
170+
num_acc_lines = sum(1 for _ in file)
173171

174172
num_acc_log_entries = num_acc_lines - 2
175173
num_perf_log_entries = num_perf_lines - 2
@@ -189,42 +187,38 @@ def main():
189187
continue
190188

191189
# calculate md5sum of line in perf mode accuracy_log
192-
perf_md5sum_cmd = (
193-
"head -n "
194-
+ str(perf_line + 1)
195-
+ " "
196-
+ perf_log
197-
+ "| tail -n 1| sed -r 's/,//g' | sed -r 's/\"seq_id\" : \\S+//g' | md5sum"
198-
)
199-
# print(perf_md5sum_cmd)
200-
perf_md5sum = subprocess.check_output(perf_md5sum_cmd, shell=True).decode(
201-
"utf-8"
202-
)
203-
204-
# get qsl idx
205-
get_qsl_idx_cmd = (
206-
"head -n "
207-
+ str(perf_line + 1)
208-
+ " "
209-
+ perf_log
210-
+ "| tail -n 1| awk -F\": |,\" '{print $4}'"
211-
)
212-
qsl_idx = (
213-
subprocess.check_output(get_qsl_idx_cmd, shell=True)
214-
.decode("utf-8")
215-
.rstrip()
216-
)
190+
# read the specific line
191+
with open(perf_log, "r") as f:
192+
for i, line in enumerate(f):
193+
if i == perf_line:
194+
line_content = line.strip()
195+
break
196+
197+
# remove commas and remove 'seq_id' key-value
198+
clean_line = line_content.replace(",", "")
199+
clean_line = re.sub(r'"seq_id"\s*:\s*\S+', '', clean_line)
200+
201+
# calculate md5sum
202+
perf_md5sum = hashlib.md5(clean_line.encode("utf-8")).hexdigest()
203+
204+
# extract qsl idx
205+
fields = re.split(r": |,", line_content)
206+
qsl_idx = fields[3].strip()
217207

218208
# calculate md5sum of line in acc mode accuracy_log
219-
acc_md5sum_cmd = (
220-
'grep "qsl_idx\\" : '
221-
+ qsl_idx
222-
+ '," '
223-
+ acc_log
224-
+ "| sed -r 's/,//g' | sed -r 's/\"seq_id\" : \\S+//g' | md5sum"
225-
)
226-
acc_md5sum = subprocess.check_output(
227-
acc_md5sum_cmd, shell=True).decode("utf-8")
209+
acc_matches = []
210+
with open(acc_log, "r") as f:
211+
for line in f:
212+
if f'"qsl_idx" : {qsl_idx},' in line:
213+
acc_matches.append(line.strip())
214+
215+
# join all matching lines together
216+
acc_line = "\n".join(acc_matches)
217+
218+
acc_line = acc_line.replace(",", "")
219+
acc_line = re.sub(r'"seq_id"\s*:\s*\S+', '', acc_line)
220+
221+
acc_md5sum = hashlib.md5(acc_line.encode("utf-8")).hexdigest()
228222

229223
if perf_md5sum != acc_md5sum:
230224
num_perf_log_data_mismatch += 1

compliance/TEST04/run_verification.py

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -58,28 +58,38 @@ def main():
5858
os.path.dirname(__file__), "verify_performance.py"
5959
)
6060
verify_performance_command = (
61-
"python3 "
61+
sys.executable + " "
6262
+ verify_performance_binary
63-
+ " -r "
64-
+ results_dir
65-
+ "/performance/run_1/mlperf_log_summary.txt"
66-
+ " -t "
67-
+ compliance_dir
68-
+ "/mlperf_log_summary.txt | tee verify_performance.txt"
63+
+ " -r"
64+
+ os.path.join(results_dir, "performance",
65+
"run_1", "mlperf_log_summary.txt")
66+
+ " -t"
67+
+ os.path.join(compliance_dir, "mlperf_log_summary.txt")
6968
)
69+
7070
try:
71-
os.system(verify_performance_command)
71+
with open("verify_performance.txt", "w") as f:
72+
process = subprocess.Popen(
73+
verify_performance_command,
74+
stdout=subprocess.PIPE, # capture output
75+
stderr=subprocess.STDOUT,
76+
text=True, # decode output as text
77+
shell=True,
78+
)
79+
# Write output to both console and file
80+
for line in process.stdout:
81+
print(line, end="") # console
82+
f.write(line) # file
83+
process.wait()
7284
except Exception:
7385
print(
7486
"Exception occurred trying to execute:\n " +
7587
verify_performance_command)
7688

7789
# check if verify performance script passes
78-
performance_pass_command = "grep PASS verify_performance.txt"
7990
try:
80-
performance_pass = "TEST PASS" in subprocess.check_output(
81-
performance_pass_command, shell=True
82-
).decode("utf-8")
91+
with open("verify_performance.txt", "r") as file:
92+
performance_pass = "TEST PASS" in file.read()
8393
except Exception:
8494
performance_pass = False
8595

docs/benchmarks/image_classification/get-resnet50-data.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The benchmark implementation run command will automatically download the validat
1515

1616
### Get Validation Dataset
1717
```
18-
mlcr get,dataset,imagenet,validation -j
18+
mlcr get,dataset,imagenet,validation,_full -j
1919
```
2020
=== "Calibration"
2121
ResNet50 calibration dataset consist of 500 images selected from the Imagenet 2012 validation dataset. There are 2 alternative options for the calibration dataset.
@@ -32,7 +32,7 @@ The benchmark implementation run command will automatically download the validat
3232
### Get ResNet50 preprocessed dataset
3333

3434
```
35-
mlcr get,dataset,image-classification,imagenet,preprocessed,_pytorch -j
35+
mlcr get,dataset,image-classification,imagenet,preprocessed,_pytorch,_full-j
3636
```
3737

3838
- `--outdirname=<PATH_TO_DOWNLOAD_IMAGENET_DATASET>` could be provided to download the dataset to a specific location.
@@ -52,7 +52,7 @@ Get the Official MLPerf ResNet50 Model
5252

5353
### Onnx
5454
```
55-
mlcr get,ml-model,resnet50,_onnx -j
55+
mlcr get,ml-model,resnet50,image-classification,_onnx -j
5656
```
5757

5858
- `--outdirname=<PATH_TO_DOWNLOAD_RESNET50_MODEL>` could be provided to download the model to a specific location.

docs/benchmarks/language/get-deepseek-r1-data.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,11 @@ The benchmark implementation run command will automatically download the validat
1111

1212
=== "Validation"
1313

14-
### Get Validation Dataset
15-
```
16-
mlcr get,preprocessed,dataset,deepseek-r1,_validation,_mlc,_rclone --outdirname=<path to download> -j
17-
```
1814

19-
=== "Calibration"
15+
### Get Validation Dataset
16+
```
17+
mlcr get,preprocessed,dataset,deepseek-r1,_validation,_mlc,_r2-downloader --outdirname=<path to download> -j
18+
```
2019

2120
### Get Calibration Dataset
2221
```
@@ -33,4 +32,4 @@ The benchmark implementation run command will automatically download the require
3332
### Get the Official MLPerf DeekSeek-R1 model from MLCOMMONS Storage
3433
```
3534
mlcr get,ml-model,deepseek-r1,_r2-downloader,_mlc,_dry-run -j
36-
```
35+
```

docs/benchmarks/language/get-llama3_1-405b-data.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,14 @@ The benchmark implementation run command will automatically download the require
3838
```
3939
mlcr get,ml-model,llama3,_mlc,_r2-downloader,_405b --outdirname=<path to download> -j
4040
```
41+
42+
=== "From Cloudfare R2"
43+
44+
> **Note:** One has to accept the [MLCommons Llama 3.1 License Confidentiality Notice](http://llama3-1.mlcommons.org/) to access the model files in MLCOMMONS Google Drive.
45+
46+
### Get the Official MLPerf LLAMA3.1-405B model from MLCOMMONS Cloudfare R2
47+
```
48+
mlcr get,ml-model,llama3,_mlc,_405b,_r2-downloader --outdirname=<path to download> -j
4149

4250
=== "From Hugging Face repo"
4351

docs/benchmarks/language/get-llama3_1-8b-data.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,14 @@ hide:
1010
The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
1111

1212
=== "Validation"
13-
13+
1414
=== "Full dataset (Datacenter)"
1515

1616
### Get Validation Dataset
1717
```
1818
mlcr get,dataset,cnndm,_validation,_datacenter,_llama3,_mlc,_r2-downloader --outdirname=<path to download> -j
1919
```
20-
20+
2121
=== "5000 samples (Edge)"
2222

2323
### Get Validation Dataset
@@ -26,7 +26,8 @@ The benchmark implementation run command will automatically download the validat
2626
```
2727

2828
=== "Calibration"
29-
29+
```
30+
3031
### Get Calibration Dataset
3132
```
3233
mlcr get,dataset,cnndm,_calibration,_llama3,_mlc,_r2-downloader --outdirname=<path to download> -j

0 commit comments

Comments
 (0)