-
Notifications
You must be signed in to change notification settings - Fork 480
Parallel conv partial fix #1380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Considering " |
@@ -73,7 +73,7 @@ def match(self, node): | |||
|
|||
def transform(self, model, node): | |||
dim = node.__class__.__name__[-2:] # '1D' or '2D' | |||
new_attrs = {k: v for k, v in node.attributes.items() if k not in ('trace', 'precision', 'reuse_factor')} | |||
new_attrs = node.attributes.attributes.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this in a comment in the conversation. I think trace, precision, and reuse_factor are regenerated no matter what, so the values you copy here get overriden (unless something has changed from before). It may be an hls4ml behavior worth revisiting and potentially revising, but I don't think this change fixes anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reuse_factor
defined under Model key is not propagated as expected otherwise. Rm'ed warning if the update opr is trivial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see why it's not propagated properly? Shouldn't it come from a configuration in the first place?
) | ||
mult_params['n_out'] = int(node.get_attr('in_width') * node.get_attr('n_filt') / mult_params['reuse']) | ||
mult_params['n_out'] = node.get_attr('in_width') * node.get_attr('n_filt') // mult_params['n_partitions'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are n_partitions
and mult_params['n_partions']
different?
@@ -50,7 +56,13 @@ void pointwise_conv_1d_cl(data_T data[CONFIG_T::in_width * CONFIG_T::n_chan], | |||
assert(CONFIG_T::filt_width == 1); | |||
|
|||
// Inlining helps reduce latency, but may also cause timing issues in some cases, use carefully. | |||
//#pragma HLS INLINE recursive | |||
// But without inlining Vitis HLS doesn't respect the parallelization factor config ¯\_(ツ)_/¯ | |||
// #pragma HLS PIPELINE II = CONFIG_T::reuse_factor * CONFIG_T::n_partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe erase commented out pragmas throughout?
conv_1d_resource_cl<data_T, res_T, CONFIG_T>(data, res, weights, biases); | ||
} | ||
}; | ||
|
||
template <class data_T, class res_T, typename CONFIG_T> | ||
class BatchedDenseForConv1D : public nnet::Conv1DKernel<data_T, res_T, CONFIG_T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment to say the purpose of this code (and also for the 1D version)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR allows modification from contributors. Feel free to add some if you find it necessary.
Description
Provide partial fix to the parallel conv issue on Vitis:
n_partitions==1
case.rf
andpf
isolatedrf
,precision
,trace
flags are inherited nowpf
give proper II nowEach layer still fully blocks, and the dataflow pragma is not having the ideal behavior (i.e., kernel level piplining). At least, we have correct global II now and resource indeed decreases with lower
pf
. Vitis 2025.1 hangs at cosim but 2023.2 was fine, guess it is a bug on the vitis side. Didn't test with more version.Since synth test CI is still not in place, I can't fire up a full regression test now. Please check if this PR breaks your use case.
Type of change
Tests
When will synthesis tests be ready?
Checklist