Skip to content

Conversation

matt-winkler
Copy link
Collaborator

No description provided.

@codecov
Copy link

codecov bot commented Sep 27, 2022

Codecov Report

Base: 86.15% // Head: 84.61% // Decreases project coverage by -1.54% ⚠️

Coverage data is based on head (db38153) compared to base (8b56460).
Patch coverage: 71.00% of modified lines in pull request are covered.

❗ Current head db38153 differs from pull request most recent head cfea827. Consider uploading reports for the commit cfea827 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #38      +/-   ##
==========================================
- Coverage   86.15%   84.61%   -1.55%     
==========================================
  Files          13       15       +2     
  Lines        1127     1183      +56     
==========================================
+ Hits          971     1001      +30     
- Misses        156      182      +26     
Impacted Files Coverage Δ
dbtc/client/cloud/configs/dbt_cloud_api.py 42.85% <42.85%> (ø)
dbtc/client/cloud/base.py 77.77% <74.39%> (-2.82%) ⬇️
dbtc/cli.py 74.52% <100.00%> (+0.09%) ⬆️
dbtc/client/cloud/configs/dbt_core_cli.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@matt-winkler
Copy link
Collaborator Author

matt-winkler commented Sep 27, 2022

@dpguthrie There's a lot in this PR, so to help organize thoughts I wanted to lay out a few areas where feedback would be valuable:

  • The notion of templating our API requests in dbt_cloud_api
  • Adding the concept of mode to trigger_job. My thinking here is to make it more clear for the end user that they only get to run a job in a single mode at a time. Otherwise I could see someone passing e.g. restart_from_failure=True AND autoscale=True and be confused about what behavior to expect.
  • As the number of use cases grows, the trigger_job method is going to have lots of possible parameters. Are we OK with that or would it be valuable to expose the concept of run modes in separate methods callable from the CLI, e.g. dbtc trigger-job-with-restart-from-failure, dbtc trigger-job-with-autoscale, etc.

@dpguthrie
Copy link
Owner

dpguthrie commented Sep 27, 2022

The notion of templating our API requests in dbt_cloud_api

I came up with a different approach for validation using the pydantic library. This would both do validation of the payload that a user specifies as well as give it the appropriate defaults (id for instance).

#39

We can iterate on what's best, I just didn't want to overwrite all your work.

@dpguthrie
Copy link
Owner

As the number of use cases grows, the trigger_job method is going to have lots of possible parameters. Are we OK with that or would it be valuable to expose the concept of run modes in separate methods callable from the CLI, e.g. dbtc trigger-job-with-restart-from-failure, dbtc trigger-job-with-autoscale, etc.

Putting this here as well - my take is that we should be building out different methods that accomplish these different tasks. The trigger_job method is already probably a bit too big and doing too much. It should be the entry point to trigger a job from all those other methods, but each of those methods will have logic that doesn't make sense to wrap all together. Also, I'm guessing you're just gonna continue thinking of awesome use cases for this library, so we should plan for that.

@matt-winkler
Copy link
Collaborator Author

@dpguthrie 100 on breaking out separate methods that all get to the base trigger_job. I was mid-way on that before thinking to stop and ask you :))


return 'api_key'

def _clone_resource(self, resource: str, **kwargs):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpguthrie this is very cool! Is there a means I could use to subset which fields from the base object are used for replication? It seems like this is the intent of create_args, but I think a field subset being passed there would need to affect the payload in order for that modification to occur.

Context: When trying to replicate a job, I'm not able to use fields such as deferring_job_definition_id on the CREATE JOB endpoint. The process needs to first create the job with fewer fields, and then add the deferring_job_defintion_id in an update.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, we need another argument here for that. Good call! Who knows this private method may be a bit overkill for the things we need to do as well as thinking about what resources actually need to be cloned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per conversation offline, noting that we agreed this isn't something to pursue further for now. There will be too much situational variation in what's needed to replicate resources, so we should stick to the CRUD operations and adjust as-needed for other use cases.

Comment on lines 1215 to 1240
new_job_definition = {k:v for k,v in existing_job_defintion.items() if k in required_fields}

if not autoscale_job_identifier:
creation_time = datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
new_job_name = '-'.join([new_job_definition['name'], creation_time])
new_job_definition['name'] = new_job_name
else:
new_job_name = '-'.join([new_job_definition['name'], autoscale_job_identifier])
new_job_definition['name'] = new_job_name

# make sure the id is none on the new job
new_job_definition['id'] = None
job_id = self.create_job(
account_id=account_id, payload=new_job_definition
)['data']['id']

# fill in optional fields on the job if present in the source
for field in optional_fields:
new_job_definition[field] = existing_job_defintion.get(field)

# set the ID in the update request
new_job_definition['id'] = job_id
self.console.log(new_job_definition)
job_id = self.update_job(
account_id=account_id, job_id=job_id, payload=new_job_definition
)['data']['id']
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-winkler I'm not sure I follow here. Why wouldn't we just create the new job from the existing job definition (outside of setting id to None and changing the name)?

Copy link
Collaborator Author

@matt-winkler matt-winkler Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpguthrie we can't copy the existing data in the job 1:1 in a new CREATE JOB request. For example, you can't create a new job including the is_deferrable parameter. That parameter needs to be applied in a subsequent UPDATE step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpguthrie updated my comment with a better example

Copy link
Collaborator Author

@matt-winkler matt-winkler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

@matt-winkler
Copy link
Collaborator Author

@dpguthrie Think I found a happier medium with the cloning idea. This updates _clone_resource so that it will return a payload that includes the object's required fields + and optional fields that are not None in the source data. Then we can further modify depending on the use case e.g. 1) in autoscaling we want to add an identifier to the job's name OR 2) if this was for migration, just push the same payload to another account / project etc.

@matt-winkler
Copy link
Collaborator Author

Also adding a comment that tests on autoscaling jobs are failing because we need to standardize on an account / jobs to work from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants