Skip to content

Conversation

@cthoyt
Copy link

@cthoyt cthoyt commented Dec 21, 2021

As a follow-up to this twitter discussion

Really cool post, but I wish it were possible to directly re-run it (there are local file paths to ChEMBL data and no automation for downloading)

Solution: I added the generation of the substructure library as an extra function in `chembl_downloader`:https://t.co/MgiChnrxaj

— Charles Tapley Hoyt (@cthoyt) December 20, 2021

this PR makes a small change to automate the download of the ChEMBL SDF file using the lightweight chembl_downloader package. It chooses a file path that's deterministic on all systems so it can abstract away the need for a local file path for the ChEMBL SDF file.

It would also be possible to replace the whole line with gzip.open(sdf_path) as gz, Chem.ForwardSDMolSupplier(gz) as suppl: with with chembl_downloader.supplier(version="29") as suppl:, but I think that would be a bit too esoteric.

@greglandrum
Copy link
Owner

Hi @cthoyt, sorry I'm so slow to reply to this one; I missed the notification and am just now seeing it.

I'd be happy to mention using the chembl_downloader here (and agree that it could be useful to people who don't already have a local copy of the file downloaded), but would prefer to have that pulled out into a separate code block/section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants