-
Notifications
You must be signed in to change notification settings - Fork 1
feat: adding conda specs #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds multiple new Conda environment YAMLs and updates Snakemake rules to reference those envs; several rules replace shell steps with Python Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant SN as Snakemake
participant EF as Env YAMLs
participant CR as Conda Resolver
participant RL as Rule Runtime
participant FS as Filesystem
rect #F8FBFF
SN->>EF: read referenced envs (`../envs/*.yml`)
EF-->>SN: return env contents (new env files)
end
rect #F6FFF0
SN->>CR: request env resolution for rule (conda: ...)
CR-->>SN: resolve env and provide runtime packages
end
rect #FFF9F0
SN->>RL: execute rule
alt rule uses `run:` (Python)
RL->>FS: perform Python ops (shutil.copy, parse/write grid params)
FS-->>RL: I/O results
else rule uses `shell:`
RL->>FS: run shell command
FS-->>RL: I/O results
end
RL-->>SN: rule finished
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
workflow/envs/biopython.yml (1)
5-5
: Remove trailing whitespace.Line 5 contains trailing whitespace after
python>=3.12.4
. This should be removed for cleaner YAML formatting.workflow/envs/simple_pandas.yml (1)
5-5
: Remove trailing whitespace.Line 5 contains trailing whitespace after
python>=3.12.4
. This should be removed for cleaner YAML formatting.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
workflow/envs/biopython.yml
(1 hunks)workflow/envs/openbabel.yml
(1 hunks)workflow/envs/simple_pandas.yml
(1 hunks)workflow/rules/analyse.smk
(3 hunks)workflow/rules/preparation.smk
(5 hunks)
🔇 Additional comments (9)
workflow/rules/analyse.smk (3)
175-176
: LGTM! Ensure simple_pandas.yml includes matplotlib.The conda environment specification for makeHistogram is correctly added. However, ensure that the simple_pandas.yml environment includes matplotlib, as this rule generates PNG histogram output (line 170).
Note: This issue was flagged in the review of workflow/envs/simple_pandas.yml.
219-220
: LGTM!The conda environment specification for dockingResultsTxt is correctly added and appropriate for the CSV output generation.
427-428
: LGTM! Verify matplotlib availability.The conda environment specification for makeVenn is correctly added. Ensure that the simple_pandas.yml environment includes matplotlib for the venn.png output generation (line 423).
Note: This issue was flagged in the review of workflow/envs/simple_pandas.yml.
workflow/rules/preparation.smk (5)
35-36
: LGTM!The conda environment specification for convertMol2 correctly references openbabel.yml, which provides the required openbabel dependency.
69-70
: LGTM!The conda environment specification for SDFToPDBQT correctly references openbabel.yml, which provides the required openbabel dependency.
82-83
: LGTM! Verify pandas is needed.The conda environment specification for prepareReceptor correctly references biopython.yml. However, verify whether the pandas dependency in biopython.yml is actually required by the prepareReceptor.py script.
Note: This verification was requested in the review of workflow/envs/biopython.yml.
95-96
: LGTM!The conda environment specification for makeReceptorPDBQT correctly references openbabel.yml, which provides the required openbabel dependency.
149-150
: LGTM!The conda environment specification for energyMin correctly references openbabel.yml, which provides the required openbabel dependency for energy minimization operations.
workflow/envs/biopython.yml (1)
5-7
: Evaluate Python version requirement and pandas dependency
- Confirm that requiring Python ≥3.12.4 aligns with your target environments (this may exclude users on 3.11.x).
prepareReceptor.py
does not import pandas; if this YAML is only for that script, remove pandas. Otherwise note that pandas is used byunion_venn.py
,makeHistogram.py
,generateIRODS.py
, andResultTxt.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workflow/rules/preparation.smk (1)
35-175
: Remove merge conflict markers from preparation.smkSeveral rules (convertMol2, SDFToPDBQT, prepareReceptor, makeReceptorPDBQT, energyMin) still contain
<<<<<<<
,=======
,>>>>>>>
markers inside theirconda
blocks. Snakemake will refuse to parse this file until the conflicts are resolved. Decide on the intended indentation/commas and remove all markers.workflow/rules/analyse.smk (1)
175-438
: Resolve merge conflicts in analyse.smkThe
makeHistogram
rule still contains<<<<<<<
/=======
/>>>>>>>
markers, which makes the Snakefile invalid. Clean up the conflict and choose one conda spec before merging. While you’re in there, verify themakeVenn
conda directives—only the last one will take effect.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
workflow/envs/biopython.yml
(1 hunks)workflow/envs/openbabel.yml
(1 hunks)workflow/envs/plotting.yml
(1 hunks)workflow/envs/vinalc.yml
(1 hunks)workflow/rules/analyse.smk
(5 hunks)workflow/rules/docking.smk
(1 hunks)workflow/rules/preparation.smk
(8 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
workflow/envs/openbabel.yml
[error] 4-4: syntax error: could not find expected ':'
(syntax)
workflow/envs/biopython.yml
[error] 5-5: syntax error: could not find expected ':'
(syntax)
workflow/rules/docking.smk
Outdated
conda: | ||
"../envs/vinalc.yml", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broken conda environment reference
This rule now depends on ../envs/vinalc.yml
, but that environment cannot resolve because vinalc is not available on (bio)conda (see comment on workflow/envs/vinalc.yml). Until the package exists in a channel you publish, this rule will fail at runtime.
🤖 Prompt for AI Agents
In workflow/rules/docking.smk around lines 30-31, the conda directive points to
"../envs/vinalc.yml" which references a vinalc package not available on conda
channels and will cause runtime failures; update the rule to use a resolvable
environment: either point to an alternative env yaml that only lists packages
available on conda/bioconda (or a local vendored env file), remove the conda
stanza and switch to a supported container (Docker/Singularity) for that rule,
or modify workflow/envs/vinalc.yml to replace the unavailable vinalc dependency
with an installable alternative or build step; ensure the chosen env is
committed to the repo or uses only published channels so the rule can be
resolved at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if vinalc package exists on bioconda by fetching the recipe
curl -s "https://api.github.com/repos/bioconda/bioconda-recipes/contents/recipes/vinalc" | jq -r '.[].name' 2>/dev/null || echo "Failed to fetch recipe contents"
# Also check the meta.yaml to see version info
curl -s "https://raw.githubusercontent.com/bioconda/bioconda-recipes/master/recipes/vinalc/meta.yaml" | head -20
Length of output: 507
🌐 Web query:
Is vinalc package available on bioconda channel? What versions are available?
💡 Result:
Yes — vinalc is available on the bioconda channel (listed on Anaconda.org). The package entry shows a recent upload dated 2025-10-01. [1]
I couldn't reliably scrape the package-files page here, so to see all published builds/versions run locally (shows exact versions and platforms):
conda search -c bioconda vinalc
or view the package files on Anaconda.org (bioconda/vinalc). [1][2]
Sources:
- Anaconda.org bioconda package listing (vinalc entry). [1]
- Bioconda package index / repo listing. [2]
@cmeesters, you're absolutely right, and I apologize for the incorrect assessment! 🙏
The evidence confirms that vinalc is indeed available on bioconda:
- ✅ Recipe exists in bioconda-recipes with version 1.4.2
- ✅ Package is published on Anaconda.org under the bioconda channel (uploaded 2025-10-01)
- ✅ The version specified in your
workflow/envs/vinalc.yml
(>=1.4.2
) matches the available version
The conda environment reference in this rule is valid and will resolve correctly. The note in the PR description about vinalc not being available was likely outdated or the package became available since the PR was created.
Would you like me to resolve my original comment since the concern I raised is no longer valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
workflow/rules/docking.smk (1)
30-31
: Broken conda environment (already flagged)This conda directive was already flagged in a previous review—vinalc is not available on (bio)conda, so this rule will fail at runtime until the package is published or an alternative approach is used.
workflow/rules/analyse.smk (1)
321-322
: Broken conda environment (already flagged)This conda directive was already flagged in a previous review—vinalc is not available on (bio)conda, so this rule will fail at runtime until the package is published or an alternative approach is used.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
workflow/rules/analyse.smk
(5 hunks)workflow/rules/docking.smk
(1 hunks)workflow/rules/preparation.smk
(9 hunks)
🔇 Additional comments (7)
workflow/rules/analyse.smk (4)
175-176
: LGTM!The conda environment for plotting is correctly referenced.
219-220
: LGTM!The conda environment for simple pandas operations is correctly referenced.
284-288
: LGTM!Replacing the shell-based file copy with a Python
run
block usingshutil.copy
improves portability and maintainability.
429-430
: LGTM!The conda environment for plotting is correctly referenced, consistent with the
makeHistogram
rule.workflow/rules/preparation.smk (3)
110-111
: LGTM!The conda environment for basic utilities is correctly referenced.
166-183
: Verify the trailing space in outputThe Python run block correctly parses grid parameters and reverses their order. However, Line 183 writes
" ".join(grid_params) + " \n"
which includes a trailing space before the newline. Confirm whether this trailing space is required by downstream tools or if it should be just"\n"
.
200-202
: LGTM!Replacing the shell-based approach with a Python
run
block usingshutil.copy
improves portability and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
workflow/rules/preparation.smk (1)
166-183
: Remove unnecessary trailing space and handle empty grid parameters
- Change
f.write(" ".join(grid_params) + " \n")
tof.write(" ".join(grid_params) + "\n")
to avoid the extra space.- Optionally, add a check to raise an error if
grid_params
is empty, preventing a whitespace‐only output file.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
workflow/rules/preparation.smk
(9 hunks)
🔇 Additional comments (1)
workflow/rules/preparation.smk (1)
110-111
: LGTM!The conda directive correctly references the basic environment for the gunzip operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
workflow/rules/preparation.smk (1)
200-203
: Critical behavior change: file copy vs. path reference.The original shell command
echo {input} > {output}
wrote the file path to the output, creating a reference file. The new implementation usesshutil.copy()
to copy the file contents, fundamentally changing the rule's behavior.Comparing with the similar
prepareLibrary
rule (Lines 186-192), which still usesecho {input} > {output}
to write paths, suggests the pattern is to create reference files with paths, not copies.If the intent is to create a path reference (likely correct), revert to the original behavior. If file copying is intentional, verify all downstream consumers of this output expect file contents rather than a path reference.
Apply this diff to restore the original path-writing behavior:
- run: - import shutil - - shutil.copy(input[0], output[0]) + shell: + "echo {input} > {output}"
🧹 Nitpick comments (1)
workflow/rules/preparation.smk (1)
166-183
: Minor issues in Python conversion.The Python implementation has two minor issues:
Misplaced comment (Line 178): The comment
# Reverse the order (equivalent to 'tac')
is indented inside thewith
block but applies to thegrid_params.reverse()
call at Line 179, which is outside the block. Move the comment to Line 179 for clarity.Unusual trailing space (Line 183): The output format
" \n"
includes a trailing space before the newline, which is unusual. Verify that this matches the expected output format from the original shell implementation. If the trailing space is unintentional, use"\n"
instead.Apply this diff to address these issues:
- # Reverse the order (equivalent to 'tac') grid_params.reverse() +# Reverse the order (equivalent to 'tac') # Write to output file with space separation and trailing newline with open(output[0], "w") as f: - f.write(" ".join(grid_params) + " \n") + f.write(" ".join(grid_params) + "\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
workflow/rules/preparation.smk
(9 hunks)
🔇 Additional comments (1)
workflow/rules/preparation.smk (1)
110-111
: No issues with basic.yml Verified thatworkflow/envs/basic.yml
exists and contains no merge conflict markers.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
workflow/rules/preparation.smk
(8 hunks)
🔇 Additional comments (1)
workflow/rules/preparation.smk (1)
166-183
: Verify the trailing space in the output format.The implementation correctly extracts grid parameters and reverses their order. However, line 183 writes the output with a trailing space before the newline:
f.write(" ".join(grid_params) + " \n")This trailing space might be intentional for downstream parsing, but it's unusual and could be a typo.
Verify that downstream consumers of
{receptorID}_grid.txt
expect this format, or remove the trailing space if unintended.Apply this diff if the trailing space is unintended:
- f.write(" ".join(grid_params) + " \n") + f.write(" ".join(grid_params) + "\n")Otherwise, consider adding a comment explaining why the trailing space is required:
- # Write to output file with space separation and trailing newline + # Write to output file with space separation, trailing space (required by downstream parser), and newline
…ture-based-screening into feat/add_conda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
workflow/envs/biopython.yml (1)
1-4
: Pin Python for reproducibilityThe PR aims at reproducible deployments, but this env leaves Python implicit, so solver drift over time can alter the environment. Add an explicit Python requirement to stabilize builds.
dependencies: + - python>=3.12 - biopython>=1.85
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
workflow/envs/basic.yml
(1 hunks)workflow/rules/preparation.smk
(8 hunks)
🔇 Additional comments (2)
workflow/rules/preparation.smk (2)
35-36
: Conda environment directives look good.The per-rule conda environment references are correctly specified and align with the new environment files added in this PR. Based on previous review confirmations, the referenced environment files (
openbabel.yml
,biopython.yml
) have had their merge conflicts resolved and are now valid.Also applies to: 48-49, 71-72, 84-85, 97-98, 153-154
166-183
: Python migration looks correct.The migration from shell to Python run block correctly implements the grid parameter extraction logic:
- Filters lines starting with
npts
orgridcenter
- Extracts the 2nd-4th space-separated fields
- Reverses the order (equivalent to
tac
)- Writes space-separated output
The trailing space before the newline in line 183 appears intentional to match the original behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
workflow/rules/preparation.smk (1)
166-183
: LGTM! Clean migration to Python.The Python implementation correctly replicates the shell-based logic:
- Filters lines starting with "npts" or "gridcenter"
- Safely extracts fields 2-4 with boundary checks
- Reverses the parameter order (equivalent to
tac
)- Writes space-separated output
If you'd like to enhance maintainability, consider adding a brief docstring or inline comments explaining the grid parameter extraction and reversal logic:
run: # Extract grid parameters (npts and gridcenter) from AutoDock GPF file # Fields 2-4 of matching lines are collected, reversed, and output space-separated grid_params = [] with open(input[0], "r") as f: for line in f: if line.startswith(("npts", "gridcenter")): parts = line.strip().split() if len(parts) >= 4: grid_params.append(" ".join(parts[1:4])) # Reverse parameter order grid_params.reverse() with open(output[0], "w") as f: f.write(" ".join(grid_params) + " \n")Note: The trailing space in the output appears intentional for downstream compatibility.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
workflow/envs/basic.yml
(1 hunks)workflow/rules/preparation.smk
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- workflow/envs/basic.yml
🔇 Additional comments (1)
workflow/rules/preparation.smk (1)
110-111
: basic.yml is properly populated and valid with gzip>=1.14.
adding conda specs for reproducible deployments - this is a draft PR: the vinalc is not yet (bio)conda
Summary by CodeRabbit