-
-
Notifications
You must be signed in to change notification settings - Fork 207
Added management command for linking issue #2000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/mentorship-portal
Are you sure you want to change the base?
Added management command for linking issue #2000
Conversation
Summary by CodeRabbit
WalkthroughAdds Issue.level (FK to mentorship.TaskLevel) and Module.issues (M2M to github.Issue); removes Task.level and makes Task.assigned_at optional. Adds two management commands to sync module–issue links and infer issue levels from labels. Updates admin displays and Makefiles; adds normalize_name utility and supporting migrations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (1)
43-61
: Consider optimizing the linked issues query.The command logic is sound, but there's a potential N+1 query issue where
module.linked_issues.values_list("id", flat=True)
executes a database query for each module.Consider prefetching the linked issues to avoid repeated queries:
modules = ( Module.objects.exclude(linked_issue_labels__isnull=True) .exclude(linked_issue_labels=[]) + .prefetch_related("linked_issues") .iterator() )
Alternatively, you could batch the processing by collecting all module updates and applying them together, though the current approach with individual transactions provides better error isolation.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backend/Makefile
(1 hunks)backend/apps/mentorship/admin/module.py
(1 hunks)backend/apps/mentorship/management/commands/link_issues_by_label.py
(1 hunks)backend/apps/mentorship/migrations/0004_module_linked_issue_labels_module_linked_issues.py
(1 hunks)backend/apps/mentorship/models/module.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run backend tests
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (9)
backend/apps/mentorship/models/module.py (2)
55-60
: LGTM! Well-designed JSONField for label storage.The
linked_issue_labels
field is properly configured with appropriate defaults, validation, and descriptive help text. Using a JSONField withdefault=list
is the right approach for storing GitHub issue labels.
71-77
: LGTM! Properly configured ManyToManyField relationship.The
linked_issues
field correctly establishes the many-to-many relationship with GitHub issues. Therelated_name="mentorship_modules"
provides a clear reverse relationship path, and the help text accurately describes the purpose.backend/Makefile (1)
62-63
: LGTM! Makefile target follows established conventions.The
filter-issues
target correctly follows the same pattern as other Django management command targets in the file and is positioned appropriately.backend/apps/mentorship/admin/module.py (2)
15-15
: LGTM! Good addition to admin list display.Adding
linked_issue_labels
to the list display will provide valuable visibility into which labels are associated with each module directly from the admin list view.
17-17
: LGTM! Horizontal filter improves UX for M2M field.Using
filter_horizontal
for thelinked_issues
many-to-many field provides a much better user experience than the default widget, especially when dealing with large numbers of issues.backend/apps/mentorship/migrations/0004_module_linked_issue_labels_module_linked_issues.py (1)
13-34
: LGTM! Migration correctly reflects model changes.The migration properly adds both new fields with configurations that exactly match the model definitions. Dependencies are correctly specified for both the github app (for the Issue model) and the previous mentorship migration.
backend/apps/mentorship/management/commands/link_issues_by_label.py (3)
23-30
: LGTM! Efficient approach to building label mapping.The strategy of building a complete
label_to_issue_ids
mapping upfront is efficient and avoids repeated database queries during the main processing loop. Usingprefetch_related("labels")
andonly("id")
optimizes the query.
37-41
: LGTM! Good optimization to filter modules upfront.Filtering out modules without
linked_issue_labels
before processing avoids unnecessary work. The use of.iterator()
is also good for memory efficiency when dealing with large numbers of modules.
53-54
: LGTM! Proper use of atomic transactions.Using
transaction.atomic()
for each module update ensures data consistency and provides good error isolation. The.set()
method is the correct way to update many-to-many relationships.
backend/Makefile
Outdated
@@ -59,6 +59,9 @@ enrich-data: \ | |||
owasp-enrich-events \ | |||
owasp-enrich-projects | |||
|
|||
filter-issues: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter/link?
move this to mentorship app Makefile
modules = ( | ||
Module.objects.exclude(linked_issue_labels__isnull=True) | ||
.exclude(linked_issue_labels=[]) | ||
.iterator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why .iterator()
?
@@ -52,6 +52,13 @@ class Meta: | |||
verbose_name="Labels", | |||
) | |||
|
|||
linked_issue_labels = models.JSONField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already have labels
field for that
@@ -61,6 +68,14 @@ class Meta: | |||
blank=True, | |||
) | |||
|
|||
linked_issues = models.ManyToManyField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just issues
is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/apps/mentorship/Makefile (1)
1-2
: Optional: add minimal conventional targets to satisfycheckmake
Static analysis flagged missing
all
,clean
, andtest
phony targets.
If this Makefile is intentionally scoped to a single command you can ignore the warning, otherwise consider stubs to silence CI:.PHONY: all clean test all: ## default noop clean: ## noop test: ## noop
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/Makefile
(1 hunks)backend/apps/mentorship/Makefile
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/Makefile
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: backend/apps/mentorship/graphql/mutations/module.py:28-29
Timestamp: 2025-07-16T13:49:58.648Z
Learning: In the OWASP Nest mentorship system, mentors can be created with only github_user initially (without nest_user) when assigned to modules. This allows mentors to be assigned before they've signed into the system. When these users eventually sign in, the nest_user is automatically linked to the existing github_user, creating a complete mentor profile. This design provides flexibility in mentor assignment workflows.
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: frontend/src/components/ModuleForm.tsx:112-134
Timestamp: 2025-07-14T16:18:07.287Z
Learning: In the OWASP Nest mentorship system, date validation for modules (ensuring start date precedes end date) is handled on the backend in the module GraphQL mutations via the `_validate_module_dates` helper function in backend/apps/mentorship/graphql/mutations/module.py, which prevents invalid date ranges from being stored in the database.
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile
[warning] 1-1: Missing required phony target "all"
(minphony)
[warning] 1-1: Missing required phony target "clean"
(minphony)
[warning] 1-1: Missing required phony target "test"
(minphony)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (python)
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/apps/mentorship/Makefile (1)
1-2
: Align the target name with the PR documentation to prevent confusionThe PR description and Makefile snippet in the main README refer to the command as
make filter-issues
, but the actual target isfilter/link
.
Either rename the target or update the docs to keep tooling and documentation in sync.-filter/link: - @CMD="python manage.py link_issues_by_label" $(MAKE) exec-backend-command +filter-issues: + @CMD="python manage.py link_issues_by_label" $(MAKE) exec-backend-command +.PHONY: filter-issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't break existing ordering convention with the changes you add.
filter_horizontal
is a slow widget, use autocomplete_fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest focusing on simple and straight-forward solution instead of efficiency fist. We can optimize it later if needed.
label_to_issue_ids: dict[str, set[int]] = {} | ||
for issue in issues: | ||
for label in issue.labels.all(): | ||
label_to_issue_ids.setdefault(label.name, set()).add(issue.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label name is not unique. The same name label can exist in multiple repositories.
current_ids = set(module.issues.values_list("id", flat=True)) | ||
if current_ids != matched_issue_ids: | ||
with transaction.atomic(): | ||
module.issues.set(matched_issue_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You link any issue that has a label w/ this name -- we need project related issues only. The database contains issues from multiple repositories of different organizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created map-based matching for this, so each issue and repository pair has a composite key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (3)
backend/apps/mentorship/management/commands/link_issues_by_label.py (3)
30-30
: Drop select_related('repository'); limit fields to reduce memoryYou only use issue.id and issue.repository_id, so select_related isn't needed. Limit columns to reduce memory.
- issues_query = Issue.objects.select_related("repository").prefetch_related("labels") + issues_query = Issue.objects.only("id", "repository_id").prefetch_related("labels")
68-73
: Don’t overwrite manual links by default; add --append and --dry-runset(...) replaces all existing links. Provide safer defaults and options:
- --append: add only the newly matched issues.
- --dry-run: compute and report without writing.
Apply within handle():
- with transaction.atomic(): - module.issues.set(matched_issue_ids) - - num_linked = len(matched_issue_ids) + dry_run = options.get("dry_run", False) + with transaction.atomic(): + if options.get("append"): + existing_ids = set(module.issues.values_list("id", flat=True)) + to_add = matched_issue_ids - existing_ids + if not dry_run and to_add: + module.issues.add(*to_add) + num_linked = len(to_add) + else: + if not dry_run: + module.issues.set(matched_issue_ids) + num_linked = len(matched_issue_ids)And define arguments:
def add_arguments(self, parser): parser.add_argument( "--append", action="store_true", help="Add new matches without removing existing links." ) parser.add_argument( "--dry-run", action="store_true", help="Compute matches and print changes without writing." )Would you like me to open a follow-up PR with these changes?
71-73
: Clarify and correct “total links created” accountingCurrently this sums the size of each matched set, not the number of newly-created M2M links. If you add --append, increment by len(to_add). If replacing, you can report “set N links” instead of “created”.
Example:
total_links_created += num_linked # where num_linked reflects added links in append modeAnd tweak the summary message to “set” vs “added” depending on mode.
Also applies to: 83-88
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/Makefile
(1 hunks)backend/apps/mentorship/admin/module.py
(1 hunks)backend/apps/mentorship/management/commands/link_issues_by_label.py
(1 hunks)backend/apps/mentorship/models/module.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- backend/apps/mentorship/models/module.py
- backend/apps/mentorship/admin/module.py
- backend/Makefile
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: backend/apps/mentorship/graphql/mutations/module.py:28-29
Timestamp: 2025-07-16T13:49:58.648Z
Learning: In the OWASP Nest mentorship system, mentors can be created with only github_user initially (without nest_user) when assigned to modules. This allows mentors to be assigned before they've signed into the system. When these users eventually sign in, the nest_user is automatically linked to the existing github_user, creating a complete mentor profile. This design provides flexibility in mentor assignment workflows.
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: frontend/src/components/ModuleForm.tsx:112-134
Timestamp: 2025-07-14T16:18:07.287Z
Learning: In the OWASP Nest mentorship system, date validation for modules (ensuring start date precedes end date) is handled on the backend in the module GraphQL mutations via the `_validate_module_dates` helper function in backend/apps/mentorship/graphql/mutations/module.py, which prevents invalid date ranges from being stored in the database.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run backend tests
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: CodeQL (python)
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
backend/apps/mentorship/management/commands/link_issues_by_label.py (2)
62-67
: Good: repository-scoped label matching prevents cross-repo leakageUsing (repo.id, label_name) keyed lookups scoped to module.project.repositories addresses prior concerns about label name collisions across repos and linking issues from unrelated projects.
59-60
: Ignore incorrectlinked_issue_labels
suggestionThe
Module
model only defines alabels = models.JSONField(...)
field—there is nolinked_issue_labels
attribute anywhere in the codebase. Usingmodule.labels
is correct, so you can disregard the original request to switch to a non-existent field.Likely an incorrect or invalid review comment.
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
backend/apps/mentorship/management/commands/link_issues_by_label.py (2)
52-57
: Add distinct() to prevent duplicate modules from M2M joins.This addresses previous feedback about duplicate module processing due to the M2M relationship with repositories.
modules_to_process = ( Module.objects.prefetch_related("project__repositories") .exclude(project__repositories__isnull=True) .exclude(labels__isnull=True) .exclude(labels=[]) + .distinct() )
31-41
: Performance issue: Query optimization needed to avoid loading entire database.The current implementation fetches all issues and labels from the database without filtering, which can cause significant performance issues as the database grows. This addresses the previous review feedback about scaling.
Based on the previous review suggestions, implement the optimized approach using the through table:
- issues_query = Issue.objects.select_related("repository").prefetch_related("labels") - - for issue in issues_query: - if not issue.repository_id: - continue - - for label in issue.labels.all(): - normalized_label = (label.name or "").strip().casefold() - key = (issue.repository_id, normalized_label) - repo_label_to_issue_ids.setdefault(key, set()).add(issue.id) + # First, get modules to determine which repos/labels we care about + modules_queryset = ( + Module.objects.prefetch_related("project__repositories") + .exclude(project__repositories__isnull=True) + .exclude(labels__isnull=True) + .exclude(labels=[]) + .distinct() + ) + + # Gather repos and labels of interest + repo_ids = set() + wanted_labels = set() + for module in modules_queryset: + for repo in module.project.repositories.all(): + repo_ids.add(repo.id) + for label_name in (module.labels or []): + wanted_labels.add((label_name or "").strip().casefold()) + + if not repo_ids or not wanted_labels: + self.stdout.write("No repositories or labels found to process.") + return + + # Query via through table for efficiency + Through = Issue.labels.through + pairs = ( + Through.objects + .select_related("issue__repository", "label") + .filter( + issue__repository_id__in=repo_ids, + label__name__in=wanted_labels + ) + .values_list("issue_id", "issue__repository_id", "label__name") + ) + + # Build the map efficiently + for issue_id, repo_id, label_name in pairs: + normalized_label = (label_name or "").strip().casefold() + if normalized_label in wanted_labels: # Double-check normalization + key = (repo_id, normalized_label) + repo_label_to_issue_ids.setdefault(key, set()).add(issue_id)
🧹 Nitpick comments (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (1)
76-79
: Potential performance issue: Consider using prefetch_related for assignees.The current query fetches issues and then accesses
assignees.first()
for each issue individually, which could result in N+1 query problems.if matched_issue_ids: issues = Issue.objects.filter( id__in=matched_issue_ids, assignees__isnull=False, - ).distinct() + ).prefetch_related("assignees").distinct()
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/apps/mentorship/management/commands/link_issues_by_label.py
(1 hunks)backend/apps/mentorship/migrations/0005_module_issues.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (3)
backend/apps/mentorship/models/module.py (2)
Module
(16-99)save
(92-99)backend/apps/mentorship/models/task.py (1)
Task
(10-88)backend/apps/github/models/generic_issue_model.py (1)
repository_id
(70-77)
🔇 Additional comments (2)
backend/apps/mentorship/migrations/0005_module_issues.py (1)
1-24
: LGTM! Clean migration for adding ManyToManyField.The migration properly adds the
issues
field to the Module model with appropriate metadata and constraints. The field configuration is well-structured with helpful text and proper related naming.backend/apps/mentorship/management/commands/link_issues_by_label.py (1)
112-112
: Repository name attribute usage confirmed.Good fix from the previous review - using
r.name
instead of the non-existentr.path
attribute.
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/link_issues_by_label.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
backend/apps/mentorship/Makefile (1)
1-5
: Declare these targets as .PHONY and standardize verbosity.Make can skip targets if a file/dir matches the name. Mark them phony. Also consider consistent verbosity across the two commands.
+.PHONY: sync/module-issues sync/issue-levels sync/module-issues: @CMD="python manage.py sync_module_issues -v 2" $(MAKE) exec-backend-command sync/issue-levels: - @CMD="python manage.py sync_issue_levels" $(MAKE) exec-backend-command + @CMD="python manage.py sync_issue_levels -v 2" $(MAKE) exec-backend-command
🧹 Nitpick comments (10)
backend/apps/mentorship/utils.py (1)
4-6
: Type annotations and docstring alignment for normalize_nameMinor polish: reflect that casefold is used (more robust than lower) and add type hints for clarity.
Apply this diff to the function:
-def normalize_name(name): - """Normalize a string by stripping whitespace and converting to lowercase.""" - return (name or "").strip().casefold() +def normalize_name(name: "Optional[str]") -> str: + """Normalize a string by stripping whitespace and applying casefold for case-insensitive matching.""" + return (name or "").strip().casefold()Add the missing import at the top of the file:
from typing import Optionalbackend/apps/github/models/issue.py (1)
57-64
: Rename related_name and fix help_text to reflect Issue semanticsThe FK lives on Issue, so using related_name="tasks" and mentioning “task” in help_text is misleading. Recommend aligning both to “issue”.
Apply this diff:
- level = models.ForeignKey( - "mentorship.TaskLevel", - null=True, - blank=True, - on_delete=models.SET_NULL, - related_name="tasks", - help_text="The difficulty level of this task.", - ) + level = models.ForeignKey( + "mentorship.TaskLevel", + null=True, + blank=True, + on_delete=models.SET_NULL, + related_name="issues", + help_text="The difficulty level of this issue.", + )Note: This will require a migration and updating any usages of TaskLevel.tasks (if any) to TaskLevel.issues.
backend/apps/github/admin/issue.py (1)
22-22
: Add filter and select_related for level to improve admin UX and performance
- Add level to list_filter to quickly narrow issues by difficulty.
- Use list_select_related to avoid N+1 queries for repository and level in the changelist.
Proposed additions:
# Improve filters list_filter = ( "state", "is_locked", "level", ) # Avoid N+1 on FK columns shown in list_display list_select_related = ("repository", "level")backend/apps/mentorship/models/task.py (1)
28-30
: Consider indexing assigned_at since it’s used for ordering/searchWith admin ordering by assigned_at and likely frequent sorting/filtering, adding a DB index will help at scale.
Apply this diff:
- assigned_at = models.DateTimeField( - null=True, - blank=True, - help_text="Timestamp when the task was assigned to the mentee.", - ) + assigned_at = models.DateTimeField( + null=True, + blank=True, + db_index=True, + help_text="Timestamp when the task was assigned to the mentee.", + )backend/apps/mentorship/admin/task.py (1)
22-23
: Ordering by -assigned_at: consider nulls-last and select_related/search for better admin usability
- Null assigned_at will sort first on some DBs; if you want assigned tasks first, consider nulls_last (requires Django 4.1+ via get_ordering with F expression).
- Add list_select_related to avoid N+1 on issue/assignee/module columns in list_display.
- Restoring simple search_fields will improve discoverability.
Option (if Django 4.1+):
from django.db.models import F def get_ordering(self, request): return [F("assigned_at").desc(nulls_last=True)]Additional enhancements:
list_select_related = ("issue", "assignee", "module") search_fields = ( "issue__title", "assignee__login", "module__name", )backend/apps/mentorship/management/commands/sync_issue_levels.py (2)
1-1
: Fix docstring typo: “Tasklevel” → “TaskLevel”.Minor polish.
-"""A command to sync issue level with Tasklevel.""" +"""A command to sync issue level with TaskLevel."""
70-77
: Iterate with a cursor to reduce memory footprint on large datasets.Using QuerySet.iterator() prevents loading all Issues into memory at once.
- for issue in issues_query: + for issue in issues_query.iterator(chunk_size=2000): issue_labels_normalized = {normalize_name(label.name) for label in issue.labels.all()}backend/apps/mentorship/management/commands/sync_module_issues.py (3)
60-67
: Use explicit None-check on repository_id and rely on the FK’s numeric id for dict keys.Guard against pathological cases and keep keys as ints.
- issues_query = Issue.objects.select_related("repository").prefetch_related("labels") + issues_query = Issue.objects.select_related("repository").prefetch_related("labels") for issue in issues_query: - if not issue.repository_id: + if issue.repository_id is None: continue for label in issue.labels.all(): normalized_label = normalize_name(label.name) - key = (issue.repository_id, normalized_label) + key = (int(issue.repository_id), normalized_label) repo_label_to_issue_ids.setdefault(key, set()).add(issue.id)
24-31
: Simplify by removing URL parsing at callsites.Now that _extract_repo_full_name accepts the repo object, pass the Repository instance directly.
- repo_full_name = self._extract_repo_full_name(str(issue.repository.url)) + repo_full_name = self._extract_repo_full_name(issue.repository)Also applies to: 113-129
1-2
: Polish the command’s help text.Minor grammar fix.
-"""A command to sync update relation between module and issue and create task.""" +"""Sync module–issue relations and create tasks."""
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (10)
backend/apps/github/admin/issue.py
(1 hunks)backend/apps/github/migrations/0035_issue_level.py
(1 hunks)backend/apps/github/models/issue.py
(1 hunks)backend/apps/mentorship/Makefile
(1 hunks)backend/apps/mentorship/admin/task.py
(1 hunks)backend/apps/mentorship/management/commands/sync_issue_levels.py
(1 hunks)backend/apps/mentorship/management/commands/sync_module_issues.py
(1 hunks)backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py
(1 hunks)backend/apps/mentorship/models/task.py
(1 hunks)backend/apps/mentorship/utils.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
backend/apps/mentorship/utils.py (1)
backend/apps/mentorship/api/internal/nodes/mentor.py (1)
name
(18-20)
backend/apps/github/migrations/0035_issue_level.py (1)
backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py (1)
Migration
(6-25)
backend/apps/mentorship/management/commands/sync_issue_levels.py (5)
backend/apps/github/models/issue.py (1)
Issue
(18-220)backend/apps/github/models/label.py (1)
Label
(9-77)backend/apps/mentorship/models/task_level.py (1)
TaskLevel
(8-61)backend/apps/mentorship/utils.py (1)
normalize_name
(4-6)backend/apps/mentorship/management/commands/sync_module_issues.py (2)
Command
(16-212)handle
(176-212)
backend/apps/mentorship/management/commands/sync_module_issues.py (5)
backend/apps/github/models/issue.py (2)
Issue
(18-220)save
(173-182)backend/apps/mentorship/models/task.py (1)
Task
(10-80)backend/apps/mentorship/utils.py (1)
normalize_name
(4-6)backend/apps/mentorship/management/commands/sync_issue_levels.py (1)
Command
(12-97)backend/apps/github/models/generic_issue_model.py (1)
repository_id
(70-77)
🪛 GitHub Check: CodeQL
backend/apps/mentorship/management/commands/sync_module_issues.py
[failure] 26-26: Incomplete URL substring sanitization
The string github.com may be at an arbitrary position in the sanitized URL.
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile
[warning] 1-1: Missing required phony target "all"
(minphony)
[warning] 1-1: Missing required phony target "clean"
(minphony)
[warning] 1-1: Missing required phony target "test"
(minphony)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (6)
backend/apps/github/admin/issue.py (1)
22-22
: Expose level in list — good additionSurfacing level in the changelist is helpful for triage.
backend/apps/github/migrations/0035_issue_level.py (1)
8-11
: Migration wiring looks correct.Dependencies are sensible and align with the removal of Task.level and the addition of Issue.level.
backend/apps/mentorship/management/commands/sync_issue_levels.py (1)
20-38
: Mapping logic is solid and efficient.Building a per-module normalized label/level map is a good trade-off. Prefetching only Label.name is also a nice touch.
backend/apps/mentorship/management/commands/sync_module_issues.py (3)
136-159
: Reassigning task.module is correct.Good to realign tasks when module linkage changes.
24-31
: Use Repository.path and urlparse for safe owner/repo extractionThe Django Repository model already exposes a
.path
property ("owner/name"
), so there’s no need for a fragile regex on the URL. If you must parse a URL string (e.g., for nonstandard URLs), useurllib.parse.urlparse
and split the path component:--- a/backend/apps/mentorship/management/commands/sync_module_issues.py +++ b/backend/apps/mentorship/management/commands/sync_module_issues.py @@ def _extract_repo_full_name(self, repo_url): - if repo_url and "github.com" in repo_url: - match = re.search(r"github\.com/([^/]+/[^/]+)", repo_url) - if match: - return match.group(1) - return None + if not repo_url: + return None + # Prefer the model’s path property when you have an Issue.repository instance: + # return issue.repository.path + # For URL strings, parse the path component robustly: + from urllib.parse import urlparse + parsed = urlparse(repo_url) + path = parsed.path.lstrip("/") + parts = path.split("/") + if len(parts) >= 2: + return f"{parts[0]}/{parts[1]}" + return None• Remove any references to a non-existent
Repository.full_name
—useissue.repository.path
(orstr(issue.repository)
) in your logging
• This change fully addresses the CodeQL finding by avoiding substring-based regex on URLs and leveraging built-in URL parsing and model propertiesLikely an incorrect or invalid review comment.
176-213
: End-to-end flow is cohesive and logs are helpful.Single-pass map build + per-module processing + final summary is clear and maintainable. Error handling around GitHub API looks reasonable.
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)
23-34
: Harden repo full_name extraction; prefer model attribute; fix host-case; support api.github.com.Current parsing misses api.github.com, is case-sensitive, and depends on stringified URLs. Prefer Repository.full_name when available to avoid URL parsing entirely. This also addresses the CodeQL “incomplete URL sanitization” family of findings.
- ALLOWED_GITHUB_HOSTS = {"github.com", "www.github.com"} + ALLOWED_GITHUB_HOSTS = {"github.com", "www.github.com", "api.github.com"} @@ - def _extract_repo_full_name(self, repo_url): - parsed = urlparse(repo_url or "") - if parsed.netloc in self.ALLOWED_GITHUB_HOSTS: - parts = parsed.path.strip("/").split("/") - if len(parts) >= self.REPO_PATH_PARTS: - return "/".join(parts[: self.REPO_PATH_PARTS]) - return None - return None + def _extract_repo_full_name(self, repo_or_url): + """Return 'owner/repo' using Repository.full_name if present, else parse URL safely.""" + full_name = getattr(repo_or_url, "full_name", None) + if full_name: + return full_name + + parsed = urlparse(str(repo_or_url or "")) + host = (parsed.netloc or "").lower() + parts = [p for p in (parsed.path or "").strip("/").split("/") if p] + if host in self.ALLOWED_GITHUB_HOSTS and len(parts) >= 2: + # Handle api.github.com/repos/{owner}/{repo} + if host == "api.github.com" and parts[0] == "repos" and len(parts) >= 3: + return f"{parts[1]}/{parts[2]}" + return f"{parts[0]}/{parts[1]}" + return None
🧹 Nitpick comments (6)
backend/apps/mentorship/management/commands/sync_module_issues.py (6)
1-2
: Polish the command docstring for clarity.Reword to read more naturally.
-"""A command to sync update relation between module and issue and create task.""" +"""Sync module–issue links and create/update mentorship tasks."""
97-106
: Pre-normalize labels once per module.Minor perf/readability improvement.
- linked_label_names = module.labels + linked_label_names = module.labels + normalized_labels = {normalize_name(n) for n in (linked_label_names or [])} @@ - for repo in project_repos: - for label_name in linked_label_names: - normalized_label = normalize_name(label_name) + for repo in project_repos: + for normalized_label in normalized_labels: key = (repo.id, normalized_label) issues_for_label = repo_label_to_issue_ids.get(key, set()) matched_issue_ids.update(issues_for_label)
165-169
: Log owner/repo for clarity.Show repository owner/name, not just name.
- f"{issue.repository.name}#{issue.number} " + f"{getattr(issue.repository, 'full_name', issue.repository.name)}#{issue.number} "
200-201
: Tidy startup log.Minor cosmetic tweak.
- self.stdout.write("starting...") + self.stdout.write("Starting...")
23-34
: Add unit tests for repo name extraction edge cases.Cover: api.github.com/repos/{owner}/{repo}, mixed-case hosts, trailing slashes, extra path segments, None/empty input, Repository instances with/without full_name.
I can generate pytest tests under backend/apps/mentorship/tests/ for _extract_repo_full_name. Want me to open a follow-up PR?
122-169
: GitHub API rate limits: minimize calls and cache aggressively.The proposed refactor reduces calls, but consider short-circuiting when Task is COMPLETED and issue is still closed (no event fetch needed).
I can add guards to skip event lookups for closed issues and memoize assigned_at per (repo_full_name, issue.number, assignee.login) during a single run.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
backend/apps/mentorship/management/commands/sync_module_issues.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
📚 Learning: 2025-08-17T11:55:55.990Z
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
Applied to files:
backend/apps/mentorship/management/commands/sync_module_issues.py
🧬 Code graph analysis (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (4)
backend/apps/github/models/issue.py (2)
Issue
(18-220)save
(173-182)backend/apps/mentorship/models/task.py (2)
Task
(10-80)Status
(19-25)backend/apps/mentorship/utils.py (1)
normalize_name
(4-6)backend/apps/github/models/generic_issue_model.py (1)
repository_id
(70-77)
🔇 Additional comments (5)
backend/apps/mentorship/management/commands/sync_module_issues.py (5)
108-121
: Atomic link set is correct.Using a transaction and module.issues.set(...) is appropriate.
148-159
: Status mapping looks good and respects business rules (single assignee).No change requested.
183-197
: Clear per-module summary.Good UX; conditional detail by verbosity is appropriate.
151-181
: Consider updating assigned_at for existing tasks when it’s missing.If a Task exists without assigned_at, we should backfill it when available (the refactor above includes this).
Would you like me to add a one-off data migration to backfill assigned_at for existing tasks using the same logic?
165-169
: Ignore full_name fallback suggestion;Repository
model has nofull_name
field and usingissue.repository.name
is correct.Likely an incorrect or invalid review comment.
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
backend/apps/mentorship/management/commands/sync_module_issues.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rajgupta36 this PR has been sitting here for a while with unresolved coderabbit's comments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)
128-133
: Make assignee selection deterministic.first() on an unordered M2M can vary by backend; order by PK (or desired field) to keep stable behavior. Note: honoring the “one assignee per issue” rule matches the documented product decision.
- assignee = issue.assignees.first() + assignee = issue.assignees.order_by("id").first()I’m applying your stored learning that one task is created per issue using only the first assignee.
🧹 Nitpick comments (6)
backend/apps/mentorship/management/commands/sync_module_issues.py (6)
1-1
: Polish the command docstring for clarity.Suggest: “Sync the relation between modules and issues and create tasks.”
51-67
: Select the most recent matching assignment event explicitly.PyGithub event ordering isn’t guaranteed; use max(created_at) instead of the last iterated value.
- last_dt = None - for event in gh_issue.get_events(): + last_dt = None + for event in gh_issue.get_events(): if ( event.event == "assigned" and event.assignee and event.assignee.login == assignee_login ): - last_dt = event.created_at + if last_dt is None or event.created_at > last_dt: + last_dt = event.created_at
75-91
: Optional: build the map only for labels used by modules to reduce DB work.Collect normalized labels from target modules first and filter the Issue query by those labels; this can cut rows scanned on large datasets.
114-117
: Reduce transaction scope; avoid holding DB locks during external API calls.Wrap only the minimal DB mutations (e.g., module.issues.set and per-task saves) in atomic blocks; perform GitHub lookups outside to lower contention and latency.
Would you like a follow-up patch that narrows the atomic blocks while preserving consistency?
118-125
: Drop unused prefetch('labels') on issues queryset.Labels aren’t used in the loop; removing this reduces memory and query work.
- .prefetch_related("assignees", "labels") + .prefetch_related("assignees")
170-178
: Minor: include owner/repo in creation log for clarity.Using the repo’s full name (owner/repo) improves traceability across organizations.
- f"{issue.repository.name}#{issue.number} " + f"{getattr(issue.repository, 'path', issue.repository.name)}#{issue.number} "
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/Makefile
(1 hunks)backend/apps/mentorship/management/commands/sync_module_issues.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/Makefile
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
📚 Learning: 2025-08-17T11:55:55.990Z
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
Applied to files:
backend/apps/mentorship/management/commands/sync_module_issues.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run backend tests
- GitHub Check: Run frontend unit tests
🔇 Additional comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)
201-238
: No changes needed—Task.Meta.unique_together enforces (issue, assignee) uniqueness.
Proposed change
Checklist
sync/module-issues
sync/issue-levels
preview