Skip to content

Conversation

Rajgupta36
Copy link
Collaborator

@Rajgupta36 Rajgupta36 commented Aug 6, 2025

Proposed change

Checklist

  • Added management command sync/module-issues sync/issue-levels
  • Update makefile added make filter-issues cmd
  • Updated models

preview

image image

Copy link
Contributor

coderabbitai bot commented Aug 6, 2025

Summary by CodeRabbit

  • New Features
    • Modules can link to related GitHub issues; admin now offers autocomplete when selecting them.
    • GitHub Issues now support a difficulty level and display it in the admin list.
    • Two sync commands added to populate module–issue links and set issue difficulty from labels.
    • Tasks show most recently assigned first; assignment time is optional.
  • Chores
    • Makefile updates to include mentorship app and add targets to run the new sync commands.

Walkthrough

Adds Issue.level (FK to mentorship.TaskLevel) and Module.issues (M2M to github.Issue); removes Task.level and makes Task.assigned_at optional. Adds two management commands to sync module–issue links and infer issue levels from labels. Updates admin displays and Makefiles; adds normalize_name utility and supporting migrations.

Changes

Cohort / File(s) Summary
Build integration
backend/Makefile, backend/apps/mentorship/Makefile
Include mentorship Makefile in top-level Makefile; add sync/module-issues and sync/issue-levels targets that invoke Django management commands.
Mentorship models & migrations
backend/apps/mentorship/models/module.py, backend/apps/mentorship/models/task.py, backend/apps/mentorship/migrations/0005_module_issues.py, backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py
Add Module.issues M2M to github.Issue; remove Task.level FK; change Task.assigned_at to null=True, blank=True; include migrations.
Mentorship admin
backend/apps/mentorship/admin/module.py, backend/apps/mentorship/admin/task.py
ModuleAdmin adds autocomplete_fields = ("issues",); TaskAdmin removes search_fields and adds ordering = ["-assigned_at"].
Mentorship management commands & utils
backend/apps/mentorship/management/commands/sync_issue_levels.py, backend/apps/mentorship/management/commands/sync_module_issues.py, backend/apps/mentorship/utils.py
Add sync_issue_levels command to infer and set Issue.level from labels per-module; add sync_module_issues to link issues to modules and create/update Tasks for assignees; add normalize_name helper.
GitHub models, migrations & admin
backend/apps/github/models/issue.py, backend/apps/github/migrations/0035_issue_level.py, backend/apps/github/admin/issue.py
Add optional Issue.level FK to mentorship.TaskLevel (SET_NULL) with migration; include level in IssueAdmin.list_display.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • kasya
  • arkid15r
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Rajgupta36 Rajgupta36 changed the title added adition fields and job for issue filter by labels Added management command for linking issue Aug 6, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (1)

43-61: Consider optimizing the linked issues query.

The command logic is sound, but there's a potential N+1 query issue where module.linked_issues.values_list("id", flat=True) executes a database query for each module.

Consider prefetching the linked issues to avoid repeated queries:

 modules = (
     Module.objects.exclude(linked_issue_labels__isnull=True)
     .exclude(linked_issue_labels=[])
+    .prefetch_related("linked_issues")
     .iterator()
 )

Alternatively, you could batch the processing by collecting all module updates and applying them together, though the current approach with individual transactions provides better error isolation.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4424d3 and 5cf863d.

📒 Files selected for processing (5)
  • backend/Makefile (1 hunks)
  • backend/apps/mentorship/admin/module.py (1 hunks)
  • backend/apps/mentorship/management/commands/link_issues_by_label.py (1 hunks)
  • backend/apps/mentorship/migrations/0004_module_linked_issue_labels_module_linked_issues.py (1 hunks)
  • backend/apps/mentorship/models/module.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (9)
backend/apps/mentorship/models/module.py (2)

55-60: LGTM! Well-designed JSONField for label storage.

The linked_issue_labels field is properly configured with appropriate defaults, validation, and descriptive help text. Using a JSONField with default=list is the right approach for storing GitHub issue labels.


71-77: LGTM! Properly configured ManyToManyField relationship.

The linked_issues field correctly establishes the many-to-many relationship with GitHub issues. The related_name="mentorship_modules" provides a clear reverse relationship path, and the help text accurately describes the purpose.

backend/Makefile (1)

62-63: LGTM! Makefile target follows established conventions.

The filter-issues target correctly follows the same pattern as other Django management command targets in the file and is positioned appropriately.

backend/apps/mentorship/admin/module.py (2)

15-15: LGTM! Good addition to admin list display.

Adding linked_issue_labels to the list display will provide valuable visibility into which labels are associated with each module directly from the admin list view.


17-17: LGTM! Horizontal filter improves UX for M2M field.

Using filter_horizontal for the linked_issues many-to-many field provides a much better user experience than the default widget, especially when dealing with large numbers of issues.

backend/apps/mentorship/migrations/0004_module_linked_issue_labels_module_linked_issues.py (1)

13-34: LGTM! Migration correctly reflects model changes.

The migration properly adds both new fields with configurations that exactly match the model definitions. Dependencies are correctly specified for both the github app (for the Issue model) and the previous mentorship migration.

backend/apps/mentorship/management/commands/link_issues_by_label.py (3)

23-30: LGTM! Efficient approach to building label mapping.

The strategy of building a complete label_to_issue_ids mapping upfront is efficient and avoids repeated database queries during the main processing loop. Using prefetch_related("labels") and only("id") optimizes the query.


37-41: LGTM! Good optimization to filter modules upfront.

Filtering out modules without linked_issue_labels before processing avoids unnecessary work. The use of .iterator() is also good for memory efficiency when dealing with large numbers of modules.


53-54: LGTM! Proper use of atomic transactions.

Using transaction.atomic() for each module update ensures data consistency and provides good error isolation. The .set() method is the correct way to update many-to-many relationships.

backend/Makefile Outdated
@@ -59,6 +59,9 @@ enrich-data: \
owasp-enrich-events \
owasp-enrich-projects

filter-issues:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter/link?

move this to mentorship app Makefile

modules = (
Module.objects.exclude(linked_issue_labels__isnull=True)
.exclude(linked_issue_labels=[])
.iterator()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why .iterator()?

@@ -52,6 +52,13 @@ class Meta:
verbose_name="Labels",
)

linked_issue_labels = models.JSONField(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have labels field for that

@@ -61,6 +68,14 @@ class Meta:
blank=True,
)

linked_issues = models.ManyToManyField(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just issues is fine.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/apps/mentorship/Makefile (1)

1-2: Optional: add minimal conventional targets to satisfy checkmake

Static analysis flagged missing all, clean, and test phony targets.
If this Makefile is intentionally scoped to a single command you can ignore the warning, otherwise consider stubs to silence CI:

.PHONY: all clean test
all: ## default noop
clean: ## noop
test: ## noop
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de17ed7 and ccd183c.

📒 Files selected for processing (2)
  • backend/Makefile (1 hunks)
  • backend/apps/mentorship/Makefile (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/Makefile
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: backend/apps/mentorship/graphql/mutations/module.py:28-29
Timestamp: 2025-07-16T13:49:58.648Z
Learning: In the OWASP Nest mentorship system, mentors can be created with only github_user initially (without nest_user) when assigned to modules. This allows mentors to be assigned before they've signed into the system. When these users eventually sign in, the nest_user is automatically linked to the existing github_user, creating a complete mentor profile. This design provides flexibility in mentor assignment workflows.
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: frontend/src/components/ModuleForm.tsx:112-134
Timestamp: 2025-07-14T16:18:07.287Z
Learning: In the OWASP Nest mentorship system, date validation for modules (ensuring start date precedes end date) is handled on the backend in the module GraphQL mutations via the `_validate_module_dates` helper function in backend/apps/mentorship/graphql/mutations/module.py, which prevents invalid date ranges from being stored in the database.
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/apps/mentorship/Makefile (1)

1-2: Align the target name with the PR documentation to prevent confusion

The PR description and Makefile snippet in the main README refer to the command as make filter-issues, but the actual target is filter/link.
Either rename the target or update the docs to keep tooling and documentation in sync.

-filter/link:
-	@CMD="python manage.py link_issues_by_label" $(MAKE) exec-backend-command
+filter-issues:
+	@CMD="python manage.py link_issues_by_label" $(MAKE) exec-backend-command
+.PHONY: filter-issues

@Rajgupta36 Rajgupta36 requested a review from arkid15r August 7, 2025 07:41
Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't break existing ordering convention with the changes you add.

filter_horizontal is a slow widget, use autocomplete_fields

Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest focusing on simple and straight-forward solution instead of efficiency fist. We can optimize it later if needed.

label_to_issue_ids: dict[str, set[int]] = {}
for issue in issues:
for label in issue.labels.all():
label_to_issue_ids.setdefault(label.name, set()).add(issue.id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label name is not unique. The same name label can exist in multiple repositories.

current_ids = set(module.issues.values_list("id", flat=True))
if current_ids != matched_issue_ids:
with transaction.atomic():
module.issues.set(matched_issue_ids)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You link any issue that has a label w/ this name -- we need project related issues only. The database contains issues from multiple repositories of different organizations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created map-based matching for this, so each issue and repository pair has a composite key.

@Rajgupta36 Rajgupta36 requested a review from arkid15r August 7, 2025 20:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
backend/apps/mentorship/management/commands/link_issues_by_label.py (3)

30-30: Drop select_related('repository'); limit fields to reduce memory

You only use issue.id and issue.repository_id, so select_related isn't needed. Limit columns to reduce memory.

-        issues_query = Issue.objects.select_related("repository").prefetch_related("labels")
+        issues_query = Issue.objects.only("id", "repository_id").prefetch_related("labels")

68-73: Don’t overwrite manual links by default; add --append and --dry-run

set(...) replaces all existing links. Provide safer defaults and options:

  • --append: add only the newly matched issues.
  • --dry-run: compute and report without writing.

Apply within handle():

-            with transaction.atomic():
-                module.issues.set(matched_issue_ids)
-
-            num_linked = len(matched_issue_ids)
+            dry_run = options.get("dry_run", False)
+            with transaction.atomic():
+                if options.get("append"):
+                    existing_ids = set(module.issues.values_list("id", flat=True))
+                    to_add = matched_issue_ids - existing_ids
+                    if not dry_run and to_add:
+                        module.issues.add(*to_add)
+                    num_linked = len(to_add)
+                else:
+                    if not dry_run:
+                        module.issues.set(matched_issue_ids)
+                    num_linked = len(matched_issue_ids)

And define arguments:

def add_arguments(self, parser):
    parser.add_argument(
        "--append", action="store_true",
        help="Add new matches without removing existing links."
    )
    parser.add_argument(
        "--dry-run", action="store_true",
        help="Compute matches and print changes without writing."
    )

Would you like me to open a follow-up PR with these changes?


71-73: Clarify and correct “total links created” accounting

Currently this sums the size of each matched set, not the number of newly-created M2M links. If you add --append, increment by len(to_add). If replacing, you can report “set N links” instead of “created”.

Example:

total_links_created += num_linked  # where num_linked reflects added links in append mode

And tweak the summary message to “set” vs “added” depending on mode.

Also applies to: 83-88

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccd183c and 39b1236.

📒 Files selected for processing (4)
  • backend/Makefile (1 hunks)
  • backend/apps/mentorship/admin/module.py (1 hunks)
  • backend/apps/mentorship/management/commands/link_issues_by_label.py (1 hunks)
  • backend/apps/mentorship/models/module.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/apps/mentorship/models/module.py
  • backend/apps/mentorship/admin/module.py
  • backend/Makefile
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: backend/apps/mentorship/graphql/mutations/module.py:28-29
Timestamp: 2025-07-16T13:49:58.648Z
Learning: In the OWASP Nest mentorship system, mentors can be created with only github_user initially (without nest_user) when assigned to modules. This allows mentors to be assigned before they've signed into the system. When these users eventually sign in, the nest_user is automatically linked to the existing github_user, creating a complete mentor profile. This design provides flexibility in mentor assignment workflows.
Learnt from: Rajgupta36
PR: OWASP/Nest#1717
File: frontend/src/components/ModuleForm.tsx:112-134
Timestamp: 2025-07-14T16:18:07.287Z
Learning: In the OWASP Nest mentorship system, date validation for modules (ensuring start date precedes end date) is handled on the backend in the module GraphQL mutations via the `_validate_module_dates` helper function in backend/apps/mentorship/graphql/mutations/module.py, which prevents invalid date ranges from being stored in the database.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
backend/apps/mentorship/management/commands/link_issues_by_label.py (2)

62-67: Good: repository-scoped label matching prevents cross-repo leakage

Using (repo.id, label_name) keyed lookups scoped to module.project.repositories addresses prior concerns about label name collisions across repos and linking issues from unrelated projects.


59-60: Ignore incorrect linked_issue_labels suggestion

The Module model only defines a labels = models.JSONField(...) field—there is no linked_issue_labels attribute anywhere in the codebase. Using module.labels is correct, so you can disregard the original request to switch to a non-existent field.

Likely an incorrect or invalid review comment.

@Rajgupta36 Rajgupta36 marked this pull request as draft August 13, 2025 09:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
backend/apps/mentorship/management/commands/link_issues_by_label.py (2)

52-57: Add distinct() to prevent duplicate modules from M2M joins.

This addresses previous feedback about duplicate module processing due to the M2M relationship with repositories.

         modules_to_process = (
             Module.objects.prefetch_related("project__repositories")
             .exclude(project__repositories__isnull=True)
             .exclude(labels__isnull=True)
             .exclude(labels=[])
+            .distinct()
         )

31-41: Performance issue: Query optimization needed to avoid loading entire database.

The current implementation fetches all issues and labels from the database without filtering, which can cause significant performance issues as the database grows. This addresses the previous review feedback about scaling.

Based on the previous review suggestions, implement the optimized approach using the through table:

-        issues_query = Issue.objects.select_related("repository").prefetch_related("labels")
-
-        for issue in issues_query:
-            if not issue.repository_id:
-                continue
-
-            for label in issue.labels.all():
-                normalized_label = (label.name or "").strip().casefold()
-                key = (issue.repository_id, normalized_label)
-                repo_label_to_issue_ids.setdefault(key, set()).add(issue.id)
+        # First, get modules to determine which repos/labels we care about
+        modules_queryset = (
+            Module.objects.prefetch_related("project__repositories")
+            .exclude(project__repositories__isnull=True)
+            .exclude(labels__isnull=True)
+            .exclude(labels=[])
+            .distinct()
+        )
+        
+        # Gather repos and labels of interest
+        repo_ids = set()
+        wanted_labels = set()
+        for module in modules_queryset:
+            for repo in module.project.repositories.all():
+                repo_ids.add(repo.id)
+            for label_name in (module.labels or []):
+                wanted_labels.add((label_name or "").strip().casefold())
+        
+        if not repo_ids or not wanted_labels:
+            self.stdout.write("No repositories or labels found to process.")
+            return
+            
+        # Query via through table for efficiency
+        Through = Issue.labels.through
+        pairs = (
+            Through.objects
+            .select_related("issue__repository", "label")
+            .filter(
+                issue__repository_id__in=repo_ids,
+                label__name__in=wanted_labels
+            )
+            .values_list("issue_id", "issue__repository_id", "label__name")
+        )
+        
+        # Build the map efficiently
+        for issue_id, repo_id, label_name in pairs:
+            normalized_label = (label_name or "").strip().casefold()
+            if normalized_label in wanted_labels:  # Double-check normalization
+                key = (repo_id, normalized_label)
+                repo_label_to_issue_ids.setdefault(key, set()).add(issue_id)
🧹 Nitpick comments (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (1)

76-79: Potential performance issue: Consider using prefetch_related for assignees.

The current query fetches issues and then accesses assignees.first() for each issue individually, which could result in N+1 query problems.

                 if matched_issue_ids:
                     issues = Issue.objects.filter(
                         id__in=matched_issue_ids,
                         assignees__isnull=False,
-                    ).distinct()
+                    ).prefetch_related("assignees").distinct()
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac10863 and ad1f1d8.

📒 Files selected for processing (2)
  • backend/apps/mentorship/management/commands/link_issues_by_label.py (1 hunks)
  • backend/apps/mentorship/migrations/0005_module_issues.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/apps/mentorship/management/commands/link_issues_by_label.py (3)
backend/apps/mentorship/models/module.py (2)
  • Module (16-99)
  • save (92-99)
backend/apps/mentorship/models/task.py (1)
  • Task (10-88)
backend/apps/github/models/generic_issue_model.py (1)
  • repository_id (70-77)
🔇 Additional comments (2)
backend/apps/mentorship/migrations/0005_module_issues.py (1)

1-24: LGTM! Clean migration for adding ManyToManyField.

The migration properly adds the issues field to the Module model with appropriate metadata and constraints. The field configuration is well-structured with helpful text and proper related naming.

backend/apps/mentorship/management/commands/link_issues_by_label.py (1)

112-112: Repository name attribute usage confirmed.

Good fix from the previous review - using r.name instead of the non-existent r.path attribute.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
backend/apps/mentorship/Makefile (1)

1-5: Declare these targets as .PHONY and standardize verbosity.

Make can skip targets if a file/dir matches the name. Mark them phony. Also consider consistent verbosity across the two commands.

+.PHONY: sync/module-issues sync/issue-levels
 sync/module-issues:
 	@CMD="python manage.py sync_module_issues -v 2" $(MAKE) exec-backend-command

 sync/issue-levels:
-	@CMD="python manage.py sync_issue_levels" $(MAKE) exec-backend-command
+	@CMD="python manage.py sync_issue_levels -v 2" $(MAKE) exec-backend-command
🧹 Nitpick comments (10)
backend/apps/mentorship/utils.py (1)

4-6: Type annotations and docstring alignment for normalize_name

Minor polish: reflect that casefold is used (more robust than lower) and add type hints for clarity.

Apply this diff to the function:

-def normalize_name(name):
-    """Normalize a string by stripping whitespace and converting to lowercase."""
-    return (name or "").strip().casefold()
+def normalize_name(name: "Optional[str]") -> str:
+    """Normalize a string by stripping whitespace and applying casefold for case-insensitive matching."""
+    return (name or "").strip().casefold()

Add the missing import at the top of the file:

from typing import Optional
backend/apps/github/models/issue.py (1)

57-64: Rename related_name and fix help_text to reflect Issue semantics

The FK lives on Issue, so using related_name="tasks" and mentioning “task” in help_text is misleading. Recommend aligning both to “issue”.

Apply this diff:

-    level = models.ForeignKey(
-        "mentorship.TaskLevel",
-        null=True,
-        blank=True,
-        on_delete=models.SET_NULL,
-        related_name="tasks",
-        help_text="The difficulty level of this task.",
-    )
+    level = models.ForeignKey(
+        "mentorship.TaskLevel",
+        null=True,
+        blank=True,
+        on_delete=models.SET_NULL,
+        related_name="issues",
+        help_text="The difficulty level of this issue.",
+    )

Note: This will require a migration and updating any usages of TaskLevel.tasks (if any) to TaskLevel.issues.

backend/apps/github/admin/issue.py (1)

22-22: Add filter and select_related for level to improve admin UX and performance

  • Add level to list_filter to quickly narrow issues by difficulty.
  • Use list_select_related to avoid N+1 queries for repository and level in the changelist.

Proposed additions:

# Improve filters
list_filter = (
    "state",
    "is_locked",
    "level",
)

# Avoid N+1 on FK columns shown in list_display
list_select_related = ("repository", "level")
backend/apps/mentorship/models/task.py (1)

28-30: Consider indexing assigned_at since it’s used for ordering/search

With admin ordering by assigned_at and likely frequent sorting/filtering, adding a DB index will help at scale.

Apply this diff:

-    assigned_at = models.DateTimeField(
-        null=True,
-        blank=True,
-        help_text="Timestamp when the task was assigned to the mentee.",
-    )
+    assigned_at = models.DateTimeField(
+        null=True,
+        blank=True,
+        db_index=True,
+        help_text="Timestamp when the task was assigned to the mentee.",
+    )
backend/apps/mentorship/admin/task.py (1)

22-23: Ordering by -assigned_at: consider nulls-last and select_related/search for better admin usability

  • Null assigned_at will sort first on some DBs; if you want assigned tasks first, consider nulls_last (requires Django 4.1+ via get_ordering with F expression).
  • Add list_select_related to avoid N+1 on issue/assignee/module columns in list_display.
  • Restoring simple search_fields will improve discoverability.

Option (if Django 4.1+):

from django.db.models import F

def get_ordering(self, request):
    return [F("assigned_at").desc(nulls_last=True)]

Additional enhancements:

list_select_related = ("issue", "assignee", "module")

search_fields = (
    "issue__title",
    "assignee__login",
    "module__name",
)
backend/apps/mentorship/management/commands/sync_issue_levels.py (2)

1-1: Fix docstring typo: “Tasklevel” → “TaskLevel”.

Minor polish.

-"""A command to sync issue level with Tasklevel."""
+"""A command to sync issue level with TaskLevel."""

70-77: Iterate with a cursor to reduce memory footprint on large datasets.

Using QuerySet.iterator() prevents loading all Issues into memory at once.

-        for issue in issues_query:
+        for issue in issues_query.iterator(chunk_size=2000):
             issue_labels_normalized = {normalize_name(label.name) for label in issue.labels.all()}
backend/apps/mentorship/management/commands/sync_module_issues.py (3)

60-67: Use explicit None-check on repository_id and rely on the FK’s numeric id for dict keys.

Guard against pathological cases and keep keys as ints.

-        issues_query = Issue.objects.select_related("repository").prefetch_related("labels")
+        issues_query = Issue.objects.select_related("repository").prefetch_related("labels")
         for issue in issues_query:
-            if not issue.repository_id:
+            if issue.repository_id is None:
                 continue
             for label in issue.labels.all():
                 normalized_label = normalize_name(label.name)
-                key = (issue.repository_id, normalized_label)
+                key = (int(issue.repository_id), normalized_label)
                 repo_label_to_issue_ids.setdefault(key, set()).add(issue.id)

24-31: Simplify by removing URL parsing at callsites.

Now that _extract_repo_full_name accepts the repo object, pass the Repository instance directly.

-                        repo_full_name = self._extract_repo_full_name(str(issue.repository.url))
+                        repo_full_name = self._extract_repo_full_name(issue.repository)

Also applies to: 113-129


1-2: Polish the command’s help text.

Minor grammar fix.

-"""A command to sync update relation between module and issue and create task."""
+"""Sync module–issue relations and create tasks."""
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8c3b3b6 and ee9eb3d.

📒 Files selected for processing (10)
  • backend/apps/github/admin/issue.py (1 hunks)
  • backend/apps/github/migrations/0035_issue_level.py (1 hunks)
  • backend/apps/github/models/issue.py (1 hunks)
  • backend/apps/mentorship/Makefile (1 hunks)
  • backend/apps/mentorship/admin/task.py (1 hunks)
  • backend/apps/mentorship/management/commands/sync_issue_levels.py (1 hunks)
  • backend/apps/mentorship/management/commands/sync_module_issues.py (1 hunks)
  • backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py (1 hunks)
  • backend/apps/mentorship/models/task.py (1 hunks)
  • backend/apps/mentorship/utils.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
backend/apps/mentorship/utils.py (1)
backend/apps/mentorship/api/internal/nodes/mentor.py (1)
  • name (18-20)
backend/apps/github/migrations/0035_issue_level.py (1)
backend/apps/mentorship/migrations/0006_remove_task_level_alter_task_assigned_at.py (1)
  • Migration (6-25)
backend/apps/mentorship/management/commands/sync_issue_levels.py (5)
backend/apps/github/models/issue.py (1)
  • Issue (18-220)
backend/apps/github/models/label.py (1)
  • Label (9-77)
backend/apps/mentorship/models/task_level.py (1)
  • TaskLevel (8-61)
backend/apps/mentorship/utils.py (1)
  • normalize_name (4-6)
backend/apps/mentorship/management/commands/sync_module_issues.py (2)
  • Command (16-212)
  • handle (176-212)
backend/apps/mentorship/management/commands/sync_module_issues.py (5)
backend/apps/github/models/issue.py (2)
  • Issue (18-220)
  • save (173-182)
backend/apps/mentorship/models/task.py (1)
  • Task (10-80)
backend/apps/mentorship/utils.py (1)
  • normalize_name (4-6)
backend/apps/mentorship/management/commands/sync_issue_levels.py (1)
  • Command (12-97)
backend/apps/github/models/generic_issue_model.py (1)
  • repository_id (70-77)
🪛 GitHub Check: CodeQL
backend/apps/mentorship/management/commands/sync_module_issues.py

[failure] 26-26: Incomplete URL substring sanitization
The string github.com may be at an arbitrary position in the sanitized URL.

🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (6)
backend/apps/github/admin/issue.py (1)

22-22: Expose level in list — good addition

Surfacing level in the changelist is helpful for triage.

backend/apps/github/migrations/0035_issue_level.py (1)

8-11: Migration wiring looks correct.

Dependencies are sensible and align with the removal of Task.level and the addition of Issue.level.

backend/apps/mentorship/management/commands/sync_issue_levels.py (1)

20-38: Mapping logic is solid and efficient.

Building a per-module normalized label/level map is a good trade-off. Prefetching only Label.name is also a nice touch.

backend/apps/mentorship/management/commands/sync_module_issues.py (3)

136-159: Reassigning task.module is correct.

Good to realign tasks when module linkage changes.


24-31: Use Repository.path and urlparse for safe owner/repo extraction

The Django Repository model already exposes a .path property ("owner/name"), so there’s no need for a fragile regex on the URL. If you must parse a URL string (e.g., for nonstandard URLs), use urllib.parse.urlparse and split the path component:

--- a/backend/apps/mentorship/management/commands/sync_module_issues.py
+++ b/backend/apps/mentorship/management/commands/sync_module_issues.py
@@ def _extract_repo_full_name(self, repo_url):
-        if repo_url and "github.com" in repo_url:
-            match = re.search(r"github\.com/([^/]+/[^/]+)", repo_url)
-            if match:
-                return match.group(1)
-        return None
+        if not repo_url:
+            return None
+        # Prefer the model’s path property when you have an Issue.repository instance:
+        #     return issue.repository.path
+        # For URL strings, parse the path component robustly:
+        from urllib.parse import urlparse
+        parsed = urlparse(repo_url)
+        path = parsed.path.lstrip("/")
+        parts = path.split("/")
+        if len(parts) >= 2:
+            return f"{parts[0]}/{parts[1]}"
+        return None

• Remove any references to a non-existent Repository.full_name—use issue.repository.path (or str(issue.repository)) in your logging
• This change fully addresses the CodeQL finding by avoiding substring-based regex on URLs and leveraging built-in URL parsing and model properties

Likely an incorrect or invalid review comment.


176-213: End-to-end flow is cohesive and logs are helpful.

Single-pass map build + per-module processing + final summary is clear and maintainable. Error handling around GitHub API looks reasonable.

@Rajgupta36 Rajgupta36 marked this pull request as ready for review August 17, 2025 13:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)

23-34: Harden repo full_name extraction; prefer model attribute; fix host-case; support api.github.com.

Current parsing misses api.github.com, is case-sensitive, and depends on stringified URLs. Prefer Repository.full_name when available to avoid URL parsing entirely. This also addresses the CodeQL “incomplete URL sanitization” family of findings.

-    ALLOWED_GITHUB_HOSTS = {"github.com", "www.github.com"}
+    ALLOWED_GITHUB_HOSTS = {"github.com", "www.github.com", "api.github.com"}
@@
-    def _extract_repo_full_name(self, repo_url):
-        parsed = urlparse(repo_url or "")
-        if parsed.netloc in self.ALLOWED_GITHUB_HOSTS:
-            parts = parsed.path.strip("/").split("/")
-            if len(parts) >= self.REPO_PATH_PARTS:
-                return "/".join(parts[: self.REPO_PATH_PARTS])
-            return None
-        return None
+    def _extract_repo_full_name(self, repo_or_url):
+        """Return 'owner/repo' using Repository.full_name if present, else parse URL safely."""
+        full_name = getattr(repo_or_url, "full_name", None)
+        if full_name:
+            return full_name
+
+        parsed = urlparse(str(repo_or_url or ""))
+        host = (parsed.netloc or "").lower()
+        parts = [p for p in (parsed.path or "").strip("/").split("/") if p]
+        if host in self.ALLOWED_GITHUB_HOSTS and len(parts) >= 2:
+            # Handle api.github.com/repos/{owner}/{repo}
+            if host == "api.github.com" and parts[0] == "repos" and len(parts) >= 3:
+                return f"{parts[1]}/{parts[2]}"
+            return f"{parts[0]}/{parts[1]}"
+        return None
🧹 Nitpick comments (6)
backend/apps/mentorship/management/commands/sync_module_issues.py (6)

1-2: Polish the command docstring for clarity.

Reword to read more naturally.

-"""A command to sync update relation between module and issue and create task."""
+"""Sync module–issue links and create/update mentorship tasks."""

97-106: Pre-normalize labels once per module.

Minor perf/readability improvement.

-        linked_label_names = module.labels
+        linked_label_names = module.labels
+        normalized_labels = {normalize_name(n) for n in (linked_label_names or [])}
@@
-        for repo in project_repos:
-            for label_name in linked_label_names:
-                normalized_label = normalize_name(label_name)
+        for repo in project_repos:
+            for normalized_label in normalized_labels:
                 key = (repo.id, normalized_label)
                 issues_for_label = repo_label_to_issue_ids.get(key, set())
                 matched_issue_ids.update(issues_for_label)

165-169: Log owner/repo for clarity.

Show repository owner/name, not just name.

-                                    f"{issue.repository.name}#{issue.number} "
+                                    f"{getattr(issue.repository, 'full_name', issue.repository.name)}#{issue.number} "

200-201: Tidy startup log.

Minor cosmetic tweak.

-        self.stdout.write("starting...")
+        self.stdout.write("Starting...")

23-34: Add unit tests for repo name extraction edge cases.

Cover: api.github.com/repos/{owner}/{repo}, mixed-case hosts, trailing slashes, extra path segments, None/empty input, Repository instances with/without full_name.

I can generate pytest tests under backend/apps/mentorship/tests/ for _extract_repo_full_name. Want me to open a follow-up PR?


122-169: GitHub API rate limits: minimize calls and cache aggressively.

The proposed refactor reduces calls, but consider short-circuiting when Task is COMPLETED and issue is still closed (no event fetch needed).

I can add guards to skip event lookups for closed issues and memoize assigned_at per (repo_full_name, issue.number, assignee.login) during a single run.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5291650 and 1e3d166.

📒 Files selected for processing (1)
  • backend/apps/mentorship/management/commands/sync_module_issues.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
📚 Learning: 2025-08-17T11:55:55.990Z
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.

Applied to files:

  • backend/apps/mentorship/management/commands/sync_module_issues.py
🧬 Code graph analysis (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (4)
backend/apps/github/models/issue.py (2)
  • Issue (18-220)
  • save (173-182)
backend/apps/mentorship/models/task.py (2)
  • Task (10-80)
  • Status (19-25)
backend/apps/mentorship/utils.py (1)
  • normalize_name (4-6)
backend/apps/github/models/generic_issue_model.py (1)
  • repository_id (70-77)
🔇 Additional comments (5)
backend/apps/mentorship/management/commands/sync_module_issues.py (5)

108-121: Atomic link set is correct.

Using a transaction and module.issues.set(...) is appropriate.


148-159: Status mapping looks good and respects business rules (single assignee).

No change requested.


183-197: Clear per-module summary.

Good UX; conditional detail by verbosity is appropriate.


151-181: Consider updating assigned_at for existing tasks when it’s missing.

If a Task exists without assigned_at, we should backfill it when available (the refactor above includes this).

Would you like me to add a one-off data migration to backfill assigned_at for existing tasks using the same logic?


165-169: Ignore full_name fallback suggestion; Repository model has no full_name field and using issue.repository.name is correct.

Likely an incorrect or invalid review comment.

@arkid15r arkid15r changed the base branch from main to feature/mentorship-portal September 5, 2025 01:39
Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rajgupta36 this PR has been sitting here for a while with unresolved coderabbit's comments

Copy link

sonarqubecloud bot commented Sep 7, 2025

@Rajgupta36 Rajgupta36 requested a review from arkid15r September 7, 2025 19:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)

128-133: Make assignee selection deterministic.

first() on an unordered M2M can vary by backend; order by PK (or desired field) to keep stable behavior. Note: honoring the “one assignee per issue” rule matches the documented product decision.

-                    assignee = issue.assignees.first()
+                    assignee = issue.assignees.order_by("id").first()

I’m applying your stored learning that one task is created per issue using only the first assignee.

🧹 Nitpick comments (6)
backend/apps/mentorship/management/commands/sync_module_issues.py (6)

1-1: Polish the command docstring for clarity.

Suggest: “Sync the relation between modules and issues and create tasks.”


51-67: Select the most recent matching assignment event explicitly.

PyGithub event ordering isn’t guaranteed; use max(created_at) instead of the last iterated value.

-            last_dt = None
-            for event in gh_issue.get_events():
+            last_dt = None
+            for event in gh_issue.get_events():
                 if (
                     event.event == "assigned"
                     and event.assignee
                     and event.assignee.login == assignee_login
                 ):
-                    last_dt = event.created_at
+                    if last_dt is None or event.created_at > last_dt:
+                        last_dt = event.created_at

75-91: Optional: build the map only for labels used by modules to reduce DB work.

Collect normalized labels from target modules first and filter the Issue query by those labels; this can cut rows scanned on large datasets.


114-117: Reduce transaction scope; avoid holding DB locks during external API calls.

Wrap only the minimal DB mutations (e.g., module.issues.set and per-task saves) in atomic blocks; perform GitHub lookups outside to lower contention and latency.

Would you like a follow-up patch that narrows the atomic blocks while preserving consistency?


118-125: Drop unused prefetch('labels') on issues queryset.

Labels aren’t used in the loop; removing this reduces memory and query work.

-                    .prefetch_related("assignees", "labels")
+                    .prefetch_related("assignees")

170-178: Minor: include owner/repo in creation log for clarity.

Using the repo’s full name (owner/repo) improves traceability across organizations.

-                                f"{issue.repository.name}#{issue.number} "
+                                f"{getattr(issue.repository, 'path', issue.repository.name)}#{issue.number} "
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e3d166 and 225106f.

📒 Files selected for processing (2)
  • backend/Makefile (1 hunks)
  • backend/apps/mentorship/management/commands/sync_module_issues.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • backend/Makefile
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.
📚 Learning: 2025-08-17T11:55:55.990Z
Learnt from: Rajgupta36
PR: OWASP/Nest#2000
File: backend/apps/mentorship/management/commands/sync_module_issues.py:109-145
Timestamp: 2025-08-17T11:55:55.990Z
Learning: In the OWASP/Nest mentorship system, tasks are designed to be assigned to only one assignee per issue, even if GitHub issues can have multiple assignees. The sync_module_issues command correctly uses issue.assignees.first() to create one task per issue for the first assignee only.

Applied to files:

  • backend/apps/mentorship/management/commands/sync_module_issues.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (1)
backend/apps/mentorship/management/commands/sync_module_issues.py (1)

201-238: No changes needed—Task.Meta.unique_together enforces (issue, assignee) uniqueness.

@Rajgupta36 Rajgupta36 marked this pull request as draft September 10, 2025 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants