Skip to content

Pull requests: QiJune/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

add kv cache offloading example
#14 opened Sep 9, 2025 by QiJune Loading…
1 task
move weights loading related logic to ModelLoader
#13 opened Sep 6, 2025 by QiJune Loading…
1 task
Clean cuda graph
#12 opened Sep 5, 2025 by QiJune Loading…
1 task
refactor cuda graph runner
#11 opened Aug 11, 2025 by QiJune Loading…
add _prepare_and_schedule_batch function in PyExecutor
#10 opened Jul 25, 2025 by QiJune Loading…
Try to fix allgather
#9 opened Jul 7, 2025 by QiJune Loading…
fix allgather
#8 opened Jul 4, 2025 by QiJune Loading…
Fix pad 2
#7 opened Jul 3, 2025 by QiJune Loading…
Fix pad 1
#6 opened Jun 30, 2025 by QiJune Loading…
Clean llm
#5 opened Jun 25, 2025 by QiJune Loading…
create worker from llm args
#4 opened Jun 24, 2025 by QiJune Loading…
Draft
#3 opened Jun 13, 2025 by QiJune Loading…
Pure Python LlmResponse
#2 opened Jun 12, 2025 by QiJune Loading…
Nano bind
#1 opened Jun 12, 2025 by QiJune Loading…
ProTip! Follow long discussions with comments:>50.