Skip to content

Conversation

AIGCZero
Copy link

Mask channel input error:RuntimeError: shape '[-1, 3, 2, 14, 14]' is invalid for input of size 1172864
Modify the node, only take the first 3 channels

- Implements Qwen image editing functionality with CLIP text encoding
- Features intelligent scaling algorithm selection:
  - Uses 'area' method for downscaling to preserve details
  - Uses 'lanczos' method for upscaling for better quality
- Supports optional VAE encoding for reference latents
- Maintains aspect ratio with 'disabled' crop method
- Scales images to target resolution (1024x1024 pixels) with 8-pixel alignment
- Implement automatic algorithm selection: area for downscaling, lanczos for upscaling
- Improve image quality by choosing optimal scaling method based on target size
- Add Chinese comments for better code documentation
- Ensure 8-pixel alignment for better compatibility with diffusion models
- Fix TextEncodeQwenImageEditPlus to ensure only RGB channels are used
- Prevents RuntimeError when input images have alpha channels
- Ensures proper tensor shape for vision language models
Resolved conflict in comfy_extras/nodes_qwen.py by keeping the alpha channel fix:
- Retained the fix: images_vl.append(s.movedim(1, -1)[:, :, :, :3])
- This ensures only RGB channels are used for vision processing
- Fixes RuntimeError when processing images with alpha channels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant