Skip to content

Reimplement video utilsΒ #1929

@SkalskiP

Description

@SkalskiP

Description

The functionalities currently available in supervision.utils.video should be reimplemented and consolidated within a new Video class. Importantly, all features supported by the old video API must remain available in the new implementation.

  • get video info (works for files, RTSP, webcams)

    import supervision as sv
     
    # static video
    sv.Video("source.mp4").info
    
    # video stream
    sv.Video("rtsp://...").info
    
    # webcam
    sv.Video(0).info
  • simple frame iteration (object is iterable)

    import supervision as sv
    
    video = sv.Video("source.mp4")
    for frame in video:
        ...
  • advanced frame iteration (stride, sub-clip, on-the-fly resize)

    import supervision as sv
    
    for frame in sv.Video("source.mp4").frames(stride=5, start=100, end=500, resolution_wh=(1280, 720)):
        ...
  • process the video

    import cv2
    import supervision as sv
    
    def blur(frame, i):
        return cv2.GaussianBlur(frame, (11, 11), 0)
    
    sv.Video("source.mp4").save(
        "blurred.mp4",
        callback=blur,
        show_progress=True
    )
  • overwrite target video parameters

    import supervision as sv
    
    sv.Video("source.mp4").save(
        "timelapse.mp4",
        fps=60,
        callback=lambda f, i: f,
        show_progress=True
    )
  • complete manual control with explicit VideoInfo

    from supervision import Video, VideoInfo
    
    source = Video("source.mp4")
    target_info = VideoInfo(width=800, height=800, fps=24)
    
    with src.sink("square.mp4", info=target_info) as sink:
        for f in src.frames():
            f = cv2.resize(f, target_info.resolution_wh)
            sink.write(f)
  • multi-backend support decode/encode; implement PyAV and OpenCV

    import supervision as sv
    
    video = sv.Video("source.mkv", backend="pyav")
    
    video = sv.Video("source.mkv", backend="opencv")

    suggested minimal protocol

    class Backend(Protocol):
        def open(self, path: str) -> Any: ...
        def info(self, handle: Any) -> VideoInfo: ...
    
        def read(self, handle: Any) -> tuple[bool, np.ndarray]: ...
        def grab(self, handle: Any) -> bool: ...
        def seek(self, handle: Any, frame_idx: int) -> None: ...
    
        def writer(self, path: str, info: VideoInfo, codec: str) -> Writer: ...
    
    class Writer(Protocol):
        def write(self, frame: np.ndarray) -> None: ...
        def close(self) -> None: ...

Additional

  • Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! πŸ™πŸ»
  • Mark all methods from the old video API as deprecated. Find examples of already deprecated methods or classes in the current codebase and use the same approach to mark the old video API methods. A deprecation period of at least 5 releases is required.
  • Reimplement the internals of the old video API using the new video API.
  • Provide full unit-test coverage matching existing test style.
  • Update the library docs (docstrings, mkdocs) to reflect the new API.
  • Take into account the comments in Issue with sv.VideoInfo FPS Handling for Precise Video Metadata RetrievalΒ #1687.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions