Skip to content

Conversation

jitanshuraut
Copy link

Adds comprehensive C4 architecture documentation for MarkItDown CLI

What's Changed

  • Added C4 architecture diagrams (Levels 1–4) for MarkItDown CLI
  • Included editable source files (.drawio, .puml) and exported PNGs
  • Organized files into src/ and exports/ directories
  • Added detailed README.md explaining each diagram level and system components

Why

To improve documentation and onboarding for contributors by providing clear, visual architecture overviews. The C4 model helps new developers understand:

  • How the system works end-to-end
  • How components interact
  • Where to extend or debug

How

  • Designed diagrams using draw.io and PlantUML
  • Followed C4 model standards (Context → Containers → Components → Code)
  • Structured folder with clear separation between source and generated assets

- Level 1 (System Context): Overview of MarkItDown in the ecosystem, showing interaction with users and external tools.
- Level 2 (Containers): Breakdown into main components — CLI, Conversion Orchestrator, Converters, and Stream Info.
- Level 3 (Components): Deep dive into Conversion Orchestrator (_markitdown.py), including phases, dependencies, and control flow.
- Level 4 (Code-level): Detailed design of DocumentConverter hierarchy, exception handling, and converter registry.

Diagrams included as editable sources (.drawio, .puml) and exported PNGs:
- C4-Diagrams/src/   – editable files
- C4-Diagrams/exports/ – rendered images

Improves onboarding, documentation, and long-term maintainability.
@jitanshuraut
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant