Study Utils

Study Utils is a collection of scripts that support the full study loop for online courses and self-directed learning.

It captures some of the key ideas of a Personal Knowledge & Learning System (PKLS) that I've written about on my essays site: https://essays.mattlebrun.com/2025/09/personal-knowledge-learning-system.html

Features

CLI first tooling for:

Converting various document formats to Markdown
Reshaping documents into study-friendly formats
Create Retrieval-Augmented Generation (RAG) databases for study and topic exploration
Create and take quizzes to reinforce learning

Requirements

Python 3.12+
OPENAI_API_KEY environment variable for AI-powered features (supports .env)

Optional but recommended:

System libraries for WeasyPrint (for Markdown → PDF conversion)
ffmpeg for video transcription (pydub dependency)
pandoc for epub → markdown conversion (if you want epub support)
uv for dependency and virtual environment management
A markdown document viewer/editor of your choice

This currently tested on Ubuntu 22.04/24.04 (WSL) and Termux on Android.

Technically it should work on macOS, but I haven't tested it there.

Quick start

Assuming you have the system requirements in place, you can get started in a few steps.

Create a study workspace:

mkdir -p ~/Study && cd ~/Study

Create configuration files:

printf 'OPENAI_API_KEY=your_api_key_here\n' > .env

Install with pip:

uv pip install git+https://github.com/cr8ivecodesmith/study-utils.git

Initialize the workspace:

uv run study init

Convert documents to Markdown:

Create a materials/ directory and drop your source documents there:

mkdir -p materials

uv run study convert-markdown materials

The converted Markdown files will land in ~/.study-utils-data/converted by default.

Reshape documents to study formats:

Scaffold the document prompts configuration (skip if study init already created it):

uv run study generate-document config init

Create a reading assignment and a keywords list from a converted Markdown file:

uv run study generate-document reading_assignment assignments/lesson-01.md ~/.study-utils-data/converted

uv run study generate-document keywords assignments/lesson-01-keywords.md ~/.study-utils-data/converted

Create a RAG database:

Initialize and create a RAG database from the converted materials:

uv run study rag config init

uv run study rag ingest --name lesson-01 ~/.study-utils-data/converted

Explore the study materials:

uv run study rag chat --db lesson-01

Take quizzes:

Initialize and edit quizzer.toml and set:

uv run study quizzer init lesson-01

[quiz.lesson-01]
sources = ["/path/to/home/.study-utils-data/converted"]

Generate topics and questions:

uv run study quizzer topics generate lesson-01 --use-ai

uv run study quizzer questions generate lesson-01 --per-topic 5 --ensure-coverage

Start a quiz session:

uv run study quizzer start lesson-01 --num 5

Development setup

uv is the preferred workflow for managing dependencies and virtual environments. Run uv sync --dev to install everything needed for local development.
When working inside Termux, pyenv remains the more reliable option for managing Python versions and virtual environments. Install Python 3.12 with pyenv and create a virtual environment before running the tooling.

Testing

Run uv run pytest (or just test) to execute the full suite. Coverage is enforced at 100% via pytest-cov; any regression will fail the run.
The tests run entirely offline by default. Shared fixtures under tests/fixtures/ provide stubs for OpenAI, WeasyPrint, dotenv, and pydub to keep runs deterministic.
For local debugging you can generate a detailed report with uv run pytest --cov-report=term-missing. To temporarily relax the coverage gate while debugging, drop --cov-fail-under=100 from pyproject.toml and restore it before committing.

Workspace and configuration

Run study init to bootstrap the shared workspace (defaults to ~/.study-utils-data). The command creates converted/, logs/, and config/ directories and accepts --path for alternate locations.
Configuration files live under <workspace>/config. Use study convert-markdown config init to scaffold convert_markdown.toml with documented defaults. Pass --workspace to target a specific workspace or --path/--force for fully custom destinations.
All CLI entry points respect the STUDY_UTILS_DATA_HOME environment variable when resolving the workspace; if unset they fall back to the default directory created by study init.

System requirements

OS Requirements (Ubuntu)

To enable Markdown → PDF generation with WeasyPrint, install its system libraries and fonts.

Base setup:

sudo apt-get update
sudo apt-get install -y libcairo2 libpango-1.0-0 libgdk-pixbuf2.0-0 libffi-dev libxml2 libxslt1.1 shared-mime-info

Recommended fonts:

sudo apt-get install -y fonts-dejavu fonts-liberation

To enable video transcription, ensure ffmpeg is installed (pydub uses it under the hood):
```
sudo apt-get install -y ffmpeg
```

Gather materials for a module

Download all materials for a module.
At the very least the transcript file for the video.
If there's no transcript file available, use study transcribe-video.

CLI commands

All tooling is routed through the study console script. Run study list to see available commands or study help <command> for details and supported flags.

study transcribe-video TARGET [options]

Transcribe one .mp4 file or a directory of .mp4 files using Whisper-1.
Supports optional recursion, list/preview mode, smart names with optional AI refinement, and composable filename prefixes.

Examples:

Preview names and save editable cache:

uv run study transcribe-video ./videos --list --smart-names --use-ai

Transcribe a directory recursively with smart names and counter prefix:

study transcribe-video ./videos -r --smart-names -p 'text:Lecture ' -p 'counter:NN'

Transcribe a single file to a custom folder:

study transcribe-video ./videos/intro.mp4 -o ./transcripts

study markdown-to-pdf OUTPUT.pdf INPUTS... [options]

Convert Markdown to a single PDF using WeasyPrint, with configurable paper size, margins, optional title page, and optional table of contents.

Examples:

study markdown-to-pdf out.pdf notes.md --toc --paper-size a4

study markdown-to-pdf out.pdf docs/ --level-limit 2 --margin "1in 0.75in"

study markdown-to-pdf out.pdf README.md --title-page --title "My Guide" --author "Me"

study init [options]

Provision the shared workspace directory and any missing subdirectories.

Examples:

study init

study init --path /tmp/study-utils

study convert-markdown PATHS... [options]

Convert PDFs, DOCX, HTML, TXT, and EPUB files into Markdown outputs with YAML front matter while preserving basenames.
Default outputs land in <workspace>/converted; use --output-dir or the TOML config to redirect elsewhere.
Respects layered config precedence (CLI flags > environment variables > convert_markdown.toml).
Use study convert-markdown config init to scaffold the default template in the workspace config directory.

Examples:

study convert-markdown ./docs --extensions pdf docx

study convert-markdown config init --workspace ~/.study-utils-data

study text-combiner OUTPUT INPUTS... [options]

Combine text files with optional section titles and ordering. See --help.

study generate-document DOC_TYPE OUTPUT INPUTS... [options]

Generate a Markdown document from reference files using prompts defined in a TOML config.
Resolves documents.toml from the workspace config directory by default and falls back to the current directory. If no config is found, the CLI exits with guidance to run study generate-document config init.
Use study generate-document config init to scaffold the default template in the workspace or a custom destination (--path, --workspace, --force).

Example:

study generate-document reading_assignment notes.md ./materials --extensions txt md --level-limit 0

study quizzer [options]

Launch the interactive Rich-based quiz session for drilling on generated questions.

study rag <subcommand> [options]

Manage retrieval-augmented study databases and chat sessions. Includes config, ingest, list, inspect, export, import, chat, and doctor helpers.

Examples:

study rag config init

study rag ingest --name physics-notes ./notes

study rag doctor

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
agents		agents
src/study_utils		src/study_utils
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
justfile		justfile
pyproject.toml		pyproject.toml
sample.env		sample.env
sample.envrc		sample.envrc
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Study Utils

Features

Requirements

Quick start

Development setup

Testing

Workspace and configuration

System requirements

OS Requirements (Ubuntu)

Gather materials for a module

CLI commands

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

cr8ivecodesmith/study-utils

Folders and files

Latest commit

History

Repository files navigation

Study Utils

Features

Requirements

Quick start

Development setup

Testing

Workspace and configuration

System requirements

OS Requirements (Ubuntu)

Gather materials for a module

CLI commands

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages