feat(evaluation): MCP Tool Evaluation for Financial Document Analysis #51
+681
−21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat(evaluation): MCP Tool Evaluation for Financial Document Analysis
Overview
This Pull Request introduces a dedicated evaluation framework for Multi-Context Processing (MCP) tools, focusing on real-world financial documents. By leveraging Patronus AI's FinanceBench dataset, it provides a streamlined approach to parsing and analyzing filings like 10-Ks and earnings reports—ideal for testing MCP's text extraction and summarization capabilities in practical finance scenarios.
Motivation
Although MCP excels in text-based orchestration, financial documents can be complex:
With this evaluation framework, we bridge the gap between general MCP functionalities and the specialized needs of financial data analysis, making it simpler to test and iterate on improvement strategies.
Files Changed
New Files
minions/examples/finance/evaluate_mcp_tools.pyminions/examples/finance/finance_queries.pyminions/examples/finance/pdfs/(Directory)mcp.json.examplemcp.jsonand customize paths or permissions without committing sensitive data.Modified File
setup.pyRemoved/Unchanged Files
mcp.jsonis now listed in.gitignoreand excluded from version control.minions/prompts/minion_wtools.pyhas been restored to its original state with no changes.How to Use
Install Dependencies
Configure MCP (Optional)
mcp.json.exampletomcp.json(ignored by Git) and tailor it for local development.Run the Evaluation
minions/examples/finance/pdfs/.finance_queries.pyto test MCP's handling of financial data.Explore & Modify Queries
finance_queries.pyto test scenarios like revenue variance, forward guidance analysis, or KPI detection.Technical Highlights
Text Extraction via PyPDF2
Evaluation Metrics
Scenario-Based Testing
finance_queries.pyreflect real analyst questions (e.g., "Summarize quarterly earnings").Separation of Concerns
mcp.json.exampleensures local config changes remain private.Future Work
Table Extraction
pdfplumberorcamelot-pyin future iterations for more robust table parsing.Enhanced Error Handling
Extended Prompts & Context
Scalability & Parallel Processing
Summary
This PR establishes a targeted environment for evaluating MCP tools on financial documents—focusing on textual extraction, testing queries, and a minimal yet functional setup that can be expanded. Users can quickly install dependencies, run the evaluation script, and adapt the local MCP configuration. While the current scope is centered on text parsing with PyPDF2, future enhancements (like advanced table extraction) will broaden the system's utility.
Discussion
By narrowing the scope to text-only parsing and an example config file, this PR keeps things simple and paves the way for incremental improvements.
Opinion
This framework is a little first step in adapting MCP to handle real-world financial documents. Starting with a minimal set of dependencies and a clearly defined testing workflow helps maintain clarity and stability.
Thank you for reviewing this Pull Request!