VibecoderMcSwaggins commited on
Commit
f632ba8
·
unverified ·
1 Parent(s): 6dccda9

chore: improve DevEx and clean up obsolete docs (#25)

Browse files

- Add `make all` as default target (alias for check)
- Add `make cov` as shorter alias for test-cov
- Add `make cov-html` for HTML coverage reports
- Update `make clean` to remove htmlcov/
- Remove obsolete docs/pending/ planning docs (all phases complete)
- Update docs/index.md with current status (Phases 1-14 complete)
- Fix broken doc links and update team section

Makefile CHANGED
@@ -1,4 +1,7 @@
1
- .PHONY: install test lint format typecheck check clean
 
 
 
2
 
3
  install:
4
  uv sync --all-extras
@@ -7,9 +10,15 @@ install:
7
  test:
8
  uv run pytest tests/unit/ -v
9
 
 
 
10
  test-cov:
11
  uv run pytest --cov=src --cov-report=term-missing
12
 
 
 
 
 
13
  lint:
14
  uv run ruff check src tests
15
 
@@ -23,5 +32,5 @@ check: lint typecheck test
23
  @echo "All checks passed!"
24
 
25
  clean:
26
- rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage
27
  find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
 
1
+ .PHONY: install test lint format typecheck check clean all cov cov-html
2
+
3
+ # Default target
4
+ all: check
5
 
6
  install:
7
  uv sync --all-extras
 
10
  test:
11
  uv run pytest tests/unit/ -v
12
 
13
+ # Coverage aliases
14
+ cov: test-cov
15
  test-cov:
16
  uv run pytest --cov=src --cov-report=term-missing
17
 
18
+ cov-html:
19
+ uv run pytest --cov=src --cov-report=html
20
+ @echo "Coverage report: open htmlcov/index.html"
21
+
22
  lint:
23
  uv run ruff check src tests
24
 
 
32
  @echo "All checks passed!"
33
 
34
  clean:
35
+ rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage htmlcov
36
  find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
docs/index.md CHANGED
@@ -9,10 +9,10 @@ AI-powered deep research system for accelerating drug repurposing discovery.
9
  ## Quick Links
10
 
11
  ### Architecture
12
- - **[Overview](architecture/overview.md)** - Project overview, use case, architecture, timeline
13
- - **[Design Patterns](architecture/design-patterns.md)** - 17 technical patterns, reference repos, judge prompts, data models
14
 
15
- ### Implementation (Start Here!)
16
  - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
17
  - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** ✅ - Tooling, config, first tests
18
  - **[Phase 2: Search](implementation/02_phase_search.md)** ✅ - PubMed search
@@ -22,18 +22,18 @@ AI-powered deep research system for accelerating drug repurposing discovery.
22
  - **[Phase 6: Embeddings](implementation/06_phase_embeddings.md)** ✅ - Semantic search + dedup
23
  - **[Phase 7: Hypothesis](implementation/07_phase_hypothesis.md)** ✅ - Mechanistic reasoning
24
  - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
25
- - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** 📝 - Remove DuckDuckGo
26
- - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** 📝 - Clinical trials API
27
- - **[Phase 11: bioRxiv](implementation/11_phase_biorxiv.md)** 📝 - Preprint search
 
 
 
28
 
29
  ### Guides
30
- - [Setup Guide](guides/setup.md) (coming soon)
31
  - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
32
 
33
  ### Development
34
  - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
35
- - [Contributing](development/contributing.md) (coming soon)
36
-
37
 
38
  ---
39
 
@@ -54,7 +54,7 @@ AI-powered deep research system for accelerating drug repurposing discovery.
54
  User Question → Research Agent (Orchestrator)
55
 
56
  Search Loop:
57
- → Tools (PubMed, Web Search)
58
  → Judge (Quality + Budget)
59
  → Repeat or Synthesize
60
 
@@ -63,21 +63,22 @@ User Question → Research Agent (Orchestrator)
63
 
64
  ---
65
 
66
- ## Hackathon Tracks
67
 
68
- | Track | Status | Key Feature |
69
- |-------|--------|-------------|
70
- | **Gradio** | ✅ Planned | Streaming UI with progress |
71
- | **MCP** | ✅ Planned | PubMed as MCP server |
72
- | **Modal** | 🔄 Stretch | GPU inference option |
 
73
 
74
  ---
75
 
76
  ## Team
77
 
78
- - Physician (medical domain expert) ✅
79
- - Software engineers ✅
80
- - AI architecture validated by multiple agents ✅
81
 
82
  ---
83
 
@@ -85,11 +86,7 @@ User Question → Research Agent (Orchestrator)
85
 
86
  | Phase | Status |
87
  |-------|--------|
88
- | Phases 1-8 | ✅ COMPLETE |
89
- | Phase 9: Remove DuckDuckGo | 📝 SPEC READY |
90
- | Phase 10: ClinicalTrials.gov | 📝 SPEC READY |
91
- | Phase 11: bioRxiv | 📝 SPEC READY |
92
 
 
93
  **Architecture Review**: PASSED (98-99/100)
94
- **Phases 1-8**: COMPLETE
95
- **Next**: Phases 9-11 (Multi-Source Enhancement)
 
9
  ## Quick Links
10
 
11
  ### Architecture
12
+ - **[Overview](architecture/overview.md)** - Project overview, use case, architecture
13
+ - **[Design Patterns](architecture/design-patterns.md)** - Technical patterns, data models
14
 
15
+ ### Implementation
16
  - **[Roadmap](implementation/roadmap.md)** - Phased execution plan with TDD
17
  - **[Phase 1: Foundation](implementation/01_phase_foundation.md)** ✅ - Tooling, config, first tests
18
  - **[Phase 2: Search](implementation/02_phase_search.md)** ✅ - PubMed search
 
22
  - **[Phase 6: Embeddings](implementation/06_phase_embeddings.md)** ✅ - Semantic search + dedup
23
  - **[Phase 7: Hypothesis](implementation/07_phase_hypothesis.md)** ✅ - Mechanistic reasoning
24
  - **[Phase 8: Report](implementation/08_phase_report.md)** ✅ - Structured scientific reports
25
+ - **[Phase 9: Source Cleanup](implementation/09_phase_source_cleanup.md)** - Remove DuckDuckGo
26
+ - **[Phase 10: ClinicalTrials](implementation/10_phase_clinicaltrials.md)** - Clinical trials API
27
+ - **[Phase 11: bioRxiv](implementation/11_phase_biorxiv.md)** - Preprint search
28
+ - **[Phase 12: MCP Server](implementation/12_phase_mcp_server.md)** ✅ - Claude Desktop integration
29
+ - **[Phase 13: Modal Integration](implementation/13_phase_modal_integration.md)** ✅ - Secure code execution
30
+ - **[Phase 14: Demo Submission](implementation/14_phase_demo_submission.md)** ✅ - Hackathon submission
31
 
32
  ### Guides
 
33
  - **[Deployment Guide](guides/deployment.md)** - Gradio, MCP, and Modal launch steps
34
 
35
  ### Development
36
  - **[Testing Strategy](development/testing.md)** - Unit, Integration, and E2E testing patterns
 
 
37
 
38
  ---
39
 
 
54
  User Question → Research Agent (Orchestrator)
55
 
56
  Search Loop:
57
+ → Tools (PubMed, ClinicalTrials, bioRxiv)
58
  → Judge (Quality + Budget)
59
  → Repeat or Synthesize
60
 
 
63
 
64
  ---
65
 
66
+ ## Features
67
 
68
+ | Feature | Status | Description |
69
+ |---------|--------|-------------|
70
+ | **Gradio UI** | ✅ Complete | Streaming chat interface |
71
+ | **MCP Server** | ✅ Complete | Tools accessible from Claude Desktop |
72
+ | **Modal Sandbox** | Complete | Secure statistical analysis |
73
+ | **Multi-Source Search** | ✅ Complete | PubMed, ClinicalTrials, bioRxiv |
74
 
75
  ---
76
 
77
  ## Team
78
 
79
+ - The-Obstacle-Is-The-Way
80
+ - MarioAderman
81
+ - Josephrp
82
 
83
  ---
84
 
 
86
 
87
  | Phase | Status |
88
  |-------|--------|
89
+ | Phases 1-14 | ✅ COMPLETE |
 
 
 
90
 
91
+ **Test Coverage**: 65% (96 tests passing)
92
  **Architecture Review**: PASSED (98-99/100)
 
 
docs/pending/00_priority_summary.md DELETED
@@ -1,111 +0,0 @@
1
- # DeepCritical Hackathon Priority Summary
2
-
3
- ## 4 Days Left (Deadline: Nov 30, 2025 11:59 PM UTC)
4
-
5
- ---
6
-
7
- ## Git Contribution Analysis
8
-
9
- ```text
10
- The-Obstacle-Is-The-Way: 20+ commits (Phases 1-11, all demos, all fixes)
11
- MarioAderman: 3 commits (Modal, LlamaIndex, PubMed fix)
12
- JJ (Maintainer): 0 code commits (merge button only)
13
- ```
14
-
15
- **Conclusion:** You built 90%+ of this codebase.
16
-
17
- ---
18
-
19
- ## Current Stack (What We Have)
20
-
21
- | Component | Status | Files |
22
- |-----------|--------|-------|
23
- | PubMed Search | ✅ Working | `src/tools/pubmed.py` |
24
- | ClinicalTrials Search | ✅ Working | `src/tools/clinicaltrials.py` |
25
- | bioRxiv Search | ✅ Working | `src/tools/biorxiv.py` |
26
- | Search Handler | ✅ Working | `src/tools/search_handler.py` |
27
- | Embeddings/ChromaDB | ✅ Working | `src/services/embeddings.py` |
28
- | LlamaIndex RAG | ✅ Working | `src/services/llamaindex_rag.py` |
29
- | Hypothesis Agent | ✅ Working | `src/agents/hypothesis_agent.py` |
30
- | Report Agent | ✅ Working | `src/agents/report_agent.py` |
31
- | Judge Agent | ✅ Working | `src/agents/judge_agent.py` |
32
- | Orchestrator | ✅ Working | `src/orchestrator.py` |
33
- | Gradio UI | ✅ Working | `src/app.py` |
34
- | Modal Code Execution | ⚠️ Built, not wired | `src/tools/code_execution.py` |
35
- | **MCP Server** | ✅ **Working** | `src/mcp_tools.py`, `src/app.py` |
36
-
37
- ---
38
-
39
- ## What's Required for Track 2 (MCP in Action)
40
-
41
- | Requirement | Have It? | Priority |
42
- |-------------|----------|----------|
43
- | Autonomous agent behavior | ✅ Yes | - |
44
- | Must use MCP servers as tools | ✅ **YES** | Done (Phase 12) |
45
- | Must be Gradio app | ✅ Yes | - |
46
- | Planning/reasoning/execution | ✅ Yes | - |
47
-
48
- **Bottom Line:** ✅ MCP server implemented in Phase 12. Track 2 compliant.
49
-
50
- ---
51
-
52
- ## 3 Things To Do (In Order)
53
-
54
- ### 1. MCP Server (P0 - Required) ✅ DONE
55
-
56
- - **Files:** `src/mcp_tools.py`, `src/app.py`
57
- - **Status:** Implemented in Phase 12
58
- - **Doc:** `02_mcp_server_integration.md`
59
- - **Endpoint:** `/gradio_api/mcp/`
60
-
61
- ### 2. Modal Wiring (P1 - $2,500 Prize)
62
- - **File:** Update `src/agents/analysis_agent.py`
63
- - **Time:** 2-3 hours
64
- - **Doc:** `03_modal_integration.md`
65
- - **Why:** Modal Innovation Award is $2,500
66
-
67
- ### 3. Demo Video + Submission (P0 - Required)
68
- - **Time:** 1-2 hours
69
- - **Why:** Required for all submissions
70
-
71
- ---
72
-
73
- ## Submission Checklist
74
-
75
- - [ ] Space in MCP-1st-Birthday org
76
- - [ ] Tag: `mcp-in-action-track-enterprise`
77
- - [ ] Social media post link
78
- - [ ] Demo video (1-5 min)
79
- - [ ] MCP server working
80
- - [ ] All tests passing
81
-
82
- ---
83
-
84
- ## Prize Math
85
-
86
- | Award | Amount | Eligible? |
87
- |-------|--------|-----------|
88
- | Track 2 1st Place | $2,500 | If MCP works |
89
- | Modal Innovation | $2,500 | If Modal wired |
90
- | LlamaIndex | $1,000 | Yes (have it) |
91
- | Community Choice | $1,000 | Maybe |
92
- | **Total Potential** | **$7,000** | With MCP + Modal |
93
-
94
- ---
95
-
96
- ## Next Actions
97
-
98
- ```bash
99
- # 1. MCP Server - DONE ✅
100
- uv run python src/app.py # Starts Gradio with MCP at /gradio_api/mcp/
101
-
102
- # 2. Test MCP works
103
- curl http://localhost:7860/gradio_api/mcp/schema | jq
104
-
105
- # 3. Wire Modal into pipeline
106
- # (see 03_modal_integration.md)
107
-
108
- # 4. Record demo video
109
-
110
- # 5. Submit to MCP-1st-Birthday org
111
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/pending/01_hackathon_requirements.md DELETED
@@ -1,99 +0,0 @@
1
- # MCP's 1st Birthday Hackathon - Requirements Analysis
2
-
3
- > **✅ MCP Server implemented in Phase 12** - Track 2 compliant
4
-
5
- ## Deadline: November 30, 2025 11:59 PM UTC
6
-
7
- ---
8
-
9
- ## Track Selection: MCP in Action (Track 2)
10
-
11
- DeepCritical fits **Track 2: MCP in Action** - AI agent applications.
12
-
13
- ### Required Tags (pick one)
14
- ```yaml
15
- tags:
16
- - mcp-in-action-track-enterprise # Drug repurposing = enterprise/healthcare
17
- # OR
18
- - mcp-in-action-track-consumer # If targeting patients/consumers
19
- ```
20
-
21
- ### Track 2 Requirements
22
-
23
- | Requirement | DeepCritical Status | Action Needed |
24
- |-------------|---------------------|---------------|
25
- | Autonomous Agent behavior | ✅ Have it | Search-Judge-Synthesize loop |
26
- | Must use MCP servers as tools | ✅ **DONE** | `src/mcp_tools.py` |
27
- | Must be a Gradio app | ✅ Have it | `src/app.py` |
28
- | Planning, reasoning, execution | ✅ Have it | Orchestrator + Judge |
29
- | Context Engineering / RAG | ✅ Have it | LlamaIndex + ChromaDB |
30
-
31
- ---
32
-
33
- ## Prize Opportunities
34
-
35
- ### Current Eligibility vs With MCP Integration
36
-
37
- | Award | Prize | Current | With MCP |
38
- |-------|-------|---------|----------|
39
- | MCP in Action (1st) | $2,500 | ✅ Eligible | ✅ STRONGER |
40
- | Modal Innovation | $2,500 | ❌ Not using | ✅ ELIGIBLE (code execution) |
41
- | Blaxel Choice | $2,500 | ❌ Not using | ⚠️ Could integrate |
42
- | LlamaIndex | $1,000 | ✅ Using (Mario's code) | ✅ ELIGIBLE |
43
- | Google Gemini | $10K credits | ❌ Not using | ⚠️ Could add |
44
- | Community Choice | $1,000 | ⚠️ Possible | ✅ Better demo helps |
45
- | **TOTAL POTENTIAL** | | ~$2,500 | **$8,500+** |
46
-
47
- ---
48
-
49
- ## Submission Checklist
50
-
51
- - [ ] HuggingFace Space in `MCP-1st-Birthday` organization
52
- - [ ] Track tags in Space README.md
53
- - [ ] Social media post link (X, LinkedIn)
54
- - [ ] Demo video (1-5 minutes)
55
- - [ ] All team members registered
56
- - [ ] Original work (Nov 14-30)
57
-
58
- ---
59
-
60
- ## Priority Integration Order
61
-
62
- ### P0 - MUST HAVE (Required for Track 2)
63
- 1. **MCP Server Wrapper** - Expose search tools as MCP servers
64
- - See: `02_mcp_server_integration.md`
65
-
66
- ### P1 - HIGH VALUE ($2,500 each)
67
- 2. **Modal Integration** - Already have code, need to wire up
68
- - See: `03_modal_integration.md`
69
-
70
- ### P2 - NICE TO HAVE
71
- 3. **Blaxel** - MCP hosting platform (if time permits)
72
- 4. **Gemini API** - Add as LLM option for Google prize
73
-
74
- ---
75
-
76
- ## What MCP Actually Means for Us
77
-
78
- MCP (Model Context Protocol) is Anthropic's standard for connecting AI to tools.
79
-
80
- **Current state:**
81
- - We have `PubMedTool`, `ClinicalTrialsTool`, `BioRxivTool`
82
- - They're Python classes with `search()` methods
83
-
84
- **What we need:**
85
- - Wrap these as MCP servers
86
- - So Claude Desktop, Cursor, or any MCP client can use them
87
-
88
- **Why this matters:**
89
- - Judges will test if our tools work with Claude Desktop
90
- - No MCP = disqualified from Track 2
91
-
92
- ---
93
-
94
- ## Reference Links
95
-
96
- - [Hackathon Page](https://huggingface.co/MCP-1st-Birthday)
97
- - [MCP Documentation](https://modelcontextprotocol.io/)
98
- - [Gradio MCP Guide](https://www.gradio.app/guides/building-mcp-server-with-gradio)
99
- - [Discord: #agents-mcp-hackathon-winter25](https://discord.gg/huggingface)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/pending/02_mcp_server_integration.md DELETED
@@ -1,177 +0,0 @@
1
- # MCP Server Integration
2
-
3
- ## Priority: P0 - REQUIRED FOR TRACK 2
4
-
5
- > **✅ STATUS: IMPLEMENTED** - See `src/mcp_tools.py` and `src/app.py`
6
- > MCP endpoint: `/gradio_api/mcp/`
7
-
8
- ---
9
-
10
- ## What We Need
11
-
12
- Expose our search tools as MCP servers so Claude Desktop/Cursor can use them.
13
-
14
- ### Current Tools to Expose
15
-
16
- | Tool | File | MCP Tool Name |
17
- |------|------|---------------|
18
- | PubMed Search | `src/tools/pubmed.py` | `search_pubmed` |
19
- | ClinicalTrials Search | `src/tools/clinicaltrials.py` | `search_clinical_trials` |
20
- | bioRxiv Search | `src/tools/biorxiv.py` | `search_biorxiv` |
21
- | Combined Search | `src/tools/search_handler.py` | `search_all_sources` |
22
-
23
- ---
24
-
25
- ## Implementation Options
26
-
27
- ### Option 1: Gradio MCP (Recommended)
28
-
29
- Gradio 5.0+ can expose any Gradio app as an MCP server automatically.
30
-
31
- ```python
32
- # src/mcp_server.py
33
- import gradio as gr
34
- from src.tools.pubmed import PubMedTool
35
- from src.tools.clinicaltrials import ClinicalTrialsTool
36
- from src.tools.biorxiv import BioRxivTool
37
-
38
- pubmed = PubMedTool()
39
- trials = ClinicalTrialsTool()
40
- biorxiv = BioRxivTool()
41
-
42
- async def search_pubmed(query: str, max_results: int = 10) -> str:
43
- """Search PubMed for biomedical literature."""
44
- results = await pubmed.search(query, max_results)
45
- return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
46
-
47
- async def search_clinical_trials(query: str, max_results: int = 10) -> str:
48
- """Search ClinicalTrials.gov for clinical trial data."""
49
- results = await trials.search(query, max_results)
50
- return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
51
-
52
- async def search_biorxiv(query: str, max_results: int = 10) -> str:
53
- """Search bioRxiv/medRxiv for preprints."""
54
- results = await biorxiv.search(query, max_results)
55
- return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
56
-
57
- # Create Gradio interface
58
- demo = gr.Interface(
59
- fn=[search_pubmed, search_clinical_trials, search_biorxiv],
60
- inputs=[gr.Textbox(label="Query"), gr.Number(label="Max Results", value=10)],
61
- outputs=gr.Textbox(label="Results"),
62
- )
63
-
64
- # Launch as MCP server
65
- if __name__ == "__main__":
66
- demo.launch(mcp_server=True) # Gradio 5.0+ feature
67
- ```
68
-
69
- ### Option 2: Native MCP SDK
70
-
71
- Use the official MCP Python SDK:
72
-
73
- ```bash
74
- uv add mcp
75
- ```
76
-
77
- ```python
78
- # src/mcp_server.py
79
- from mcp.server import Server
80
- from mcp.types import Tool, TextContent
81
-
82
- from src.tools.pubmed import PubMedTool
83
- from src.tools.clinicaltrials import ClinicalTrialsTool
84
- from src.tools.biorxiv import BioRxivTool
85
-
86
- server = Server("deepcritical-research")
87
-
88
- @server.tool()
89
- async def search_pubmed(query: str, max_results: int = 10) -> list[TextContent]:
90
- """Search PubMed for biomedical literature on drug repurposing."""
91
- tool = PubMedTool()
92
- results = await tool.search(query, max_results)
93
- return [TextContent(type="text", text=e.content) for e in results]
94
-
95
- @server.tool()
96
- async def search_clinical_trials(query: str, max_results: int = 10) -> list[TextContent]:
97
- """Search ClinicalTrials.gov for clinical trials."""
98
- tool = ClinicalTrialsTool()
99
- results = await tool.search(query, max_results)
100
- return [TextContent(type="text", text=e.content) for e in results]
101
-
102
- @server.tool()
103
- async def search_biorxiv(query: str, max_results: int = 10) -> list[TextContent]:
104
- """Search bioRxiv/medRxiv for preprints (not peer-reviewed)."""
105
- tool = BioRxivTool()
106
- results = await tool.search(query, max_results)
107
- return [TextContent(type="text", text=e.content) for e in results]
108
-
109
- if __name__ == "__main__":
110
- server.run()
111
- ```
112
-
113
- ---
114
-
115
- ## Claude Desktop Configuration
116
-
117
- After implementing, users add to `claude_desktop_config.json`:
118
-
119
- ```json
120
- {
121
- "mcpServers": {
122
- "deepcritical": {
123
- "command": "uv",
124
- "args": ["run", "python", "src/mcp_server.py"],
125
- "cwd": "/path/to/DeepCritical-1"
126
- }
127
- }
128
- }
129
- ```
130
-
131
- ---
132
-
133
- ## Testing MCP Server
134
-
135
- 1. Start the MCP server (via Gradio app):
136
-
137
- ```bash
138
- uv run python src/app.py
139
- ```
140
-
141
- 2. Check MCP schema:
142
-
143
- ```bash
144
- curl http://localhost:7860/gradio_api/mcp/schema | jq
145
- ```
146
-
147
- 3. Test with MCP Inspector:
148
-
149
- ```bash
150
- npx @anthropic/mcp-inspector http://localhost:7860/gradio_api/mcp/sse
151
- ```
152
-
153
- 4. Verify tools appear and work
154
-
155
- ---
156
-
157
- ## Demo Video Script
158
-
159
- For the hackathon submission video:
160
-
161
- 1. Show Claude Desktop with DeepCritical MCP tools
162
- 2. Ask: "Search PubMed for metformin Alzheimer's"
163
- 3. Show real results appearing
164
- 4. Ask: "Now search clinical trials for the same"
165
- 5. Show combined analysis
166
-
167
- This proves MCP integration works.
168
-
169
- ---
170
-
171
- ## Files Created
172
-
173
- - [x] `src/mcp_tools.py` - MCP tool wrapper functions
174
- - [x] `src/app.py` - Gradio app with `mcp_server=True`
175
- - [x] `tests/unit/test_mcp_tools.py` - Unit tests
176
- - [x] `tests/integration/test_mcp_tools_live.py` - Integration tests
177
- - [x] `README.md` - Updated with MCP usage instructions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/pending/03_modal_integration.md DELETED
@@ -1,158 +0,0 @@
1
- # Modal Integration
2
-
3
- ## Priority: P1 - HIGH VALUE ($2,500 Modal Innovation Award)
4
-
5
- ---
6
-
7
- ## What Modal Is For
8
-
9
- Modal provides serverless GPU/CPU compute. For DeepCritical:
10
-
11
- ### Current Use Case (Mario's Code)
12
- - `src/tools/code_execution.py` - Run LLM-generated analysis code in sandboxes
13
- - Scientific computing (pandas, scipy, numpy) in isolated containers
14
-
15
- ### Potential Additional Use Cases
16
-
17
- | Use Case | Benefit | Complexity |
18
- |----------|---------|------------|
19
- | Code Execution Sandbox | Run statistical analysis safely | ✅ Already built |
20
- | LLM Inference | Run local models (no API costs) | Medium |
21
- | Batch Processing | Process many papers in parallel | Medium |
22
- | Embedding Generation | GPU-accelerated embeddings | Low |
23
-
24
- ---
25
-
26
- ## Current State
27
-
28
- Mario implemented `src/tools/code_execution.py`:
29
-
30
- ```python
31
- # Already exists - ModalCodeExecutor
32
- executor = get_code_executor()
33
- result = executor.execute("""
34
- import pandas as pd
35
- import numpy as np
36
- # LLM-generated statistical analysis
37
- """)
38
- ```
39
-
40
- ### What's Missing
41
-
42
- 1. **Not wired into the main pipeline** - The executor exists but isn't used
43
- 2. **No Modal tokens configured** - Needs MODAL_TOKEN_ID/MODAL_TOKEN_SECRET
44
- 3. **No demo showing it works** - Judges need to see it
45
-
46
- ---
47
-
48
- ## Integration Plan
49
-
50
- ### Step 1: Wire Into Agent Pipeline
51
-
52
- Add a `StatisticalAnalyzer` service that uses Modal:
53
-
54
- ```python
55
- # src/services/statistical_analyzer.py
56
- import asyncio
57
- from src.tools.code_execution import get_code_executor
58
-
59
- class StatisticalAnalyzer:
60
- """Run statistical analysis on evidence using Modal sandbox."""
61
-
62
- async def analyze(self, evidence: list[Evidence], query: str) -> str:
63
- # 1. LLM generates analysis code
64
- code = await self._generate_analysis_code(evidence, query)
65
-
66
- # 2. Execute in Modal sandbox (run sync executor in thread pool)
67
- executor = get_code_executor()
68
- loop = asyncio.get_event_loop()
69
- result = await loop.run_in_executor(None, executor.execute, code)
70
-
71
- # 3. Return results
72
- return result["stdout"]
73
- ```
74
-
75
- ### Step 2: Add to Orchestrator
76
-
77
- ```python
78
- # In orchestrator, after gathering evidence:
79
- if settings.enable_modal_analysis:
80
- analysis_agent = AnalysisAgent()
81
- stats_results = await analysis_agent.analyze(evidence, query)
82
- ```
83
-
84
- ### Step 3: Create Demo
85
-
86
- ```python
87
- # examples/modal_demo/run_analysis.py
88
- """Demo: Modal-powered statistical analysis of drug evidence."""
89
-
90
- # Show:
91
- # 1. Gather evidence from PubMed
92
- # 2. Generate analysis code with LLM
93
- # 3. Execute in Modal sandbox
94
- # 4. Return statistical insights
95
- ```
96
-
97
- ---
98
-
99
- ## Modal Setup
100
-
101
- ### 1. Install Modal CLI
102
- ```bash
103
- pip install modal
104
- modal setup # Authenticates with Modal
105
- ```
106
-
107
- ### 2. Set Environment Variables
108
- ```bash
109
- # In .env
110
- MODAL_TOKEN_ID=your-token-id
111
- MODAL_TOKEN_SECRET=your-token-secret
112
- ```
113
-
114
- ### 3. Deploy (Optional)
115
- ```bash
116
- modal deploy src/tools/code_execution.py
117
- ```
118
-
119
- ---
120
-
121
- ## What to Show Judges
122
-
123
- For the Modal Innovation Award ($2,500):
124
-
125
- 1. **Sandbox Isolation** - Code runs in container, not local
126
- 2. **Scientific Computing** - Real pandas/scipy analysis
127
- 3. **Safety** - Can't access local filesystem
128
- 4. **Speed** - Modal's fast cold starts
129
-
130
- ### Demo Script
131
-
132
- ```bash
133
- # Run the Modal verification script
134
- uv run python examples/modal_demo/verify_sandbox.py
135
- ```
136
-
137
- This proves code runs in Modal, not locally.
138
-
139
- ---
140
-
141
- ## Files to Update
142
-
143
- - [ ] Wire `code_execution.py` into pipeline
144
- - [ ] Create `src/agents/analysis_agent.py`
145
- - [ ] Update `examples/modal_demo/` with working demo
146
- - [ ] Add Modal setup to README
147
- - [ ] Test with real Modal account
148
-
149
- ---
150
-
151
- ## Cost Estimate
152
-
153
- Modal pricing for our use case:
154
- - CPU sandbox: ~$0.0001 per execution
155
- - For demo/judging: < $1 total
156
- - Free tier: 30 hours/month
157
-
158
- Not a cost concern.