Spaces:

DataQuests
/

DeepCritical

Running

VibecoderMcSwaggins commited on 11 days ago

Commit

8a98024

1 Parent(s): 1515e72

docs: replace completed bug docs with new Magentic bug report

Deleted (all implemented):
- P0_ACTIONABLE_FIXES.md
- P0_CRITICAL_BUGS.md
- P0_MAGENTIC_AND_SEARCH_AUDIT.md
- PHASE_00-03 implementation docs

Added new bug report for actual issue found:
- Magentic mode returns ChatMessage object instead of text
- Root cause: event.message.text extraction fails
- Max rounds reached before ReportAgent can synthesize

OpenAI key works. Simple mode works. The bug is in how
the final result event is processed.

Files changed (8) hide show

docs/bugs/P0_ACTIONABLE_FIXES.md +0 -281
docs/bugs/P0_CRITICAL_BUGS.md +0 -298
docs/bugs/P0_MAGENTIC_AND_SEARCH_AUDIT.md +0 -249
docs/bugs/P0_MAGENTIC_MODE_BROKEN.md +116 -0
docs/bugs/PHASE_00_IMPLEMENTATION_ORDER.md +0 -156
docs/bugs/PHASE_01_REPLACE_BIORXIV.md +0 -371
docs/bugs/PHASE_02_PUBMED_QUERY_PREPROCESSING.md +0 -355
docs/bugs/PHASE_03_CLINICALTRIALS_FILTERING.md +0 -386

docs/bugs/P0_ACTIONABLE_FIXES.md DELETED Viewed

@@ -1,281 +0,0 @@
-# P0 Actionable Fixes - What to Do
-**Date:** November 27, 2025
-**Status:** ACTIONABLE
----
-## Summary: What's Broken and What's Fixable
-| Tool | Problem | Fixable? | How |
-|------|---------|----------|-----|
-| BioRxiv | API has NO search endpoint | **NO** | Replace with Europe PMC |
-| PubMed | No query preprocessing | **YES** | Add query cleaner |
-| ClinicalTrials | No filters applied | **YES** | Add filter params |
-| Magentic Framework | Nothing wrong | N/A | Already working |
----
-## FIX 1: Replace BioRxiv with Europe PMC (30 min)
-### Why BioRxiv Can't Be Fixed
-The bioRxiv API only has this endpoint:
-```
-https://api.biorxiv.org/details/{server}/{date-range}/{cursor}/json
-```
-This returns papers **by date**, not by keyword. There is NO search endpoint.
-**Proof:** I queried `medrxiv/2024-01-01/2024-01-02` and got:
-- "Global risk of Plasmodium falciparum" (malaria)
-- "Multiple Endocrine Neoplasia in India"
-- "Acupuncture for Acute Musculoskeletal Pain"
-**None of these are about Long COVID** because the API doesn't search.
-### Europe PMC Has Search + Preprints
-```bash
-curl "https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=long+covid+treatment&resultType=core&pageSize=3&format=json"
-```
-Returns 283,058 results including:
-- "Long COVID Treatment No Silver Bullets, Only a Few Bronze BBs" ✅
-### The Fix
-Replace `src/tools/biorxiv.py` with `src/tools/europepmc.py`:
-```python
-"""Europe PMC preprint and paper search tool."""
-import httpx
-from src.utils.models import Citation, Evidence
-class EuropePMCTool:
-    """Search Europe PMC for papers and preprints."""
-    BASE_URL = "https://www.ebi.ac.uk/europepmc/webservices/rest/search"
-    @property
-    def name(self) -> str:
-        return "europepmc"
-    async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-        """Search Europe PMC (includes preprints from bioRxiv/medRxiv)."""
-        params = {
-            "query": query,
-            "resultType": "core",
-            "pageSize": max_results,
-            "format": "json",
-        }
-        async with httpx.AsyncClient(timeout=30.0) as client:
-            response = await client.get(self.BASE_URL, params=params)
-            response.raise_for_status()
-            data = response.json()
-            results = data.get("resultList", {}).get("result", [])
-            return [self._to_evidence(r) for r in results]
-    def _to_evidence(self, result: dict) -> Evidence:
-        """Convert Europe PMC result to Evidence."""
-        title = result.get("title", "Untitled")
-        abstract = result.get("abstractText", "No abstract")
-        doi = result.get("doi", "")
-        pub_year = result.get("pubYear", "Unknown")
-        source = result.get("source", "europepmc")
-        # Mark preprints
-        pub_type = result.get("pubTypeList", {}).get("pubType", [])
-        is_preprint = "Preprint" in pub_type
-        content = f"{'[PREPRINT] ' if is_preprint else ''}{abstract[:1800]}"
-        return Evidence(
-            content=content,
-            citation=Citation(
-                source="europepmc" if not is_preprint else "preprint",
-                title=title[:500],
-                url=f"https://doi.org/{doi}" if doi else "",
-                date=str(pub_year),
-            ),
-            relevance=0.75 if is_preprint else 0.9,
-        )
-```
----
-## FIX 2: Add PubMed Query Preprocessing (1 hour)
-### Current Problem
-User enters: `What medications show promise for Long COVID?`
-PubMed receives: `What medications show promise for Long COVID?`
-The question words pollute the search.
-### The Fix
-Add `src/tools/query_utils.py`:
-```python
-"""Query preprocessing utilities."""
-import re
-# Question words to remove
-QUESTION_WORDS = {
-    "what", "which", "how", "why", "when", "where", "who",
-    "is", "are", "can", "could", "would", "should", "do", "does",
-    "show", "promise", "help", "treat", "cure",
-}
-# Medical synonyms to expand
-SYNONYMS = {
-    "long covid": ["long COVID", "PASC", "post-COVID syndrome", "post-acute sequelae"],
-    "alzheimer": ["Alzheimer's disease", "AD", "Alzheimer dementia"],
-    "cancer": ["neoplasm", "tumor", "malignancy", "carcinoma"],
-}
-def preprocess_pubmed_query(raw_query: str) -> str:
-    """Convert natural language to cleaner PubMed query."""
-    # Lowercase
-    query = raw_query.lower()
-    # Remove question marks
-    query = query.replace("?", "")
-    # Remove question words
-    words = query.split()
-    words = [w for w in words if w not in QUESTION_WORDS]
-    query = " ".join(words)
-    # Expand synonyms
-    for term, expansions in SYNONYMS.items():
-        if term in query:
-            # Add OR clause
-            expansion = " OR ".join([f'"{e}"' for e in expansions])
-            query = query.replace(term, f"({expansion})")
-    return query.strip()
-```
-Then update `src/tools/pubmed.py`:
-```python
-from src.tools.query_utils import preprocess_pubmed_query
-async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-    # Preprocess query
-    clean_query = preprocess_pubmed_query(query)
-    search_params = self._build_params(
-        db="pubmed",
-        term=clean_query,  # Use cleaned query
-        retmax=max_results,
-        sort="relevance",
-    )
-    # ... rest unchanged
-```
----
-## FIX 3: Add ClinicalTrials.gov Filters (30 min)
-### Current Problem
-Returns ALL trials including withdrawn, terminated, observational studies.
-### The Fix
-The API supports `filter.overallStatus` and other filters. Update `src/tools/clinicaltrials.py`:
-```python
-async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-    params: dict[str, str | int] = {
-        "query.term": query,
-        "pageSize": min(max_results, 100),
-        "fields": "|".join(self.FIELDS),
-        # ADD THESE FILTERS:
-        "filter.overallStatus": "COMPLETED|RECRUITING|ACTIVE_NOT_RECRUITING",
-        # Only interventional studies (not observational)
-        "aggFilters": "studyType:int",
-    }
-    # ... rest unchanged
-```
-**Note:** I tested the API - it supports filtering but with slightly different syntax. Check the [API docs](https://clinicaltrials.gov/data-api/api).
----
-## What NOT to Change
-### Microsoft Agent Framework - WORKING
-I verified:
-```python
-from agent_framework import MagenticBuilder, ChatAgent
-from agent_framework.openai import OpenAIChatClient
-# All imports OK
-orchestrator = MagenticOrchestrator(max_rounds=2)
-workflow = orchestrator._build_workflow()
-# Workflow built successfully
-```
-The Magentic agents are correctly wired:
-- SearchAgent → GPT-5.1 ✅
-- JudgeAgent → GPT-5.1 ✅
-- HypothesisAgent → GPT-5.1 ✅
-- ReportAgent → GPT-5.1 ✅
-**The framework is fine. The tools it calls are broken.**
----
-## Priority Order
-1. **Replace BioRxiv** → Immediate, fundamental
-2. **Add PubMed preprocessing** → High impact, easy
-3. **Add ClinicalTrials filters** → Medium impact, easy
----
-## Test After Fixes
-```bash
-# Test Europe PMC
-uv run python -c "
-import asyncio
-from src.tools.europepmc import EuropePMCTool
-tool = EuropePMCTool()
-results = asyncio.run(tool.search('long covid treatment', 3))
-for r in results:
-    print(r.citation.title)
-"
-# Test PubMed with preprocessing
-uv run python -c "
-from src.tools.query_utils import preprocess_pubmed_query
-q = 'What medications show promise for Long COVID?'
-print(preprocess_pubmed_query(q))
-# Should output: (\"long COVID\" OR \"PASC\" OR \"post-COVID syndrome\") medications
-"
-```
----
-## After These Fixes
-The Magentic workflow will:
-1. SearchAgent calls `search_pubmed("long COVID treatment")` → Gets RELEVANT papers
-2. SearchAgent calls `search_preprints("long COVID treatment")` → Gets RELEVANT preprints via Europe PMC
-3. SearchAgent calls `search_clinical_trials("long COVID")` → Gets INTERVENTIONAL trials only
-4. JudgeAgent evaluates GOOD evidence
-5. HypothesisAgent generates hypotheses from GOOD evidence
-6. ReportAgent synthesizes GOOD report
-**The framework will work once we feed it good data.**

docs/bugs/P0_CRITICAL_BUGS.md DELETED Viewed

@@ -1,298 +0,0 @@
-# P0 CRITICAL BUGS - Why DeepCritical Produces Garbage Results
-**Date:** November 27, 2025
-**Status:** CRITICAL - App is functionally useless
-**Severity:** P0 (Blocker)
-## TL;DR
-The app produces garbage because:
-1. **BioRxiv search doesn't work** - returns random papers
-2. **Free tier LLM is too dumb** - can't identify drugs
-3. **Query construction is naive** - no optimization for PubMed/CT.gov syntax
-4. **Loop terminates too early** - 5 iterations isn't enough
----
-## P0-001: BioRxiv Search is Fundamentally Broken
-**File:** `src/tools/biorxiv.py:248-286`
-**The Problem:**
-The bioRxiv API **DOES NOT SUPPORT KEYWORD SEARCH**.
-The code does this:
-```python
-# Fetch recent papers (last 90 days, first 100 papers)
-url = f"{self.BASE_URL}/{self.server}/{interval}/0/json"
-# Then filter client-side for keywords
-```
-**What Actually Happens:**
-1. Fetches the first 100 papers from medRxiv in the last 90 days (chronological order)
-2. Filters those 100 random papers for query keywords
-3. Returns whatever garbage matches
-**Result:** For "Long COVID medications", you get random papers like:
-- "Calf muscle structure-function adaptations"
-- "Work-Life Balance of Ophthalmologists During COVID"
-These papers contain "COVID" somewhere but have NOTHING to do with Long COVID treatments.
-**Root Cause:** The `/0/json` pagination only returns 100 papers. You'd need to paginate through ALL papers (thousands) to do proper keyword filtering.
-**Fix Options:**
-1. **Remove BioRxiv entirely** - It's unusable without proper search API
-2. **Use a different preprint aggregator** - Europe PMC has preprints WITH search
-3. **Add pagination** - Fetch all papers (slow, expensive)
-4. **Use Semantic Scholar API** - Has preprints and proper search
----
-## P0-002: Free Tier LLM Cannot Perform Drug Identification
-**File:** `src/agent_factory/judges.py:153-211`
-**The Problem:**
-Without an API key, the app uses `HFInferenceJudgeHandler` with:
-- Llama 3.1 8B Instruct
-- Mistral 7B Instruct
-These are **7-8 billion parameter models**. They cannot:
-- Reliably parse complex biomedical abstracts
-- Identify drug candidates from scientific text
-- Generate structured JSON output consistently
-- Reason about mechanism of action
-**Evidence of Failure:**
-```python
-# From MockJudgeHandler - the honest fallback when LLM fails
-drug_candidates=[
-    "Drug identification requires AI analysis",
-    "Enter API key above for full results",
-]
-```
-The team KNEW the free tier can't identify drugs and added this message.
-**Root Cause:** Drug repurposing requires understanding:
-- Drug mechanisms
-- Disease pathophysiology
-- Clinical trial phases
-- Statistical significance
-This requires GPT-4 / Claude Sonnet class models (100B+ parameters).
-**Fix Options:**
-1. **Require API key** - No free tier, be honest
-2. **Use larger HF models** - Llama 70B or Mixtral 8x7B (expensive on free tier)
-3. **Hybrid approach** - Use free tier for search, require paid for synthesis
----
-## P0-003: PubMed Query Not Optimized
-**File:** `src/tools/pubmed.py:54-71`
-**The Problem:**
-The query is passed directly to PubMed without optimization:
-```python
-search_params = self._build_params(
-    db="pubmed",
-    term=query,  # Raw user query!
-    retmax=max_results,
-    sort="relevance",
-)
-```
-**What User Enters:** "What medications show promise for Long COVID?"
-**What PubMed Receives:** `What medications show promise for Long COVID?`
-**What PubMed Should Receive:**
-```
-("long covid"[Title/Abstract] OR "post-COVID"[Title/Abstract] OR "PASC"[Title/Abstract])
-AND (drug[Title/Abstract] OR treatment[Title/Abstract] OR medication[Title/Abstract] OR therapy[Title/Abstract])
-AND (clinical trial[Publication Type] OR randomized[Title/Abstract])
-```
-**Root Cause:** No query preprocessing or medical term expansion.
-**Fix Options:**
-1. **Add query preprocessor** - Extract medical entities, expand synonyms
-2. **Use MeSH terms** - PubMed's controlled vocabulary for better recall
-3. **LLM query generation** - Use LLM to generate optimized PubMed query
----
-## P0-004: Loop Terminates Too Early
-**File:** `src/app.py:42-45` and `src/utils/models.py`
-**The Problem:**
-```python
-config = OrchestratorConfig(
-    max_iterations=5,
-    max_results_per_tool=10,
-)
-```
-5 iterations is not enough to:
-1. Search multiple variations of the query
-2. Gather enough evidence for the Judge to synthesize
-3. Refine queries based on initial results
-**Evidence:** The user's output shows "Max Iterations Reached" with only 6 sources.
-**Root Cause:** Conservative defaults to avoid API costs, but makes app useless.
-**Fix Options:**
-1. **Increase default to 10-15** - More iterations = better results
-2. **Dynamic termination** - Stop when confidence > threshold, not iteration count
-3. **Parallel query expansion** - Run more queries per iteration
----
-## P0-005: No Query Understanding Layer
-**Files:** `src/orchestrator.py`, `src/tools/search_handler.py`
-**The Problem:**
-There's no NLU (Natural Language Understanding) layer. The system:
-1. Takes raw user query
-2. Passes directly to search tools
-3. No entity extraction
-4. No intent classification
-5. No query expansion
-For drug repurposing, you need to extract:
-- **Disease:** "Long COVID" → [Long COVID, PASC, Post-COVID syndrome, chronic COVID]
-- **Drug intent:** "medications" → [drugs, treatments, therapeutics, interventions]
-- **Evidence type:** "show promise" → [clinical trials, efficacy, RCT]
-**Root Cause:** No preprocessing pipeline between user input and search execution.
-**Fix Options:**
-1. **Add entity extraction** - Use BioBERT or PubMedBERT for medical NER
-2. **Add query expansion** - Use medical ontologies (UMLS, MeSH)
-3. **LLM preprocessing** - Use LLM to generate search strategy before searching
----
-## P0-006: ClinicalTrials.gov Results Not Filtered
-**File:** `src/tools/clinicaltrials.py`
-**The Problem:**
-ClinicalTrials.gov returns ALL matching trials including:
-- Withdrawn trials
-- Terminated trials
-- Not yet recruiting
-- Observational studies (not interventional)
-For drug repurposing, you want:
-- Interventional studies
-- Phase 2+ (has safety/efficacy data)
-- Completed or with results
-**Root Cause:** No filtering of trial metadata.
----
-## Summary: Why This App Produces Garbage
-```
-User Query: "What medications show promise for Long COVID?"
-    │
-    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ NO QUERY PREPROCESSING                                       │
-│ - No entity extraction                                       │
-│ - No synonym expansion                                       │
-│ - No medical term normalization                              │
-└─────────────────────────────────────────────────────────────┘
-    │
-    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ BROKEN SEARCH LAYER                                          │
-│ - PubMed: Raw query, no MeSH, gets 1 result                 │
-│ - BioRxiv: Returns random papers (API doesn't support search)│
-│ - ClinicalTrials: Returns all trials, no filtering          │
-└─────────────────────────────────────────────────────────────┘
-    │
-    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ GARBAGE EVIDENCE                                             │
-│ - 6 papers, most irrelevant                                  │
-│ - "Calf muscle adaptations" (mentions COVID once)            │
-│ - "Ophthalmologist work-life balance"                        │
-└─────────────────────────────────────────────────────────────┘
-    │
-    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ DUMB JUDGE (Free Tier)                                       │
-│ - Llama 8B can't identify drugs from garbage                 │
-│ - JSON parsing fails                                         │
-│ - Falls back to "Drug identification requires AI analysis"   │
-└─────────────────────────────────────────────────────────────┘
-    │
-    ▼
-┌─────────────────────────────────────────────────────────────┐
-│ LOOP HITS MAX (5 iterations)                                 │
-│ - Never finds enough good evidence                           │
-│ - Never synthesizes anything useful                          │
-└─────────────────────────────────────────────────────────────┘
-    │
-    ▼
-    GARBAGE OUTPUT
-```
----
-## What Would Make This Actually Work
-### Minimum Viable Fix (1-2 days)
-1. **Remove BioRxiv** - It doesn't work
-2. **Require API key** - Be honest that free tier is useless
-3. **Add basic query preprocessing** - Strip question words, expand COVID synonyms
-4. **Increase iterations to 10**
-### Proper Fix (1-2 weeks)
-1. **Query Understanding Layer**
-   - Medical NER (BioBERT/SciBERT)
-   - Query expansion with MeSH/UMLS
-   - Intent classification (drug discovery vs mechanism vs safety)
-2. **Optimized Search**
-   - PubMed: Proper query syntax with MeSH terms
-   - ClinicalTrials: Filter by phase, status, intervention type
-   - Replace BioRxiv with Europe PMC (has preprints + search)
-3. **Evidence Ranking**
-   - Score by publication type (RCT > cohort > case report)
-   - Score by journal impact factor
-   - Score by recency
-   - Score by citation count
-4. **Proper LLM Pipeline**
-   - Use GPT-4 / Claude for synthesis
-   - Structured extraction of: drug, mechanism, evidence level, effect size
-   - Multi-step reasoning: identify → validate → rank → synthesize
----
-## The Hard Truth
-Building a drug repurposing agent that works is HARD. The state of the art is:
-- **Drug2Disease (IBM)** - Uses knowledge graphs + ML
-- **COVID-KG (Stanford)** - Dedicated COVID knowledge graph
-- **Literature Mining at scale (PubMed)** - Millions of papers, not 10
-This hackathon project is fundamentally a **search wrapper with an LLM prompt**. That's not enough.
-To make it useful:
-1. Either scope it down (e.g., "find clinical trials for X disease")
-2. Or invest serious engineering in the NLU + search + ranking pipeline

docs/bugs/P0_MAGENTIC_AND_SEARCH_AUDIT.md DELETED Viewed

@@ -1,249 +0,0 @@
-# P0 Audit: Microsoft Agent Framework (Magentic) & Search Tools
-**Date:** November 27, 2025
-**Auditor:** Claude Code
-**Status:** VERIFIED
----
-## TL;DR
-| Component | Status | Verdict |
-|-----------|--------|---------|
-| Microsoft Agent Framework | ✅ WORKING | Correctly wired, no bugs |
-| GPT-5.1 Model Config | ✅ CORRECT | Using `gpt-5.1` as configured |
-| Search Tools | ❌ BROKEN | Root cause of garbage results |
-**The orchestration framework is fine. The search layer is garbage.**
----
-## Microsoft Agent Framework Verification
-### Import Test: PASSED
-```python
-from agent_framework import MagenticBuilder, ChatAgent
-from agent_framework.openai import OpenAIChatClient
-# All imports successful
-```
-### Agent Creation Test: PASSED
-```python
-from src.agents.magentic_agents import create_search_agent
-search_agent = create_search_agent()
-# SearchAgent created: SearchAgent
-# Description: Searches biomedical databases (PubMed, ClinicalTrials.gov, bioRxiv)
-```
-### Workflow Build Test: PASSED
-```python
-from src.orchestrator_magentic import MagenticOrchestrator
-orchestrator = MagenticOrchestrator(max_rounds=2)
-workflow = orchestrator._build_workflow()
-# Workflow built successfully: <class 'agent_framework._workflows._workflow.Workflow'>
-```
-### Model Configuration: CORRECT
-```python
-settings.openai_model = "gpt-5.1"  # ✅ Using GPT-5.1, not GPT-4o
-settings.openai_api_key = True     # ✅ API key is set
-```
----
-## What Magentic Provides (Working)
-1. **Multi-Agent Coordination**
-   - Manager agent orchestrates SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent
-   - Uses `MagenticBuilder().with_standard_manager()` for coordination
-2. **ChatAgent Pattern**
-   - Each agent has internal LLM (GPT-5.1)
-   - Can call tools via `@ai_function` decorator
-   - Has proper instructions for domain-specific tasks
-3. **Workflow Streaming**
-   - Events: `MagenticAgentMessageEvent`, `MagenticFinalResultEvent`, etc.
-   - Real-time UI updates via `workflow.run_stream(task)`
-4. **State Management**
-   - `MagenticState` persists evidence across agents
-   - `get_bibliography()` tool for ReportAgent
----
-## What's Actually Broken: The Search Tools
-### File: `src/agents/tools.py`
-The Magentic agents call these tools:
-- `search_pubmed` → Uses `PubMedTool`
-- `search_clinical_trials` → Uses `ClinicalTrialsTool`
-- `search_preprints` → Uses `BioRxivTool`
-**These tools are the problem, not the framework.**
----
-## Search Tool Bugs (Detailed)
-### BUG 1: BioRxiv API Does Not Support Search
-**File:** `src/tools/biorxiv.py:248-286`
-```python
-# This fetches the FIRST 100 papers from the last 90 days
-# It does NOT search by keyword - the API doesn't support that
-url = f"{self.BASE_URL}/{self.server}/{interval}/0/json"
-# Then filters client-side for keywords
-matching = self._filter_by_keywords(papers, query_terms, max_results)
-```
-**Problem:**
-- Fetches 100 random chronological papers
-- Filters for ANY keyword match in title/abstract
-- "Long COVID medications" returns papers about "calf muscles" because they mention "COVID" once
-**Fix:** Remove BioRxiv or use Europe PMC (which has actual search)
----
-### BUG 2: PubMed Query Not Optimized
-**File:** `src/tools/pubmed.py:54-71`
-```python
-search_params = self._build_params(
-    db="pubmed",
-    term=query,  # RAW USER QUERY - no preprocessing!
-    retmax=max_results,
-    sort="relevance",
-)
-```
-**Problem:**
-- User enters: "What medications show promise for Long COVID?"
-- PubMed receives: `What medications show promise for Long COVID?`
-- Should receive: `("long covid"[Title/Abstract] OR "PASC"[Title/Abstract]) AND (treatment[Title/Abstract] OR drug[Title/Abstract])`
-**Fix:** Add query preprocessing:
-1. Strip question words (what, which, how, etc.)
-2. Expand medical synonyms (Long COVID → PASC, Post-COVID)
-3. Use MeSH terms for better recall
----
-### BUG 3: ClinicalTrials.gov No Filtering
-**File:** `src/tools/clinicaltrials.py`
-Returns ALL trials including:
-- Withdrawn trials
-- Terminated trials
-- Observational studies (not drug interventions)
-- Phase 1 (no efficacy data)
-**Fix:** Filter by:
-- `studyType=INTERVENTIONAL`
-- `phase=PHASE2,PHASE3,PHASE4`
-- `status=COMPLETED,ACTIVE_NOT_RECRUITING,RECRUITING`
----
-## Evidence: Garbage In → Garbage Out
-When the Magentic SearchAgent calls these tools:
-```
-SearchAgent: "Find evidence for Long COVID medications"
-    │
-    ▼
-search_pubmed("Long COVID medications")
-    → Returns 1 semi-relevant paper (raw query hits)
-search_preprints("Long COVID medications")
-    → Returns garbage (BioRxiv API doesn't search)
-    → "Calf muscle adaptations" (has "COVID" somewhere)
-    → "Ophthalmologist work-life balance" (mentions COVID)
-search_clinical_trials("Long COVID medications")
-    → Returns all trials, no filtering
-    │
-    ▼
-JudgeAgent receives garbage evidence
-    │
-    ▼
-HypothesisAgent can't generate good hypotheses from garbage
-    │
-    ▼
-ReportAgent produces garbage report
-```
-**The framework is doing its job. It's orchestrating agents correctly. But the agents are being fed garbage data.**
----
-## Recommended Fixes
-### Priority 1: Delete or Fix BioRxiv (30 min)
-**Option A: Delete it**
-```python
-# In src/agents/tools.py, remove:
-# from src.tools.biorxiv import BioRxivTool
-# _biorxiv = BioRxivTool()
-# @ai_function search_preprints(...)
-```
-**Option B: Replace with Europe PMC**
-Europe PMC has preprints AND proper search API:
-```
-https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=long+covid+treatment&format=json
-```
-### Priority 2: Fix PubMed Query (1 hour)
-Add query preprocessor:
-```python
-def preprocess_query(raw_query: str) -> str:
-    """Convert natural language to PubMed query syntax."""
-    # Strip question words
-    # Expand medical synonyms
-    # Add field tags [Title/Abstract]
-    # Return optimized query
-```
-### Priority 3: Filter ClinicalTrials (30 min)
-Add parameters to API call:
-```python
-params = {
-    "query.term": query,
-    "filter.overallStatus": "COMPLETED,RECRUITING",
-    "filter.studyType": "INTERVENTIONAL",
-    "pageSize": max_results,
-}
-```
----
-## Conclusion
-**Microsoft Agent Framework: NO BUGS FOUND**
-- Imports work ✅
-- Agent creation works ✅
-- Workflow building works ✅
-- Model config correct (GPT-5.1) ✅
-- Streaming events work ✅
-**Search Tools: CRITICALLY BROKEN**
-- BioRxiv: API doesn't support search (fundamental)
-- PubMed: No query optimization (fixable)
-- ClinicalTrials: No filtering (fixable)
-**Recommendation:**
-1. Delete BioRxiv immediately (unusable)
-2. Add PubMed query preprocessing
-3. Add ClinicalTrials filtering
-4. Then the Magentic multi-agent system will work as designed

docs/bugs/P0_MAGENTIC_MODE_BROKEN.md ADDED Viewed

	@@ -0,0 +1,116 @@

+# P0 Bug: Magentic Mode Returns ChatMessage Object Instead of Report Text
+**Status**: OPEN
+**Priority**: P0 (Critical)
+**Date**: 2025-11-27
+---
+## Actual Bug Found (Not What We Thought)
+**The OpenAI key works fine.** The real bug is different:
+### The Problem
+When Magentic mode completes, the final report returns a `ChatMessage` object instead of the actual text:
+```
+FINAL REPORT:
+<agent_framework._types.ChatMessage object at 0x11db70310>
+```
+### Evidence
+Full test output shows:
+1. Magentic orchestrator starts correctly
+2. SearchAgent finds evidence
+3. HypothesisAgent generates hypotheses
+4. JudgeAgent evaluates
+5. **BUT**: Final output is `ChatMessage` object, not text
+### Root Cause
+In `src/orchestrator_magentic.py` line 193:
+```python
+elif isinstance(event, MagenticFinalResultEvent):
+    text = event.message.text if event.message else "No result"
+```
+The `event.message` is a `ChatMessage` object, and `.text` may not extract the content correctly, or the message structure changed in the agent-framework library.
+---
+## Secondary Issue: Max Rounds Reached
+The orchestrator hits max rounds before producing a report:
+```
+[ERROR] Magentic Orchestrator: Max round count reached
+```
+This means the workflow times out before the ReportAgent synthesizes the final output.
+---
+## What Works
+- OpenAI API key: **Works** (loaded from .env)
+- SearchAgent: **Works** (finds evidence from PubMed, ClinicalTrials, Europe PMC)
+- HypothesisAgent: **Works** (generates Drug -> Target -> Pathway chains)
+- JudgeAgent: **Partial** (evaluates but sometimes loses context)
+---
+## Files to Fix
+| File | Line | Issue |
+|------|------|-------|
+| `src/orchestrator_magentic.py` | 193 | `event.message.text` returns object, not string |
+| `src/orchestrator_magentic.py` | 97-99 | `max_round_count=3` too low for full pipeline |
+---
+## Suggested Fix
+```python
+# In _process_event, line 192-199
+elif isinstance(event, MagenticFinalResultEvent):
+    # Handle ChatMessage object properly
+    if event.message:
+        if hasattr(event.message, 'content'):
+            text = event.message.content
+        elif hasattr(event.message, 'text'):
+            text = event.message.text
+        else:
+            text = str(event.message)
+    else:
+        text = "No result"
+```
+And increase rounds:
+```python
+# In _build_workflow, line 97
+max_round_count=self._max_rounds,  # Use configured value, default 10
+```
+---
+## Test Command
+```bash
+set -a && source .env && set +a && uv run python examples/orchestrator_demo/run_magentic.py "metformin alzheimer"
+```
+---
+## Simple Mode Works
+For reference, simple mode produces full reports:
+```bash
+uv run python examples/orchestrator_demo/run_agent.py "metformin alzheimer"
+```
+Output includes structured report with Drug Candidates, Key Findings, etc.

docs/bugs/PHASE_00_IMPLEMENTATION_ORDER.md DELETED Viewed

@@ -1,156 +0,0 @@
-# Phase 00: Implementation Order & Summary
-**Total Effort:** 5-8 hours
-**Parallelizable:** Yes (all 3 phases are independent)
----
-## Executive Summary
-The DeepCritical drug repurposing agent produces garbage results because the search tools are broken:
-| Tool | Problem | Fix |
-|------|---------|-----|
-| BioRxiv | API doesn't support search | Replace with Europe PMC |
-| PubMed | Raw queries, no preprocessing | Add query cleaner |
-| ClinicalTrials | No filtering | Add status/type filters |
-**The Microsoft Agent Framework (Magentic) is working correctly.** The orchestration layer is fine. The data layer is broken.
----
-## Phase Specs
-| Phase | Title | Effort | Priority | Dependencies |
-|-------|-------|--------|----------|--------------|
-| **01** | [Replace BioRxiv with Europe PMC](./PHASE_01_REPLACE_BIORXIV.md) | 2-3 hrs | P0 | None |
-| **02** | [PubMed Query Preprocessing](./PHASE_02_PUBMED_QUERY_PREPROCESSING.md) | 2-3 hrs | P0 | None |
-| **03** | [ClinicalTrials Filtering](./PHASE_03_CLINICALTRIALS_FILTERING.md) | 1-2 hrs | P1 | None |
----
-## Recommended Execution Order
-Since all phases are independent, they can be done in parallel by different developers.
-**If doing sequentially, order by impact:**
-1. **Phase 01** - BioRxiv is completely broken (returns random papers)
-2. **Phase 02** - PubMed is partially broken (returns suboptimal results)
-3. **Phase 03** - ClinicalTrials returns too much noise
----
-## TDD Workflow (Per Phase)
-```
-1. Write failing tests
-2. Run tests (confirm they fail)
-3. Implement fix
-4. Run tests (confirm they pass)
-5. Run ALL tests (confirm no regressions)
-6. Manual verification
-7. Commit
-```
----
-## Verification After All Phases
-After completing all 3 phases, run this integration test:
-```bash
-# Full system test
-uv run python -c "
-import asyncio
-from src.tools.europepmc import EuropePMCTool
-from src.tools.pubmed import PubMedTool
-from src.tools.clinicaltrials import ClinicalTrialsTool
-async def test_all():
-    query = 'long covid treatment'
-    print('=== Europe PMC (Preprints) ===')
-    epmc = EuropePMCTool()
-    results = await epmc.search(query, 2)
-    for r in results:
-        print(f'  - {r.citation.title[:60]}...')
-    print()
-    print('=== PubMed ===')
-    pm = PubMedTool()
-    results = await pm.search(query, 2)
-    for r in results:
-        print(f'  - {r.citation.title[:60]}...')
-    print()
-    print('=== ClinicalTrials.gov ===')
-    ct = ClinicalTrialsTool()
-    results = await ct.search(query, 2)
-    for r in results:
-        print(f'  - {r.citation.title[:60]}...')
-asyncio.run(test_all())
-"
-```
-**Expected:** All results should be relevant to "long covid treatment"
----
-## Test Magentic Integration
-After all phases are complete, test the full Magentic workflow:
-```bash
-# Test Magentic mode (requires OPENAI_API_KEY)
-uv run python -c "
-import asyncio
-from src.orchestrator_magentic import MagenticOrchestrator
-async def test_magentic():
-    orchestrator = MagenticOrchestrator(max_rounds=3)
-    print('Running Magentic workflow...')
-    async for event in orchestrator.run('What drugs show promise for Long COVID?'):
-        print(f'[{event.type}] {event.message[:100]}...')
-asyncio.run(test_magentic())
-"
-```
----
-## Files Changed (All Phases)
-| File | Phase | Action |
-|------|-------|--------|
-| `src/tools/europepmc.py` | 01 | CREATE |
-| `tests/unit/tools/test_europepmc.py` | 01 | CREATE |
-| `src/agents/tools.py` | 01 | MODIFY |
-| `src/tools/search_handler.py` | 01 | MODIFY |
-| `src/tools/biorxiv.py` | 01 | DELETE |
-| `tests/unit/tools/test_biorxiv.py` | 01 | DELETE |
-| `src/tools/query_utils.py` | 02 | CREATE |
-| `tests/unit/tools/test_query_utils.py` | 02 | CREATE |
-| `src/tools/pubmed.py` | 02 | MODIFY |
-| `src/tools/clinicaltrials.py` | 03 | MODIFY |
-| `tests/unit/tools/test_clinicaltrials.py` | 03 | MODIFY |
----
-## Success Criteria (Overall)
-- [ ] All unit tests pass
-- [ ] All integration tests pass (real APIs)
-- [ ] Query "What drugs show promise for Long COVID?" returns relevant results from all 3 sources
-- [ ] Magentic workflow produces a coherent research report
-- [ ] No regressions in existing functionality
----
-## Related Documentation
-- [P0 Critical Bugs](./P0_CRITICAL_BUGS.md) - Root cause analysis
-- [P0 Magentic Audit](./P0_MAGENTIC_AND_SEARCH_AUDIT.md) - Framework verification
-- [P0 Actionable Fixes](./P0_ACTIONABLE_FIXES.md) - Fix summaries

docs/bugs/PHASE_01_REPLACE_BIORXIV.md DELETED Viewed

@@ -1,371 +0,0 @@
-# Phase 01: Replace BioRxiv with Europe PMC
-**Priority:** P0 - Critical
-**Effort:** 2-3 hours
-**Dependencies:** None
----
-## Problem Statement
-The BioRxiv API does not support keyword search. It only returns papers by date range, resulting in completely irrelevant results for any query.
-## Success Criteria
-- [ ] `search_preprints("long covid treatment")` returns papers actually about Long COVID
-- [ ] All existing tests pass
-- [ ] New tests cover Europe PMC integration
----
-## TDD Implementation Order
-### Step 1: Write Failing Test
-**File:** `tests/unit/tools/test_europepmc.py`
-```python
-"""Unit tests for Europe PMC tool."""
-import pytest
-from unittest.mock import AsyncMock, patch
-from src.tools.europepmc import EuropePMCTool
-from src.utils.models import Evidence
-@pytest.mark.unit
-class TestEuropePMCTool:
-    """Tests for EuropePMCTool."""
-    @pytest.fixture
-    def tool(self):
-        return EuropePMCTool()
-    def test_tool_name(self, tool):
-        assert tool.name == "europepmc"
-    @pytest.mark.asyncio
-    async def test_search_returns_evidence(self, tool):
-        """Test that search returns Evidence objects."""
-        mock_response = {
-            "resultList": {
-                "result": [
-                    {
-                        "id": "12345",
-                        "title": "Long COVID Treatment Study",
-                        "abstractText": "This study examines treatments for Long COVID.",
-                        "doi": "10.1234/test",
-                        "pubYear": "2024",
-                        "source": "MED",
-                        "pubTypeList": {"pubType": ["research-article"]},
-                    }
-                ]
-            }
-        }
-        with patch("httpx.AsyncClient") as mock_client:
-            mock_instance = AsyncMock()
-            mock_client.return_value.__aenter__.return_value = mock_instance
-            mock_instance.get.return_value.json.return_value = mock_response
-            mock_instance.get.return_value.raise_for_status = lambda: None
-            results = await tool.search("long covid treatment", max_results=5)
-            assert len(results) == 1
-            assert isinstance(results[0], Evidence)
-            assert "Long COVID Treatment Study" in results[0].citation.title
-    @pytest.mark.asyncio
-    async def test_search_marks_preprints(self, tool):
-        """Test that preprints are marked correctly."""
-        mock_response = {
-            "resultList": {
-                "result": [
-                    {
-                        "id": "PPR12345",
-                        "title": "Preprint Study",
-                        "abstractText": "Abstract text",
-                        "doi": "10.1234/preprint",
-                        "pubYear": "2024",
-                        "source": "PPR",
-                        "pubTypeList": {"pubType": ["Preprint"]},
-                    }
-                ]
-            }
-        }
-        with patch("httpx.AsyncClient") as mock_client:
-            mock_instance = AsyncMock()
-            mock_client.return_value.__aenter__.return_value = mock_instance
-            mock_instance.get.return_value.json.return_value = mock_response
-            mock_instance.get.return_value.raise_for_status = lambda: None
-            results = await tool.search("test", max_results=5)
-            assert "[PREPRINT]" in results[0].content
-            assert results[0].citation.source == "preprint"
-    @pytest.mark.asyncio
-    async def test_search_empty_results(self, tool):
-        """Test handling of empty results."""
-        mock_response = {"resultList": {"result": []}}
-        with patch("httpx.AsyncClient") as mock_client:
-            mock_instance = AsyncMock()
-            mock_client.return_value.__aenter__.return_value = mock_instance
-            mock_instance.get.return_value.json.return_value = mock_response
-            mock_instance.get.return_value.raise_for_status = lambda: None
-            results = await tool.search("nonexistent query xyz", max_results=5)
-            assert results == []
-@pytest.mark.integration
-class TestEuropePMCIntegration:
-    """Integration tests with real API."""
-    @pytest.mark.asyncio
-    async def test_real_api_call(self):
-        """Test actual API returns relevant results."""
-        tool = EuropePMCTool()
-        results = await tool.search("long covid treatment", max_results=3)
-        assert len(results) > 0
-        # At least one result should mention COVID
-        titles = " ".join([r.citation.title.lower() for r in results])
-        assert "covid" in titles or "sars" in titles
-```
-### Step 2: Implement Europe PMC Tool
-**File:** `src/tools/europepmc.py`
-```python
-"""Europe PMC search tool - replaces BioRxiv."""
-from typing import Any
-import httpx
-from tenacity import retry, stop_after_attempt, wait_exponential
-from src.utils.exceptions import SearchError
-from src.utils.models import Citation, Evidence
-class EuropePMCTool:
-    """
-    Search Europe PMC for papers and preprints.
-    Europe PMC indexes:
-    - PubMed/MEDLINE articles
-    - PMC full-text articles
-    - Preprints from bioRxiv, medRxiv, ChemRxiv, etc.
-    - Patents and clinical guidelines
-    API Docs: https://europepmc.org/RestfulWebService
-    """
-    BASE_URL = "https://www.ebi.ac.uk/europepmc/webservices/rest/search"
-    @property
-    def name(self) -> str:
-        return "europepmc"
-    @retry(
-        stop=stop_after_attempt(3),
-        wait=wait_exponential(multiplier=1, min=1, max=10),
-        reraise=True,
-    )
-    async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-        """
-        Search Europe PMC for papers matching query.
-        Args:
-            query: Search keywords
-            max_results: Maximum results to return
-        Returns:
-            List of Evidence objects
-        """
-        params = {
-            "query": query,
-            "resultType": "core",
-            "pageSize": min(max_results, 100),
-            "format": "json",
-        }
-        async with httpx.AsyncClient(timeout=30.0) as client:
-            try:
-                response = await client.get(self.BASE_URL, params=params)
-                response.raise_for_status()
-                data = response.json()
-                results = data.get("resultList", {}).get("result", [])
-                return [self._to_evidence(r) for r in results[:max_results]]
-            except httpx.HTTPStatusError as e:
-                raise SearchError(f"Europe PMC API error: {e}") from e
-            except httpx.RequestError as e:
-                raise SearchError(f"Europe PMC connection failed: {e}") from e
-    def _to_evidence(self, result: dict[str, Any]) -> Evidence:
-        """Convert Europe PMC result to Evidence."""
-        title = result.get("title", "Untitled")
-        abstract = result.get("abstractText", "No abstract available.")
-        doi = result.get("doi", "")
-        pub_year = result.get("pubYear", "Unknown")
-        # Get authors
-        author_list = result.get("authorList", {}).get("author", [])
-        authors = [a.get("fullName", "") for a in author_list[:5] if a.get("fullName")]
-        # Check if preprint
-        pub_types = result.get("pubTypeList", {}).get("pubType", [])
-        is_preprint = "Preprint" in pub_types
-        source_db = result.get("source", "europepmc")
-        # Build content
-        preprint_marker = "[PREPRINT - Not peer-reviewed] " if is_preprint else ""
-        content = f"{preprint_marker}{abstract[:1800]}"
-        # Build URL
-        if doi:
-            url = f"https://doi.org/{doi}"
-        elif result.get("pmid"):
-            url = f"https://pubmed.ncbi.nlm.nih.gov/{result['pmid']}/"
-        else:
-            url = f"https://europepmc.org/article/{source_db}/{result.get('id', '')}"
-        return Evidence(
-            content=content[:2000],
-            citation=Citation(
-                source="preprint" if is_preprint else "europepmc",
-                title=title[:500],
-                url=url,
-                date=str(pub_year),
-                authors=authors,
-            ),
-            relevance=0.75 if is_preprint else 0.9,
-        )
-```
-### Step 3: Update Magentic Tools
-**File:** `src/agents/tools.py` - Replace biorxiv import:
-```python
-# REMOVE:
-# from src.tools.biorxiv import BioRxivTool
-# _biorxiv = BioRxivTool()
-# ADD:
-from src.tools.europepmc import EuropePMCTool
-_europepmc = EuropePMCTool()
-# UPDATE search_preprints function:
-@ai_function
-async def search_preprints(query: str, max_results: int = 10) -> str:
-    """Search Europe PMC for preprints and papers.
-    Use this tool to find the latest research including preprints
-    from bioRxiv, medRxiv, and peer-reviewed papers.
-    Args:
-        query: Search terms (e.g., "long covid treatment")
-        max_results: Maximum results to return (default 10)
-    Returns:
-        Formatted list of papers with abstracts and links
-    """
-    state = get_magentic_state()
-    results = await _europepmc.search(query, max_results)
-    if not results:
-        return f"No papers found for: {query}"
-    new_count = state.add_evidence(results)
-    output = [f"Found {len(results)} papers ({new_count} new stored):\n"]
-    for i, r in enumerate(results[:max_results], 1):
-        title = r.citation.title
-        date = r.citation.date
-        source = r.citation.source
-        content_clean = r.content[:300].replace("\n", " ")
-        url = r.citation.url
-        output.append(f"{i}. **{title}**")
-        output.append(f"   Source: {source} | Date: {date}")
-        output.append(f"   {content_clean}...")
-        output.append(f"   URL: {url}\n")
-    return "\n".join(output)
-```
-### Step 4: Update Search Handler (Simple Mode)
-**File:** `src/tools/search_handler.py` - Update imports:
-```python
-# REMOVE:
-# from src.tools.biorxiv import BioRxivTool
-# ADD:
-from src.tools.europepmc import EuropePMCTool
-```
-### Step 5: Delete Old BioRxiv Tests
-```bash
-# After all new tests pass:
-rm tests/unit/tools/test_biorxiv.py
-```
----
-## Verification
-```bash
-# Run new tests
-uv run pytest tests/unit/tools/test_europepmc.py -v
-# Run integration test (real API)
-uv run pytest tests/unit/tools/test_europepmc.py::TestEuropePMCIntegration -v
-# Run all tests to ensure no regressions
-uv run pytest tests/unit/ -v
-# Manual verification
-uv run python -c "
-import asyncio
-from src.tools.europepmc import EuropePMCTool
-tool = EuropePMCTool()
-results = asyncio.run(tool.search('long covid treatment', 3))
-for r in results:
-    print(f'- {r.citation.title}')
-"
-```
----
-## Files Changed
-| File | Action |
-|------|--------|
-| `src/tools/europepmc.py` | CREATE |
-| `tests/unit/tools/test_europepmc.py` | CREATE |
-| `src/agents/tools.py` | MODIFY (replace biorxiv import) |
-| `src/tools/search_handler.py` | MODIFY (replace biorxiv import) |
-| `src/tools/biorxiv.py` | DELETE (after verification) |
-| `tests/unit/tools/test_biorxiv.py` | DELETE (after verification) |
----
-## Rollback Plan
-If issues arise:
-1. Revert `src/agents/tools.py` to use BioRxivTool
-2. Revert `src/tools/search_handler.py`
-3. Keep `europepmc.py` for future use

docs/bugs/PHASE_02_PUBMED_QUERY_PREPROCESSING.md DELETED Viewed

@@ -1,355 +0,0 @@
-# Phase 02: PubMed Query Preprocessing
-**Priority:** P0 - Critical
-**Effort:** 2-3 hours
-**Dependencies:** None (can run parallel with Phase 01)
----
-## Problem Statement
-PubMed receives raw natural language queries like "What medications show promise for Long COVID?" which include question words that pollute search results.
-## Success Criteria
-- [ ] Question words stripped from queries
-- [ ] Medical synonyms expanded (Long COVID → PASC, etc.)
-- [ ] Relevant results returned for natural language questions
-- [ ] All existing tests pass
-- [ ] New tests cover query preprocessing
----
-## TDD Implementation Order
-### Step 1: Write Failing Tests
-**File:** `tests/unit/tools/test_query_utils.py`
-```python
-"""Unit tests for query preprocessing utilities."""
-import pytest
-from src.tools.query_utils import preprocess_query, expand_synonyms, strip_question_words
-@pytest.mark.unit
-class TestQueryPreprocessing:
-    """Tests for query preprocessing."""
-    def test_strip_question_words(self):
-        """Test removal of question words."""
-        assert strip_question_words("What drugs treat cancer") == "drugs treat cancer"
-        assert strip_question_words("Which medications help diabetes") == "medications diabetes"
-        assert strip_question_words("How can we cure alzheimer") == "cure alzheimer"
-        assert strip_question_words("Is metformin effective") == "metformin effective"
-    def test_strip_preserves_medical_terms(self):
-        """Test that medical terms are preserved."""
-        result = strip_question_words("What is the mechanism of metformin")
-        assert "metformin" in result
-        assert "mechanism" in result
-    def test_expand_synonyms_long_covid(self):
-        """Test Long COVID synonym expansion."""
-        result = expand_synonyms("long covid treatment")
-        assert "PASC" in result or "post-COVID" in result
-    def test_expand_synonyms_alzheimer(self):
-        """Test Alzheimer's synonym expansion."""
-        result = expand_synonyms("alzheimer drug")
-        assert "Alzheimer" in result
-    def test_expand_synonyms_preserves_unknown(self):
-        """Test that unknown terms are preserved."""
-        result = expand_synonyms("metformin diabetes")
-        assert "metformin" in result
-        assert "diabetes" in result
-    def test_preprocess_query_full_pipeline(self):
-        """Test complete preprocessing pipeline."""
-        raw = "What medications show promise for Long COVID?"
-        result = preprocess_query(raw)
-        # Should not contain question words
-        assert "what" not in result.lower()
-        assert "show" not in result.lower()
-        assert "promise" not in result.lower()
-        # Should contain expanded terms
-        assert "PASC" in result or "post-COVID" in result or "long covid" in result.lower()
-        assert "medications" in result.lower() or "drug" in result.lower()
-    def test_preprocess_query_removes_punctuation(self):
-        """Test that question marks are removed."""
-        result = preprocess_query("Is metformin safe?")
-        assert "?" not in result
-    def test_preprocess_query_handles_empty(self):
-        """Test handling of empty/whitespace queries."""
-        assert preprocess_query("") == ""
-        assert preprocess_query("   ") == ""
-    def test_preprocess_query_already_clean(self):
-        """Test that clean queries pass through."""
-        clean = "metformin diabetes mechanism"
-        result = preprocess_query(clean)
-        assert "metformin" in result
-        assert "diabetes" in result
-        assert "mechanism" in result
-```
-### Step 2: Implement Query Utils
-**File:** `src/tools/query_utils.py`
-```python
-"""Query preprocessing utilities for biomedical search."""
-import re
-from typing import ClassVar
-# Question words and filler words to remove
-QUESTION_WORDS: set[str] = {
-    # Question starters
-    "what", "which", "how", "why", "when", "where", "who", "whom",
-    # Auxiliary verbs in questions
-    "is", "are", "was", "were", "do", "does", "did", "can", "could",
-    "would", "should", "will", "shall", "may", "might",
-    # Filler words in natural questions
-    "show", "promise", "help", "believe", "think", "suggest",
-    "possible", "potential", "effective", "useful", "good",
-    # Articles (remove but less aggressively)
-    "the", "a", "an",
-}
-# Medical synonym expansions
-SYNONYMS: dict[str, list[str]] = {
-    "long covid": [
-        "long COVID",
-        "PASC",
-        "post-acute sequelae of SARS-CoV-2",
-        "post-COVID syndrome",
-        "post-COVID-19 condition",
-    ],
-    "alzheimer": [
-        "Alzheimer's disease",
-        "Alzheimer disease",
-        "AD",
-        "Alzheimer dementia",
-    ],
-    "parkinson": [
-        "Parkinson's disease",
-        "Parkinson disease",
-        "PD",
-    ],
-    "diabetes": [
-        "diabetes mellitus",
-        "type 2 diabetes",
-        "T2DM",
-        "diabetic",
-    ],
-    "cancer": [
-        "cancer",
-        "neoplasm",
-        "tumor",
-        "malignancy",
-        "carcinoma",
-    ],
-    "heart disease": [
-        "cardiovascular disease",
-        "CVD",
-        "coronary artery disease",
-        "heart failure",
-    ],
-}
-def strip_question_words(query: str) -> str:
-    """
-    Remove question words and filler terms from query.
-    Args:
-        query: Raw query string
-    Returns:
-        Query with question words removed
-    """
-    words = query.lower().split()
-    filtered = [w for w in words if w not in QUESTION_WORDS]
-    return " ".join(filtered)
-def expand_synonyms(query: str) -> str:
-    """
-    Expand medical terms to include synonyms.
-    Args:
-        query: Query string
-    Returns:
-        Query with synonym expansions in OR groups
-    """
-    result = query.lower()
-    for term, expansions in SYNONYMS.items():
-        if term in result:
-            # Create OR group: ("term1" OR "term2" OR "term3")
-            or_group = " OR ".join([f'"{exp}"' for exp in expansions])
-            result = result.replace(term, f"({or_group})")
-    return result
-def preprocess_query(raw_query: str) -> str:
-    """
-    Full preprocessing pipeline for PubMed queries.
-    Pipeline:
-    1. Strip whitespace and punctuation
-    2. Remove question words
-    3. Expand medical synonyms
-    Args:
-        raw_query: Natural language query from user
-    Returns:
-        Optimized query for PubMed
-    """
-    if not raw_query or not raw_query.strip():
-        return ""
-    # Remove question marks and extra whitespace
-    query = raw_query.replace("?", "").strip()
-    query = re.sub(r"\s+", " ", query)
-    # Strip question words
-    query = strip_question_words(query)
-    # Expand synonyms
-    query = expand_synonyms(query)
-    return query.strip()
-```
-### Step 3: Update PubMed Tool
-**File:** `src/tools/pubmed.py` - Add preprocessing:
-```python
-# Add import at top:
-from src.tools.query_utils import preprocess_query
-# Update search method:
-@retry(
-    stop=stop_after_attempt(3),
-    wait=wait_exponential(multiplier=1, min=1, max=10),
-    reraise=True,
-)
-async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-    """
-    Search PubMed and return evidence.
-    """
-    await self._rate_limit()
-    # PREPROCESS QUERY
-    clean_query = preprocess_query(query)
-    if not clean_query:
-        clean_query = query  # Fallback to original if preprocessing empties it
-    async with httpx.AsyncClient(timeout=30.0) as client:
-        search_params = self._build_params(
-            db="pubmed",
-            term=clean_query,  # Use preprocessed query
-            retmax=max_results,
-            sort="relevance",
-        )
-        # ... rest unchanged
-```
-### Step 4: Update PubMed Tests
-**File:** `tests/unit/tools/test_pubmed.py` - Add preprocessing test:
-```python
-@pytest.mark.asyncio
-async def test_search_preprocesses_query(self, pubmed_tool, mock_httpx_client):
-    """Test that queries are preprocessed before search."""
-    # This test verifies the integration - the actual preprocessing
-    # is tested in test_query_utils.py
-    mock_httpx_client.get.return_value = httpx.Response(
-        200,
-        json={"esearchresult": {"idlist": []}},
-    )
-    # Natural language query
-    await pubmed_tool.search("What drugs help with Long COVID?")
-    # Verify the call was made (preprocessing happens internally)
-    assert mock_httpx_client.get.called
-```
----
-## Verification
-```bash
-# Run query utils tests
-uv run pytest tests/unit/tools/test_query_utils.py -v
-# Run pubmed tests
-uv run pytest tests/unit/tools/test_pubmed.py -v
-# Run all tests
-uv run pytest tests/unit/ -v
-# Manual verification
-uv run python -c "
-from src.tools.query_utils import preprocess_query
-queries = [
-    'What medications show promise for Long COVID?',
-    'Is metformin effective for cancer treatment?',
-    'How can we treat Alzheimer with existing drugs?',
-]
-for q in queries:
-    print(f'Input:  {q}')
-    print(f'Output: {preprocess_query(q)}')
-    print()
-"
-```
-Expected output:
-```
-Input:  What medications show promise for Long COVID?
-Output: medications ("long COVID" OR "PASC" OR "post-acute sequelae of SARS-CoV-2" OR "post-COVID syndrome" OR "post-COVID-19 condition")
-Input:  Is metformin effective for cancer treatment?
-Output: metformin for ("cancer" OR "neoplasm" OR "tumor" OR "malignancy" OR "carcinoma") treatment
-Input:  How can we treat Alzheimer with existing drugs?
-Output: we treat ("Alzheimer's disease" OR "Alzheimer disease" OR "AD" OR "Alzheimer dementia") with existing drugs
-```
----
-## Files Changed
-| File | Action |
-|------|--------|
-| `src/tools/query_utils.py` | CREATE |
-| `tests/unit/tools/test_query_utils.py` | CREATE |
-| `src/tools/pubmed.py` | MODIFY (add preprocessing) |
-| `tests/unit/tools/test_pubmed.py` | MODIFY (add integration test) |
----
-## Future Enhancements (Out of Scope)
-- MeSH term lookup via NCBI API
-- Drug name normalization (brand → generic)
-- Disease ontology integration (UMLS)
-- Query intent classification

docs/bugs/PHASE_03_CLINICALTRIALS_FILTERING.md DELETED Viewed

@@ -1,386 +0,0 @@
-# Phase 03: ClinicalTrials.gov Filtering
-**Priority:** P1 - High
-**Effort:** 1-2 hours
-**Dependencies:** None (can run parallel with Phase 01 & 02)
----
-## Problem Statement
-ClinicalTrials.gov returns ALL matching trials including:
-- Withdrawn/Terminated trials (no useful data)
-- Observational studies (not drug interventions)
-- Phase 1 trials (safety only, no efficacy)
-For drug repurposing, we need interventional studies with efficacy data.
-## Success Criteria
-- [ ] Only interventional studies returned
-- [ ] Withdrawn/terminated trials filtered out
-- [ ] Phase information included in results
-- [ ] All existing tests pass
-- [ ] New tests cover filtering
----
-## TDD Implementation Order
-### Step 1: Write Failing Tests
-**File:** `tests/unit/tools/test_clinicaltrials.py` - Add filter tests:
-```python
-"""Unit tests for ClinicalTrials.gov tool."""
-import pytest
-from unittest.mock import patch, MagicMock
-from src.tools.clinicaltrials import ClinicalTrialsTool
-from src.utils.models import Evidence
-@pytest.mark.unit
-class TestClinicalTrialsTool:
-    """Tests for ClinicalTrialsTool."""
-    @pytest.fixture
-    def tool(self):
-        return ClinicalTrialsTool()
-    def test_tool_name(self, tool):
-        assert tool.name == "clinicaltrials"
-    @pytest.mark.asyncio
-    async def test_search_uses_filters(self, tool):
-        """Test that search applies status and type filters."""
-        mock_response = MagicMock()
-        mock_response.json.return_value = {"studies": []}
-        mock_response.raise_for_status = MagicMock()
-        with patch("requests.get", return_value=mock_response) as mock_get:
-            await tool.search("test query", max_results=5)
-            # Verify filters were applied
-            call_args = mock_get.call_args
-            params = call_args.kwargs.get("params", call_args[1].get("params", {}))
-            # Should filter for active/completed studies
-            assert "filter.overallStatus" in params
-            assert "COMPLETED" in params["filter.overallStatus"]
-            assert "RECRUITING" in params["filter.overallStatus"]
-            # Should filter for interventional studies
-            assert "filter.studyType" in params
-            assert "INTERVENTIONAL" in params["filter.studyType"]
-    @pytest.mark.asyncio
-    async def test_search_returns_evidence(self, tool):
-        """Test that search returns Evidence objects."""
-        mock_study = {
-            "protocolSection": {
-                "identificationModule": {
-                    "nctId": "NCT12345678",
-                    "briefTitle": "Metformin for Long COVID Treatment",
-                },
-                "statusModule": {
-                    "overallStatus": "COMPLETED",
-                    "startDateStruct": {"date": "2023-01-01"},
-                },
-                "descriptionModule": {
-                    "briefSummary": "A study examining metformin for Long COVID symptoms.",
-                },
-                "designModule": {
-                    "phases": ["PHASE2", "PHASE3"],
-                },
-                "conditionsModule": {
-                    "conditions": ["Long COVID", "PASC"],
-                },
-                "armsInterventionsModule": {
-                    "interventions": [{"name": "Metformin"}],
-                },
-            }
-        }
-        mock_response = MagicMock()
-        mock_response.json.return_value = {"studies": [mock_study]}
-        mock_response.raise_for_status = MagicMock()
-        with patch("requests.get", return_value=mock_response):
-            results = await tool.search("long covid metformin", max_results=5)
-            assert len(results) == 1
-            assert isinstance(results[0], Evidence)
-            assert "Metformin" in results[0].citation.title
-            assert "PHASE2" in results[0].content or "Phase" in results[0].content
-    @pytest.mark.asyncio
-    async def test_search_includes_phase_info(self, tool):
-        """Test that phase information is included in content."""
-        mock_study = {
-            "protocolSection": {
-                "identificationModule": {
-                    "nctId": "NCT12345678",
-                    "briefTitle": "Test Study",
-                },
-                "statusModule": {
-                    "overallStatus": "RECRUITING",
-                    "startDateStruct": {"date": "2024-01-01"},
-                },
-                "descriptionModule": {
-                    "briefSummary": "Test summary.",
-                },
-                "designModule": {
-                    "phases": ["PHASE3"],
-                },
-                "conditionsModule": {"conditions": ["Test"]},
-                "armsInterventionsModule": {"interventions": []},
-            }
-        }
-        mock_response = MagicMock()
-        mock_response.json.return_value = {"studies": [mock_study]}
-        mock_response.raise_for_status = MagicMock()
-        with patch("requests.get", return_value=mock_response):
-            results = await tool.search("test", max_results=5)
-            # Phase should be in content
-            assert "PHASE3" in results[0].content or "Phase 3" in results[0].content
-    @pytest.mark.asyncio
-    async def test_search_empty_results(self, tool):
-        """Test handling of empty results."""
-        mock_response = MagicMock()
-        mock_response.json.return_value = {"studies": []}
-        mock_response.raise_for_status = MagicMock()
-        with patch("requests.get", return_value=mock_response):
-            results = await tool.search("nonexistent xyz 12345", max_results=5)
-            assert results == []
-@pytest.mark.integration
-class TestClinicalTrialsIntegration:
-    """Integration tests with real API."""
-    @pytest.mark.asyncio
-    async def test_real_api_returns_interventional(self):
-        """Test that real API returns interventional studies."""
-        tool = ClinicalTrialsTool()
-        results = await tool.search("long covid treatment", max_results=3)
-        # Should get results
-        assert len(results) > 0
-        # Results should mention interventions or treatments
-        all_content = " ".join([r.content.lower() for r in results])
-        has_intervention = (
-            "intervention" in all_content
-            or "treatment" in all_content
-            or "drug" in all_content
-            or "phase" in all_content
-        )
-        assert has_intervention
-```
-### Step 2: Update ClinicalTrials Tool
-**File:** `src/tools/clinicaltrials.py` - Add filters:
-```python
-"""ClinicalTrials.gov search tool using API v2."""
-import asyncio
-from typing import Any, ClassVar
-import requests
-from tenacity import retry, stop_after_attempt, wait_exponential
-from src.utils.exceptions import SearchError
-from src.utils.models import Citation, Evidence
-class ClinicalTrialsTool:
-    """Search tool for ClinicalTrials.gov.
-    Note: Uses `requests` library instead of `httpx` because ClinicalTrials.gov's
-    WAF blocks httpx's TLS fingerprint. The `requests` library is not blocked.
-    See: https://clinicaltrials.gov/data-api/api
-    """
-    BASE_URL = "https://clinicaltrials.gov/api/v2/studies"
-    # Fields to retrieve
-    FIELDS: ClassVar[list[str]] = [
-        "NCTId",
-        "BriefTitle",
-        "Phase",
-        "OverallStatus",
-        "Condition",
-        "InterventionName",
-        "StartDate",
-        "BriefSummary",
-    ]
-    # Status filter: Only active/completed studies with potential data
-    STATUS_FILTER = "COMPLETED|ACTIVE_NOT_RECRUITING|RECRUITING|ENROLLING_BY_INVITATION"
-    # Study type filter: Only interventional (drug/treatment studies)
-    STUDY_TYPE_FILTER = "INTERVENTIONAL"
-    @property
-    def name(self) -> str:
-        return "clinicaltrials"
-    @retry(
-        stop=stop_after_attempt(3),
-        wait=wait_exponential(multiplier=1, min=1, max=10),
-        reraise=True,
-    )
-    async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
-        """Search ClinicalTrials.gov for interventional studies.
-        Args:
-            query: Search query (e.g., "metformin alzheimer")
-            max_results: Maximum results to return (max 100)
-        Returns:
-            List of Evidence objects from clinical trials
-        """
-        params: dict[str, str | int] = {
-            "query.term": query,
-            "pageSize": min(max_results, 100),
-            "fields": "|".join(self.FIELDS),
-            # FILTERS - Only interventional, active/completed studies
-            "filter.overallStatus": self.STATUS_FILTER,
-            "filter.studyType": self.STUDY_TYPE_FILTER,
-        }
-        try:
-            # Run blocking requests.get in a separate thread for async compatibility
-            response = await asyncio.to_thread(
-                requests.get,
-                self.BASE_URL,
-                params=params,
-                headers={"User-Agent": "DeepCritical-Research-Agent/1.0"},
-                timeout=30,
-            )
-            response.raise_for_status()
-            data = response.json()
-            studies = data.get("studies", [])
-            return [self._study_to_evidence(study) for study in studies[:max_results]]
-        except requests.HTTPError as e:
-            raise SearchError(f"ClinicalTrials.gov API error: {e}") from e
-        except requests.RequestException as e:
-            raise SearchError(f"ClinicalTrials.gov request failed: {e}") from e
-    def _study_to_evidence(self, study: dict[str, Any]) -> Evidence:
-        """Convert a clinical trial study to Evidence."""
-        # Navigate nested structure
-        protocol = study.get("protocolSection", {})
-        id_module = protocol.get("identificationModule", {})
-        status_module = protocol.get("statusModule", {})
-        desc_module = protocol.get("descriptionModule", {})
-        design_module = protocol.get("designModule", {})
-        conditions_module = protocol.get("conditionsModule", {})
-        arms_module = protocol.get("armsInterventionsModule", {})
-        nct_id = id_module.get("nctId", "Unknown")
-        title = id_module.get("briefTitle", "Untitled Study")
-        status = status_module.get("overallStatus", "Unknown")
-        start_date = status_module.get("startDateStruct", {}).get("date", "Unknown")
-        # Get phase (might be a list)
-        phases = design_module.get("phases", [])
-        phase = phases[0] if phases else "Not Applicable"
-        # Get conditions
-        conditions = conditions_module.get("conditions", [])
-        conditions_str = ", ".join(conditions[:3]) if conditions else "Unknown"
-        # Get interventions
-        interventions = arms_module.get("interventions", [])
-        intervention_names = [i.get("name", "") for i in interventions[:3]]
-        interventions_str = ", ".join(intervention_names) if intervention_names else "Unknown"
-        # Get summary
-        summary = desc_module.get("briefSummary", "No summary available.")
-        # Build content with key trial info
-        content = (
-            f"{summary[:500]}... "
-            f"Trial Phase: {phase}. "
-            f"Status: {status}. "
-            f"Conditions: {conditions_str}. "
-            f"Interventions: {interventions_str}."
-        )
-        return Evidence(
-            content=content[:2000],
-            citation=Citation(
-                source="clinicaltrials",
-                title=title[:500],
-                url=f"https://clinicaltrials.gov/study/{nct_id}",
-                date=start_date,
-                authors=[],  # Trials don't have traditional authors
-            ),
-            relevance=0.85,  # Trials are highly relevant for repurposing
-        )
-```
----
-## Verification
-```bash
-# Run clinicaltrials tests
-uv run pytest tests/unit/tools/test_clinicaltrials.py -v
-# Run integration test (real API)
-uv run pytest tests/unit/tools/test_clinicaltrials.py::TestClinicalTrialsIntegration -v
-# Run all tests
-uv run pytest tests/unit/ -v
-# Manual verification
-uv run python -c "
-import asyncio
-from src.tools.clinicaltrials import ClinicalTrialsTool
-tool = ClinicalTrialsTool()
-results = asyncio.run(tool.search('long covid treatment', 3))
-for r in results:
-    print(f'Title: {r.citation.title}')
-    print(f'Content: {r.content[:200]}...')
-    print()
-"
-```
----
-## Files Changed
-| File | Action |
-|------|--------|
-| `src/tools/clinicaltrials.py` | MODIFY (add filters) |
-| `tests/unit/tools/test_clinicaltrials.py` | MODIFY (add filter tests) |
----
-## API Filter Reference
-ClinicalTrials.gov API v2 supports these filters:
-| Parameter | Values | Purpose |
-|-----------|--------|---------|
-| `filter.overallStatus` | COMPLETED, RECRUITING, etc. | Trial status |
-| `filter.studyType` | INTERVENTIONAL, OBSERVATIONAL | Study design |
-| `filter.phase` | PHASE1, PHASE2, PHASE3, PHASE4 | Trial phase |
-| `filter.geo` | Country codes | Geographic filter |
-See: https://clinicaltrials.gov/data-api/api