Projects
2026
Python RAG LLM Systematic Review BioNLP Oncology

HGSOC Liquid Biopsy Expert — RAG System

Domain-specific expert assistant for high-grade serous ovarian cancer liquid biopsy research, grounded in a PROSPERO-registered systematic review.

Overview

A domain-specific expert RAG system that answers clinical and research questions about liquid biopsy biomarkers in high-grade serous ovarian cancer (HGSOC). Unlike generic biomedical RAG systems, this one is grounded in a curated two-tier corpus: the full-text papers from a PROSPERO-registered systematic review (the evidence base), combined with a structured research wiki covering HGSOC biology, liquid biopsy methodology, and related literature.

Tech & Architecture

  • Corpus curation: Docling-based extraction pipeline for full-text PDFs (tables, figures, supplementary materials); structured Markdown knowledge base (~180 documents across two tiers)
  • Retrieval: hybrid BM25 + dense retrieval (MedCPT / BioLORD adapters) over deterministic JSONL chunk exports; metadata filters expose corpus tier at query time
  • Evaluation: three-arm comparison (long-context agent vs. RAG vs. QLoRA fine-tune) scored on field accuracy, citation accuracy, and hallucination rate against extraction_v2.db (PROSPERO CRD420261405303)
  • Search automation: PICO-to-boolean query generation benchmarked against a frozen 2,927-record human PubMed search (134/158 priority records recovered by template arm)

Results & Highlights

  • PROSPERO-registered systematic review as the evaluation ground truth — higher provenance than any LLM-annotated benchmark in the field
  • Full PRISMA pipeline automated: search recall benchmarking (Task 1), title/abstract screening with per-criterion scoring (Task 2), structured data extraction (Task 3)
  • Per-criterion screening labels captured for 42 PI-confirmed full-text decisions, including 9 reclassifications that reveal the precise boundary where automated screeners systematically fail
  • Two-tier corpus design decouples SR-quality extraction (benchmark-grounded) from broader domain Q&A (wiki-grounded), enabling both rigorous evaluation and expert-level synthesis