Add DISGENET API integration as a new database tool#288
Open
MoiraClimentGispert wants to merge 2 commits intosnap-stanford:mainfrom
Open
Add DISGENET API integration as a new database tool#288MoiraClimentGispert wants to merge 2 commits intosnap-stanford:mainfrom
MoiraClimentGispert wants to merge 2 commits intosnap-stanford:mainfrom
Conversation
Integrate DISGENET REST API endpoints as a new data source in Biomni, enabling the agent to query gene-disease associations, variant-disease associations, and related biomedical data directly through the DISGENET API. Main changes: - Add query_disgenet functions to biomni/tool/database.py with API-based data retrieval, normalization, and result parsing - Add DISGENET evidence tool to biomni/tool/literature.py - Add tool descriptions for all new DISGENET tools - Add DISGENET_API_KEY preflight check in agent initialization Bug Fixes & Enhancements: biomni/llm.py line 38: changed config.llm_model for config.llm (self.variable of config.py) biomni/config.py edited default configuration so all is centralized through env. File biomni/env_desc.py & env_desc_cm.py Removed all references to DisGeNET.parquet from the codebase biomni/agent/a1.py Moved load_dotenv() to before the import of default_config to avoid missing environment variables Added self.disgenet_api_available = self._ensure_disgenet_api_key() biomni/tool/database.py line 87 changed query_llm_for_api, now handles diff. formats "Failed to parse LLM response, no attribute .strip" Line 67: Use string replacement instead of .format() to avoid issues with curly braces in JSON Required .env variables: DISGENET_API_KEY = Required for DISGENET API access If no API key is provided, a warning message appears together with the option to provide it or continue without it regardless.
6a1b11c to
99da00c
Compare
for more information, see https://pre-commit.ci
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR by MedBioinformatics adds DISGENET as a new data source in Biomni, following the Adding New Data (web API) contribution guidelines.
DISGENET™(https://disgenet.com/) is an integrated, evidence-scored knowledge layer that connects genes, variants, diseases, traits and therapeutics into a unified semantic framework designed for computational use, clinical interpretation and translational research.
What's included
Core tool (new data source via web API):
biomni/tool/database.py— Newquery_disgenet_api()function that translates natural-language prompts into DISGENET REST API calls. Includes automatic entity normalization (disease names → UMLS CUI, gene names → NCBI Gene ID), dynamic endpoint selection, and structured result parsing.biomni/tool/literature.py— Newquery_disgenet_evidence()evidence retrieval tool for DISGENET literature-backed association queries.biomni/tool/tool_description/database.py— Tool description forquery_disgenet_apifollowing existing format.biomni/tool/tool_description/literature.py— Tool description forquery_disgenet_evidence()following the existing format.Agent integration:
biomni/agent/a1.py— AddedDISGENET_API_KEYpreflight check during agent initialization (warns early if key is missing; optionally prompts in interactive sessions). Movedload_dotenvcall earlier to ensure env vars are available before imports.biomni/config.py— Added DISGENET-related configuration options..env.example— AddedDISGENET_API_KEYentry.Key features of the DISGENET tool
Test prompt
Requirements if you want to use DISGENET tool
DISGENET_API_KEYenvironment variable)Test plan
query_disgenet_api()with GDA, VDA, and DDA queriesbiomni/eval/benchmark_results/)