doctor - Medical AI by Doctornuke

Today marks a major milestone in the evolution of PyDxAI, our autonomous medical reasoning system designed to combine large language model (LLM) intelligence with structured medical retrieval, self-reflection, and memory management.

For the first time, every layer of the pipeline—from query sharpening to vector retrieval, agentic web search, and contextual memory saving—worked seamlessly in a complete, closed loop.

🧩 The Core Idea: From Simple Question to Intelligent Response

The user prompt that triggered today’s full agentic flow was:

“The patient comes with cough, fever, and headache for four days. What is the management?”

A simple question on the surface—but it represents exactly the kind of everyday clinical scenario where PyDxAI must interpret vague input, retrieve high-quality references, and deliver a precise, evidence-based answer.

The system begins by sharpening the user query. The “front LLM” (DeepSeek or Mistral backend) normalizes phrasing and ensures context clarity—turning free text into a semantically structured medical question.

This step converts “The patient come with cough, fever, headache” into a standardized diagnostic request suitable for RAG (Retrieval-Augmented Generation).

🔍 Smart Retrieval: Context from Trusted Medical Sources

Once sharpened, PyDxAI’s retriever selector analyzes the query type.
Because this prompt matched the symptom_check intent, the system automatically chose the VectorStoreRetriever module linked to Qdrant, our local vector database at localhost:6333.

Within seconds, three authoritative documents were retrieved:

Oxford Handbook of Emergency Medicine, 5th Edition (2020)
Tintinalli’s Emergency Medicine
CURRENT Medical Diagnosis and Treatment (2022)

This confirms that the Qdrant-based vector retrieval pipeline is functioning optimally—embedding alignment, relevance scoring, and text segmentation are all correctly tuned. Each document returned precise context segments about fever, headache, and respiratory symptoms, forming the evidence backbone for the final reasoning phase.

🧠 Contextual Memory: Teaching the System to Remember

Parallel to document retrieval, the memory subsystem activates. PyDxAI now maintains three distinct layers of recall—session memory, long-term memory, and a condensed cross-session memory.

In today’s run, the system successfully retrieved three memory entries, then automatically condensed them into a 506-character summary. The memory context was inserted into the reasoning prompt to enrich the LLM’s perspective without overwhelming it.

For example, the retrieved memory contained a reflective note from a prior interaction—illustrating that the model’s recall layer is functioning, even if not yet domain-filtered. Future improvements will allow PyDxAI to distinguish between “medical” and “general” memories, retrieving only those relevant to the task at hand.

This marks an important step toward a true cognitive agent—one that not only recalls data but can contextualize it to improve understanding over time.

⚙️ The Agentic Chain in Action

When the reasoning phase begins, all components interact autonomously:

Front LLM refines the user query and detects intent.
RAG Engine (Qdrant) retrieves semantically similar passages.
Memory Manager merges condensed recall and session context.
Main LLM (DeepSeek or Mistral) generates the medical answer.
Post-processor evaluates the response quality.
If weak, the agentic trigger launches a web search and retries.
Finally, results and reasoning context are stored in both session and global memory tables.

The full log from today’s run showed flawless execution of this cycle.
Response generation, embedding comparisons, and data saving all occurred within 3–5 seconds—a solid performance benchmark for an on-premise multi-component AI stack.

💾 The Database Fix: When JSON Speaks Python

Earlier in the day, a small but critical bug appeared when saving memory to PostgreSQL:

❌ Failed to save memory: invalid input syntax for type json
DETAIL: Token "web_search" is invalid.

The problem: Python dictionaries were being inserted directly into JSON columns without serialization.

The fix was straightforward but essential—adding a json.dumps() conversion before insertion. Once implemented, all memory entries, including structured tags like ["web_search"] and summary dictionaries, were stored cleanly.

After that, memory saving logs confirmed:

✅ Memory saved id=151  scope=session
✅ Saved to chat_history + global_memory

This repair closed the loop between reasoning output and persistent learning—PyDxAI now records its conversations, summaries, and contextual metadata flawlessly.

📈 Diagnostic Insights from the Logs

Several key insights emerged from the system logs:

Embeddings consistency — Both query and memory vectors were 768-dimensional, confirming model compatibility.
Latency — Each retrieval step completed in under 0.5 seconds.
Memory summarization — Context compression effectively reduced noise.
Intent detection — Correctly classified the query as “symptom_check,” demonstrating good keyword-to-intent mapping.

Every one of these signals contributes to the overarching goal: a self-refining, agentic medical assistant capable of understanding, retrieving, reasoning, and learning continuously.

🔮 Next Steps

Although today’s performance was nearly perfect, a few refinements are planned:

Domain filtering:
Only retrieve memories labeled as “medical,” excluding unrelated text from past sessions.
Relevance thresholds:
Dynamically limit retrieved documents based on similarity score, improving response clarity.
Structured output:
For clinical queries, responses will follow a fixed format—
Assessment → Differential diagnosis → Investigations → Management.
Latency tracking:
Introduce automatic performance logs to measure response time and GPU utilization per query.
Agentic self-review:
Future versions will let PyDxAI critique its own responses using a smaller evaluation model (“judge LLM”) and revise them autonomously.

🩺 Conclusion

Today’s successful run demonstrates that PyDxAI is no longer a simple RAG chatbot—it’s an emerging agentic system with memory, reasoning, and autonomous control.

It can decide when its own answer is weak, trigger a search, retry with improved context, and persist the result for future learning. Each of these abilities mirrors fundamental cognitive behaviors—reflection, recall, and adaptation.

From a medical perspective, this means the model can handle increasingly complex clinical reasoning with better evidence grounding. From a system design perspective, it shows the power of integrating multiple specialized subsystems—retrievers, memory engines, and LLMs—into one cohesive intelligence loop.

November 4, 2025 thus stands as a key point in PyDxAI’s journey:
the day when autonomous reasoning, retrieval, and memory truly began to work together—transforming it from a reactive assistant into a proactive medical intelligence system.

doctornuke

A web newbie since 1996

doctornuke.com/

Posts tagged: doctor

PyDxAI Agentic Intelligence — System Progress Report (Nov 4, 2025)