Research
Engineering insights, conference talks, and benchmark results.
Paper When the Judge is Wrong: Measuring LLM-as-Judge Reliability Against Graph-Verified Ground Truth in Financial Documents
llm-as-judge evaluation knowledge-graph financial-documents model-risk
Paper FinStructBench: A Benchmark for Structured Information Retrieval from Financial Documents Using Graph-Verifiable Questions
benchmark knowledge-graph financial-documents structured-retrieval regulatory-compliance
Blog The Missing Layer Between RAG and Production
RAG solved retrieval. Guardrails solved safety. What's missing is a governed harness — and regulated industries can't ship without it.
verification RAG compliance LLM regulated-industries