YESciEval provides a comprehensive library for evaluating the quality of synthesized scientific answers using predefined rubrics and sophisticated LLM-based judgment models. This framework enables you to assess answers on key criteria by utilizing pretrained judges and parsing LLM outputs into structured JSON formats for detailed analysis.
YESciEval was created by Scientific Knowledge Organization (SciKnowOrg group) at Technische Informationsbibliothek (TIB). Don’t hesitate to open an issue on the YESciEval repository if something is broken or if you have further questions.
See also
See the Quickstart for more quick information on how to use YESciEval.
If you find this repository helpful, feel free to cite our publication YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering:
@article{d2025yescieval, title={YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering}, author={D'Souza, Jennifer and Giglou, Hamed Babaei and M{\"u}nch, Quentin}, journal={arXiv preprint arXiv:2505.14279}, year={2025} }