Releasing reliability-checklist framework for holistic language model evaluations