Question Answering on SQuAD2.0 dev
This benchmark is evaluating models on the dev set of the SQuAD2.0 dataset.
Step 1: Evaluate models locally
First, use one of the public benchmarks libraries to evaluate your model.
sotabench-evalis a framework-agnostic library that implements the SQuAD2.0 Benchmark. See sotabench-eval docs here.
Once you can run the benchmark locally, you are ready to connect it to our automatic service.
Step 2: Login and connect your GitHub Repository
Connect your GitHub repository to automatically start benchmarking your repository. Once connected we'll re-benchmark your
master branch on every commit, giving your users confidence in using models in your repository and helping you spot any bugs.