Skip to main content
← Back to the Research Program
Language & ReasoningHigh School researchExample brief

Open vs Closed Models on a Reasoning Test

The research question

How does a freely downloadable open model compare to a closed commercial model on the same reasoning problems?

Abstract

I gave an open model and a closed model the same set of logic puzzles and scored them. The closed model scored higher, but the open model was closer than expected.

Background

Open models can be run by anyone, which matters for AI sovereignty. I wanted to measure the capability gap on reasoning.

What I did

I built a set of 20 logic and word problems with known answers and ran both models three times each.

What I found

The closed model answered more correctly, especially on multi-step problems, but the open model still solved a clear majority.

What's next

I would test whether more careful prompting closes more of the gap.

Takeaway

Closed models still lead on hard reasoning — but open models are capable enough to take seriously.