Informed the author about OpenAI's compatibility test in the gpt-oss GitHub repository for verifying provider implementations
How media typically covers Dominik Kundel
Directly quoted in these articles
The same open weight LLM (gpt-oss-120b) achieves dramatically different performance across hosted providers (36.7% to 93.3% on AIME), due to differences in serving frameworks, quantization, and configuration versions.
“Informed the author about OpenAI's compatibility test in the gpt-oss GitHub repository for verifying provider implementations”