Author of "oLLM" in GitHub
How this journalist typically writes
Mega as author
oLLM enables running large language models with 100k context windows on consumer GPUs with 8GB VRAM using only fp16/bf16 precision without quantization.
“Author of "oLLM" in GitHub”