#Llm

3 posts with this tag

Found 3 posts

2026-06-08

Local Gemma was too slow with AIdaemon until I fixed llama.cpp and the prompt size

I wanted AIdaemon on local Gemma 4 26B through llama.cpp, not Ollama. Generation ran at ~45 tok/s on an M4 Pro. Agent turns still felt stuck because prefill on 14k-token prompts took 8 to 9 seconds before the model wrote a single word.

aisoftware-developmentopen-source

2026-03-15

5 min read

AI agent patterns I learned from building AIdaemon

Seven well-known AI agent patterns and how they actually work inside AIdaemon, a self-hosted AI agent daemon I've been building in Rust.

aisoftware-development

2025-12-08

3 min read

Why AI chatbots speak Markdown

Ever wondered why ChatGPT, Claude, and other AI models format their responses with **bold text** and [links](url)? It's Markdown, and there's a good reason for it.

aiweb-development

Get the latest posts and insights delivered to your inbox.

Unsubscribe anytime. No spam, ever.

Blog archive

#Llm

Local Gemma was too slow with AIdaemon until I fixed llama.cpp and the prompt size

AI agent patterns I learned from building AIdaemon

Why AI chatbots speak Markdown

Stay Updated