Quick definition · 2 min AI term
Inference
Inference is the moment an AI model actually runs to produce an answer, as opposed to being trained.
Think of it like
Training is studying for the exam. Inference is sitting the exam: using what was learned to answer right now.
Example
“Inference cost” is what you pay each time the model answers. “Inference speed” is how fast that answer comes back.
Why it matters
You will see “inference” on pricing pages and in performance talk constantly. It just means the model running.
Where you’ll see it
OpenAI APITogether AIyour own server