Quick definition · 2 min AI term

Inference

Inference is the moment an AI model actually runs to produce an answer, as opposed to being trained.

Think of it like

Training is studying for the exam. Inference is sitting the exam: using what was learned to answer right now.

Example

“Inference cost” is what you pay each time the model answers. “Inference speed” is how fast that answer comes back.

Why it matters

You will see “inference” on pricing pages and in performance talk constantly. It just means the model running.

Where you’ll see it

OpenAI APITogether AIyour own server

Related terms