Llama 3.1 8B Instant
JSON →A fast, lightweight instruction-tuned language model from Meta optimized for low-latency text generation and chat applications.
Specs
context window 128K tokens
max output 8K tokens
input price $0.05 / 1M tokens
output price $0.08 / 1M tokens
Capabilities
streamingcode-generationfunction-callingtool-usejson-mode
Dates
releasedJul 2024
knowledge cutoffDec 2023
Resources
homepagellama.meta.com/ ↗
API
full doc /v1/models/llama-3.1-8b-instant