Tinker Python SDK
Tinker is the official Python SDK for the Tinker API, designed for fine-tuning large language models (LLMs). It abstracts away the complexities of distributed GPU training, allowing developers to focus on data and algorithms. The current version is 0.18.0, and it is actively maintained with ongoing development and documentation updates.
Common errors
-
KeyError: 'TINKER_API_KEY'
cause The `TINKER_API_KEY` environment variable is not set, which is required for authenticating with the Tinker API.fixSet the `TINKER_API_KEY` environment variable before running your application. For example: `export TINKER_API_KEY="your_api_key_here"` in your shell, or `os.environ['TINKER_API_KEY'] = 'your_api_key_here'` in your Python script (for testing, not recommended for production). -
TypeError: 'coroutine' object is not awaited
cause An asynchronous Tinker API call (ending with `_async`) was made but not `await`ed, leading to a coroutine object not being executed.fixEnsure all calls to asynchronous Tinker methods are preceded by `await`. For example, `await client.some_async_method(...)`. -
APIConnectionError or APITimeoutError
cause These errors indicate network issues or problems reaching the Tinker API server, potentially due to connectivity problems, incorrect endpoint configuration, or server-side issues.fixCheck your internet connection and verify the Tinker API endpoint configuration if specified. If the issue persists, check the Tinker status page or contact support, as it might be a server-side problem. For timeouts, consider increasing the client timeout if it's a known long-running operation, though very long timeouts might indicate server-side hangs.
Warnings
- breaking The `RenderedMessage` fields `prefix`, `content`, and `suffix` were renamed to `header`, `output`, and `stop_overlap` respectively. The `Renderer` interface also changed from `Protocol` to `ABC`.
- gotcha Making sequential API calls instead of utilizing asynchronous patterns is a major performance bottleneck, especially for operations like `sample` or `optim_step`.
- gotcha A sampling client created before saving new weights will silently sample from old, stale weights, leading to unexpected model behavior.
- gotcha LoRA fine-tuning typically requires a significantly higher learning rate compared to full fine-tuning, often around 10x higher.
Install
-
pip install tinker -
uv pip install tinker
Imports
- ServiceClient
from tinker import ServiceClient
import tinker client = tinker.ServiceClient()
Quickstart
import os
import tinker
from tinker import types
os.environ['TINKER_API_KEY'] = os.environ.get('TINKER_API_KEY', 'your_tinker_api_key_here')
# Initialize the Tinker ServiceClient
service_client = tinker.ServiceClient()
# Create a LoRA training client (example for fine-tuning)
training_client = service_client.create_lora_training_client(
base_model="meta-llama/Llama-3.2-1B",
rank=32,
)
# Example of an asynchronous operation (replace with actual data and loss_fn)
async def run_optim_step():
# In a real scenario, you would have actual data and define a loss function
# For this quickstart, we'll simulate a minimal Datum and OptimStepRequest
dummy_model_input = types.ModelInput(
text="This is a dummy prompt.",
tokens=[1, 2, 3]
)
dummy_loss_fn_inputs = {"labels": types.TensorData(data_float=[1.0, 2.0])}
datum = types.Datum(model_input=dummy_model_input, loss_fn_inputs=dummy_loss_fn_inputs)
optim_request = types.OptimStepRequest(
datums=[datum],
# Other required fields like loss_fn, optim_params, etc. would go here
# This is a simplified example; refer to full docs for actual usage
loss_fn=types.LossFunction.CROSS_ENTROPY,
optim_params=types.AdamParams(learning_rate=1e-5)
)
optim_future = await training_client.optim_step_async(optim_request)
# await optim_future.get_result_async()
print("Optim step initiated.")
import asyncio
asyncio.run(run_optim_step())