The fastest way to get started with LLM Engine is to use the Python client in this repository to run inference and fine-tuning on Scale's infrastructure. This path does not require you to install anything on your infrastructure, and Scale's free research preview gives you access to experimentation using open source LLMs.
To start, install LLM Engine via pip:
Scale API Keys¶
Next, you need a Scale Spellbook API key.
Retrieving your API Key¶
Different API Keys for different Scale Products
If you have leveraged Scale's platform for annotation work in the past, please note that your Spellbook API key will be different than the Scale Annotation API key. You will want to create a Spellbook API key before getting started.
Set your API Key¶
LLM Engine uses environment variables to access your API key.
Set this API key as the
SCALE_API_KEY environment variable by running the following command in your terminal before you run your python application.
You can also add in the line above to your
.bash_profile so it's automatically set for future sessions.
Alternatively, you can also set your API key using either of the following patterns:
With your API key set, you can now send LLM Engine requests using the Python client:
from llmengine import Completion response = Completion.create( model="llama-2-7b", prompt="I'm opening a pancake restaurant that specializes in unique pancake shapes, colors, and flavors. List 3 quirky names I could name my restaurant.", max_new_tokens=100, temperature=0.2, ) print(response.output.text)
import sys from llmengine import Completion stream = Completion.create( model="llama-2-7b", prompt="Give me a 200 word summary on the current economic events in the US.", max_new_tokens=1000, temperature=0.2, stream=True, ) for response in stream: if response.output: print(response.output.text, end="") sys.stdout.flush() else: # an error occurred print(response.error) # print the error message out break