Batching

Developing AI Systems with the OpenAI API

Francesca Donadoni

Curriculum Manager, DataCamp

What are rate limits

A person driving a car, stopped by a policeman

How rate limits occur

Too many requests

Too much text in the request

Avoiding rate limits

Retry
- Short wait between requests

Batching
- Processing multiple messages in one request

Reducing tokens
- Quantifying and cutting down the number of tokens

Retrying

from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential
)

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))

Retrying

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))

def get_response(model, message):
    response = client.chat.completions.create(
      model=model,
      messages=[message],
      response_format={"type": "json_object"}
    )
    return response.choices[0].message.content

Batching

countries = ["United States", "Ireland", "India"]

message=[
    {
    "role": "system",
    "content": """You are given a series of countries and are asked to return the 
    country and capital city. Provide each of the questions with an answer in the 
    response as separate content.""",
    }]


[message.append({"role": "user", "content": i }) for i in countries]

Batching

response = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=message
    )

print(response.choices[0].message.content)

United States: Washington D.C.
Ireland: Dublin
India: New Delhi

Reducing tokens

import tiktoken


encoding = tiktoken.encoding_for_model("gpt-4o-mini")

prompt = "Tokens can be full words, or groups of characters commonly grouped 
          together: tokenization."


num_tokens = len(encoding.encode(prompt))

print("Number of tokens in prompt:", num_tokens)

Number of tokens in prompt: 17

Let's practice!

Developing AI Systems with the OpenAI API

Batching

What are rate limits

How rate limits occur

Too many requests

Too much text in the request

Avoiding rate limits

Retrying

Retrying

Batching

Batching

Reducing tokens

Let's practice!