Working with different model parameters

Introduction to Amazon Bedrock

Nikhil Rangarajan

Data Scientist

Models parameters in Amazon Bedrock

Models have parameters to control their behavior
temperature: Controls randomness in predictions
top_p: Controls diversity of model's output by including top-ranked tokens
max_tokens: Sets the maximum length of the output

A circular temperature dial or knob with a curved green arrow pointing to it, representing how model parameters like temperature can be adjusted to control randomness in Amazon Bedrock's model outputs.

Temperature

Response randomness and creativity
Low temperature (near 0): More focused, deterministic responses
High temperature (near 1): More diverse, creative outputs
Most Bedrock models default to 0.7

prompt = "Write a headline for a 
          tech article"
request = {
    "temperature": 0.2,
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", 
                         "text": prompt}],
        }
    ],
    ...
}

The range of temperature

Temperature = model's 'risk appetite'
Low temperature acts like a cautious decision-maker
- For tasks like summarization or fact-based answers
High temperature behaves like a creative thinker willing to take risks
- For creative tasks like story generation or brainstorming

A thermometer representing temperature

Top_p

Top_p (nucleus sampling)
- Helps control output predictability
- Determines probability of words used
- Range: 0.1 (focused) to 0.9 (diverse)
- e.g - A top_p of 0.1 means the model only considers the most likely 10% of possible next words

prompt = "Explain quantum computing"

# Focused response
request["top_p"] = 0.1

# Diverse response
request["top_p"] = 0.9

Max tokens

Max_tokens limits response length:
- For managing costs, response size, and maintaining performance
- Typical values: 100-2000

prompt = "Explain quantum computing"

# Focused shorter response
request["top_p"] = 0.1
request["max_tokens"] = 100

# Diverse longer response
request["top_p"] = 0.9
request["max_tokens"] = 500

Parameter selection

Content generation: Higher temperature (0.7-0.9)
Q&A systems: Lower temperature (0.1-0.3)
Documentation: Lower top_p (0.1-0.3)
Brainstorming: Higher top_p (0.7-0.9)
Chat applications: Moderate max_tokens (150-300)
Long-form content: Higher max_tokens (1000+)

A digital control panel or dashboard with a speedometer-like display, knobs, and buttons, representing the various parameters that can be adjusted when working with language models in Amazon Bedrock. The panel has a modern, sleek design with blue backlighting.

Let's practice!

Introduction to Amazon Bedrock