Working with different model parameters

Introduction to Amazon Bedrock

Nikhil Rangarajan

Data Scientist

Models parameters in Amazon Bedrock

  • Models have parameters to control their behavior

  • temperature: Controls randomness in predictions

  • top_p: Controls diversity of model's output by including top-ranked tokens
  • max_tokens: Sets the maximum length of the output

A circular temperature dial or knob with a curved green arrow pointing to it, representing how model parameters like temperature can be adjusted to control randomness in Amazon Bedrock's model outputs.

Introduction to Amazon Bedrock

Temperature

  • Response randomness and creativity

  • Low temperature (near 0): More focused, deterministic responses

  • High temperature (near 1): More diverse, creative outputs

  • Most Bedrock models default to 0.7

prompt = "Write a headline for a 
          tech article"
request = {
    "temperature": 0.2,
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", 
                         "text": prompt}],
        }
    ],
    ...
}
Introduction to Amazon Bedrock

The range of temperature

  • Temperature = model's 'risk appetite'

  • Low temperature acts like a cautious decision-maker

    • For tasks like summarization or fact-based answers
  • High temperature behaves like a creative thinker willing to take risks

    • For creative tasks like story generation or brainstorming

A thermometer representing temperature

Introduction to Amazon Bedrock

Top_p

  • Top_p (nucleus sampling)
    • Helps control output predictability
    • Determines probability of words used
    • Range: 0.1 (focused) to 0.9 (diverse)
    • e.g - A top_p of 0.1 means the model only considers the most likely 10% of possible next words
prompt = "Explain quantum computing"

# Focused response
request["top_p"] = 0.1

# Diverse response
request["top_p"] = 0.9
Introduction to Amazon Bedrock

Max tokens

  • Max_tokens limits response length:

    • For managing costs, response size, and maintaining performance
    • Typical values: 100-2000

    Two speech bubble shapes against a gray background - one blank and one filled with random text, illustrating the concept of maximum token length in language models where responses can be limited to a certain size.

prompt = "Explain quantum computing"

# Focused shorter response
request["top_p"] = 0.1
request["max_tokens"] = 100

# Diverse longer response
request["top_p"] = 0.9
request["max_tokens"] = 500
Introduction to Amazon Bedrock

Parameter selection

  • Content generation: Higher temperature (0.7-0.9)
  • Q&A systems: Lower temperature (0.1-0.3)
  • Documentation: Lower top_p (0.1-0.3)
  • Brainstorming: Higher top_p (0.7-0.9)
  • Chat applications: Moderate max_tokens (150-300)
  • Long-form content: Higher max_tokens (1000+)

A digital control panel or dashboard with a speedometer-like display, knobs, and buttons, representing the various parameters that can be adjusted when working with language models in Amazon Bedrock. The panel has a modern, sleek design with blue backlighting.

Introduction to Amazon Bedrock

Let's practice!

Introduction to Amazon Bedrock

Preparing Video For Download...