Creating customer call transcripts

Multi-Modal Systems with the OpenAI API

James Chapman

Curriculum Manager, DataCamp

Case study introduction

An image of chatbot

AI Engineer at DataCamp
Handles voice messages
Speech customer support chatbot

Customer support team in DataCamp

Case study introduction

An image of chatbot

Case study introduction

Step: transcribe audio

Case study introduction

Step: detect language

Case study introduction

Step: translate to English

Case study introduction

Step: generate a response

Case study introduction

Step: reply in original language

Case study introduction

Step: moderation

Case study plan

Transcribe the audio into text
Detect the language
Translate into English
Refining the text

Step: translate to English

Step 1: transcribe audio

from openai import OpenAI

client = OpenAI(api_key="ENTER YOUR KEY HERE")

# Open the mp3 file
audio_file = open("recording.mp3", "rb")

# Create a transcript
response = client.audio.transcriptions.create(
                  model="whisper-1", 
                  file=audio_file)

Step 1: transcribe audio

# Extract and print the transcript
transcript = response.text
print(transcript)

Transcript in Ukrainian

Step 2: detect language

response = client.chat.completions.create(
    model="gpt-4o-mini",
    max_completion_tokens=5,

    messages=[{"role": "user", 
        "content": f"""Identify the language of the following text and respond 
         only with the country code (e.g., 'en', 'uk', 'fr'): {transcript}"""}])


# Extract detected language
language = response.choices[0].message.content
print(language)

uk

Step 3: translate to English

response = client.chat.completions.create(
    model="gpt-4o-mini",
    max_completion_tokens=300,
    messages=[
        {"role": "user", "content": f"""Translate this customer transcript
        from country code {language} to English: {transcript}"""}])

# Extract translated text
translated_text = response.choices[0].message.content

Step 3: translate to English

print(translated_text)

Translated text - raw

Step 3: translate to English

print(translated_text)

Translated text (highlighted) - raw

Step 4: refining the text

response = client.chat.completions.create(
    model="gpt-4o-mini",
    max_completion_tokens=300,
    messages=[
    {"role": "user", 
     "content": f"""You are an AI assistant that corrects transcripts by fixing 
     misinterpretations, names, and terminology. Please refine the following
     transcript:\n\n{translated_text}"""}])

# Extract corrected text
corrected_text = response.choices[0].message.content

Step 4: refining the text

print(corrected_text)

Corrected text (highlighted)

Recap

Transcribed the audio
Detected and translated language
Refined the text

Called OpenAI API four times ⭐

Transcript in Ukrainian

Translated text (highlighted) - raw

Corrected text (highlighted)

Time for practice!

Multi-Modal Systems with the OpenAI API