Multimodale systemen met de OpenAI API
James Chapman
Curriculum Manager, DataCamp
$$

from openai import OpenAI# Maak de OpenAI-client client = OpenAI(api_key="<OPENAI_API_TOKEN>")# Maak een verzoek naar de Chat Completions-endpoint response = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "What is the OpenAI API?"}])
# Haal de content uit de response
print(response.choices[0].message.content)
The OpenAI API is a cloud-based service provided by OpenAI that allows developers
to integrate advanced AI models into their applications.
$$
Spraak-naar-tekst mogelijkheden:
mp3, mp4, mpeg, mpga, m4a, wav en webm (25 MB limiet)
Use cases:

Voorbeeld: transcribeer meeting_recording.mp3
audio_file = open("meeting_recording.mp3", "rb")
$$
Als het bestand in een andere map staat
audio_file = open("path/to/file/meeting_recording.mp3", "rb")
audio_file= open("meeting_recording.mp3", "rb")response = client.audio.transcriptions.create(model="whisper-1",file=audio_file)print(response)
Transcription(text="Welcome everyone to the June product monthly. We'll get started in...)
print(response.text)
Welcome everyone to the June product monthly. We'll get started in just a minute.
Alright, let's get started. Today's agenda will start with a spotlight from Chris
on the new mobile user onboarding flow, then we'll review how we're tracking on
our quarterly targets, and finally, we'll finish with another spotlight from Katie
who will discuss the upcoming branding updates...

Workflow transcriberen:
open() audiobestandaudio_file = open("non_english_audio.m4a", "rb")response = client.audio.translations.create(model="whisper-1",file=audio_file)print(response.text)
The search volume for keywords like A I has increased rapidly since the launch of
Cha GTP.
![]()
Multimodale systemen met de OpenAI API