Introduction to Embeddings with the OpenAI API
Emmanuel Pire
Senior Software Engineer, DataCamp
's8170'
)'s8103'
)reference_ids = ['s8170', 's8103']
reference_texts = collection.get(ids=reference_ids)["documents"]
result = collection.query( query_texts=reference_texts, n_results=3 )
{'ids': [['s8170', 's6939', 's7000'],['s8103', 's2968', 's3085']],
'embeddings': None,
'documents': [['Title: Terrifier (Movie)...',
'Title: Haunters: The Art of the Scare (Movie)...',
'Title: Horror Story (Movie)...'],
["Title: Strawberry Shortcake: Berry Bitty Adventures (TV Show)...",
"Title: Shopkins (TV Show)...",
"Title: Rainbow Ruby (TV Show)..."]],
'metadatas': [[None, None, None], [None, None, None]],
'distances': [[0.00, 0.25, 0.26], [0.00, 0.25, 0.28]]}
import csv
ids = []
metadatas = []
with open('netflix_titles.csv') as csvfile:
reader = csv.DictReader(csvfile)
for i, row in enumerate(reader):
ids.append(row['show_id'])
metadatas.append({
"type":row['type'],
"release_year": int(row['release_year'])
})
collection.update(ids=ids, metadatas=metadatas)
result = collection.query(
query_texts=reference_texts,
n_results=3,
where={
"type": "Movie"
}
)
where={
"type": "Movie"
}
is the same as
where={
"type": {
"$eq": "Movie"
}
}
List of operators:
$eq
- equal to (string, int, float)$ne
- not equal to (string, int, float)$gt
- greater than (int, float)$gte
- greater than or equal to (int, float)$lt
- less than (int, float)$lte
- less than or equal to (int, float)where={
"$and": [
{"type":
{"$eq": "Movie"}
},
{"release_year":
{"$gt": 2020}
}
]
}
$or
: filter based on at least one conditionTitle: A Classic Horror Story (Movie) [...]
===
Title: Nightbooks (Movie) [...]
===
Title: Irul (Movie) [...]
===
Title: Intrusion (Movie) [...]
===
Title: Things Heard & Seen (Movie) [...]
===
Title: A StoryBots Space Adventure (Movie) [...]
Introduction to Embeddings with the OpenAI API