Vector Databases for Embeddings with Pinecone
James Chapman
Curriculum Manager, DataCamp
def chunks(iterable, batch_size=100):
it = iter(iterable)
chunk = tuple(itertools.islice(it, batch_size))
while chunk:
yield chunk
chunk = tuple(itertools.islice(it, batch_size))
pc.Pinecone(api_key="YOUR API KEY") index = pc.Index('datacamp-index')
for chunk in chunks(vectors): index.upsert(vectors=chunk)
Pros:
Cons:
pc = Pinecone(api_key="YOUR_API_KEY", pool_threads=30)
with pc.Index('datacamp-index', pool_threads=30) as index:
async_results = [index.upsert(vectors=chunk, async_req=True) for chunk in chunks(vectors, batch_size=100)]
[async_result.get() for async_result in async_results]
Vector Databases for Embeddings with Pinecone