with collection.batch.dynamic() as batch:
for fact in facts:
fact_id = uuid.uuid4()
batch.add_object(
properties={
"fact": fact.content,
},
vector=fact.vector,
uuid=fact_id,
)
weaviate_uuid_list.append(str(fact_id))
there is an exception → batch thread died unexpectedly.
Batching is using async behind the scenes already. Adding objects is non-blocking and it automatically sends multiple concurrent requests. We do not think it makes sense to call this async
If you want to do async batching yourself you can use data.insert_many(). The latest developement version (4.7.0-rc-2) also contains an async client taht you could try out.
@Dirk thanks for your reply and details.
One aspect that I do not have clear is. You said that batching is not supported being called by async, but in the previous post you commented that batching is using async already behind the scenes. I am confused because I thought batching is not supported in async environments.
Our implementation of batch is using async-code to send batches.
So basically if you call batch.add_object() you add objects to a queue and then there are background threads that observer that queue and send batches. The internal send_batch function is async to allow for multiple concurrent requests without threading
But then, why does batching not working using async methods using fastapi as web framework?
I think (but not 100% sure) because we also create event loops inside of the batching
using fastapi as web framework
I am not an fastapi expert, but our batching algorithm is mainly aimed for long running tasks eg 1000 objects+ and is not threadsafe.
I would recommend to either:
use data.insert/insert_many() and build your own async wrapper around them
install weaviate-client==4.7.0-rc.2 and test our new async client (please not directly in production ) you can then do:
async with weaviate.use_async_with_local() as async_client
collection = async_client.collections.get(name)
await collection.data.insert()/insert_many(....)
give us feedback if anything does not work as expected
Hi @Dirk ,
to my surprise batching is working as expected using fastapi and async methods. The problem was that I was running pycharm in debug mode and in this mode batching, async and event loops are not good friends
to my surprise batching is working as expected using fastapi and async methods. The problem was that I was running pycharm in debug mode and in this mode batching, async and event loops are not good friends
I would be careful here - I think often it comes down to timing and the debug mode might just have the “wrong” timing. Meaning anything that changes the timing (different query, weird user input) could bring the error back.
@tsmith023 - you are more experienced with async, what do you think?
Yes, of course @Dirk , we will be watching if it works as expected. Anyway, I think there is a plan to release a Weaviate async client. Do you know when it would be the release?