Python `v4` client - feedback megathread!

Hi Everybody,

Our amazing engineering team has delivered a brand-new Python client with a revised API that we think is even more developer-friendly!

The key features are:

  • New streamlined syntax for interacting with Weaviate in a Python-native way

    • No more builder methods
    • No more raw dictionaries
  • Full gRPC support for batch imports and searches

  • Generics syntax for end-to-end type safety of your data objects

  • Pydantic client-side runtime data cleansing logic

  • Handy methods to get up-and-running as fast as possible

  • As easy as:

    • import weaviate
    • client = weaviate.connect_to_local()
    • articles = client.collections.get("Articles")

Check it out by installing it with pip install --pre "weaviate-client==4.*", and using it with Weaviate 1.22 or higher.

Yes - it’s receiving a major version bump!

Some barebones docs will be up soon (tomorrow), and we’ll build it up as we go.

Please leave any feedback here, so that we can track it more easily in the one thread.

Thanks everybody.

5 Likes

When will we also get Async client support?

Hi @Mohit_Singla :wave:. Thanks for asking.

This is something that we would definitely like to tackle, but we can’t provide a timeline at this point. Sorry about that!

to share some first impression. The API is indeed smoother. However, for schema I prefer the json way, as it let us store the schema elsewhere. We have an internal toolkit used for several project, and having everything in code is not possible. Right now we create with legacy API the schema and import with the new one.

Hi @pommedeterresautee !

Oh that’s an interesting data point. Would it work for your use case if we added a method to fetch the full, JSON definition for the collection? (Like a articles.config.get() or articles.config.raw() method, where articles is the collection).

Or would you also want the collection creation to take JSON inputs? :thinking:

The documentation suggests that this is available to use with wcs but the client doesn’t appear to have wcs connection implemented. Any ETA on when this will be available?

Hi @asido - sorry about that.

I’m updating the page to reflect this. WCS clusters are currently not compatible with the v4 client as we are adding gRPC support.

There’s no exact ETA on this currently, but the last I heard was that it will be a few weeks.

Thanks - does the V4 client add any new features or just syntax changes?

For example, batch deleting based on a list of IDs isn’t available in the v3 client. Would the new syntax support new operations or the same ones?

Hey @Adam_Hughes - sorry for the late reply; I’ve been on holiday and I think this one slipped through the cracks.

There are new features for sure - gRPC support is the big one, which should speed up imports and queries significantly.

Re: deletion by multiple IDs - unless I am mistaken, I think that is supported through batch deletes.

I think this works for me w/ the latest v4 client, and a similar syntax should work with the v3 client as well.

questions = client.collections.get("Question")
response = questions.query.fetch_objects(limit=3)

ids = [o.uuid for o in response.objects]

questions.data.delete_many(
    where=wvc.Filter("id").contains_any(ids)
)
1 Like

Cool thanks - I guess I had assumed deleting by ID would use its own method isntead of a Filter/contains_any, but this seems fine too!

Any plans for supporting not like filter?

1 Like

Hi @axeloh - we are looking at this with:
Improved `NotEqual` operator · Issue #3319 · weaviate/weaviate · GitHub and
[Feature Request] Add "Not" operator in filter to support "Not Like" etc. in Get queries · Issue #3683 · weaviate/weaviate · GitHub

I am not sure exactly when it will make it to a release, but it’s definitely being looked at closely.

Then the clients will be also updated to follow of course :).

2 Likes

Hi, I’m looking at the python v4 client again now that it works with wcs and the query fetch_objects function returns a _QueryReturn containing _Object objects.

Is there any particular reason these return internal classes? I’m writing a function to map the response objects but can’t have proper typing unless I imported _Object from the internal package which I guess isn’t an intended workflow.

It can be done without typing but that seems a waste when the classes are already there.

Hi @asido - welcome back. And yes, WCS works now with the V4 client (we’re working on making the sandboxes compatible).

Hmm. That’s an interesting point re: typing. You should still get IDE autocompletes for attributes like shown here, regardless of typing the .

But I will check with the devs as to pros/cons of exposing these response classes directly.

Thanks for your input!
JP

Thanks, that’s true but only if you’re using the response object in the same function that ran the query because it knows the return type. That’s not the case in other functions.

As an example, I have multiple functions that run different weaviate queries. I need to map the list of returned weaviate objects to domain objects and I don’t want to do this in every method that runs a query so I use mapping functions. With the way it is currently, any functions like this cannot have typing.

Overall though the new client is nice. Definitely a cleaner dev experience.

Yeah.

I brought this up with the devs, and looks like they generally agree with you. I can’t make any promises, but at this point we’re looking at exposing those classes through a separate submodule (so as to not clutter weaviate.classes).

Thanks so much for the feedback! :slight_smile:

1 Like

Hi @jphwang,

the new python client looks pretty nice and easy to use! Is there a way to use it with Azure currently? I did not find any way to specify resource_name and deployment_id.

One can only set base_url which does not seem to be enough

    parameters = {
        "collection_name": "Test6Collection",
        "vectorizer_config": wvc.Configure.Vectorizer.text2vec_openai(
            base_url="https://COMPANYINSTANCE.openai.azure.com/",
            model="ada",
            model_version="002",
        ),
        "generative_config": wvc.Configure.Generative.openai(),
        "properties": [
            wvc.Property(name="property1", data_type=wvc.DataType.TEXT),
            wvc.Property(name="property2", data_type=wvc.DataType.TEXT),
        ],
    }


weaviate_client = weaviate.WeaviateClient(
    connection_params=weaviate.ConnectionParams.from_params(
        http_host="0.0.0.0",
        http_port="8080",
        http_secure=False,
        grpc_host="0.0.0.0",
        grpc_port="50051",
        grpc_secure=False,
    ),
    auth_client_secret=weaviate.AuthApiKey(weaviate_secret_key),
    additional_headers={
        "X-Azure-Api-Key": azure_openai_key,
    },
)

Would be great to get your feedback on this!

Sure clean fetching API would help. Still the most important thing is the ability to create schema from json files (next to more code based approach).

Hi @c-lara - sorry about the late reply, I had not seen this earlier.

There is a separate method (text2vec_azure_openai) for this use case - so you should be able to use that. I include a screenshot from the latest 4.4b4 beta below.

1 Like

Hi All,

I am unable to successfully create a collection with a text2vec_openai vectorizer using v4 and Azure Openai. We have provided our Azure credentials as shown in the previous message but the request is timing out. We noticed that there are back to back “/“ characters after our base url in the timeout error and are not sure if this is an issue or if the web server handles that automatically.

Is anyone else facing this or a similar issue? Any pointers or links to up to date documentation would be appreciated. The documentation link in the previous post is not an active webpage.

Best,
Alan