AzureOpenAI embedding with Java client

Hi,
I am trying to vectorize some pdfs with Azure OpenAI and Weaviate Java client. But it throws exception about missing OpenAI Key, which is wrong since it should use Azure endpoint.

Schema:

Map<String, Object> moduleConfig = new HashMap<>() {{
                put("text2vec-openai", new HashMap<>(){{
                    put("resourceName", "canada-aoai");
                    put("deploymentId", "embedding");
                }});
        }};

        Map<String, Object> moduleConfigSkip = new HashMap<>() {{
                put("text2vec-openai", new HashMap<>(){{
                    put("skip", true);
                }});
        }};

        Map<String, Object> moduleConfigNoSkip = new HashMap<>() {{
                put("text2vec-openai", new HashMap<>(){{
                    put("skip", false);
                }});
        }};

        Result<Boolean> run = client.schema().classCreator()
                .withClass(WeaviateClass.builder()
                        .className("CVs")
                        .description("collection of CVs")
                        .vectorizer("text2vec-openai")
                        .moduleConfig(moduleConfig)
                        .properties(List.of(
                                Property.builder().name("title").dataType(List.of("text"))
                                        .moduleConfig(moduleConfigSkip).build(),
                                Property.builder().name("filepath").dataType(List.of("text"))
                                        .moduleConfig(moduleConfigSkip).build(),                                Property.builder().name("contentVector").dataType(List.of("text"))
                                        .moduleConfig(moduleConfigNoSkip).build()
                        ))
                        .build()).run();

WeaviateClient:

 Map<String, String> headers = new HashMap<>() { {
            put("X-Azure-Api-Key", "<MY_AZURE_OpenAI_Key>");
        } };

        Config config = new Config("http", "localhost:8080", headers);
        WeaviateClient client = new WeaviateClient(config);

Adding pdf Objects to Weaviate(Resume is POJO with title, filepath, contentVector as fields):

Gson gson = new Gson();
                String json = gson.toJson(resume);
                Map<String, Object> map = gson.fromJson(json, new TypeToken<Map<String, Object>>() {}.getType());

                Result<WeaviateObject> result = client.data().creator().withClassName("CVs").withProperties(map).run();
                System.out.println(result.getError().getMessages());

I also tried using ObjectsBatcher to add the Resume objects. Then I get error when I try yo query:

  client.graphQL().get().withClassName("CVs")
                .withFields(Field.builder().name("title").build())
                .withNearText(NearTextArgument.builder().concepts(new String[]{"springboot"}).build())
                .withLimit(3)
                .run().getResult();

Maybe I am missing something small and trivial. But cannot make it out.
I am using docker-compose locally with text2vec-openai module enabled.

Error I get : API Key: no api key found neither in request header: X-Openai-Api-Key nor in environment variable under OPENAI_APIKEY

Thanks
Soham

Hi Soham,

You probably need to provide the baseURL like so:

put("baseURL": "https://COMPANYINSTANCE.openai.azure.com/");

Hope this solves your issue!
C.

No, that didn’t.
I already tried “X-OpenAI-BaseURL” which is from the docs the way to set endpoint.

But tbh, if that was the case then the error message is wrong.

BTW: weaviate/modules/text2vec-openai/clients/vectorizer.go at master · weaviate/weaviate (github.com) tells me something is wrong when this builds the URL for Azure. Not sure if I should open an Issue, since I don’t see any working examples of Azure and Weaviate

I see… So it solved the problem for me when using the Python client. It showed the same error, so yes, the error message is wrong. But then there seems to be an issue with how it builds the URL. I hope someone can help you with that…

Thanks for your reply.

Do you have your Python client example with Azure OpenAI somewhere available? Github? It would be nice to see how you defined the schema and adding data, maybe I can find some pointers there.

We discussed it in this thread: "Incorrect API key provided" Error when working with Azure OpenAI - #5 by c-lara

Thanks. Nope, this doesn’t work for Java client.
Looks to me an issue with the SDK unless anyone else has any ideas?

@antas-marcin can you maybe help with this?

I have the solution.

Problem was how “moduleConfig” is declared.

Correct way:

        Map<String, Object> text2vec = new HashMap<>();
        text2vec.put("baseURL", "https://<resource_name>.openai.azure.com/");
        text2vec.put("resourceName", "<resource_name>");
        text2vec.put("deploymentId", "<deployment_name>");

        Map<String, Object> moduleConfig = new HashMap<>();
        moduleConfig.put("text2vec-openai", text2vec);

Wrong way:

Map<String, Object> moduleConfig = new HashMap<>() {{
                put("text2vec-openai", new HashMap<>(){{
                    put("baseURL", "https://<resource_name>.openai.azure.com/");
                    put("resourceName", "<resource_name>");
                    put("deploymentId", "<deployment_name>");
                }});
        }};

I believe maybe how this object is transformed later to JSON doesn’t work with the 2nd way of initializing a Map.

2 Likes

Glad that you found the reason I was about to paste you the same solution.
Yeah, with the 2nd approach the moduleConfig doesn’t get serialized and unfortunately you end up with a class without those options.

2 Likes