V4 Client can't parse my custom url

I have a problem connecting to the custom instance because of the way V4 client parses the connection http urls.

My url where the server is located looks like this:
https://www.host.com/cloud/weaviate/

The way v4 parses the http url is currently as follows:

    @property
    def _http_url(self) -> str:
        return f"{self._http_scheme}://{self.http.host}:{self.http.port}"

I can’t connect to the server because it creates this url:
https://www.host.com/cloud/weaviate/:443/v1/meta

How should I solve this issue if I can’t get rid of “/cloud/weaviate/” part of the url?

hi @Radomir_Babek !!

That’s an interesting edge case :grimacing:

For now, and only because of that, I would say that Weaviate doesn’t support being exposed on a subpath.

I don’t know also if GRPC will allow being exposed on subpath.

I will need to escalate this with our client team as they have help us here.

Hang tight :slight_smile:

THanks!

1 Like

Hello!

Could you try this PR? I added an optional path keyword to the connect_to_custom method.

You can install it by either downloading the wheel from here or doing

pip install git+https://github.com/weaviate/weaviate-python-client.git@2adaae3605a6a6a97bcc0babc36f059e6b76926a

Can’t quickly test it myself and not sure if GRPC works with custom paths, but please give it a try :slight_smile:

2 Likes

Hello Dirk and Duda,
thank you very much for your support.

There is a small typo in the PR that uses grpc.path instead of http.path on line 141 of file base.py

Additionally there is a validation check that forbids the same hosts and ports that doesn’t consider different paths.
This results in the following error message:
“Value error, http.port and grpc.port must be different if using the same host”.

You are probably right that gRPC will not allow exposure on a subpath. Trying to connect to the gRPC server raises the WeaviateGRPCUnavailableError. Unfortunatelly I can’t check right now if this is a firewall issue.

Not sure what to do right now, anyway, thank you for your answers.

1 Like

Hi, sorry missed those! Could you try again with commit 29d6e6e8670b9f6a92daa4ecf62a9676a123d4f6? Just replace in the pip command above

You should also be able to directly create a weaviate.WeaviateClient(). You’d just need to create the connection_params: Optional[ConnectionParams] yourself

1 Like

Hey,

Testing this setup and attempting to provide as much context/info as possible which seems a bit much to dump on this forum, is a GH issue a better space for all this info, or use something like pastebin for below logs?

I am also hosting an instance with a subpath and an auth header requirement and was having trouble with this config:

token = get_bearer_token()
host = “subdomain.labs.company.com
client = weaviate.connect_to_custom(
http_host=host, # Hostname for the HTTP API connection
http_port=443,
path=“/8b449c1e-bdfd-4d87-b6f5-70b31bd03ae5/llm-tool-weaviate-master-weaviate/v1”,
http_secure=True,
grpc_host=host + “grpc”,
grpc_port=443,
grpc_secure=True,
headers={“Authorization”: token},
)

Created wheel for weaviate-client: filename=weaviate_client-4.9.1.dev4+g29d6e6e8-py3-none-any.whl size=378508 sha256=7200425a349cc28d48a99a65035ac4242f9fa07b811bb609257918d4450396bb
Stored in directory: /home/lochy/.cache/pip/wheels/be/17/ea/f64fb007c3c821e6b3b0d4350b014e7e57e528bb653af3d0fe
Successfully built weaviate-client
Installing collected packages: weaviate-client
Attempting uninstall: weaviate-client
Found existing installation: weaviate-client 4.9.0
Uninstalling weaviate-client-4.9.0:
Successfully uninstalled weaviate-client-4.9.0
Successfully installed weaviate-client-4.9.1.dev4+g29d6e6e8

Traceback (most recent call last):
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/demo.py”, line 26, in
client = weaviate.connect_to_custom(
TypeError: connect_to_custom() got an unexpected keyword argument ‘path’

Changed to http_path and got this error instead:

Traceback (most recent call last):
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/demo.py”, line 32, in
client = weaviate.connect_to_custom(
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/helpers.py”, line 390, in connect_to_custom
return __connect(
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/helpers.py”, line 416, in __connect
raise e
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/helpers.py”, line 412, in __connect
client.connect()
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/syncify.py”, line 23, in sync_method
return _EventLoopSingleton.get_instance().run_until_complete(
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/event_loop.py”, line 40, in run_until_complete
return fut.result()
File “/home/lochy/.asdf/installs/python/3.10.13/lib/python3.10/concurrent/futures/_base.py”, line 458, in result
return self.__get_result()
File “/home/lochy/.asdf/installs/python/3.10.13/lib/python3.10/concurrent/futures/_base.py”, line 403, in __get_result
raise self._exception
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/client_base.py”, line 153, in connect
await self._connection.connect(self._skip_init_checks)
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/v4.py”, line 146, in connect
await self._open_connections(self._auth, skip_init_checks)
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/v4.py”, line 242, in _open_connections
self.__make_clients()
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/v4.py”, line 227, in __make_clients
self._client = self.__make_async_client()
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/weaviate/connect/v4.py”, line 220, in __make_async_client
return AsyncClient(
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/httpx/_client.py”, line 1389, in init
super().init(
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/httpx/_client.py”, line 183, in init
self.headers = Headers(headers)
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/httpx/_models.py”, line 72, in init
self._list = [
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/httpx/_models.py”, line 76, in
normalize_header_value(v, encoding),
File “/home/lochy/repos/llm-tools/llm-tool-weaviate/.venv/lib/python3.10/site-packages/httpx/_utils.py”, line 53, in normalize_header_value
return value.encode(encoding or “ascii”)
AttributeError: ‘NoneType’ object has no attribute ‘encode’

Would it be possible to get access to a weaviate instance with a custom path? It is hard for me to try to debug it if I can’t try it out and I don’t have the time to set one up myself - feel free to write me a DM here or on slack

Unfortunately, not, it’s a work instance with no external access.
Not sure if the issue is directly related to my instance/setup or just a general configuration client issue.
I’ve been able to curl my instance successfully, but having trouble replicating rest connection via python client so far.
If there’s any way to debug further or grab other logs, that might be good too.

Bummer

Do you feel comfortable digging into a client with a debugger?
Something seems like it is None here that should not be None weaviate/connect/v4.py”, line 220, in __make_async_client

I can see it’s giving me URL:
'https://subdomain.subdomain.domain.com:443/8b449c1e-bdfd-4d87-b6f5-70b31bd03ae5/llm-tool-weaviate-master-weaviate/v1/'

Could the port there be causing issues?
I can’t put port None either.

Could you try to directly initialize weaviate.WeaviateClient(connection_params: Optional[ConnectionParams]) without going through our helpers? Then you should be able to put in any combination. Might also make sense to create your own ConnectionParams class by inheriting from ours and overwrite the functions that return the the paths.

Sorry that I can’t help more, but without a setup it is hard to figure out what could go wrong

Getting SSL issues on work machine, any suggestions how to ignore that here?
I’ve tried using work certs which haven’t worked so far.

hi @light !

You may have a http ssl proxy of some sort, like zscaler or fortigate.

You can work around this by getting the Root CA and patching the certifi one.

This is an example for zscaler:

cat ZscalerRootCA.pem >> $(python -m certifi)

as per this doc

Let me know if this is your scenario.

Thanks

I think our company uses palo/GP for packet inspection stuff replacing all our certs. So can be annoying to work around that, so often I just try set to verify=false or ignore tls/ssl etc.

Also don’t have grpc endpoint setup yet, not sure if a fake one will cause issues and allow us to use rest only or something.

Hi, sorry for the response delay. Http works just fine. I can acccess meta endpoint so I guess there won’t be other problems.

Unfortunatelly after some research I came to the conclusion gRPC cannot be hosted on the custom path. We as a team will probably have to do some additional set up, it’s up to you if you want to include the http path option.

EDIT: I should probably include the error message, seems it fails on DNS resolution, might look at it a little more
Traceback (most recent call last):
File “X\Python310\site-packages\weaviate\connect\v4.py”, line 701, in _ping_grpc
res: health_pb2.HealthCheckResponse = await self._grpc_channel.unary_unary(
File “X\Python310\site-packages\grpc\aio_call.py”, line 327, in await
raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = “DNS resolution failed for https://www.Y.com:443/weaviate/grpc: UNAVAILABLE: getaddrinfo: WSA Error (Unable to retrieve error string – 11001)”
debug_error_string = “UNKNOWN:Error received from peer {grpc_message:“DNS resolution failed for https://www.Y.com:443/weaviate/grpc UNAVAILABLE: getaddrinfo: WSA Error (Unable to retrieve error string – 11001)”, grpc_status:14, created_time:“2024-10-29T07:37:51.2977311+00:00”}”

Thank you for getting back! I think in this case we will not add an additional path argument for http

Very much understandable. Thank you for your effort