How to use weaviate with LM Studio?

Description

I have one of the weaviate docker configurations running on my laptop and I also have LM Studio running on my laptop so I can serve up the Llama 3 LLM with a local OpenAI compatible endpoint.

However, unlike the ollama module which takes an apiEndpoint parameter, I don’t see how to set a custom local apiEndpoint for the OpenAI module.

What’s the best way to call into an LM Studio LLM from a weaviate module?

hi @ctindel !!

Welcome to our community :hugs:

I have not played with LM Studio! Just downloaded it here. Awesome project!

You can set a base url at query time or define it at collection creation.

Here is the code I crafted, based on our quickstart:

import weaviate
from weaviate import classes as wvc

client = weaviate.connect_to_local()

client.collections.delete("Question")
questions = client.collections.create(
    name="Question",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(
        base_url="http://host.docker.internal:1234"
    ),  # If set to "none" you must always provide vectors yourself. Could be any other "text2vec-*" also.
    generative_config=wvc.config.Configure.Generative.openai(
        base_url="http://host.docker.internal:1234"
    )  # Ensure the `generative-openai` module is used for generative queries
)

# insert data
import requests, json
resp = requests.get('https://raw.githubusercontent.com/weaviate-tutorials/quickstart/main/data/jeopardy_tiny.json')
data = json.loads(resp.text)  # Load data

question_objs = list()
for i, d in enumerate(data):
    question_objs.append({
        "answer": d["Answer"],
        "question": d["Question"],
        "category": d["Category"],
    })

questions = client.collections.get("Question")
questions.data.insert_many(question_objs)

# now query

questions = client.collections.get("Question")

response = questions.query.near_text(
    query="biology",
    limit=2
)
print(response.objects[0].properties)

# now let's generate some content
generate = questions.generate.near_text(limit=2, query="biology", grouped_task="generate a tweet about the questions {question}")

the output:

Here’s a tweet about the questions:

“Did you know? Watson & Crick built a model of DNA in 1953! And, did you know that our liver is responsible for removing excess glucose from the blood and storing it as glycogen? Mind blown! #ScienceFacts #DNA #LiverFunction

Explanation:

The response generates a tweet by combining information from the provided questions. The first question asks about the molecular structure of DNA, which Watson & Crick built in 1953. The second question talks about the liver’s role in removing excess glucose and storing it as glycogen. The response combines these facts into a concise and engaging tweet that includes relevant hashtags to make it discoverable by others interested in science and health-related topics.

Looking at LM logs, I can see data going thru it:

and here the generate part:

Let me know if this helps!

Thanks!

Yes that is very helpful!

It’s confusing because the docs pages for the text2vec_openai module (text2vec-openai | Weaviate - Vector Database) refer to it in camel case like “baseURL”

However now I’m getting this error:

Failed to import 1 objects
e.g. Failed to import object with error: WeaviateInsertManyAllFailedError(‘Every object failed during insertion. Here is the set of all errors: send POST request: Post “http://host.docker.internal:1234/v1/embeddings”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)’)

And I’m wondering if generating the vector is just taking too long because its a large object. Is there an equivalent in openai_text2vec of the text_fields parameter used in multi2vec-bind ? Like, without using named indexes how do I tell openai_text2vec which properties to use for generating the vector?

Ok! We had an opportunity to tackle this in our office hours, and the issue was that the embedding model was not defined in LM Studio.

When using LM Studio, make sure to also load an embbeding model. So this should work:

curl http://localhost:1234/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer THIS_WILL_BE_IGNORED" \
  -d '{
    "input": "Your text string goes here",
    "model": "THIS WILL ALSO BE IGNORED"
  }'

THanks!