Backup Issue Weaviate

Hello Team

I am getting this error, when I am running weaviate backup(.create) method.

The block list may not contain more than 50,000 blocks.\nERROR CODE: BlockListTooLong\n--------------------------------------------------------------------------------\n\ufeff<?xml version="1.0" encoding="utf-8"?>BlockListTooLongThe block list may not contain more than 50,000 blocks.

I’ve been experimenting with the chunk_size and compressionLevel parameters in the backup configuration to address this issue.
Please find my code.

result = client.backup.create(
backup_id="weaviate-backup_20240902600000_3",
backend='azure',
wait_for_completion=False,
config=BackupConfigCreate(chunk_size="256"),
include_collections = ['demotesting']
)

Given the size of the dataset, I suspect that the error is related to the number of data blocks generated during the backup process. To reduce the total number of blocks, I increase the chunk size, but still giving the same error.
Please let me know how I can fix this issue

  • Weaviate Server Version:
  • Deployment Method: Kubernetes
  • Number of Running Nodes: One node
  • Weaviate Version: 1.25.0
  • backup-azure
  • weaviate python client: 4.6.5

Hi @sanjeev1678 !!

I believe this is a bug :thinking:

Have you tried a different module, for example, backup-s3?

I will create an issue on this after I get some word from our team.

Thanks!

@sanjeev1678 !!

Also, can you try that with the latest version?

We had some changes on modules that have bumped some dependencies and may have fixed this.

THanks!

Hello @DudaNogueira

In the meantime, I encountered multiple issue with the backup in Azure.

Error message:

“upload stream "weaviate-backup/weaviate-0/doc/chunk-1": Put "\weaviate-backup_20240902600000%2Fweaviate-0%2Fdoc%2Fchunk-1?blockid=5oN5xDQLTMVIhoP8%2F5nf8gAALL0AAAAAAAAAAAAA%3D%3D\u0026comp=block": dial tcp 20.60.128.132:443: connect: connection refused”

Additionally, there’s another issue:

“backup class industryintel descriptor: cannot create new backup, backup ‘weaviate-backup_20240902600000’ is not yet released, this means its contents have not yet been fully copied to its destination, try again later”

I was expecting these to be resolved in newer versions of Weaviate.

hi, sorry, are those logs from the latest version?

It was not clear if that you have updated it after the 1.25.0 version

Hello @DudaNogueira

Yes, this is the logs from the latest version after updated to 1.25.0 version.

Oh, the latest version is 1.26.5, not 1.25 :thinking:

The changes I mentioned landed in latest 1.26.X version (not sure exactly which, so 1.26.5 is the best option)

I understand, but will the issue with “backup not released yet” be resolved in the latest release?

A recurring issue we’re facing is that whenever a previous backup fails for any reason, this “backup release not yet” error appears for all subsequent backups.

This is a significant problem because once this issue occurs, entire backup pipeline stops functioning. It’s strange, and we should have some form of exception-handling mechanism to address and resolve this issue.

The error message we receive is:
“cannot create new backup, backup ‘weaviate-backup_20240902600000’ is not yet released.”

Hi @sanjeev1678 !

The new 1.26.5 version implements a new cancel backup endpoint:

Previously you could solve this by restarting Weaviate server, that will reset all backup processes.

This is not yet implemented on new client version, but you call this API directly.

here is the part of the openapi spec for that:

You need to do a DELETE call to this endpoint:

GET/backups/{backend}/{id}

Let me know if this helps!

Hi! By the way the legend @jphwang :man_superhero: just updated our docs with that: