Are there benchmarks on the speed of the various filter operators as applied to the different tokenization strategies?
Thank you
Are there benchmarks on the speed of the various filter operators as applied to the different tokenization strategies?
Thank you
hi @rjalex !
That’s an interesting subject.
I believe that the difference here will be the number of indexed tokens.
For the given example:
You can consider that field
tokenization will always get you one indexed token, while the other will get you multiple (if you have more than one token in your content) tokens.
We have a benchmarking repo:
But I don’t think they cover this, but other index benchmarks.