To Do List: Late Chunking for Some Other Embedding Models
3/15/2025: Anyway, an update can be found here: zh-late-chunking: Late Chunking for Chinese
3/10/2025: I decided not to do this project, since there’s a better way to do chunking as specified in this article: Why We Should Not Do Overlap in Chunking (and What to Do Instead).
This article is a little reminder to myself.
I have been working on some projects where I need to chunk long texts and retrieve relevant information from them. Surprisingly, I came across this article: Late Chunking in Long-Context Embedding Models. It introduces a new method to chunk long texts while keeping the context information. The method sounds promising and has two pros:
- It requires far fewer computing resources that llm aided chunking.
- It does as good as or even better than llm aided chunking according to Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models.
However, even though the authors claimed that it works on any embedding models that use avarage pooling technique, it still requires a lot of work to implement it. Plus, there is a language problem, because Chinese and other languages in late chunking differs from English. Therefore, to my knowledge, the only model supports this method is the jina-embedding
series from Jina AI. Which I tried, and it does not work as well as the leading embedding models such as openai/text-embedding-3
and BAAI/bge-m3
.
In the next few months, if I have time, I will try to implement this method for some other embedding models.