Poor retrieval quality while using CSV and XLSX files

While working with CSVs and using langchain’s csv loader & recursive character text splitter, the retrieval qualities are very poor.

Few records from the CSV:

111111,John,Doe,01/2000,Hispanic,M,FT,Fall 2008,2.71,Albuquerque,New Mexico,87112,jdoe@example.com,17.9,FALSE,FALSE,TRUE

111112,Jane,Smith,05/2001,Hispanic,F,TRANSFER,Fall 2006,3.73,New York,New York,10009,jsmith@example.com,18.1,FALSE,FALSE,TRUE

If I ask what is the date of birth of John Doe, retrieval for the John Doe entry is coming out towards the end (when sorted by certainty).

We tried decreasing certainty & improving the number of retrievals but it is not helping. What would be the right way to deal with this issue?


I’m also facing a similar thing with CSV data. any help would be appreciated.