For historical reasons that are intentional (forwarding at the ISP level) I regularly receive the same email into two of my email accounts (call them account A and accoun… (read more)
For historical reasons that are intentional (forwarding at the ISP level) I regularly receive the same email into two of my email accounts (call them account A and account B). When I perform a global search only one of each pair of identical emails is listed.
If I delete the email that shows up, and then repeat the global search, the results now include the other copy of the email that was previously not displayed.
This proves that both copies have made it into the Gloda (the global index) but the filtering of the results is hiding duplicate emails.
If this is "works as intended", then I am happy with that, but I am curious about two things...
1. When account A and account B contain duplicate emails, it is somewhat random whether the one in account A or the one in account B is hidden. However, I rebuilt the global index yesterday, and after that, a global query with duplicate emails showed all the ones from account A and hid all the ones from account B. Today, after receiving more duplicate emails, the duplicates of today's emails in account B are displayed and those in account A are hidden. I am wondering what defines which gets hidden.
2. What is being checked to discover that two emails are identical? Looking at the message source I see that although the message bodies are identical, the headers are wildly different for the most part. Is it perhaps checking the "Message-ID:" line of the header?