[MlMt] Searching for text of URL links

Chris Newman chris.newman at oracle.com
Fri Mar 15 17:05:40 EDT 2019


The IMAP standard requires implementation of a pure substring search, 
but in practice most search indexing software toolkits only do 
word-based search and don't support efficient substring search (most can 
do reasonably efficient prefix search but not efficient suffix search). 
So particularly for body searches, you need to search for a substring 
that counts as a word to whatever indexing software is used on the IMAP 
server you're using. Also search for a stop word (e.g., 'and') may not 
work either (not indexing those words reduces index size). The IMAP 
server I work on can either do IMAP compliant search brute-force or do 
word-based indexed body search quickly and it's up to the server admin 
to choose which to use. Given that many clients do body search by 
default now, most admins of larger sites choose to use the indexed 
word-based search.

While it's possible to implement efficient indexed pure substring 
search, that requires a significantly larger search index than 
word-based search technologies, and it's not clear "free" email services 
would be willing to pay for that extra storage when they can just ignore 
the standard and provide word-based search cheaper (and I don't recall 
being asked to provide such a feature by any customer).

Also search indexing software is likely to drop any markup. So if the 
URL is an HTML link rather than actually in the text of the message, it 
may not be indexed (or searchable) at all.

		- Chris

On 15 Mar 2019, at 10:33, Ted Lesley wrote:

> I must be missing something, I’m trying to search for the text of 
> links in mail messages and they don’t come up when I search for part 
> of the URL (domain, string in link, etc.)
>
> What am I missing? Thanks for any help.
> _______________________________________________
> mailmate mailing list
> mailmate at lists.freron.com


More information about the mailmate mailing list