[MlMt] Searching for text of URL links
Chris Newman
chris.newman at oracle.com
Fri Mar 15 17:05:40 EDT 2019
The IMAP standard requires implementation of a pure substring search,
but in practice most search indexing software toolkits only do
word-based search and don't support efficient substring search (most can
do reasonably efficient prefix search but not efficient suffix search).
So particularly for body searches, you need to search for a substring
that counts as a word to whatever indexing software is used on the IMAP
server you're using. Also search for a stop word (e.g., 'and') may not
work either (not indexing those words reduces index size). The IMAP
server I work on can either do IMAP compliant search brute-force or do
word-based indexed body search quickly and it's up to the server admin
to choose which to use. Given that many clients do body search by
default now, most admins of larger sites choose to use the indexed
word-based search.
While it's possible to implement efficient indexed pure substring
search, that requires a significantly larger search index than
word-based search technologies, and it's not clear "free" email services
would be willing to pay for that extra storage when they can just ignore
the standard and provide word-based search cheaper (and I don't recall
being asked to provide such a feature by any customer).
Also search indexing software is likely to drop any markup. So if the
URL is an HTML link rather than actually in the text of the message, it
may not be indexed (or searchable) at all.
- Chris
On 15 Mar 2019, at 10:33, Ted Lesley wrote:
> I must be missing something, I’m trying to search for the text of
> links in mail messages and they don’t come up when I search for part
> of the URL (domain, string in link, etc.)
>
> What am I missing? Thanks for any help.
> _______________________________________________
> mailmate mailing list
> mailmate at lists.freron.com
More information about the mailmate
mailing list