Skip to content

Fulltext searches more text files (Fixes #7)

This creates an expanded text list for mimetypes.guess_type to check against. This allows for fulltext to search in more text files including xml, rtf, shellscripts, postscript, json, php, perl, ruby, svg, java, javascript, and several more text files. More can be added easily.

This also adds a secondary binary check to check for a null byte in case guess_type tries checking binary files without a file extension. I should be able to update this function later to search within non-utf8 files but I think that should be another commit with more testing later.

Also adds a filter to skip searching directories excluded by users in the preferences.

Running tests this change seems to find almost twice as many files in 25% less time (1488s vs 1987s), while also saving a lot of disk reads (7GB vs 34GB) on my 4TB 5900rpm HDD (with around 1.1 million files and folders).

I put the Text and Binary check functions above the run function, if they should go below let me know and I can change that if this change seems okay besides that. Thanks.

Edited by newhoa

Merge request reports