Re: [Tracker] It doesn't index PHP files
- From: Carlos Garnacho <carlos lanedo com>
- To: awilliam whitemice org
- Cc: Tracker mailing list <tracker-list gnome org>
- Subject: Re: [Tracker] It doesn't index PHP files
- Date: Thu, 04 Oct 2012 14:32:00 +0200
Hey,
On jue, 2012-10-04 at 07:41 -0400, Adam Tauno Williams wrote:
On Thu, 2012-10-04 at 12:34 +0100, Martyn Russell wrote:
On 04/10/12 12:24, Adam Tauno Williams wrote:
On Thu, 2012-10-04 at 10:54 +0100, Martyn Russell wrote:
On 04/10/12 09:17, Ivan Frade wrote:
   I think python script contents are indexed because the mimetype is
"text/x-python" and it falls back to the "text/*" extractor. PHP files
have the mimetype "application/x-php" and there is no default option
for that.
   This can be solved adding "application/x-php" in the .rules file of
the text extractor (check
/usr/local/share/tracker/extract-rules/90-text-generic.rule and other
rule files in the same folder).
   Note that generic text indexing means that the python code is treated
as plain text, a bunch of words. You could always write an specialized
extractor that takes into account the semantic of the file. For
example ignoring __init__.py files, or import statemens, maybe
ignoring the code and indexing only function names.... depends on what
you want. Same applies to PHP.
   Writing an extractor module is not difficult with some rudiments of
programming in C and we can help via mailing list or IRC. Patches are
welcome ;)
I should add, you can use:
    tracker-control -m $MIME
or
    tracker-control --reindex-mime-type=$MIME
If you change the rules file to note have to reindex all content again.
awilliam linux-nysu:~>
cat /usr/share/tracker/extract-rules/90-text-generic.rule
[ExtractorRule]
ModulePath=/usr/lib64/tracker-0.14/extract-modules/libextract-text.so
MimeTypes=text/*;application/php
awilliam linux-nysu:~> tracker-control
--reindex-mime-type="application/php"
Reindexing mime types was successful
I'd like to point out that the mimetype is application/x-php, not
application/php
<snip>
Extractor rules loaded
Setting memory limitations: total is 3.9 GB, minimum is 256 MB,
recommended is ~1 GB
  Virtual/Heap set to 2.0 GB (50% of total or MAXLONG)
Guessing mime type as '(null)'
tracker_mimetype_info_get_module: assertion `info != NULL' failed
No modules found to handle metadata extraction
Huh, so... it doesn't recognize the MIME type at all?
This is a bug in the logging call, it's showing the mimetype that's
known before the check, not the one that's figured out after.
  Carlos
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]