Re: [Tracker] Refactoring the filesystem miner
- From: Martyn Russell <martyn lanedo com>
- To: Carlos Garnacho <carlos lanedo com>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] Refactoring the filesystem miner
- Date: Mon, 21 Nov 2011 15:15:05 +0000
On 05/10/11 11:49, Carlos Garnacho wrote:
Hey hey,
Hi Carlos,
Just to tell in the ML, this is now implemented in the miner-fs-refactor
branch, timings for mtime checks have improved quite a lot there,
whereas real indexing performance stays roughly the same, as expected.
Intel i5, SSD disk.
indexing 468 folders, 6609 files
master miner-fs-refactor
First index 16.65s 15.4s
Mtime checks 2.36s 0.99s
Amd athlon, 5400rpm disk
indexing 625 folders, 10959 files
master miner-fs-refactor
First index 178.2s 176.65s
Mtime checks 12.39s 5.35s
Thanks for the figures here.
I have reviewed the branch and here are my comments:
1. Is there something to fix here ?
+ /* FIXME: Should avoid recursing through children,
+ * but we're not allowed to modify the tree during
+ * traversal, nor have a way to skip recursing within
+ */
2. I would add some warning here:
+ /* what are we doing with such file? should happen rarely,
+ * only with files that we've queried, but we decided not
+ * to crawl (i.e. embedded root directories, that would
+ * be processed when that root is being crawled).
+ */
3. We should put all typedefs at the top of the file (or should we) ?
+typedef struct {
+ TrackerFileNotifier *notifier;
+ GNode *cur_parent_node;
+
+ /* Canonical copy from priv->file_system */
+ GFile *cur_parent;
+} DirectoryCrawledData;
4. We are using a lot of sync calls
5. Another fixme:
+ /* FIXME: This supposedly works, but in practice
+ * won't ever happen as the parent directory
+ * wasn't being monitored if being ignored
+ */
6. Another fixme:
+ /* Move only the folder, and
+ * delete all its contents
+ */
+ /* FIXME: it doesn't */
7. The progress isn't meant to do this (i.e. go to 2%), looks like a
regression:
21 Nov 2011, 12:04:43: 85% File System - Processingâ 03s
remaining
21 Nov 2011, 12:04:44: 96% File System - Processingâ
unknown time left
21 Nov 2011, 12:04:45: 2% File System - Processingâ
unknown time left
21 Nov 2011, 12:04:45: â File System - Idle
8. I think some of the miner-fs logging needs changing, there are some
things which are debugging but printed as messages and visa versa:
e.g. these should be debug logs:
Tracker-Message: Flushing SPARQL buffer, reason: Queue handlers WAIT
Tracker-Message: (Sparql buffer) Finished array-update with 50 tasks
e.g. these should be messages (i.e. when they're officially in the db, I
would skip the "processing" versions here but at least show the
"create"/"update"/"delete" messages):
(tracker-miner-fs:6782): Tracker-DEBUG: Creating new item
'file:///home/martyn/Videos/oceans13.m4v'
(tracker-miner-fs:6782): Tracker-DEBUG: Creating new item
'file:///home/martyn/Videos/Inception.m4v'
9. I see a critical here when re-running miner-fs after initial index:
(tracker-miner-fs:6854): GLib-CRITICAL **: g_timer_reset: assertion
`timer != NULL' failed
I have avoided _reset() in the past because I had issues with it and so
use _destroy() instead. I think this was the reason why. We should
really fix it in glib if this is still the case here.
10. I notice we get events for .xsession-errors now, before we
*EXPLICITLY* had code to ignore that file because it just fills logs. We
should fix that up:
(tracker-miner-fs:6854): Tracker-DEBUG: Received monitor event:0
(G_FILE_MONITOR_EVENT_CHANGED) for
file:'file:///home/martyn/.xsession-errors'
11. Currently make distcheck fails:
In file included from ../../../src/libtracker-miner/tracker-monitor.h:29:0,
from ../../../src/libtracker-miner/tracker-monitor.c:26:
../../../src/libtracker-miner/tracker-indexing-tree.h:30:33: fatal
error: tracker-miner-enums.h: No such file or directory
12. I think we need more tests for:
â src/libtracker-miner/tracker-indexing-tree.c
â src/libtracker-miner/tracker-file-notifier.c
13. How come the branch adds:
â tests/libtracker-client/tracker-test
--
Once these are addressed I think we can merge this to master.
Thanks Carlos,
--
Regards,
Martyn
Founder and CEO of Lanedo GmbH.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]