Re: [Tracker] Merging Turtle branch to Trunk



On 09/12/08 13:19, Ivan Frade wrote:
Hi!

Hi :)

  Thanks Martyn for the review. I'll try to clarify how this
backup/restore metadata works.

Summary:

It looks to me like a lot of work is still needed to keep the code clean
and consistent with the current code base. There are a couple of leaks
which need fixing and a couple of issues which this patch fixes too. The
major concern I have is about when the restore and backup are actually
done. Unless I have misunderstood (which is entirely possible) it seems
they are done at the wrong times, or at least could be done at better times.

Ok, The goal here is to save the data set by the user (e.g.
User:Keywords, Audio:Playcount).

How is trackerd working now on reindex:

1) trackerd is started with --force-reindex
2) trackerd start tracker-db-manager with FORCE_REINDEX flag to remove
all the DB files
3) trackerd starts crawling and sending the uris to the indexer


How i want to plug the save/restore functionality:

1) trackerd is started with --force-reindex
2) trackerd starts the tracker-db-manager (without the force reindex
flag because we need to READ the Dbs)
   2.1) trackerd reads the metadata from the user
   2.2) trackerd save those metadata to a turtle file
   2.3) trackerd closes the db stack.
   2.4) trackerd install a callback to detect the Finished signal from
the indexer.
3) trackerd start tracker-db-manager with FORCE_REINDEX flag to remove
all the DB files
4) trackerd starts crawling and sending the uris to the indexer
5) when the "Finished" signal comes, call the "_restore_backup" method
in the indexer.


  Why are we restoring the backup AFTER indexing the files?
1) To ignore metadata of files that are not anymore in the filesystem
2) To overwrite default values coming from the extractor

  [Also note that we cannot set metadata of files that are not yet in the
DB so we need to crawl first]


  So, we need to save the data BEFORE deleting the DBs and restore it
AFTER finished the crawling. It is not perfect but any other solution
requires deep changes in the starting sequence (and probably
tracker-db-manager).

OK, that all makes sense.

I was just wondering if we should do the backing up part in the tracker-db-manager, but I am not sure it makes sense to have it somewhere else to where the restore code is.

At the very least, I would put all that DB opening/backing up code block into a separate function so it doesn't cause massive confusion in the main() initialisation stack. I would also put a heavy comment in there explaining what we are doing and what the point is.

Can you do this?

--
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]