Op 14/06/2013 20:13, Ivan Frade
schreef:
Hmm, Given that Tracker is fit and designed for embedded use-cases I don't think the project should not allow compile time and/or configure time configurability of behaviour. For example the --disable-journal and the --disable-fts are also behaviour changes. We could equally easily add a --disable-collator-column and a --disable-plaintext-extraction so that system integrators can easily build a Tracker package that is optimized for storage instead of optimized for performance. For longer term future I'd even go as far as to easily allow replacing the entire ontology. Although I think for that we should rather bring libtracker-sparql and tracker-store together as libsparql-store, have a semantic-nepomuk-desktop package that installs the ontology and let libtracker-miner and tracker-miner-fs be packages that depend on semantic-nepomuk-desktop and libsparql-store. And then on tracker-miner-fs have a --disable-plaintext-extraction and on libsparql-store have a --disable-journal, --disable-fts and --disable-collator-column. This would effectively mean so-called splitting the project. But I've always felt that in the long term this should happen. It would also allow tracker-miner-fs to focus more on the mining and indexing of files, and libsparql-store on being a embedded and/or highly efficient and reliable SPARQL endpoint and SPARQL INSERT store. I also think that libtracker-extract should probably move towards a truly publicly usable libmetadata-extract which exposes buffer and stream based metadata extraction for not just tracker-extract but for any program that needs this. Although it would use this libmetadata-extract just like how it uses libtracker-extract now, the tracker-extract binary should be an implementation detail of tracker-miner-fs after that. The problem I see with the current architecture of tracker-extract as the service to do metadata extraction is that it can only work well for file based metadata extraction, while the world of metadata is massively, insanely massivele larger than just files on your filesystem. If you just open your eyes to see it.
Can it be made to be correct without the collation column? Surely the collation column got created out of the same data the current property's column stores? Meaning that collation data can be made in alloca() buffers on the fly (which is of course going to be a lot slower). Kind regards. |