Re: [Tracker] Proposed future for Tracker - Smaller modules



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/09/2014 17:11, Martyn Russell wrote:

Hi Martyn,


My proposal would be to keep tracker-store, ontology and 
libtracker-sparql together (as one project).

The reason I didn't put libtracker-sparql with tracker-store /
core things is that logically it's quite a different thing for
applications wanting to JUST run SPARQL queries.

Well, nope. They depend on tracker-store either way. Since that's the
case, then it's better to make tracker-store an impl. detail of
libtracker-sparql


I suppose it doesn't make all that much difference, you still
require the store and other bits for libtracker-sparql to operate
anyway.

Precisely.

Applications wanting to JUST run SPARQL queries, need tracker-store.

Even for reading with direct-access, as tracker-store is the one that
assembles meta.db for in case it's not there yet and it needs to be
created from initial ontology.

I guess you could go as far as to say that you might want to JUST
use DBus and none of the libtracker-sparql support at all, another
reason for it being separate.

The DBus-only non-fd-passing Query APIs on tracker-store are purely
for development purposes. Nobody should really use them in production.
The only meaningful API on the Resources DBus object is GraphUpdated
signal.

The Steroids or FD-passing DBus API can be used, but preferably
applications always use libtracker-sparql and get the benefits of
direct-access mode (process prioritization and no-ipc overhead).

The reason for that is that the actual goal of libtracker-sparql
was to provide what libsqlite is for a RDMBS and SQL, but for RDF
and SPARQL.

Yea, but libsqlite3 and sqlite are separate packages (actually I am
not sure if they're separate projects at all), but having
tracker-store does not mean you have to use libtracker-sparql.


sqlite is a package providing the sqlite shell. That would be
equivalent to having a package called tracker that provides just the
tracker-sparql tool.

All of sqlite is in libsqlite, and you can perfectly use libsqlite
without the sqlite package, like how I would like libtracker-sparql to
be too (tracker is an embedded db vs. an oracle or mysql or postgresql).

Without tracker-store, libtracker-sparql can't work.

It's more the other way round I had in mind, without
libtracker-sparql, tracker-store works.

Actually, libtracker-data (the code of tracker-store) is also the
direct-access implementation of libtracker-sparql.

The tracker-store itself is a bunch of in Vala written D-Bus service
wrappers around the same libtracker-data (that is used for
libtracker-sparql-direct's implementation) plus an eventloop and
checkpointing plus some initialization code and DBus daemonization.

That's it, that's what tracker-store is :)

Everything is in libtracker-data, which is used by libtracker-sparql

By splitting the ontologies into a separate project, managed by
the Nepomuk organization or not, we could someday refactor 
libtracker-sparql to support multiple instances of installed
ontologies.

I wouldn't split the ontologies out, the store is no good without
the database and without the schema it's even more useless :)

Sure it is. There are already a few companies that are using Tracker
with a completely different ontology.

The tracker-ivi project comes to mind. Will run on the dashboards of
some nice cars.

Besides, all the ontology validation and handling is in the
libraries the store depends on.

Not correct. All ontology validation rules are in the ontology's own
introspection. The libtracker-data's ontology validation rules are
created out of that ontology introspection, and nothing about it is
hardcoded in it.

Meaning that data/ontologies is completely separate from libtracker-data

You could install a complete different data/ontologies, and
libtracker-data would just happily crunch that and assembly you a
different meta.db with different ontology validation rules.

In fact, several embedded-solution companies do this today w. tracker.

Ideal in that scenario would be that tracker-store becomes an
impl. detail of libtracker-sparql (it'll manage the instances of
stores on a on-demand basis).

The only real three reasons of existence of tracker-store are:

- GraphUpdated - The fact that SQLite isn't MVCC and we need WAL
journaling and checkpointing done by a separate 'writer' process

Yea, single point of update.

But whether it comes as a implementation detail of libtracker-sparql
or is a central process of a Desktop session; shouldn't be of concern
to whoever links with and uses libtracker-sparql.

- Providing a ontology

Not sure I follow you here, the reason for the store is not to
provide an ontology - at least libtracker-data does a lot of that
stuff - I would have to double check this.

libtracker-data doesn't provide the ontology, data/ontologies does.
The libtracker-data implementation (which is also used 1 on 1 by
libtracker-sparql's direct-access implementation) doesn't care for 99%
what the ontology's content is.

I think just some base types like rdfs:Resource and xsd:* need to be
there. That's because of the tracker:uri() and tracker:id() functions
and the root-of-them-all 'Resources' table.

The store offers a lot of buffering, queueing and general
"management" of updates (and queries). Let's not forget the DBus
interface.


Yes. And the exact same buffering resides in any process that links
with libtracker-sparql(-direct): you have the same LRU for SQLite
statements.

Just that libtracker-sparql will reroute your write queries to
tracker-store, because of WAL checkpointing and GraphUpdated emitting.

It's also the only reason. If SQLite would have triggers and MVCC,
Jürg and me would never have introduced tracker-store in the first place.

When your process links with libtracker-sparql and uses
direct-access mode, it effectively has everything it needs to
deal with meta.db.

Indeed.

Voila :)

So why is tracker-store a separate central desktop-session daemon?

No reason. libtracker-sparql users shouldn't care.

The eventual dependency tree would look like this:

Your program depends on: | +- nepomuk-ontology | 
+-libtracker-sparql `- Internally uses
libexec/tracker-sparql-store

This API:

https://developer.gnome.org/libtracker-sparql/stable/TrackerSparqlConnection.html




Would look like:

#include <nepomuk/2006-2008/ontology.h> #include
<tracker-sparql.h>

static void something () { GError *error = NULL; Ontology
*ontology = nepomuk_ontology_2006_2008_new ("session id"); 
TrackerSparqlConnection *con = 
tracker_sparql_connection_get_for_ontology (ontology) 
TrackerSparqlCursor *cursor = tracker_sparql_connection_query
(con, "select ?a { ?a nie:title 'something in nepomuk' }", NULL,
&error);


while (tracker_sparql_cursor_next (cursor, NULL, NULL)) { g_print
("%s\n", tracker_sparql_cursor_get_string()); }

g_object_unref (cursor); g_object_unref (con); g_object_unref
(ontology); }

Interesting idea.

It could also be simpler with a:

TrackerSparqlConnection * tracker_sparql_connection_get_for_ontology
        (const char* ontology_path, const char *session_id)

of course


This would internally deal with starting and stopping a 
libexec/tracker-sparql-store, and no global "tracker-store" would
be needed anymore (the fact that multiple users share the same
"session id", creates the central storage for users of that
"session id").

Storage could go to: 
~/.tracker/sessions/$session-id/nepomuk-2006-2008/meta.db or
something

This is quite an addition to what we have now.

Yep, it suddenly allows for as much different ontologies and per-app
domain or per session storage of user's metadata.

That also allows for tighter security between sharing of metadata.

Ie. we could allow access to metadata of session_id to applications
a,b and c but not to d, e and f.

ie. Rythmbox could decide to share metadata with Banshee, but nut with
Evolution Data Server. But EDS could decide to share metadata with
Sylpheed, but also with Rythmbox.

And they could all have their own ontology, although the vast majority
on a GNOME desktop would obviously use Nepomuk.

I don't think it makes much difference because libtracker-sparql
will always depend on a tracker-store (or whatever alias you use
for grouping components that update the DB - libtracker-data,
libtracker-fts, etc.).

Yes, there's no big technical difference. Just that tracker-store
disappears from sight and as a public API, and only libtracker-sparql
remains as a publicly exposed API.

How tracker-store gets activated? Hopefully with systemd. Else we just
start and stop it ourselves in tracker_sparql_connection_get(), and/or
we like today use DBus service activation. Except that we'd have a
service per such a session_id or app domain or group. Not sure how
that works with DBus.

I guess the desktop-nepomuk-ontology.deb could install the .service
file to make use of DBus activation. There are many possibilities.

Consider the application that only wants to query and not update
the DB - they don't want to depend on all the crap needed for
updating the DB, just the raw libtracker-sparql part.


Except that they already and always depend on tracker-store. You can't
avoid it (read libtracker-sparql's initialization code: you always and
necessarily depend on it).


Kind regards,

Philip


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)

iQEcBAEBAgAGBQJUDd1KAAoJEEP2NSGEz4aDSY4IAKl2JiJQO8a9EXGYVPCAvQE5
GnTRrQ7Xh9iAF4CBvzyrJ8AMPxvgmGQQMsG+Zle0SMx8iGOONKH9SklvklPbpK0X
kBSoKlg2P6m+QVtK5kyh7JEi51L4JE+SioxDfFAgZZtdnsL8oH1ngU2nvzu4IJzQ
HwVWBYOy/vR9BPvGW4YpYdVys154S9dMXaQX/Tu+J4WV1KybMEDgrtgg5pBo0Y9j
4C9dopDiN2n/7uoLb3mhmAZkE4pXlsd4ECkuOkIqyCUtaig1/Iv/htSWdKePs1pn
O2VShf9JcEvTnNrl/Ugyj9eo28nwQpau2kv9Gl8oTAj7miy9c21+QBwUFCzbHtY=
=ZSs6
-----END PGP SIGNATURE-----


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]