Hi Sam, So, what happens when I read a blog like this, and I find this: Decision 3: Kick RDF in the Nuts RDF is a shitty data model. It doesn’t have native support for lists. LISTS for fuck’s sake! The key data structure that’s used by almost every programmer on this planet and RDF starts out by giving developers a big fat middle finger in that area. Blank nodes are an abomination that we need, but they are applied inconsistently in the RDF data model Is the following. My mind thinks: this guy is ranting. I used to be like this guy. I have better things to do. And: Oh my god, not another guy who wants to create a standard. As said, it's fine to add an output format for tracker-extract, but between the processes tracker-extract, tracker-miner-fs and tracker-store there's absolutely no need, whatsoever, to have JSON. TTL is the format that we focus on, and that we parse without effort given that it's part of SPARQL. JSON is not. I think that something that converts from our output format to JSON-LD is probably the task of a AngularJS or cgi-bin frontend for some web server. That this web-server contacts tracker-extract's IPC instead of tracker-store is something we right now don't support. But that doesn't mean it wouldn't be a good idea to create such a nice tracker-extract API (just note that you will have to become the maintainer of that API). I think your frontend thingy could convert it to this format: https://www.w3.org/TR/sparql11-results-json/ , or in JSON-LD, however, tracker-extract could return it in TTL and/or using the FD passing technique which is also in use between libtracker-sparql and tracker-store. And then tomorrow we'll all read another ranter's blog and instead of JSON-LD we will use that instead in the frontend thingy. Fine. Kind regards, Philip On Sun, 2016-04-10 at 22:15 +0100, Sam Thursfield wrote:
Thanks for the quick feedback! You're right that I should have implemented Turtle output. I've done that now, this is the result (as you'd expect): <urn:artist:Best%20Coast> nmm:artistName "Best Coast" ; rdf:type nmm:Artist . <urn:album:The%20Only%20Place> nmm:albumTitle "The Only Place" ; rdf:type nmm:MusicAlbum ; nmm:albumArtist <urn:artist:Best%20Coast> . <urn:album-disc:The%20Only%20Place:Disc1> nmm:setNumber 1 ; nmm:albumDiscAlbum <urn:album:The%20Only%20Place> ; rdf:type nmm:MusicAlbumDisc . <file:///home/sam/Downloads/Best%20Coast%20-%20The%20Only%20Place.mp3> nie:comment "Free download from http://www.last.fm/music/Best+Coast and http://MP3.com" ; nmm:trackNumber 1 ; nmm:performer <urn:artist:Best%20Coast> ; nfo:averageBitrate 128000 ; nmm:musicAlbum <urn:album:The%20Only%20Place> ; nfo:channels 2 ; nmm:dlnaProfile "MP3" ; nmm:musicAlbumDisc <urn:album-disc:The%20Only%20Place:Disc1> ; rdf:type nmm:MusicPiece , nfo:Audio ; nfo:duration 164 ; nfo:codec "MPEG" ; nmm:dlnaMime "audio/mpeg" ; nfo:sampleRate 44100 ; nie:title "The Only Place" . I'm still kinda interested in JSON-LD, because JSON (though not JSON-LD) has such a massive user base already. Phillip, JSON-LD *is* a W3C standard: <https://www.w3.org/TR/json-ld/>. The great thing about standards is there are so many! That said all the W3C's previous attempts at RDF-in-JSON are quite bad, I think JSON-LD is definitely an improvement. There's a great blog post from the main guy behind the standard called "JSON-LD and Why I Hate the Semantic Web" which I recommend reading :-) <http://manu.sporny.org/2014/json-ld-origins-2/> Anyway, for my purposes, Turtle output from the extractors is fine (and a big improvement on SPARQL). I'll keep the JSON-LD stuff around in a separate commit. On Sat, Apr 9, 2016 at 12:49 PM, Carlos Garnacho <carlosg gnome org> wrote:Hey Sam :),so, inspired by something in the Python RDFLib library, I came up with a TrackerResource class that the extractors can use instead. This is a work in process, but I have a branch in git.gnome.org that adds TrackerResource, and converts some of the extractors to use it. The TrackerResource class can serialize either to SPARQL update commands or to JSON-LD. The branch also adds the `tracker extract` command from <https://bugzilla.gnome.org/show_bug.cgi?id=751991> so you can try out the extractors easily and specify `-o json` or `-o sparql` as you prefer.Nice! Should it have a turtle serializer too? Do you think this can be possibly used in the tracker store side to serialize contents?I hadn't thought of that, but it's definitely possible. You could have a `tracker serialize-the-whole-database` command :-) In terms of backups, part of me things we should use an efficient binary format.. but then it's hard to trust a backup that is an opaque binary format. If we could serialize to Turtle or JSON-LD then you could tell just by looking whether it was valid or not. We can just gzip it to make it small. ...Here's an example of auto-generated SPARQL for an MP3 extraction:<snip>Note there are a lot more DELETE statements than before. I figured that anywhere we want to replace the existing data we need a DELETE statement, and the reason we don't normally do it is because previously it had to be done manually. That said, the TrackerResource class does have a way of avoiding this. If you ever call _set_value() for a property then it assumes you want to *overwrite* it, and will generate a DELETE. If you only use _add_value() then it will assume you want to *add* to it, and won't generate a DELETE. The latter case is needed for stuff like nao:hasTag. I may be misunderstanding things here of course, I didn't actually write any of the extractors myself.Sounds good :), It seems to me that the generated sparql already ensures some correctness, which is great. The difference between set and add makes sense, given that we have to deal with single and multivalued properties. The only potentially harmful combination would be doing add_value() on a single valued property, is there any way that could raise a warning in tracker-extract, rather than being caught late due to the failed insert?I don't think that's possible because libtracker-sparql doesn't have any knowledge of the ontologies. We could move a bunch of code from libtracker-data to libtracker-sparql to make it happen, but I actually think it's a good design to have libtracker-sparql separate from Tracker's own database and Tracker's own ontologies. Sam _______________________________________________ tracker-list mailing list tracker-list gnome org https://mail.gnome.org/mailman/listinfo/tracker-list
Attachment:
signature.asc
Description: This is a digitally signed message part