Re: [Tracker] Using tracker extractors from other applications
- From: Nikolaus Rath <Nikolaus rath org>
- To: tracker-list gnome org
- Subject: Re: [Tracker] Using tracker extractors from other applications
- Date: Sat, 20 Nov 2010 12:40:23 -0500
Nikolaus Rath <Nikolaus-BTH8mxji4b0 public gmane org> writes:
Now that I know what to look for, I did some Googling as you suggested
and found a Python module for parsing turtle
(http://rdflib.googlecode.com/) so I will probably use that.
In case someone else is interested: the following Python code uses
tracker to extract the plain text content of a file:
import textwrap
import dbus
import os
from rdflib.graph import ConjunctiveGraph
from rdflib.parser import StringInputSource
from rdflib.namespace import Namespace
from rdflib.term import URIRef
def print_plain_text(path):
    prefix = textwrap.dedent('''\
    @prefix nie: <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#> .
    @prefix nfo: <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#> .
    @prefix nco: <http://www.semanticdesktop.org/ontologies/2007/03/22/nco#> .
    @prefix nmo: <http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#> .
    @prefix ncal: <http://www.semanticdesktop.org/ontologies/2007/04/02/ncal#> .
    @prefix nexif: <http://www.semanticdesktop.org/ontologies/2007/05/10/nexif#> .
    @prefix nid3: <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#> .
    ''')
    nie_ns = Namespace(URIRef('http://www.semanticdesktop.org/ontologies/2007/01/19/nie#'))
    url = 'file://' + os.path.abspath(path)
    bus = dbus.SessionBus()
    proxy = bus.get_object('org.freedesktop.Tracker1.Extract', 
                           '/org/freedesktop/Tracker1/Extract')
    tracker = dbus.Interface(proxy, 'org.freedesktop.Tracker1.Extract')
    meta = tracker.GetMetadata(url, '')[1].toPython()
    graph = ConjunctiveGraph()
    graph.parse(StringInputSource(prefix + '<%s>' % url + meta), format='n3')
    contents = [ x[2].toPython() for x 
                 in graph.triples((URIRef(url), nie_ns.plainTextContent, None)) ]
    print('\n'.join(contents))
    
Best,
   -Nikolaus
-- 
 ÂTime flies like an arrow, fruit flies like a Banana.Â
  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]