Reply to topic

"tags" in Nepomuk / Social Semantic Desktop

User avatar woodsmoke
Registered Member
Posts
110
Karma
0
OS
The whole KDE implementation, albiet, in early stages, of the Nepomuk ....zeitgeist.. the Semantic Social Desktop......has really, at a very deep level, altered and enhanced my workflow, "serendiptiy of encounter".... hard to say....

But, because of this I have done considerable research, as much as my limited capabilities will allow on the "concept" and "implementation" of the whole thing...

To that end I would ask a question.

One of the "big things" about Nepomuk and the Semantic Social Desktop is that somehow information becomes more "interlinked".... to wit: a song file has a certain characteristic in a file structure and a document has another.

One of the, usually it seems, seldom discussed things about the Semantic Social Desktop, aside from the "desktop tab" or "cashew" is RDF/XML.

If one opens a Calligra document one finds under file/document information the third item down under "General": Rdf.

Exploration of Rdf provides in the first tab: Triples.

For the person who does not know to what this refers, it basically is a way to "identify" the document in a MUCH SIMPLIFIED: "subject, verb, object" manner.

My question for the folks who know much more than the old woodsmoker is that:

i use the search function quite often and there has been advocacy at Kubuntu forums of the Recoll file finder which certainly presents things in a different manner than "Dolphin"...

But I had the rather niave idea, I guess that the whole RDF/XML thing would allow applying "tags" to documents, in addition to the ultimate idea of all data being searchable, such as the word dog buried in the fifth paragraph of a document.

If one proceeds to the Namespaces tab one sees a variety of items which are not intuitively understandable to the old Woodsmoker.

However, in the Semantic tab there are three more understandable items, Event, Contact, and Location.

One would ASSUME and one knows what one gets with THAT... :) . one would assume that clicking the item might apply it to the document.

If one does click say.... "location" the line turns blue and one is kicked back to the original menu /tabs...but

I cannot seem to "find" anything on the document itself... and also...one does not seem to be able to specify "the" location.

Moving on to Stylesheets one can set the items of the Semantic "objects" in the document, but...again....the old woodsmoker does not seem to be able to perceive just HOW that is done...within the document.

Now yes, if one goes back to the General part, one sees title, subject, keywords, language, comments...etc...

What...the old woodsmoker thought was that possibly at this stage in the evolution of the Semantic/Social desktop that there would be a...

Trivial set of things that could be done to a document such as applying "tags".

Are the "Keywords".........the same as "tags"?

And, the old woodsmoker thought that possibly one could "tag" a ....word...such as in: "The dog jumped onto the wagon". That one could highlight and right click the word "dog" and somehow, possibly "tag" the word.

What one sees along with more mundane things is: Shape properties. WOAH!! OK...cooking with gas here,...

Because if one researches RDF/XML there is a lot of work with "shape" in tems of the triples... but...no...shape herein apparently is how the "shape" has text flow around it etc.

So....the ultimate question is this:

For the "new user" of Calligra, or LibreOffice...

What "practical" and probably MORE important.... "easy" things can a user do within a document, or to the document as a whole which will "interconnect" the document with...other things...such as say...a music file?

And that because...if one looks at properties of say an mp4, one sees the usual extensions and "permissions" but that is all, nothing that the old woodsmoker can see that would be able to provide linkage betwen the two files..

OR...is the old woodsmoker misunderstanding this whole thing.

Is all of this somehow just to be "automatagically" done "behind the scenes and that the Strigi/Nepomuk/whatever "search" would automagically somehow hook all sorts of stuff together at some time in the future..?

if so, that...it would seem...would require a horrendous amount of computing power....

maybe the old woodsmoke just doesn't understand. :(

THANKS MUCHLY for reading all of this..

and awaiting possible answers...

woodsmoke
User avatar google01103
Manager
Posts
6668
Karma
25
moved to the Semantic Desktop forum as those involved in it's development are more apt to follow that forum than a more generic general discussion one


OpenSuse Leap 42.1 x64, Plasma 5.x

metzman
Registered Member
Posts
171
Karma
3
OS
woodsmoke wrote:maybe the old woodsmoke just doesn't understand. :(


Fear not, you are not alone.

Personal perspective, these views are entirely my own... etc, etc...

This whole "Semantic Thing" I just don't get.

I (now) exclusively use Recoll for indexing/searching, it has _real_ advanced search capability, IMHO, it's "the bee's knees". -- Tags & Keywords, yeah pretty much the same thing. "Meta-data", all that extra "stuff" that you attach to files. -- But it's not what you use, it's how you use it.

Hypothetical scenario... 'boy' meets 'girl'; at 'location', 'date' and 'time'; whilst listening to 'music', and enjoying 'food'... add more as you feel fit.
To record all of this we have (for example) photos, audio, and documents... Now, to "bring it all together", add the meta-data, then, with the clever use of AND, OR, NOT, EQUAL, !EQUAL, etc. we can, at some future date, conduct a search... The result of which is only as good as the data "tagging", and search query used. "Garbage In - Garbage Out"....

If one knows what one will be searching for in the future, then I guess it's relatively easy to ensure all of the relevant meta-data is added today... I for one have no idea what I'll be searching for in the future.

My own approach is you (no, should be I) can't go far wrong with the tried and tested: Title, Subject, Date, Time... approach. It seldom let's me down when I'm desperately trying to link apparently random items together; even throws up the odd surprise sometimes...

Auto-magic is fine... but it's far better if you know your own spells... ;)
User avatar woodsmoke
Registered Member
Posts
110
Karma
0
OS
LOL

thanks for the reply! :)

I've been intrigued by the whole thing about the "triplets" and all that.. and I was surprised that "xml" is also involved.

One of the references that is made is to some kind of "metadata" which is like a "grid", implying that there would be some way for the "user" to somehow apply various "tags" like you............the more the better, or maybe "only the facts m'am only the facts" so...dunno

One thing I noticed is that when one right clicks any file that one can link it to one of the "activities", in my case, the two I use are "desktop" and "writing".

But.. still

Another thing that has appeared in Kubuntu is that when one opens say...home ...in the file structure one sees "Dolphin" at the top but when one does a find there is this at the top:

ballosearch - / - Dolphin

So...Baloo is now somehow integrated into Dolphin.

It "might" be that in an "xml" type file one could "edit" the xml.

I actually reported on this some.... ten years ago when Microsith was pushing the .xml in something like a .docx and everybody in the Linux community was going nutso about not being able to open a .docx file.

In the properties one can view the "xml" quite easily, and extract the "text" and copy it into another type of document, now there is no formatting etc. it is just "plain text".

There is a bunch of other gobbeldygook in the properties, and I have no clue about what the gobbledygook means or does, but if one understood it one could edit I'm sure.

And...it is probably that it only includes information about the "kind" of file, not any kind of "tags". but..dunno..

But it "seems" that what I posted in the OP that there are some "things" that can be "clicked" there at present, but they do not seem to me to be "editable"...maybe that is by intent, it is not supposed to be editable, or it may be that such an editing feature will be produced later.

But, it is all very intriguing.

One thing that I started doing years ago was adding "descriptor" terms at the top of a document that was more than the title. And one can do that in the present situation in properties...

BUT................one cannot search for them and obtain the document.

An example is that I placed the word "sarcodina" in the "keyword" field in a Calligra document and attempted a "find" and nothing was returned.

Changing the subject from Calligra, LibreOffice, automagically found the term "sarcodina" AND....has a lot more "things" in terms of menu items that can be applied than in Calligra...

So..if one cannot find the term "sarcodina" then...seemingly...it is not now a searchable term, but as an afterward would be a "descriptor" for somebody looking at the the document in situ.

HOWEVER Again another interesting tidbit... if one right clicks a document and looks down in the menu there is a menu item for "activities" and one can "link" to the my "desktop or writing" nooowww if one goes up to the "view" tab in Dolphin and goes down through to "additional information then "other" and then one sees that one can show "link destination".

So...i tried to see a link destination for the document that I linked to "writing" and even though a title in the view box appeared for "Link destination" it did not have any "link" shown in the box for that particular document, it was blank as were all of the other documents.

Also I tried searching for the term "sarcodina" using Recoll and again, the document was not returned. Howeve... when I inserted the term "sarcodina" in a seperate document as TEXT, Recoll immediatly found it.

So...it is interesting. :)

woodsmoke
metzman
Registered Member
Posts
171
Karma
3
OS
Picking up on a few points...

woodsmoke wrote:One of the references that is made is to some kind of "metadata" which is like a "grid", implying that there would be some way for the "user" to somehow apply various "tags" like you............the more the better, or maybe "only the facts m'am only the facts" so...dunno

Meta-data overall uses too many "variable standards" :P ... The ability to create/edit meta-data is very dependant upon the data it's being added to and the application adding it.

A few examples: Apologies in advance if I'm teaching you to suck eggs ;)

Image produced by a digital camera... would have added at the point of taking "Exif" data, this is primarily technical data that the camera "knows" about the image. Import that image using photo editing software and a limited amount of extra data can be added to the "Exif" meta-data. Together with (generally) a couple more types of meta-data, IPTC and XMP, the latter gaining more widespread use. DigiKam is an excellent tool for managing photos, it has extensive meta-data editing capabilities...

Audio files... most types have the ability to store meta-data, this is all (unless the audio file has been purchased/downloaded) up to the user to add in some way. When using software to perform digital audio extraction from a CD for example, often data about the CD will be looked up using CDDB and meta-data is added at the time of extraction... Artist, Album, Track #, Track Title, Track Artist, Playing Time.... The subsequent audio file can have this data changed/added to by using numerous tools, my own choice is 'Kid3 ID3 Tagger'....

Text/Document files... if it's plain text then you're largely out of luck as far as meta-data goes, 'baloo' is able to add 'tags', 'rating', and 'comment' by using extended file attributes. Most other document files have various methods/formats to add meta-data.

So... :-\ How much to add... like yourself... dunno. One could spend hours adding meta-data that is never used; likewise, meta-data that's not there can't be indexed and subsequently found.... been there, done that, we probably both have the t-shirt.

... one can link it to one of the "activities", in my case, the two I use are "desktop" and "writing".

Don't use "Activities" myself... erm... I never quite "got it"... Maybe I'm missing something useful.

So...Baloo is now somehow integrated into Dolphin.

"Integrated" in the sense that you and I would use the term, yes. Baloo is the 'back-end' that scans, indexes, and extracts meta-data from files on your PC. This is then stored in a couple of databases for subsequent searching... Emphasise, *my personal opinion*, 'Baloo' is a "work in progress", with more than it's share of bugs teething problems.

It "might" be that in an "xml" type file one could "edit" the xml. ... snip ... I have no clue about what the gobbledygook means or does, but if one understood it one could edit I'm sure.

I would say "at one's own risk", seriously, not a good idea unless you were attempting to salvage data from a corrupted file. I won't disagree that there is plain text which could be edited ... in an OpenDocument format this would be the "content.xml", which, with care, could be edited. Wikipedia has quite a good overview of the structure of these files: https://en.wikipedia.org/wiki/OpenDocument_technical_specification

One thing that I started doing years ago was adding "descriptor" terms at the top of a document that was more than the title. And one can do that in the present situation in properties...

Effectively you were adding your own "meta-data"... I've done exactly the same thing myself in the past, and indeed still do with some documents. There are quite strong arguments for doing so in terms of knowing, that as part of the document body, it is always readily read/indexed...

BUT................one cannot search for them and obtain the document.

Using what application for the search? Dolphin/Baloo...

An example is that I placed the word "sarcodina" in the "keyword" ... snip ... Also I tried searching for the term "sarcodina" using Recoll and again, the document was not returned. Howeve... when I inserted the term "sarcodina" in a seperate document as TEXT, Recoll immediatly found it.

I don't use Calligra myself but I'm certain it uses Open Document Format (ODF) as its main file format. Recoll should certainly have returned it in it's result list, first guess is a maybe set up problem; or not indexed yet, what indexing schedule were you using?

Recoll is an extremely versatile and powerful search application, written out of a "real life" need rather than as a "software project". (Disclaimer, I've no connection with it other than as a very satisfied user). "As is" it works well, to customize it to one's own needs takes a while, but, my <insert your own deity> it's well worth the effort. I for one will not be going back to 'Baloo'....

As an aside, if you needed any help with Recoll, I'd be quite happy, within reason, and if able, to offer assistance. The author is extremely helpful also.

So...it is interesting.

Bottom line is we all need to find a solution that "works for me"... if we (as a community) can help each other along the way, that's great.

 
Reply to topic

Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], claydoh, cmb, cylverbak, Google [Bot], JesusM, kevjon, scummos, Sogou [Bot], YaCy [Bot]