Reply to topic

Common Nepomuk questions and answers

User avatar einar
Administrator
Posts
2272
Karma
5
OS
Q. What is the Nepomuk Semantic Desktop, and the Nepomuk File Indexer?

A. The Nepomuk Semantic Desktop is the foundation of the all the other modules of the Nepomuk infrastructure. It provides a way to organize, annodate and build relationships among the data (not only file name and content, but for example which applications used a certain file, or how it is tagged). A number of KDE applications and workspaces use this basic infrastructure to deliver features such as email tagging (KMail) or activity setup (Plasma).

On the other hand, the Nepomuk File Indexer is a system to index files so that they can be added to the main Nepomuk repository, a convenient way to use them within Nepomuk without adding any file manually. Also, applications such as Dolphin can then search for files basing on content, name, or other meta-data (e.g. tags) associated to indexed files. Such an indexer can also index non-text files, such as PDFs, by accessing the meta-data contained in these files (author, publication information, etc.). Some KDE components ship additional "analyzers" for more file types. Nepomuk can be fully functional without the use of the File Indexer, which is an additional (and optional) component.

Q. How can I disable the semantic desktop?

A. Most of the times, the easiest way is to disable file indexing, which is usually, among the Nepomuk components, the heavier in resource usage (although many optimizations which reduce resource usage have been included in both in 4.7.3 and Soproano 2.7.3). This is done by unchecking "Enable Nepomuk File Indexer" in the "Desktop Search" section of System Settings. In case you want to turn off all semantic features, uncheck "Enable Nepomuk semantic desktop". Notice that this will turn off search in Dolphin as well.

Notice that with the latter option some programs who use Nepomuk for meta-data will offer reduced functionality: for example KMail will not be able to tag mail, or Plasma activities will not offer additional features such as icons, or program data information.

Q. Why do I have nepomukservicestub processes even though I've disabled Nepomuk?

A. It may be a bug. Please file a bug to http://bugs.kde.org with a complete description of your problem and the steps to trigger it.

Q. Sometimes Nepomuk, Virtuoso, Strigi, or Akonadi consumes too much RAM.

A. Many of these problems have been fixed, particularly in the 4.7.3 release, and in Soprano 2.7.3. Please make sure both are installed. If Sopronao 2.7.3 is not available please contact your distribution and ask them to update it, because it has a lot of major fixes compared to earlier versions. In other cases, however, the developers are unable to reproduce the issues correctly. In this case, providing examples and test cases to bugreports increase the chances to get these bugs fixed.

Q. File indexing of PDF/some other file types doesn't work.

A. PDF indexing has been fixed with the 4.7.3 release. If you have issues with other files, open a bug, preferably adding a sample file that shows the problem.

Q. The program nepomukservicestub crashes at startup.

A. A large number of fixes for crashes has been fixed for the 4.7.3 release of the KDE Workspaces and Applications. If you encounter more, please file bugs report with detailed instructions on how to reproduce the problem, as sometimes the developers are unable to trigger them in their test setups.

Q. The Virtuoso-t process hangs at 100% CPU.

A. Virtuoso-t is a key component of the Nepomuk infrastructure and in some occasions the commands sent by the other components end up taking too much time (hence showing the effect of 100% CPU). Sebastian Trueg (the lead developer of Nepomuk) has fixed most of these problems in 4.7.1 or newer, in particular the combination of 4.7.3 and Soprano 2.7.3.

Q. Sometimes Nepomuk consumes too much RAM.

A. Many of these problems have been fixed, in other cases however the developers are unable to reproduce the issues correctly. In this case, providing examples and test cases to bugreports increase the chances to get these bugs fixed.

Q. Nepomuk re-indexes files at startup.

A. This bug has been fixed in 4.7.0 versions. Now Nepomuk just "scans" for changes, without indexing anything.

Q. Nepomuk accesses the disk too much on startup.

A. In 4.7 and newer this problem has been lessened thanks to a throttling mechanism implemented in the file indexer.

Q. My Nepomuk database has been corrupted. How do I clean it?

A. In the extreme case your database is really corrupted and all other attempts have failed, you can delete the $KDEHOME/share/apps/nepomuk directory (where $KDEHOME is usually .kde or .kde4) while Nepomuk is not running. The database will be cleared, but you will also lose existing information such as tags, ratings and comments.

Q. How do I make sure Nepomuk metadata are preserved when moving files?

A. Using Dolphin or Konqueror to move your files and metadata will be preserved, unless you move data to a folder that has been explicitly excluded by indexing. In that case, metadata will be lost.

Q. How do I back up Nepomuk data (for example doing a reinstall)?

A. The Nepomuk backup/restore system is broken since the 4.7.x releases and it's not fixed yet at this time (4.8.0), so you need to backup your Nepomuk database manually. A possible method is as follows:
To back up:
  • Stop Nepomuk using the Control Panel.
  • Locate where database is stored: echo `kde4-config --localprefix`share/apps/nepomuk/
  • Make a complete copy of the folder "repository" to a safe place.
To restore:
  • Stop Nepomuk using the Control Panel.
  • Locate where database is stored: echo `kde4-config --localprefix`share/apps/nepomuk/
  • Remove, or rename, the folder "repository".
  • Restore the copy of the "repository" folder from the safe place.
  • Start Nepomuk using the Control Panel.
If you only need to backup/restore your tags, descriptions and ratings there is an utility called Neposidekick available (http://kde-apps.org/content/show.php?content=137233) to perform the task, storing these information in a hidden file for each indexed folder.

Thanks go to Sebastian Trueg, lead Nepomuk developer, alin from the #opensuse-kde channel, and Ignacio Serantes from the forums for answering these common questions.


"Violence is the last refuge of the incompetent."
Image
Plasma FAQ maintainer - Plasma programming with Python
User avatar karthikp
Registered Member
Posts
106
Karma
0
OS
An excellent write up! Thanks!

I've largely copied (for now) the content of your post over to Nepomuk's userbase page which is a better place for this information.


karthikp, proud to be a member of KDE forums since 2008.
Image
User avatar Ignacio Serantes
Registered Member
Posts
448
Karma
1
OS
I want to notice $KDEHOME is mentioned in answers but this environment variable is not always available. As far as I know the most reliable method to locate Nepomuk path is:

`kde4-config --localprefix`share/apps/nepomuk/

because in some systems there is both a .kde and a .kde4 directories.

Some examples:
cd `kde4-config --localprefix`share/apps/nepomuk/
echo `kde4-config --localprefix`share/apps/nepomuk/
ls `kde4-config --localprefix`share/apps/nepomuk/


Ignacio Serantes, proud to be a member of KDE forums since 2008-Nov.
User avatar robert76
Registered Member
Posts
4
Karma
0
OS
Thank you for the information about Nepomuk. Regarding the scan upon login, I was just wondering why this is at all necessary. Can't Nepomuk just track the changes made to files as the user makes them? Why waste CPU energy etc to run a scan at login? It's not like any changes could have been made to files whilst the user was logged out!
User avatar karthikp
Registered Member
Posts
106
Karma
0
OS
robert76 wrote:Regarding the scan upon login, I was just wondering why this is at all necessary. Can't Nepomuk just track the changes made to files as the user makes them? Why waste CPU energy etc to run a scan at login? It's not like any changes could have been made to files whilst the user was logged out!


Strictly speaking, that's not true. Files can be changed without the user needing to log in through X. Besides, I suppose nepomuk might do an integrity check as well to make sure its indices are okay.


karthikp, proud to be a member of KDE forums since 2008.
Image
User avatar robert76
Registered Member
Posts
4
Karma
0
OS
karthikp wrote:Strictly speaking, that's not true. Files can be changed without the user needing to log in through X. Besides, I suppose nepomuk might do an integrity check as well to make sure its indices are okay.


Ok, fair enough. But, I'm talking about an average guy using his computer, with no shared folders with other user accounts, no changes being made outside X etc. For such a person, there's no reason for the scan to take place at every login. Unless. of course, it really is necessary to do an integrity check so often? Perhaps it could be good to offer some extra preferences so that Nepomuk (and its Strigi component) can better adapt to different usage patterns?

It's interesting that Spotlight from Apple doesn't seem to do a rescan at every login - I wonder how this system manages it?
User avatar karthikp
Registered Member
Posts
106
Karma
0
OS
robert76 wrote:
karthikp wrote:Strictly speaking, that's not true. Files can be changed without the user needing to log in through X. Besides, I suppose nepomuk might do an integrity check as well to make sure its indices are okay.


Ok, fair enough. But, I'm talking about an average guy using his computer, with no shared folders with other user accounts, no changes being made outside X etc. For such a person, there's no reason for the scan to take place at every login. Unless. of course, it really is necessary to do an integrity check so often? Perhaps it could be good to offer some extra preferences so that Nepomuk (and its Strigi component) can better adapt to different usage patterns?

It's interesting that Spotlight from Apple doesn't seem to do a rescan at every login - I wonder how this system manages it?


The problem comes down to defining an average guy. I consider myself average, and I'm not beyond ssh'ing into my machines and editing files remotely. Hence, to be on the safe side, nepomuk (if enabled) should probably do a quick check when you log in.

I stopped using os x before spotlight made its appearance. I only have a vague idea of what it is. If it merely indexes files by name, that's easy. updatedb and locate do much the same. If it indexes files by their content, it will need to be able to read through the files, like strigi does.

I suspect the answer also has something to do with what nepomuk is capable of. Nepomuk isn't a dumb backend that stores and returns metadata. It's designed with the intent of being an intelligent system that can see connections that may not be immediately apparent. That's the "semantic" part you keep hearing about. Although frankly, I don't see it in action. It stores and retrieves tags and metadata well enough, but I have yet to be wow'ed by what nepomuk does. If anything, I've usually been left puzzling over why it sometimes can't find some files even though I'm searching for them by an exact match...


karthikp, proud to be a member of KDE forums since 2008.
Image
User avatar robert76
Registered Member
Posts
4
Karma
0
OS
karthikp wrote:The problem comes down to defining an average guy. I consider myself average, and I'm not beyond ssh'ing into my machines and editing files remotely. Hence, to be on the safe side, nepomuk (if enabled) should probably do a quick check when you log in.


In computer terms, you definitely aren't an 'average guy' if you're happy "sshing" into your machines! Your average guy will have absolutely no idea what you're talking about!

karthikp wrote:I stopped using os x before spotlight made its appearance. I only have a vague idea of what it is. If it merely indexes files by name, that's easy. updatedb and locate do much the same. If it indexes files by their content, it will need to be able to read through the files, like strigi does.


I use Spotlight on an old iBook G4 (from 2005) and it appears to be pretty much the same kind of thing as Nepomuk+Strigi. I can add comments to files, which Spotlight will index, and it indexes files by name and content. When you first turn Spotlight on it does a huge indexing of all your files, which does indeed take quite a while and it takes its fair share of CPU - just like Nepomuk/Strigi during the first indexing. The main difference after that is that it only registers further changes to files as you make them - it doesn't bother with this scanning at every login.

I believe that I read somewhere that the Nepomuk/Strigi scan at login is restricted to a certain level of CPU usage, which reduces the impact it has on the system. This seems like a good idea. I was just wondering why such a scan was necessary in the first place. Yes, as you have explained, there are reasons for this when taking into account experienced Linux users (people who know what on earth 'sshing' is!), but it could be good to also offer some settings that better fit the needs of non-expert users.

 
Reply to topic

Bookmarks



Who is online

Registered users: Alexa [Bot], apater, Baidu [Spider], bcooksley, Bing [Bot], eagleton, Exabot [Bot], Google [Bot], google01103, GreatEmerald, hmethorst, khsien, koriun, La Ninje, lazyit, Majestic-12 [Bot], SecretCode, Sentynel, Steve T, urgo, Yahoo [Bot]

cron