This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Baloo failing to index some filenames

Tags: None
(comma "," separated)
greggb
Registered Member
Posts
15
Karma
0
OS
I've noticed that sometimes I unable to find files that I know exist. It seems that sometimes the filename is not saved as one of the search terms.

For example:

Code: Select all
-rw-r--r-- 1 ...  15078 Nov  5  2014 /home/.../Offloaded Directories/D023 FileList.txt
-rw-rw-r-- 1 ... 102619 Nov  5  2014 /home/.../Offloaded Directories/D023 TrackList.csv
-rw-r--r-- 1 ...  15838 Jan 19 17:20 /home/.../Offloaded Directories/D024 FileList.txt
-rw-rw-r-- 1 ... 109366 Jan 19 17:25 /home/.../Offloaded Directories/D024 TrackList.csv


A search for D023 gets the expected results:
Code: Select all
$ baloosearch D023
  15951 /home/.../bkup/_offloaded-directories/D023 FileList.txt
  184541 /home/.../Offloaded Directories/D023 FileList.txt
  15950 /home/.../bkup/_offloaded-directories/D023 TrackList.csv
  184542 /home/.../Offloaded Directories/D023 TrackList.csv

Whereas a search for D024 returns only two files which contain the string "d024".

However, the D024 files have been indexed:
Code: Select all
$ balooshow ~/.../Offloaded\ Directories/D024\ FileList.txt
184543 /home/.../Offloaded Directories/D024 FileList.txt
        Line Count: 263

And a search for text in the files is successful.

I was wondering if this is a known problem and if there is some way around it?

Currently running PCLinuxOS KDE 4.14.7
User avatar
vHanda
KDE Developer
Posts
84
Karma
0
OS
I know about a similar problem with '_' which I've fixed for the next release. Not sure about this one. Run `balooshow -x <fileUrl>` for it to output all the words that file contains.
greggb
Registered Member
Posts
15
Karma
0
OS
The version of balooshow I have doesn't seem to have that option:

$ balooshow -x ~/.../D024\ FileList.txt
balooshow: Unknown option 'x'.

$ balooshow -v
Qt: 4.8.6
KDE Development Platform: 4.14.7
Baloo Show: 0.1

I did poke around in the database:

$ delve file -t d024
Posting List for term `d024' (termfreq 1, collfreq 2, wdf_max 2): 121648
(interesting that baloosearch returns two files for d024 but the milou widget nothing)

$ balooshow 121648
121648 /media/.../Inbox.msf
Line Count: 69180

I redirected the output of $ delve -r 184543 and by comparing it to the results from 184541 (D023 FileList.txt) I think I am closer to the cause of the problem... The file information for 184543 (D024 FileList.txt) is: Ffilelist Ftxt - I think this is because the file was originally named FileList.txt but was moved and renamed (Using Dolphin). However, I should note that the same is true of D023 FileList.txt and previous files. But a file I moved and renamed yesterday exhibits the same problem.

Hope this helps.
User avatar
vHanda
KDE Developer
Posts
84
Karma
0
OS
I apologize, but I cannot provide support for or actually fix anything in the KDE 4 version.

With Plasma 5, lots of code has changed. In fact between Plasma 5.3 and 5.4 (August 2015), we have even replaced Xapian.
greggb
Registered Member
Posts
15
Karma
0
OS
Understood... Fortunately, I am curious about Plasma 5 and have a fully updated install of KaOS. (Plasma 5.3.1 Qt 5.4.1)

The same issue seems to exist in that version too. It is easy to recreate...

Create a text file with some searchable terms:
Code: Select all
$ baloosearch twordone

  24 /home/gregg/Documents/TestFile.txt

$ balooshow -x ~/Documents/TestFile.txt
24 /home/gregg/Documents/TestFile.txt
        Line Count: 2

Xapian Internal Info

Words: testfile twordone twordtwo txt

Prefixed Words: DT_M2015-06-05T13:42:28 DT_MD5 DT_MM6 DT_MY2015 Ftestfile Ftxt Mtext/plain Tdocument Ttext Z2

lineCount: 2

Then rename the file:
Code: Select all
$ baloosearch twordone

  24 /home/gregg/Documents/TestFile i1.txt

[gregg@KaOS ~]$ balooshow -x ~/Documents/TestFile\ i1.txt
24 /home/gregg/Documents/TestFile i1.txt
        Line Count: 2

Xapian Internal Info

Words: testfile twordone twordtwo txt

Prefixed Words: DT_M2015-06-05T13:42:28 DT_MD5 DT_MM6 DT_MY2015 Ftestfile Ftxt Mtext/plain Tdocument Ttext Z2

lineCount: 2

As you can see the rename is recognised, but the new name is not reflected in the seach terms.

However, if the file is then edited:
Code: Select all
$ baloosearch twordthree

  25 /home/gregg/Documents/TestFile i1.txt

[gregg@KaOS ~]$ balooshow -x ~/Documents/TestFile\ i1.txt
25 /home/gregg/Documents/TestFile i1.txt
        Line Count: 4

Xapian Internal Info

Words: i1 testfile twordone twordthree twordtwo txt

Prefixed Words: DT_M2015-06-05T14:02:50 DT_MD5 DT_MM6 DT_MY2015 Fi1 Ftestfile Ftxt Mtext/plain Tdocument Ttext Z2

lineCount: 4

The new file name is added to the search terms.
User avatar
vHanda
KDE Developer
Posts
84
Karma
0
OS
Right. The filename isn't actually updated in the xapian db.

Fortunately, this is fixed in master :)

Code: Select all
vlap:~ $ touch twodrone
vlap:~ $ baloosearch twodrone
/home/vishesh/twodrone
Elapsed: 0.261148 msecs
vlap:~ $ balooshow -x twodrone
114458524796321811 19 26649452 /home/vishesh/twodrone

Internal Info
Terms: Mapplication Moctet Mstream
File Name Terms: Ftwodrone twodrone
XAttr Terms:

vlap:~ $ mv twodrone magikarp
vlap:~ $ baloosearch twodrone
Elapsed: 0.170912 msecs
vlap:~ $ baloosearch magikarp
/home/vishesh/magikarp
Elapsed: 0.248865 msecs
vlap:~ $ balooshow -x magikarp
114458524796321811 19 26649452 /home/vishesh/magikarp

Internal Info
Terms: Mapplication Moctet Mstream
File Name Terms: Fmagikarp magikarp
XAttr Terms:

greggb
Registered Member
Posts
15
Karma
0
OS
So... fixed in the next release :)

Thanks,
Gregg


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]