This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Dont ask for overwrite if md5 hashes are equal?

Tags: None
(comma "," separated)
weltio
Registered Member
Posts
25
Karma
0
OS
Hello
is there a kind of function or plugin which allows such a functionality?
If not: is there somewhere a howto for writing a plugin? (Just to get into it)
User avatar
bcooksley
Administrator
Posts
19765
Karma
87
OS
Unfortunately it is relatively difficult to implement this because:
1) Either of the file(s) could be remote - which would mean they both would have to be copied to the local system and then hashed. This could be quite data intensive (particularly if the copy was happening on a single host, and could therefore be optimised as a operation).

2) Either file could be quite large, and generating a hash would require reading the full content of both files. This would be disk and memory intensive and would likely significantly slow down the copy or moving process, as well as making Dolphin less responsive.


KDE Sysadmin
[img]content/bcooksley_sig.png[/img]
User avatar
[Moviuro]
Registered Member
Posts
86
Karma
0
OS
You should fill an idea for that, allowing user to setup some options like "Size at which md5 aren't calculated anymore", setup some rules about which files (not) to overwrite without asking (never overwrite pictures unless their name is *tmp* or *temp*, never ask for files finishing in .bak, .old or ~).

Also for big files, why not split that file ramdomly and check md5sums from those both same parts ? (eg: two 4GB iso files, md5 gets calculated only for Byte 1.000.000 to 7.000.000)


KDE 4.10.1 Archlinux x86_64 on both laptops :)
"Our life is the immortals' death"
User avatar
scummos
Global Moderator
Posts
1175
Karma
7
OS
[Moviuro] wrote:You should fill an idea for that, allowing user to setup some options like "Size at which md5 aren't calculated anymore", setup some rules about which files (not) to overwrite without asking (never overwrite pictures unless their name is *tmp* or *temp*, never ask for files finishing in .bak, .old or ~).
That might actually work.

Also for big files, why not split that file ramdomly and check md5sums from those both same parts ? (eg: two 4GB iso files, md5 gets calculated only for Byte 1.000.000 to 7.000.000)

That's a very bad idea, because it will only compare parts of the file. Thus, it might overwrite files even if they're not equal. For ISOs, for example one file could be corrupted and have its latter half filled with only zeros, and you're not going to detect this that way.

Another problem: You bind automatic overwriting files to comparing hashes... which I'm not sure I like. After all, the hashes are not unique. Files can easily be prepareted to match checksums with other files, which will then overwrite those without question. This is quite much a corner case of course, but it could cause problems.

Plus, you introduce a lot of special cases that way, which might be difficult to overview for the users: filenames match some pattern, have a specific size, or specific mimetype...


I'm working on the KDevelop IDE.


Bookmarks



Who is online

Registered users: Bing [Bot], gfielding, Google [Bot], markhm, sethaaaa, Sogou [Bot], Yahoo [Bot]