Hi!
...
Yes, this could be a good way but all the users will suffer the md5
generation for all the photos. I think that only users that use the
Duplicate feature should spend time with the MD5 generation for the
photos if we don't find any uses for the MD5 that could justify that all
users suffer this loading time.
I have to agree with this opinion. This bothered me too. But I think
it would be a good idea to store the created md5 hashes. I saw this
feature in gthumb where it was a bit slow. In this situation we have
the opportunity to store the created hashes in sql for further use. So
perhaps when you run a duplicate searching it would be a good idea to
store the hashes as a side affect. The next search would be a fast
generation for just the new images and an sql query.
Yes, I think this is the best idea. To create MD5 when using the
duplicate feature. And to store the MD5 in the database could be also a
good idea, yes! When you load the photo data from the database, the MD5
could be loaded also if it exists and later, you don't need to recreate
it.
And at last one more thing. With your original idea you can't alert
the user not to import the same image twice.
No, if you loose the MD5 you can't. So you are thinking about showing
the user a dialog when she tries to import photos that are already in
the albums, no? The user then can say "Don't import any repeated photo
or import all the repeated photos". This could be a nice feature. We
annoy the user with a question but I think she will like to be informed
about it :)
So to fix some points:
1. If the user doesn't use the Duplicate feature, she won't suffer any
time spend creating md5. The only extra time will be loading the data
from the MD5 database field. I think this time should be minimal because
you will load the MD5 data field with lots of other fields.
2. If the user select the Duplicate feature then:
2.1 If she has selected a group of photos, the duplicate code will work
in this selection. A "Duplicate" tag will be created if it doesn't exist
and all the duplicates photos will be marked with this Duplicate tag.
The Duplicate tag checkbox will be selected show in the main window will
only appear the duplicates photos so the user can work with them.
Probably she will delete one of the copies or more if they exists. Maybe
we can preselect for the user all the photos except of original per
duplicate group.
2.2 If she doesn't select any photos, we will work with all the photos.
In 2.1 and 2.2 we could need to show a progress dialog.
How does it sounds?
Cheers
Hubidubi
I think the MD5 for photos could be cached in a hash table. This is what
I do in the current implementation.
Some numbers: computing the MD5 files for the photos
acs amigo:~/fotos/airport extreme$ ls -l
total 1360
-rwxr--r-- 1 acs root 364713 2005-01-05 14:26 dsc00045.jpg
-rwxr--r-- 1 acs root 330323 2005-01-05 14:26 dsc00046.jpg
-rwxr--r-- 1 acs root 324022 2005-01-05 14:26 dsc00047.jpg
-rwxr--r-- 1 acs root 344558 2005-01-05 14:27 dsc00048.jpg
and measuring the MD5 computing with DateTime.Now.Ticks (I am sure it
isn't the most accurate way to do it) in my computer (Dell X300 witn 256
MB RAM and Pentium(R) M processor 1200MHz):
First time:
MD5 compute: 00:00:00.0769270
MD5 compute: 00:00:00.0290020
MD5 compute: 00:00:00.0200700
MD5 compute: 00:00:00.0204300
Second time:
MD5 compute: 00:00:00.0199370
MD5 compute: 00:00:00.0174230
MD5 compute: 00:00:00.0176300
MD5 compute: 00:00:00.0184470
Third time:
MD5 compute: 00:00:00.0219800
MD5 compute: 00:00:00.0203260
MD5 compute: 00:00:00.0194000
MD5 compute: 00:00:00.0199240
Fourth time:
MD5 compute: 00:00:00.0284410
MD5 compute: 00:00:00.0254680
MD5 compute: 00:00:00.0252140
MD5 compute: 00:00:00.0277300
So with not very big photos (1024x768) we can find around 30ms per
photo. If you have for example 6000 photos you spend 180 seconds (3
minutes). A really bad first experience for the user. Currently, this 3
minutes are spread in the minutes you spend in the importing process
that is a bit slow actually.
Cheers
-- Alvaro
P.S: To compute the MD5 I use the code
FileStream fs = new FileStream(photo.Path, FileMode.Open,
FileAccess.Read);
MD5 md5ServiceProvider = new MD5CryptoServiceProvider();
byte[] md5 = md5ServiceProvider.ComputeHash(fs);
StringBuilder hash = new StringBuilder();
for (int pos = 0; pos < md5.Length; pos++) {
hash.Append(md5[pos].ToString("X2").ToLower());
}
taken from Mono bugzilla.
_______________________________________________
F-spot-list mailing list
F-spot-list gnome org
http://mail.gnome.org/mailman/listinfo/f-spot-list