Page 1 of 1

Remove duplicate data

Posted: Sat Jan 28, 2012 11:11 pm
by arb
I have just started going through my DVDPedia library to clean up the data I have and one feature I would dearly LOVE to have is the ability to clean up the data by removing duplicates. I don't mean duplicate DVDs, but rather duplicate entries in the Director, Producer, and Starring fields. Some data from various sources can contain duplicate entries in these fields, especially for TV series and box sets. If there was some way of automatically removing these duplicate entries it would be a god-send!

As an example: I have the Babylon 5 DVDs, and some of them list all the actors for each episode in the Starring field. Now obviously many actors appear in most (if not all) episodes, so their names are listed up to 20 times! Manually tidying these up is a very slow, and not particularly enjoyable, prospect. 8^/

Re: Remove duplicate data

Posted: Tue Jan 31, 2012 9:13 am
by Conor
Thank you for the feedback. That is quite a specialized task, but one that the computer would be excellent at. Since it something that very few would need and actually removes information I have added it as part of the debugging commands in DVDpedia.

Hold down the option key and click on the main Help menu this brings up a few extra commands that are only used for debugging and for specialized tasks. You will find "Clean Duplicates in Fields". Run this command and it will go through all the fields that are multi-calue fields in the program (have a blue bubble) and it will remove any duplicates it finds. If you like to know what it cleaned up it will log them into your console log on your computer, that you can view with the program of the same name in your Applications/Utilities folder.

This new command in in Beta 19 of DVDpedia.

Re: Remove duplicate data

Posted: Thu Feb 02, 2012 7:38 pm
by arb
Awesome! Works great as far as I can see. It cleaned my collection in a few seconds - much faster than I could have possibly done it manually. :D

Re: Remove duplicate data

Posted: Thu Feb 02, 2012 8:52 pm
by arb
Also, the "Validate Links on File" option was particularly valuable - I found a handful of DVDs that had invalid links (because I had reorganised my file server and some location were changed) and I doubt I could ever have manually verified all the DVDs in my collection.