Welcome to the Invelos forums. Please read the forum rules before posting.

Read access to our public forums is open to everyone. To post messages, a free registration is required.

If you have an Invelos account, sign in to post.

    Invelos Forums->DVD Profiler: Plugins Page: 1... 6 7 8 9 10 ...40  Previous   Next
Tool: Cast/Crew Edit 2
Author Message
DVD Profiler Desktop and Mobile RegistrantStar ContributorDJ Doena
Registered: May 1, 2002
Registered: March 14, 2007
Reputation: Highest Rating
Germany Posts: 6,741
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting StaNDarD:
Quote:
Well, lets take a rough guess. You have 5300590 people. Hundreds of first- and middlenames, hundreds of surnames. And of course you have round about 112 BYs. So there are at least (100*100*100*112) 112'000'000 different combinations. So what's the chance to get two actresses with the same common name and the same BY when you just have 5'300'590 people? But yeah it happens: <nm0001853> Vanessa Williams (1963) and <nm0004539> Vanessa Williams (1963)!


What I meant was for example you have a John Smith,, no BY, ID nm1231985 and a John Smith born 1985. Then it would clash.

There's nothing I could do about the two Vanessa Williams because they both have an actual BY. It just happens to be the same year.
Karsten
DVD Collectors Online

 Last edited: by DJ Doena
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting DJ Doena:
Quote:
What I meant was for example you have a John Smith,, no BY, ID nm1231985 and a John Smith born 1985. Then it would clash.

There's nothing I could do about the two Vanessa Williams because they both have an actual BY. It just happens to be the same year.

Yes, I understood that. It was just to show, even if chances are low, Murphy's law sais it will happen.

So what about the roman numbers? Would it be possible to convert them into an unique BY-value?
DVD Profiler Desktop and Mobile RegistrantStar ContributorDJ Doena
Registered: May 1, 2002
Registered: March 14, 2007
Reputation: Highest Rating
Germany Posts: 6,741
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
I'll look into it.
Karsten
DVD Collectors Online

DVD Profiler Unlimited RegistrantCorma
Registered: July 29, 2007
Germany Posts: 183
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
How about this (but I'm propably the only one who likes it ) :

Add an option to screw all real BYs and instead add the last four digits into everyones BY field.

I think this would speed things up a lot because CCE2 never has to touch a persons imdb page. Or maybe once to grab a headshot if that option is enabled. I'm not sure about common name changes but I think CCE2 finds them without visiting a persons own page?!

Pro:

1: speed, speed, speed
2: nothing to worry about, just survive the copy & paste marathon
2: maximum number of working crosslinks
3: maybe an easy change to the existing CCE2?

the nagative side:

1: no more real BYs in the profiler but that's a price I'm happy to pay

2: must afterwards scroll through the complete local cast & crew list to copy headshots from one entry to another. Some headshots will be hard to find because of completly different names in the collection and the imdb but the majority is just next to each other in the list

3: If there are people with the same name and the same ID / fake BY (I doubt it) there still is a minimum number of messed up crosslinks.
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting Corma:
Quote:
3: If there are people with the same name and the same ID / fake BY (I doubt it) there still is a minimum number of messed up crosslinks.

As there's still a chance of getting double entries, I'd still prefer to get the roman numbers. On the other hand I don't care too much about the real BYs. Correct linking is my ultimate goal.
DVD Profiler Unlimited RegistrantCorma
Registered: July 29, 2007
Germany Posts: 183
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting StaNDarD:
Quote:

As there's still a chance of getting double entries,


Yeah, but even if. We could do a list of these people. Look at the other lists here at the invelos forums trying to keep people seperated. I'm sure this would be the shortest one 
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting Corma:
Quote:
Yeah, but even if. We could do a list of these people. Look at the other lists here at the invelos forums trying to keep people seperated. I'm sure this would be the shortest one 

Yeah, but if it's to be done, why don't do it right? When there's a way to have real unique IDs, why take another?
DVD Profiler Unlimited RegistrantCorma
Registered: July 29, 2007
Germany Posts: 183
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting StaNDarD:
Quote:

Yeah, but if it's to be done, why don't do it right? When there's a way to have real unique IDs, why take another?


People with a unique name don't habe a roman number and I don't know what happens if a second person with that name appears on imdb. At least in theory it might be possible that the older one gets the II and the newer one the I. The digit should always stay the same, no matter if the real BY or the common name changes - or at least I think so.

But the main reason I prefer the four digits from the ID now is that it should work a lot faster for CCE2 to create the cast & crew tables since there is no need to leech data from all the peoples pages.

edit: another example / question:

Let's say Jennifer Smith (III) marries and her common name changes completly. Will Jennifer Smith (IV) automatically become (III)? Or will the next Jennifer Smith added to imdb become (III)? Or stays the (III) empty in case she divorces and the name changes back?
 Last edited: by Corma
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting Corma:
Quote:
People with a unique name don't habe a roman number and I don't know what happens if a second person with that name appears on imdb. At least in theory it might be possible that the older one gets the II and the newer one the I. The digit should always stay the same, no matter if the real BY or the common name changes - or at least I think so.

That's just what you have now when a common name changes, you have to change it manually in DVDP (OK, you then would have to change 2 persons.)

Quoting Corma:
Quote:
But the main reason I prefer the four digits from the ID now is that it should work a lot faster for CCE2 to create the cast & crew tables since there is no need to leech data from all the peoples pages.

As I fetch headshots, that's no argument on my side, but I see it's a pro for you.

Quoting Corma:
Quote:
edit: another example / question:

Let's say Jennifer Smith (III) marries and her common name changes completly. Will Jennifer Smith (IV) automatically become (III)? Or will the next Jennifer Smith added to imdb become (III)? Or stays the (III) empty in case she divorces and the name changes back?

No. It will stay afaik.
DVD Profiler Unlimited RegistrantCorma
Registered: July 29, 2007
Germany Posts: 183
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
another idea and a completly different approch:

An improved logfile. Right now it logs only the current session in chronolocal order. If it could be sorted by names, logs all session and logs all conflicts including two persons with no BY it would be a great help.
 Last edited: by Corma
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting StaNDarD:
Quote:
So what about the roman numbers? Would it be possible to convert them into an unique BY-value?

This is not as hard as I thought it would be. This works in php:
Quote:
function RomanToArabic ( $roman ) {
    $arabic = 0;
    $convert = array('M' => 1000, 'CM' => 900, 'D' => 500, 'CD' => 400, 'C' => 100, 'XC' => 90,
      'L' => 50, 'XL' => 40, 'X' => 10, 'IX' => 9, 'V' => 5, 'IV' => 4, 'I' => 1);
    foreach ( $convert as $numeral => $value ) {
        while ( substr ( $roman, 0, strlen ( $numeral ) ) == $numeral ) {
            $arabic = $arabic + $value;
            $roman = substr ( $roman , strlen ( $numeral ) );
        }
    }
    return $arabic;
}

As I just realized that many non-BY persons got incorrectly connected to persons with BY which I already had in my DB, I think I like to have an option to get a fake BY to every person without a real one.
DVD Profiler Desktop and Mobile RegistrantStar ContributorDJ Doena
Registered: May 1, 2002
Registered: March 14, 2007
Reputation: Highest Rating
Germany Posts: 6,741
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Currently I don't have the time to work on any solution (busy life right now  ).

But I already had the roman numeral as a challenge on my plate. My code is including input checking (I probably wouldn't need it on IMDb, but it increased the challenge.

Here's my C# solution to the problem:

Quote:

using System;
using System.Collections.Generic;

namespace RomanNumerals
{
    class Program
    {
        static void Main()
        {
            try
            {
                Dictionary<Char, UInt16> romanNumerals;
                String input;
                List<UInt16> numbers;
                UInt16 highestNumber;
                String highestNumeral;
                UInt16 result;
                UInt16 dCount;
                UInt16 cCount;
                UInt16 lCount;
                UInt16 xCount;
                UInt16 vCount;
                UInt16 iCount;

                romanNumerals = new Dictionary<Char, UInt16>(7);
                romanNumerals.Add('I', 1);
                romanNumerals.Add('V', 5);
                romanNumerals.Add('X', 10);
                romanNumerals.Add('L', 50);
                romanNumerals.Add('C', 100);
                romanNumerals.Add('D', 500);
                romanNumerals.Add('M', 1000);
                Console.WriteLine("Please enter a roman numeral and press <Enter>");
                input = Console.ReadLine().ToUpper();
                numbers = new List<UInt16>(input.Length);
                #region Input Checking
                highestNumber = 1000;
                highestNumeral = "M";
                dCount = 0;
                cCount = 0;
                lCount = 0;
                xCount = 0;
                vCount = 0;
                iCount = 0;
                foreach(Char letter in input)
                {
                    UInt16 number;

                    if(romanNumerals.TryGetValue(letter, out number) == false)
                    {
                        Console.WriteLine("'{0}' is not a roman number", letter);
                        return;
                    }
                    numbers.Add(number);
                }
                for(Int32 i = 0; i < numbers.Count; i++)
                {
                    if(i < numbers.Count - 1)
                    {
                        if(numbers[i] < numbers[i + 1])
                        {
                            UInt16 substraction;

                            if(numbers[i] == 1)
                            {
                                if((numbers[i + 1] != 5) && (numbers[i + 1] != 10))
                                {
                                    Console.WriteLine("You can substract 'I' only from 'V' or 'X'");
                                    return;
                                }
                            }
                            else if(numbers[i] == 10)
                            {
                                if((numbers[i + 1] != 50) && (numbers[i + 1] != 100))
                                {
                                    Console.WriteLine("You can substract 'X' only from 'L' or 'C'");
                                    return;
                                }
                            }
                            else if(numbers[i] == 100)
                            {
                                if((numbers[i + 1] != 500) && (numbers[i + 1] != 1000))
                                {
                                    Console.WriteLine("You can substract 'C' only from 'D' or 'M'");
                                    return;
                                }
                            }
                            else
                            {
                                Console.WriteLine("You cannot substract '{0}' from '{1}'", input[i], input[i + 1]);
                                return;
                            }
                            substraction = (UInt16)(numbers[i + 1] - numbers[i]);
                            if(substraction > highestNumber)
                            {
                                Console.WriteLine("You cannot have '{0}' following '{1}'", input[i].ToString() + input[i + 1].ToString(), highestNumeral);
                                return;
                            }
                            else
                            {
                                highestNumeral = input[i].ToString() + input[i + 1].ToString();
                                if(substraction == highestNumber)
                                {
                                    Console.WriteLine("You cannot have '{0}' following '{0}'", highestNumeral);
                                    return;
                                }
                                highestNumber = (UInt16)(numbers[i] - 1);
                                i++;
                                continue;
                            }
                        }
                    }
                    if(numbers[i] > highestNumber)
                    {
                        Console.WriteLine("You cannot have '{0}' following '{1}'", input[i], highestNumeral);
                        return;
                    }
                    else
                    {
                        Boolean abort;

                        if(DetermineMaxSequentials(input, numbers, i, 500, ref dCount, 1, "one", out abort) == false)
                        {
                            if(DetermineMaxSequentials(input, numbers, i, 100, ref cCount, 3, "three", out abort) == false)
                            {
                                if(DetermineMaxSequentials(input, numbers, i, 50, ref lCount, 1, "one", out abort) == false)
                                {
                                    if(DetermineMaxSequentials(input, numbers, i, 10, ref xCount, 3, "three", out abort) == false)
                                    {
                                        if(DetermineMaxSequentials(input, numbers, i, 5, ref vCount, 1, "one", out abort) == false)
                                        {
                                            DetermineMaxSequentials(input, numbers, i, 1, ref iCount, 3, "three", out abort);
                                        }
                                    }
                                }
                            }
                        }
                        if(abort)
                        {
                            return;
                        }
                        highestNumber = numbers[i];
                        highestNumeral = input[i].ToString();
                    }
                }
                #endregion
                #region Calculation
                result = 0;
                for(Int32 i = 0; i < numbers.Count; i++)
                {
                    if(i == numbers.Count - 1)
                    {
                        result += numbers[i];
                    }
                    else
                    {
                        if(numbers[i] < numbers[i + 1])
                        {
                            UInt16 substraction;

                            substraction = (UInt16)(numbers[i + 1] - numbers[i]);
                            result += substraction;
                            i++;
                        }
                        else
                        {
                            result += numbers[i];
                        }
                    }
                }
                #endregion
                Console.WriteLine("Result: {0}", result);
            }
            catch(Exception ex)
            {
                Console.WriteLine("Exception: {0}", ex.Message);
            }
            finally
            {
                Console.WriteLine("Press <Enter> to exit");
                Console.ReadLine();
            }
        }

        private static Boolean DetermineMaxSequentials(String input, List<UInt16> numbers, Int32 i, UInt16 value, ref UInt16 count, UInt16 maxCount, String numberWord
            , out Boolean abort)
        {
            abort = false;
            if(numbers[i] == value)
            {
                count++;
                if(count > maxCount)
                {
                    Console.WriteLine("You cannot have more than {0} '{1}'", numberWord, input[i]);
                    abort = true;
                }
                return (true);
            }
            return (false);
        }
    }
}


BTW:

If you didn't know: You can compile C# without any Visual Studio as long as the .Net Framwork is installed.

Copy this code in a file called roman.cs and execute the following code on a command line (please adapt paths as necessary).

Quote:

c:\Windows\Microsoft.NET\Framework\v2.0.50727\csc.exe roman.cs
Karsten
DVD Collectors Online

 Last edited: by DJ Doena
DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Ooh, I never programmed in C# or any similar. My programming skills are php and basic (yeah, from my C64 times ).

Maybe I'll do it in php - I got cast working yesterday* (the challenge hit me), but crew is a little harder, I guess...

*Well it's working in simple ways but I need to do some additional things like separating roles with '/', ignore (<language> version)... to get it nearly as perfect as your solution is.
DVD Profiler Unlimited RegistrantStar ContributorTomGaines
Registered Sept. 24, 2001
Registered: March 13, 2007
Reputation: High Rating
Germany Posts: 2,005
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting DJ Doena:
Quote:
But I already had the roman numeral as a challenge on my plate. My code is including input checking (I probably wouldn't need it on IMDb, but in increased the challenge.

Here's my C# solution to the problem:


Your code looked too complex for me, so I thought I'd check what Google tells me. I took the code from the first hit, made a slight adjustment, because the original had no input checking, and this is the result. Of course the following only has basic input checking, whereas your code also tells the user what exactly is wrong. It is similar to StaNDarD's solution.

Quote:

using System;
using System.Collections.Generic;
using System.Text;

namespace ConsoleApplication1
{
  class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("Please enter a roman numeral and press <Enter>");
      string input = Console.ReadLine().ToUpper();
      int ret = ConvertRomanNumtoInt(input);
      if (ret != -1)
        Console.WriteLine("Result: {0}", ret);
      else
        Console.WriteLine("Invalid input!");

      Console.WriteLine("Press <Enter> to exit");
      Console.ReadLine();
    }

    public static int ConvertRomanNumtoInt(string strRomanValue)
    {
      Dictionary<string, int> RomanNumbers = new Dictionary<string, int>
        {
        {"M", 1000},
        {"CM", 900},
        {"D", 500},
        {"CD", 400},
        {"C", 100},
        {"XC", 90},
        {"L", 50},
        {"XL", 40},
        {"X", 10},
        {"IX", 9},
        {"V", 5},
        {"IV", 4},
        {"I", 1}
        };
      int retVal = 0;
      foreach (KeyValuePair<string, int> pair in RomanNumbers)
      {
        while (strRomanValue.IndexOf(pair.Key.ToString()) == 0)
        {
          retVal += int.Parse(pair.Value.ToString());
          strRomanValue = strRomanValue.Substring(pair.Key.ToString().Length);
        }
      }
      if (!strRomanValue.Equals(""))
        return -1;
      return retVal;

    }
  }
}



 Last edited: by TomGaines
DVD Profiler Desktop and Mobile RegistrantStar ContributorDJ Doena
Registered: May 1, 2002
Registered: March 14, 2007
Reputation: Highest Rating
Germany Posts: 6,741
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
The biggest part of my code is input checking.

Your found code allows IVIV = 8, IXI = 10, IIII = 4 and so forth
Karsten
DVD Collectors Online

DVD Profiler Unlimited RegistrantStaNDarD
Registered: March 31, 2007
Germany Posts: 662
Posted:
PM this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting DJ Doena:
Quote:
The biggest part of my code is input checking.

Your found code allows IVIV = 8, IXI = 10, IIII = 4 and so forth

Collecting those data from IMDb shouldn't bring any incorrect numbers.

But if you're trying to code it for general purpose, you're totally right.
    Invelos Forums->DVD Profiler: Plugins Page: 1... 6 7 8 9 10 ...40  Previous   Next