Tuthur
Haha CEO
Hello, I've been made aware that World Cup pools were done manually, as there is as far as I know no tool that has been developped for their generation. As a reminder, World Cup pools generation (usually) follows the following rules:
While I am too lazy to prove this problem is NP-complete; it looks like one. That's why I wanted to tackle it using my favorite method to solve NP-complete problems: Mixed Integer Programming (MIP). The next section will detail the MIP model used in this program. If you are uneducated about MIP or just want to read about to use the final program, you can skip this part. The model should be quite straightforward, but feel free to ask questions if you have any.
Now onto the program itself. The only widely accessible (read free and performing) MIP solver to my knowledge is Google's OR-TOOLS. I am not experienced at all with UI, let alone graphical ones, so if you want to use this program you'd have to manually run the Python files in the ZIP file and also download the Google's OR-TOOLS library. The file "templateMaker.py" handles the whole MIP model, you should only touch it if you want to edit the MIP model / toy with it. The important file is "playerRandomization.py" and the one that you need to execute to generate your pools. In order to use it, you have to edit the line 49 to change the parameters of the call of the main function. The first parameter is the name of a text file containing the list of starters per country (example attached) and the second is the number of players per pool. Execute the script and it should print in your console the list of players per pool, as well as the execution time (the later can be commented out).
The example given here lets me generate a bracket that fulfill every constraints for DOUWC (13 teams, 8 starters per team, 4 players per pool) just under one hour with my 4 years old laptop. I was also able to generate ZUWC like (12 teams, 6 starters per team, 4 players per pool) in around 20 seconds.
There is a lot of room for improvement in this script, even beside the lack of a practical UI. The execution time is quite long due to the high amount of variables, despite the vast majority of them being deterministic of only a small set of them (solo(p, t)). The problem also has many symetry axes, while I broke some with forcing some teams on the first pool and also forcing some of the first teams into the first pools, I am convinced more could be found.
Another problem is that before yesterday I had never used OR-TOOLS and there are probaly ways to optimize how the model is described to the solver. My main worry is the lack of an objective function; on the one hand I'm happy because we could easily add one and that would allow for some pseudo seeding in World Cups, on the other hand, I'm concerned whether there wouldn't be more effective ways to find any solution fulfilling the constraints as opposed to use one that is designed to find the best solution based on a specific objective function.
I do also have very little knowledge on how the program would handle limit cases. At this point, I've only tested the script on two feasible cases, but what if there is no solution? Draft World Cup and WCoP have more teams than DOUWC, if there is no solution for these cases, then the algorithm won't return any solution. It is annoying to have wasted time and resource for no result, and there are not so many solutions I can imagine to solve this problem.
The first solution that crossed my mind would consist of commenting out the trio variables, as they are the most numerous and rule 3 isn't nearly as impactful as the other rules. It is quite frustrating because you've no control on how many time the rule 3 is broken.
The second solution is less extreme and would be to use a Lagrangian relaxation, i.e. turn the rule 3 into an objective function. I haven't digged up on it yet, but this seems like the most promising solution to find an effective solution that doesn't break the rule 3 that often.
Also first time working on this kind of personal projects, be kind please.
Edit: see update, performance is much better than described in op
Edit2: not editing the op for history reasons, but some changes were implemented there
- Every team must face at least once every other team.
- Every team can't face more than twice every other team.
- If three teams face together in the same pool, the same three teams can't all face in another pool.
- Every pool is full.
- Two players of the same team can't be in the same pool.
- A starter can't play in two different pools.
While I am too lazy to prove this problem is NP-complete; it looks like one. That's why I wanted to tackle it using my favorite method to solve NP-complete problems: Mixed Integer Programming (MIP). The next section will detail the MIP model used in this program. If you are uneducated about MIP or just want to read about to use the final program, you can skip this part. The model should be quite straightforward, but feel free to ask questions if you have any.
Data:
integer: SIZE = the size of the pool
integer: TEAM = the number of teams
integer: STARTER = the number of starters per pool
integer: POOL = STARTER * TEAM / SIZE
Variables:
For all p integer in [0, POOL-1], for all t integer in [0, TEAM-1], boolean: solo(p, t) = 1 if team of index t plays in pool of index p; 0 otherwise
Intermediate variables (determinist from the above variables):
For all p integer in [0, POOL-1], for all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2, boolean: duo(p, t1, t2) = 1 if teams of index t1 and t2 face each others in pool of index p; 0 otherwise.
Constraints forcing its value: 2 * duo(p, t1, t2) >= solo(p, t1) + solo(p, t2) - 1
For all p integer in [0, POOL-1], for all t1 integer in [0, TEAM-3], for all t2 integer in [1, TEAM-2], , for all t3 integer in [2, TEAM-1], such that t1 < t2 < t3, boolean: trio(p, t1, t2, t3) = 1 if teams of index t1, t2, and t3 face each others in pool of index p; 0 otherwise.
Constraints forcing its value: 3 * trio(p, t1, t2, t3) >= solo(p, t1) + solo(p, t2) + solo(p, t3) - 2
(note: "3 * trio(p, t1, t2, t3) >= duo(p, t1, t2) + duo(p, t1, t2) + duo(p, t2, t3) - 2" could work too)
Constraints:
Rule 1: For all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2,
sum(p in [0, POOL-1])(duo(t1, t2, p)) >= 1
Rule 2: For all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2,
sum(p in [0, POOL-1])(duo(t1, t2, p)) <= 2
Rule 3: For all t1 integer in [0, TEAM-3], for all t2 integer in [1, TEAM-2], for all t3 integer in [2, TEAM-1], such that t1 < t2 < t3,
sum(p in [0, POOL-1])(trio(t1, t2, t3, p)) <= 1
Rule 4 + 5: For all p in [0, POOL-1],
sum(t in [0, TEAM-1])(solo(t, p)) == SIZE
Rule 5 + 6: For all t in [0, TEAM-1],
sum(p in [0, POOL-1])(solo(t, p)) == STARTER
integer: SIZE = the size of the pool
integer: TEAM = the number of teams
integer: STARTER = the number of starters per pool
integer: POOL = STARTER * TEAM / SIZE
Variables:
For all p integer in [0, POOL-1], for all t integer in [0, TEAM-1], boolean: solo(p, t) = 1 if team of index t plays in pool of index p; 0 otherwise
Intermediate variables (determinist from the above variables):
For all p integer in [0, POOL-1], for all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2, boolean: duo(p, t1, t2) = 1 if teams of index t1 and t2 face each others in pool of index p; 0 otherwise.
Constraints forcing its value: 2 * duo(p, t1, t2) >= solo(p, t1) + solo(p, t2) - 1
For all p integer in [0, POOL-1], for all t1 integer in [0, TEAM-3], for all t2 integer in [1, TEAM-2], , for all t3 integer in [2, TEAM-1], such that t1 < t2 < t3, boolean: trio(p, t1, t2, t3) = 1 if teams of index t1, t2, and t3 face each others in pool of index p; 0 otherwise.
Constraints forcing its value: 3 * trio(p, t1, t2, t3) >= solo(p, t1) + solo(p, t2) + solo(p, t3) - 2
(note: "3 * trio(p, t1, t2, t3) >= duo(p, t1, t2) + duo(p, t1, t2) + duo(p, t2, t3) - 2" could work too)
Constraints:
Rule 1: For all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2,
sum(p in [0, POOL-1])(duo(t1, t2, p)) >= 1
Rule 2: For all t1 integer in [0, TEAM-2], for all t2 integer in [1, TEAM-1], such that t1 < t2,
sum(p in [0, POOL-1])(duo(t1, t2, p)) <= 2
Rule 3: For all t1 integer in [0, TEAM-3], for all t2 integer in [1, TEAM-2], for all t3 integer in [2, TEAM-1], such that t1 < t2 < t3,
sum(p in [0, POOL-1])(trio(t1, t2, t3, p)) <= 1
Rule 4 + 5: For all p in [0, POOL-1],
sum(t in [0, TEAM-1])(solo(t, p)) == SIZE
Rule 5 + 6: For all t in [0, TEAM-1],
sum(p in [0, POOL-1])(solo(t, p)) == STARTER
Now onto the program itself. The only widely accessible (read free and performing) MIP solver to my knowledge is Google's OR-TOOLS. I am not experienced at all with UI, let alone graphical ones, so if you want to use this program you'd have to manually run the Python files in the ZIP file and also download the Google's OR-TOOLS library. The file "templateMaker.py" handles the whole MIP model, you should only touch it if you want to edit the MIP model / toy with it. The important file is "playerRandomization.py" and the one that you need to execute to generate your pools. In order to use it, you have to edit the line 49 to change the parameters of the call of the main function. The first parameter is the name of a text file containing the list of starters per country (example attached) and the second is the number of players per pool. Execute the script and it should print in your console the list of players per pool, as well as the execution time (the later can be commented out).
The example given here lets me generate a bracket that fulfill every constraints for DOUWC (13 teams, 8 starters per team, 4 players per pool) just under one hour with my 4 years old laptop. I was also able to generate ZUWC like (12 teams, 6 starters per team, 4 players per pool) in around 20 seconds.
Europe,Canada,France,US East
Europe,France,China,Spain
Europe,US West,Italy,India
Europe,Canada,US West,Latin America
Europe,US East,US Central,Spain
Europe,US Central,Brazil,India
Europe,China,Italy,APAC
Europe,Brazil,APAC,Latin America
Canada,France,Brazil,APAC
Canada,Spain,India,Latin America
Canada,US Central,US West,China
Canada,US Central,Italy,APAC
Canada,Italy,Brazil,Spain
Canada,US East,China,India
France,China,APAC,India
France,US East,US West,Brazil
France,US West,Italy,Latin America
France,US Central,Italy,Spain
US East,China,Italy,Latin America
US East,Italy,Brazil,India
US East,US Central,US West,APAC
US East,Spain,APAC,Latin America
US West,China,Brazil,Spain
US Central,China,Brazil,Latin America
France,US Central,India,Latin America
US West,Spain,APAC,India
Europe,France,China,Spain
Europe,US West,Italy,India
Europe,Canada,US West,Latin America
Europe,US East,US Central,Spain
Europe,US Central,Brazil,India
Europe,China,Italy,APAC
Europe,Brazil,APAC,Latin America
Canada,France,Brazil,APAC
Canada,Spain,India,Latin America
Canada,US Central,US West,China
Canada,US Central,Italy,APAC
Canada,Italy,Brazil,Spain
Canada,US East,China,India
France,China,APAC,India
France,US East,US West,Brazil
France,US West,Italy,Latin America
France,US Central,Italy,Spain
US East,China,Italy,Latin America
US East,Italy,Brazil,India
US East,US Central,US West,APAC
US East,Spain,APAC,Latin America
US West,China,Brazil,Spain
US Central,China,Brazil,Latin America
France,US Central,India,Latin America
US West,Spain,APAC,India
There are probably some typos, sorry.
['Hugo', 'Hys', 'CaoJie', 'sirjelloton']
['JRL', 'Nephtyrix', 'NakanoNino', 'LicenciadoPan']
['FlyingBeagle', 'eragon11145', 'Ferrari450', 'tyo']
['Lily', 'GrandmasCookin', 'z0mOG', 'Zerotti']
['Tenzai', 'robjr', 'Spurrific', 'Yddeon']
['Fran', 'Croven', 'Staraptor', 'InkVGC']
['SMB', 'xqiht', 'entrocefalo', 'Zeal']
['MichaelderBeste2', 'Seraphz', 'Kaif', 'Ann']
['Mishimono', 'Cyndakill-SH', 'Beraldo', 'Exotic64']
['sempra', 'Kresku', 'SHIMA', 'luisin']
['Mizuhime', 'laptops', 'Lemurro', 'latiasatoshi']
['Wille', 'qsns', 'Toxigen', 'Genesis77']
['GenOne', 'waffle2O2O', 'LpZ', 'Talkze']
['RelicanthPrimal', 'zoe', 'MetapodVGCChannel', 'AIRedzone']
['Givrix', 'ChandelureVGC', 'YoBuddy', 'shadekirby321']
['gataen', 'Actuarily', 'fespy', 'havocknight']
['Herv', 'jonas', 'Amaranth', 'AkaruKokuyo']
['ratpacker', 'bage1', 'LoSconosciuto', 'RDL93']
['Arcticblast', 'hybone', 'Fafromani', 'Enzonana.']
['Iceberg77', 'MADARAAAA', 'RiloBR', 'Nido-Rus']
['panda!', 'DawAwesomeDude1', 'KeanuGridingArc', 'Feyy']
['zee', 'Shiritu', 'trace', 'Mashing']
['IwantAtagotositonme', 'tianlengmeihou', 'Ninja', 'TheAndrw']
['MajorBowman', 'Metallica126', 'AshKetchumGamer', 'ohtheguilt']
['John1240', 'Paraplegic', 'Loudwinner', 'GasaiYunoSan']
['EternalSnowman', 'SusejVGC', 'raf', 'DJBreloominati']
['Hugo', 'Hys', 'CaoJie', 'sirjelloton']
['JRL', 'Nephtyrix', 'NakanoNino', 'LicenciadoPan']
['FlyingBeagle', 'eragon11145', 'Ferrari450', 'tyo']
['Lily', 'GrandmasCookin', 'z0mOG', 'Zerotti']
['Tenzai', 'robjr', 'Spurrific', 'Yddeon']
['Fran', 'Croven', 'Staraptor', 'InkVGC']
['SMB', 'xqiht', 'entrocefalo', 'Zeal']
['MichaelderBeste2', 'Seraphz', 'Kaif', 'Ann']
['Mishimono', 'Cyndakill-SH', 'Beraldo', 'Exotic64']
['sempra', 'Kresku', 'SHIMA', 'luisin']
['Mizuhime', 'laptops', 'Lemurro', 'latiasatoshi']
['Wille', 'qsns', 'Toxigen', 'Genesis77']
['GenOne', 'waffle2O2O', 'LpZ', 'Talkze']
['RelicanthPrimal', 'zoe', 'MetapodVGCChannel', 'AIRedzone']
['Givrix', 'ChandelureVGC', 'YoBuddy', 'shadekirby321']
['gataen', 'Actuarily', 'fespy', 'havocknight']
['Herv', 'jonas', 'Amaranth', 'AkaruKokuyo']
['ratpacker', 'bage1', 'LoSconosciuto', 'RDL93']
['Arcticblast', 'hybone', 'Fafromani', 'Enzonana.']
['Iceberg77', 'MADARAAAA', 'RiloBR', 'Nido-Rus']
['panda!', 'DawAwesomeDude1', 'KeanuGridingArc', 'Feyy']
['zee', 'Shiritu', 'trace', 'Mashing']
['IwantAtagotositonme', 'tianlengmeihou', 'Ninja', 'TheAndrw']
['MajorBowman', 'Metallica126', 'AshKetchumGamer', 'ohtheguilt']
['John1240', 'Paraplegic', 'Loudwinner', 'GasaiYunoSan']
['EternalSnowman', 'SusejVGC', 'raf', 'DJBreloominati']
There is a lot of room for improvement in this script, even beside the lack of a practical UI. The execution time is quite long due to the high amount of variables, despite the vast majority of them being deterministic of only a small set of them (solo(p, t)). The problem also has many symetry axes, while I broke some with forcing some teams on the first pool and also forcing some of the first teams into the first pools, I am convinced more could be found.
Another problem is that before yesterday I had never used OR-TOOLS and there are probaly ways to optimize how the model is described to the solver. My main worry is the lack of an objective function; on the one hand I'm happy because we could easily add one and that would allow for some pseudo seeding in World Cups, on the other hand, I'm concerned whether there wouldn't be more effective ways to find any solution fulfilling the constraints as opposed to use one that is designed to find the best solution based on a specific objective function.
I do also have very little knowledge on how the program would handle limit cases. At this point, I've only tested the script on two feasible cases, but what if there is no solution? Draft World Cup and WCoP have more teams than DOUWC, if there is no solution for these cases, then the algorithm won't return any solution. It is annoying to have wasted time and resource for no result, and there are not so many solutions I can imagine to solve this problem.
The first solution that crossed my mind would consist of commenting out the trio variables, as they are the most numerous and rule 3 isn't nearly as impactful as the other rules. It is quite frustrating because you've no control on how many time the rule 3 is broken.
The second solution is less extreme and would be to use a Lagrangian relaxation, i.e. turn the rule 3 into an objective function. I haven't digged up on it yet, but this seems like the most promising solution to find an effective solution that doesn't break the rule 3 that often.
Also first time working on this kind of personal projects, be kind please.
Edit: see update, performance is much better than described in op
Edit2: not editing the op for history reasons, but some changes were implemented there
Attachments
-
3 KB Views: 7
-
3.9 KB Views: 9
-
886 bytes Views: 7
Last edited: