Problem with a Lineup Optimizer due to players having multiple Positions

Question

i have a list of Hitters in python with their name,projection and Positions

juan-soto 30.3773 ['OF']
kyle-tucker 44.0626 ['OF']
...
yordan-alvarez 32.510200000000005 ['CI', 'OF']
william-contreras 26.7904 ['CI', 'MI']

from this list i want to get the best possible lineup with 1 CI Slot, 1 MI Slot, 1 OF Slot and 2 FLEX Slots(any Position), my problem is that i cant just check if every position is available in a combination because players have more than one position, please help

What have you tried? I would start by researching "dynamic programming" and "knapsack" and/or "tic-tac-toe". — JonSG
– JonSG, Commented Mar 25 at 15:15
You need to specify in more detail: can one person be selected for multiple roles? (I suspect not; that wouldn't make any sense). — Reinderien
– Reinderien, Commented Mar 26 at 7:06

Reinderien · Accepted Answer · 2025-03-26 07:44:17Z

As best I can tell, your 'flex constraint' is really better reframed as a generic count for your entire lineup without constraining the positions of the last two players. PuLP is well-suited to this. You should probably also be using Pandas to pre-process player data:

import pandas as pd
import pulp

data = pd.DataFrame({
    'name': [
        'juan-soto', 'kyle-tucker', 'yordan-alvarez', 'william-contreras',
        'jake-meyers', 'yasmani-grandal', 'ernie-clement',
    ],
    'projection': [
        30.3773, 44.0626, 32.5102, 26.7904,
        30, 29, 31,  # fake data for demonstration
    ],
    'position': [
        ['OF'], ['OF'], ['CI', 'OF'], ['CI', 'MI'],
        ['OF'], ['CI'], ['MI'],
    ],
})

# Add onehot encodings for positions
data = pd.concat(
    (
        data,
        pd.get_dummies(data['position'].explode()).groupby(level=0).any(),
    ), axis='columns',
)

# Add decision variables
data['assign'] = pulp.LpVariable.matrix(name='assign', indices=data['name'], cat=pulp.LpBinary)

prob = pulp.LpProblem(name='mlb-lineup', sense=pulp.LpMaximize)
prob.setObjective(pulp.lpDot(data['assign'], data['projection']))

prob.addConstraint(
    name='ci_slots',
    constraint=pulp.lpSum(data.loc[data['CI'], 'assign']) >= 1,
)
prob.addConstraint(
    name='mi_slots',
    constraint=pulp.lpSum(data.loc[data['MI'], 'assign']) >= 1,
)
prob.addConstraint(
    name='of_slots',
    constraint=pulp.lpSum(data.loc[data['OF'], 'assign']) >= 1,
)
prob.addConstraint(
    name='total_slots',  # including flex
    constraint=pulp.lpSum(data['assign']) == 5,
)

assert prob.solve() == pulp.LpStatusOptimal
data['assign'] = data['assign'].apply(pulp.value) > 0.5
print(data)

Printing prob shows the LP formulation in an easy-to-understand format:

mlb-lineup:
MAXIMIZE
31.0*assign_ernie_clement + 30.0*assign_jake_meyers + 30.3773*assign_juan_soto + 44.0626*assign_kyle_tucker + 26.7904*assign_william_contreras + 29.0*assign_yasmani_grandal + 32.5102*assign_yordan_alvarez + 0.0
SUBJECT TO
ci_slots: assign_william_contreras + assign_yasmani_grandal
 + assign_yordan_alvarez >= 1

mi_slots: assign_ernie_clement + assign_william_contreras >= 1

of_slots: assign_jake_meyers + assign_juan_soto + assign_kyle_tucker
 + assign_yordan_alvarez >= 1

total_slots: assign_ernie_clement + assign_jake_meyers + assign_juan_soto
 + assign_kyle_tucker + assign_william_contreras + assign_yasmani_grandal
 + assign_yordan_alvarez = 5

VARIABLES
0 <= assign_ernie_clement <= 1 Integer
0 <= assign_jake_meyers <= 1 Integer
0 <= assign_juan_soto <= 1 Integer
0 <= assign_kyle_tucker <= 1 Integer
0 <= assign_william_contreras <= 1 Integer
0 <= assign_yasmani_grandal <= 1 Integer
0 <= assign_yordan_alvarez <= 1 Integer

Output is:

At line 2 NAME          MODEL
At line 3 ROWS
At line 9 COLUMNS
At line 47 RHS
At line 52 BOUNDS
At line 60 ENDATA
Problem MODEL has 4 rows, 7 columns and 16 elements
...
Result - Optimal solution found

Objective value:                167.95010000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.00
Time (Wallclock seconds):       0.00

Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.00   (Wallclock seconds):       0.00

                name  projection  position     CI     MI     OF  assign
0          juan-soto     30.3773      [OF]  False  False   True    True
1        kyle-tucker     44.0626      [OF]  False  False   True    True
2     yordan-alvarez     32.5102  [CI, OF]   True  False   True    True
3  william-contreras     26.7904  [CI, MI]   True   True  False   False
4        jake-meyers     30.0000      [OF]  False  False   True    True
5    yasmani-grandal     29.0000      [CI]   True  False  False   False
6      ernie-clement     31.0000      [MI]  False   True  False    True

Martin · Accepted Answer · 2025-03-25 16:02:42Z

0

from itertools import combinations

players = [
    ("juan-soto", 30.3773, ['OF']),
    ("kyle-tucker", 44.0626, ['OF']),
    ("yordan-alvarez", 32.5102, ['CI', 'OF']),
    ("william-contreras", 26.7904, ['CI', 'MI']),
]

ci_players = [p for p in players if 'CI' in p[2]]
mi_players = [p for p in players if 'MI' in p[2]]
of_players = [p for p in players if 'OF' in p[2]]
flex_players = [p for p in players]

def calculate_projection(lineup):
    return sum(player[1] for player in lineup)

best_lineup = None
best_projection = 0

for ci in ci_players:
    for mi in mi_players:
        for of in of_players:
            flex_combinations = combinations(flex_players, 2)
            for flex_pair in flex_combinations:
                lineup = [ci, mi, of] + list(flex_pair)
                total_projection = calculate_projection(lineup)
                
                if total_projection > best_projection:
                    best_projection = total_projection
                    best_lineup = lineup

if best_lineup:
    print("Best Lineup:")
    for player in best_lineup:
        print(f"{player[0]} (Projection: {player[1]})")
    print(f"Total Projection: {best_projection}")
else:
    print("No valid lineup found.")

Here you can do this easily using itertools.

answered Mar 25 at 16:02

Martin

3741 gold badge5 silver badges18 bronze badges

1 Comment

Reinderien Mar 26 at 6:46

You can, but you probably shouldn't. This problem can be solved more efficiently in a non-brute-force manner.

Sandipan Dey · Accepted Answer · 2025-06-24 19:25:49Z

0

The problem can be formulated as a binary integer program (IP) and can be solved using scipy.optimize.linprog with integrality constraint:

from scipy.optimize import linprog

# Players for 5 slots / positions are needed
# the data provided was insufficient, leading to infeasible solution
# adding a few dummy players (use players from real data instead)
# max value of the objective will depend on the dummy player scores here
# so replace them with real data

# data                                           # variables 
# juan-soto 30.3773 ['OF']                       # x1
# kyle-tucker 44.0626 ['OF']                     # x2    
# yordan-alvarez 32.510200000000005 ['CI', 'OF'] # x3
# william-contreras 26.7904 ['CI', 'MI']         # x4
# dummy1 16.7904 ['OF', 'MI']                    # x5
# dummy2 22.7904 ['CI', 'MI']                    # x6
# dummy3 32.7904 ['CI', 'OF', 'MI']              # x7

# IP formulation

# objective
# max(30.3773*x1 + 44.0626*x2 + 32.510200000000005*x3 + 26.7904*x4 + 
#                  16.7904*x5 + 22.7904*x6 + 32.7904*x7)  
# constraints
# 0*x1 + 0*x2 + 1*x3 + 1*x4 + 0*x5 + 1*x6 +  1*x7 >= 1  # CI >= 1
# 0*x1 + 0*x2 + 0*x3 + 1*x4 + 1*x5 + 1*x6 +  1*x7  >= 1  # MI >= 1
# 1*x1 + 1*x2 + 1*x3 + 0*x4 + 1*x5 + 0*x6 +  1*x7  >= 1  # OF >= 1
# x1 + x2 + x3 + x4  + x5 + x6 + x7 =  5 # POS: #hitters = # positions= 5
# x1, x2, x3, x4, x5, x6, x7 in {0,1}          # integers

# implementation
c = np.array([30.3773, 44.0626, 32.510200000000005, 26.7904, 16.7904, 22.7904, 32.7904])
A = np.array([[0, 0, 1, 1, 0, 1, 1],       # CI
              [0, 0, 0, 1, 1, 1, 1],       # MI
              [1, 1, 1, 0, 1, 0, 1],       # OF
              [1, 1, 1, 1, 1, 1, 1]]       # POS
            )
b = np.array([1, 1, 1, 5])
res = linprog(-c, A_ub=-A[:3,:], A_eq=A[3:,:], b_ub=-b[:3], b_eq=b[3:], bounds=[(0,1)]*7,  integrality=[1]*7) # need to formulate as a minimization problem
print(-res.fun, res.x, res.message)
# 166.5309 [ 1.  1.  1.  1.  0. -0.  1.] Optimization terminated successfully. (HiGHS Status 7: Optimal)

As it can be seen, the optimal solution is x1=x2=x3=x4=x7=1, with optimal hitting (projection) score 166.5309, i.e., it is optimal to select 1st, 2nd, 3rd, 4th and 7th hitters to maximize the hitting score.

edited Jun 24 at 19:25

answered Mar 25 at 18:17

Sandipan Dey

23.4k4 gold badges59 silver badges72 bronze badges

3 Comments

Reinderien Mar 26 at 6:47

There are a few issues with this, including: linprog has a clunkier interface and milp should be preferred; also, this will silently produce nonsense if the problem is infeasible. You need to be checking the success attribute.

Reinderien Mar 26 at 7:07

Also: I highly doubt that assigning three players on a five-position team is sensible.

Reinderien Mar 26 at 7:14

I don't think that's true. The problem isn't with the input data; it's with your LP formulation - it should be required to select five players.

Collectives™ on Stack Overflow

Problem with a Lineup Optimizer due to players having multiple Positions

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related