2

i have a list of Hitters in python with their name,projection and Positions

juan-soto 30.3773 ['OF']
kyle-tucker 44.0626 ['OF']
...
yordan-alvarez 32.510200000000005 ['CI', 'OF']
william-contreras 26.7904 ['CI', 'MI']

from this list i want to get the best possible lineup with 1 CI Slot, 1 MI Slot, 1 OF Slot and 2 FLEX Slots(any Position), my problem is that i cant just check if every position is available in a combination because players have more than one position, please help

2
  • 1
    What have you tried? I would start by researching "dynamic programming" and "knapsack" and/or "tic-tac-toe". Commented Mar 25 at 15:15
  • You need to specify in more detail: can one person be selected for multiple roles? (I suspect not; that wouldn't make any sense). Commented Mar 26 at 7:06

3 Answers 3

1

As best I can tell, your 'flex constraint' is really better reframed as a generic count for your entire lineup without constraining the positions of the last two players. PuLP is well-suited to this. You should probably also be using Pandas to pre-process player data:

import pandas as pd
import pulp

data = pd.DataFrame({
    'name': [
        'juan-soto', 'kyle-tucker', 'yordan-alvarez', 'william-contreras',
        'jake-meyers', 'yasmani-grandal', 'ernie-clement',
    ],
    'projection': [
        30.3773, 44.0626, 32.5102, 26.7904,
        30, 29, 31,  # fake data for demonstration
    ],
    'position': [
        ['OF'], ['OF'], ['CI', 'OF'], ['CI', 'MI'],
        ['OF'], ['CI'], ['MI'],
    ],
})

# Add onehot encodings for positions
data = pd.concat(
    (
        data,
        pd.get_dummies(data['position'].explode()).groupby(level=0).any(),
    ), axis='columns',
)

# Add decision variables
data['assign'] = pulp.LpVariable.matrix(name='assign', indices=data['name'], cat=pulp.LpBinary)

prob = pulp.LpProblem(name='mlb-lineup', sense=pulp.LpMaximize)
prob.setObjective(pulp.lpDot(data['assign'], data['projection']))

prob.addConstraint(
    name='ci_slots',
    constraint=pulp.lpSum(data.loc[data['CI'], 'assign']) >= 1,
)
prob.addConstraint(
    name='mi_slots',
    constraint=pulp.lpSum(data.loc[data['MI'], 'assign']) >= 1,
)
prob.addConstraint(
    name='of_slots',
    constraint=pulp.lpSum(data.loc[data['OF'], 'assign']) >= 1,
)
prob.addConstraint(
    name='total_slots',  # including flex
    constraint=pulp.lpSum(data['assign']) == 5,
)

assert prob.solve() == pulp.LpStatusOptimal
data['assign'] = data['assign'].apply(pulp.value) > 0.5
print(data)

Printing prob shows the LP formulation in an easy-to-understand format:

mlb-lineup:
MAXIMIZE
31.0*assign_ernie_clement + 30.0*assign_jake_meyers + 30.3773*assign_juan_soto + 44.0626*assign_kyle_tucker + 26.7904*assign_william_contreras + 29.0*assign_yasmani_grandal + 32.5102*assign_yordan_alvarez + 0.0
SUBJECT TO
ci_slots: assign_william_contreras + assign_yasmani_grandal
 + assign_yordan_alvarez >= 1

mi_slots: assign_ernie_clement + assign_william_contreras >= 1

of_slots: assign_jake_meyers + assign_juan_soto + assign_kyle_tucker
 + assign_yordan_alvarez >= 1

total_slots: assign_ernie_clement + assign_jake_meyers + assign_juan_soto
 + assign_kyle_tucker + assign_william_contreras + assign_yasmani_grandal
 + assign_yordan_alvarez = 5

VARIABLES
0 <= assign_ernie_clement <= 1 Integer
0 <= assign_jake_meyers <= 1 Integer
0 <= assign_juan_soto <= 1 Integer
0 <= assign_kyle_tucker <= 1 Integer
0 <= assign_william_contreras <= 1 Integer
0 <= assign_yasmani_grandal <= 1 Integer
0 <= assign_yordan_alvarez <= 1 Integer

Output is:

At line 2 NAME          MODEL
At line 3 ROWS
At line 9 COLUMNS
At line 47 RHS
At line 52 BOUNDS
At line 60 ENDATA
Problem MODEL has 4 rows, 7 columns and 16 elements
...
Result - Optimal solution found

Objective value:                167.95010000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.00
Time (Wallclock seconds):       0.00

Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.00   (Wallclock seconds):       0.00

                name  projection  position     CI     MI     OF  assign
0          juan-soto     30.3773      [OF]  False  False   True    True
1        kyle-tucker     44.0626      [OF]  False  False   True    True
2     yordan-alvarez     32.5102  [CI, OF]   True  False   True    True
3  william-contreras     26.7904  [CI, MI]   True   True  False   False
4        jake-meyers     30.0000      [OF]  False  False   True    True
5    yasmani-grandal     29.0000      [CI]   True  False  False   False
6      ernie-clement     31.0000      [MI]  False   True  False    True
Sign up to request clarification or add additional context in comments.

Comments

0
from itertools import combinations

players = [
    ("juan-soto", 30.3773, ['OF']),
    ("kyle-tucker", 44.0626, ['OF']),
    ("yordan-alvarez", 32.5102, ['CI', 'OF']),
    ("william-contreras", 26.7904, ['CI', 'MI']),
]

ci_players = [p for p in players if 'CI' in p[2]]
mi_players = [p for p in players if 'MI' in p[2]]
of_players = [p for p in players if 'OF' in p[2]]
flex_players = [p for p in players]

def calculate_projection(lineup):
    return sum(player[1] for player in lineup)

best_lineup = None
best_projection = 0

for ci in ci_players:
    for mi in mi_players:
        for of in of_players:
            flex_combinations = combinations(flex_players, 2)
            for flex_pair in flex_combinations:
                lineup = [ci, mi, of] + list(flex_pair)
                total_projection = calculate_projection(lineup)
                
                if total_projection > best_projection:
                    best_projection = total_projection
                    best_lineup = lineup

if best_lineup:
    print("Best Lineup:")
    for player in best_lineup:
        print(f"{player[0]} (Projection: {player[1]})")
    print(f"Total Projection: {best_projection}")
else:
    print("No valid lineup found.")

Here you can do this easily using itertools.

1 Comment

You can, but you probably shouldn't. This problem can be solved more efficiently in a non-brute-force manner.
0

The problem can be formulated as a binary integer program (IP) and can be solved using scipy.optimize.linprog with integrality constraint:

from scipy.optimize import linprog

# Players for 5 slots / positions are needed
# the data provided was insufficient, leading to infeasible solution
# adding a few dummy players (use players from real data instead)
# max value of the objective will depend on the dummy player scores here
# so replace them with real data

# data                                           # variables 
# juan-soto 30.3773 ['OF']                       # x1
# kyle-tucker 44.0626 ['OF']                     # x2    
# yordan-alvarez 32.510200000000005 ['CI', 'OF'] # x3
# william-contreras 26.7904 ['CI', 'MI']         # x4
# dummy1 16.7904 ['OF', 'MI']                    # x5
# dummy2 22.7904 ['CI', 'MI']                    # x6
# dummy3 32.7904 ['CI', 'OF', 'MI']              # x7

# IP formulation

# objective
# max(30.3773*x1 + 44.0626*x2 + 32.510200000000005*x3 + 26.7904*x4 + 
#                  16.7904*x5 + 22.7904*x6 + 32.7904*x7)  
# constraints
# 0*x1 + 0*x2 + 1*x3 + 1*x4 + 0*x5 + 1*x6 +  1*x7 >= 1  # CI >= 1
# 0*x1 + 0*x2 + 0*x3 + 1*x4 + 1*x5 + 1*x6 +  1*x7  >= 1  # MI >= 1
# 1*x1 + 1*x2 + 1*x3 + 0*x4 + 1*x5 + 0*x6 +  1*x7  >= 1  # OF >= 1
# x1 + x2 + x3 + x4  + x5 + x6 + x7 =  5 # POS: #hitters = # positions= 5
# x1, x2, x3, x4, x5, x6, x7 in {0,1}          # integers

# implementation
c = np.array([30.3773, 44.0626, 32.510200000000005, 26.7904, 16.7904, 22.7904, 32.7904])
A = np.array([[0, 0, 1, 1, 0, 1, 1],       # CI
              [0, 0, 0, 1, 1, 1, 1],       # MI
              [1, 1, 1, 0, 1, 0, 1],       # OF
              [1, 1, 1, 1, 1, 1, 1]]       # POS
            )
b = np.array([1, 1, 1, 5])
res = linprog(-c, A_ub=-A[:3,:], A_eq=A[3:,:], b_ub=-b[:3], b_eq=b[3:], bounds=[(0,1)]*7,  integrality=[1]*7) # need to formulate as a minimization problem
print(-res.fun, res.x, res.message)
# 166.5309 [ 1.  1.  1.  1.  0. -0.  1.] Optimization terminated successfully. (HiGHS Status 7: Optimal)

As it can be seen, the optimal solution is x1=x2=x3=x4=x7=1, with optimal hitting (projection) score 166.5309, i.e., it is optimal to select 1st, 2nd, 3rd, 4th and 7th hitters to maximize the hitting score.

3 Comments

There are a few issues with this, including: linprog has a clunkier interface and milp should be preferred; also, this will silently produce nonsense if the problem is infeasible. You need to be checking the success attribute.
Also: I highly doubt that assigning three players on a five-position team is sensible.
I don't think that's true. The problem isn't with the input data; it's with your LP formulation - it should be required to select five players.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.