top of page
  • Writer's pictureJoey Jakob

Maximizing Fantasy Baseball Draft Strategy: Leveraging AI for Starting Pitcher Selection

Updated: Apr 12


The image contains a formula for "Pitcher Score" which assesses baseball pitchers by summing weighted values of their stats: the lower the WHIP and ERA, and the higher the IP and strikeouts, the better the score. Weights W1 through W4 are applied to each stat to reflect their importance in the overall score.
My Pitcher Score formula determined in collaboration with ChatGPT 4

This’ll be my third year playing fantasy baseball and I wanted to try something different. Like with previous years, I began by gathering my necessary materials to develop a personal cheat sheet: team depth charts, individual player pages plus comparisons, stats and projections from trusted sources like Fangraphs, Pitcherlist.com, and fivethirtyeight.com, and MLB schedules and pre-season team power rankings. 


These sources help to build out my strategy. I have everything I need at my fingertips to determine my selection priority, including building descending lists of specific player positions according to draft order for the 23-person fantasy team. 



 The images show a fantasy baseball team named "Pineda Colada," managed by Joey Jakob, which finished second with a record of 110 wins, 64 losses, and 6 ties; in a matchup summary, the team scored a victory with an 8-2-0 record in Round 2.
Screengrabs from my 2022 season wins

But this time around, I thought, why not employ AI to help me write my own formula to determine my draft order? So I decided which positions to draft in order [see next blog post in this series for more], and then selected which stats to prioritize in descending order. First up, my focus is on SP (starting pitchers). 


My league uses the ESPN fantasy app, and I first determined that, at least for the initial draft, I’d only consider 2023 season stats.


 A sports analytics interface menu with options for data time frames; "2023 season" is selected, while "2024 season," "Last 7," "Last 15," "Last 30," "Batter vs Pitcher," and "2024 Projections" are unselected.
Sorting by player's stats views

2024 Projections are just that, projections, which I’ve seen fail to pan out time and again. Sure, you can give these a look. But it’s a good idea to base real decisions on real, historical data. The other categories are moot, either until the official current season begins, or like with Batter vs Pitcher, can be captured in individual player stats. Perfect timing, as I was about to move on to these for next up considerations. 



Order Considerations of Player Stats 


A screenshot of a fantasy baseball platform showing a list of pitchers with their 2023 season stats highlighted. The stats include innings pitched (IP), hits (H), earned runs (ER), walks (BB), strikeouts (K), ERA, WHIP, and PR15, with the 2023 season filter selected for viewing.
Sorting starting pitchers (SP) 2023 stats in ESPN's fantasy baseball platform

Moving beyond 2023 Season stats to determine who I consider as the best starting pitchers (SP), I chose to focus on WHIP, innings pitched, strikes, and ERA, in descending order. Why is WHIP so important? WHIP takes a handful of important factors into account by including “walks plus hits over innings pitched”:  


The image displays the formula for calculating WHIP in baseball, which is equal to the sum of walks (BB) and hits (H) allowed by a pitcher, divided by innings pitched (IP).
Formula to calculate WHIP (walks plus hits over innings pitched)

Even though WHIP heavily weights IP, a low WHIP – which is what we want – doesn’t inherently mean high IP over a given season. For instance, a pitcher might have a .97 WHIP but only played a total of 10 innings in a single season, as they were called up as a relief pitcher and deployed only for 3 games in a series before the regular pitching rotation is restored, rendering this player back to the farm team. And so, the second highest weighted category is the stand-alone stat for IP. 


Coming in at third weighted, I placed SP with the highest strikes (K) possible. But you have to be careful here: often high K counts amount to higher ERA (and WHIP) because more pitches are crossing the ideal strike zone for a batter to make contact. So this is where the fourth weighted, and final, stats category for comparison comes in, the ERA. ERA, or earned run average, computes the amount of runs a single pitcher allows during a game, over the number of innings they pitch. 


The image shows a graphic explaining the formula for calculating a baseball pitcher's Earned Run Average (ERA). ERA is computed as the number of earned runs allowed by a pitcher, divided by innings pitched, and then multiplied by nine.
Calculation for ERA (earned run average)

The formula for ERA is earned runs / innings pitched x 9, and we x ER or IP by 9 because there are 9 innings total per regular play baseball game (as in, no additional or extra time innings). 


Now that I have my own statistical ranking system to determine priority of available SP, I can ask ChatGPT to help me write the formula. I began by asking what it can determine by considering a screenshot of current pitchers and their stats. Here’s what happened in this step of my chat with the AI: 


This is a screenshot of an online fantasy baseball interface showing a list of starting pitchers with their 2023 season pitching statistics, such as innings pitched (IP), hits (H), earned runs (ER), bases on balls/walks (BB), strikeouts (K), saves (SV), and earned run average (ERA), as well as WHIP (walks plus hits per inning pitched). The pitchers, labeled as free agents (FA), include Logan Webb and Gerrit Cole, among others, and the interface also indicates their game status for March 20th. Additionally, the '%ROST' column suggests the percentage of fantasy rosters the players are on, which is near or at 100% for all listed players.
The first task I gave to the AI: describing what is contained within this image of SP stats

ChatGPT told me it determined 5 areas of consideration: 1) Players, 2) Type and Status, 3) Today’s Date, 4) 2023 Season Pitching Statistics, and 5) Research and Rostering. Ultimately, it concluded that the image I provided enables fantasy baseball players to consider athletes’ performance in order to decide their fantasy team roster. 


Next, I told the AI that all I want it to consider is 1) Players and 4) 2023 Season Pitching Statistics, since the other dimensions it pulled out aren’t relevant for this situation. I then asked it to write me a formula to determine best SP, given the statistical importance I outlined: lowest WHIP, highest IP, highest K, and lowest ERA, in descending order*: 


 This image is a text conversation where a user asks for a formula to evaluate starting pitchers based on identified season stats: WHIP, IP, K, and ERA. The response outlines a formula assigning weights to each stat, with a higher weight for the more important stats, suggesting the following notation: WHIP as W, IP as IP, K as strikeouts K, ERA as E, and inversing WHIP and ERA as lower numbers are better. The proposed pitcher score formula is a weighted sum of these stats, with an example giving specific weights to each.
Collaboration with AI begins when I confirm what it can see, but tell it to only hone in on specific details

The AI rewrote my parameters for easier equation formulation and assigned weights to each. So, WHIP becomes W (w1), IP stays IP (w2), K stays K (w3), and ERA becomes E (w4), inclusive of abbreviation and corresponding weights**. 



Outcome of Formula


I next asked ChatGPT to plug in the values depicted in the screenshot, of player statistics, and verify the results of this formula written to determine best starting pitcher score. It first reaffirmed how the formula is written, instructing me on how it will be utilized (image 1 that follows), and next shows me its computation, as written code (image 2 that follows). 


The image presents a mathematical formula for calculating a pitcher's score using inversely weighted stats, prioritizing WHIP and ERA as more impactful when lower, and IP and strikeouts as positive when higher. Weights are assigned to each stat to create a balanced score that reflects a pitcher's effectiveness.
Formula for Pitcher Score calculation refined

The image shows Python code using the pandas library to import player statistics, calculate a weighted pitcher score based on specified criteria—WHIP, IP, K, ERA—and then sort the pitchers by their calculated scores in descending order to determine the best performers.
Python code used to determine the results for SP formula

Finally, a list of SP from best to worst according to the parameters I instructed, is produced. We're left with a brand new Pitcher Score, as ranking best to worst, from my formula and calculations. Here’s the list of the top 12 MLB starting pitchers from the formula that AI and I wrote together: 


The image shows the results of a Python script execution, presenting a table of baseball players ranked by a computed Pitcher Score that considers weighted statistics such as WHIP, IP, K, and ERA. The top three starting pitchers, as per the calculated scores, are Gerrit Cole, Zac Gallen, and Pablo Lopez, suggesting Gerrit Cole is the best starting pitcher of the listed based on the given formula.
The ranked results for calculating best SP

My next article will backtrack a little (before moving on to how I select closing pitchers and batters), ultimately showing how I decided that my first two draft picks are starting pitchers. This post will also include which positions I intend to draft in which order, until the draft of 23 players is complete, taking MLBs standard stats*** into consideration. Stay tuned ⚾️ 🤓 



Footnotes

*Each of these figures exist MLB-wide and are not formulas I’ve determined or calculated myself, whether simply the adding up of specific markers, like strikes and innings pitched, or more complex computations like WHIP or ERA.


**This was my third attempt at deciding the proper formula for use. The first one simply took each of these in turn was weighted from first important, 1, to least important, 5: 2023 season, IP, K, ERA, H, and ER. I abandoned this avenue because I turned to WHIP, which takes a handful of these into considerations, rendering individual considerations of IP, W, and H moots. The second attempt prioritized two main elements according to highest stats, with IP and K, and then the lowest stats, with ERA and WHIP. I wrote this out as: (IP + K) + (ERA x WHIP). But this still felt unnecessarily clunky, if not outright off, because of a discrepancy between multiplication and addition of variables. Thus the third option became more suitable and I ran with it.


***There are many different types of stats that can be considered in baseball. Here is a list of MLBs "standard stats" while a more comprehensive, advanced list of stats is here.

80 views
bottom of page