Introducing NHL Team Ratings

Initial NHL ratings for the 2024-25 season and details about how the rating system works.

Oct 07, 2025

people in ice rink — Photo by Seth Hoffman on Unsplash

The puck drops on the NHL season late this afternoon with a tripleheader, and that’s followed by a few more games tomorrow. That means it’s time to post my initial NHL ratings.

I’ve written a lot about team ratings and posted a few articles each week with actual updated ratings for college football and the NFL. I’ve discussed a bit of the math behind the ratings, too. In this article, I want to talk more about some of the other technical details of the ratings, and then introduce my rating systems for the NHL and NBA. I’ll probably post game predictions on a weekly basis, but I’m not planning to update the ratings or make predictions more frequently for now.

In the past, I’ve generated team ratings for MLB, NFL, college basketball, and college football teams. Each rating system was a self-contained system that acquired the data specific to that sport or league, calculated the ratings, and then generated tables with the output. The most complex piece of the system is the actual ratings, and this approach meant that I needed to modify each piece of Python code every time I wanted to tweak the rating algorithm a bit. In practice, this resulted in subtly different rating systems for each sport that used a generally common approach. Each sport or league had its own code unique to that sport, and that made maintaining this difficult if I wanted to add even more sports. I wanted to stop the madness, and there had to be a better way. I’m going to describe that better way in the rest of this article.

The Rating Process

I have different data sources depending on the sport I’m rating. The college football results are a CSV file that I acquire from collegefootballdata.com. Data formats like CSV files are always simple to use. College basketball data is scraped from the NCAA website. Results from professional leagues are currently scraped from Sports Reference websites and other sources. NFL preseason and regular season results come from Pro Football Reference. The NHL data I use is a combination of Hockey Reference data and schedule data from the NHL stats API. For now, NBA data comes from Basketball Reference, but I’d like to add preseason games, and that will require pulling data from the NBA stats API. The API queries tend to return JSON files, which are easy to work with. But scraping data from Sports Reference and similar websites requires parsing HTML, which is a bit more complex.

The NHL ratings would have worked without NHL preseason games, but it would have made early season predictions less accurate. One obvious issue is that the Florida Panthers have been hit hard by injuries, and their preseason games provide useful information about the strength of the team right now. We won’t know just how much the injuries matter until some regular season games have been played, but incorporating preseason games should still improve the skill of the predictions. Without the preseason games, the Panthers were #2 in the ratings. But they’ve dropped four spots due to the preseason.

There’s not a common data format across different sports, and even the various Sports Reference websites have subtly different formats for their data tables. In each case, I have a conversion tool specific to that sport that scrapes data from the website or reads the CSV file, parses the data, and outputs it as a JSON file. I have specific keys for team names, the dates of games, if it’s a neutral site game or not, the scores for both teams, and other relevant data. There are a few sport-specific keys, but there are also keys that I expect to be present for every sport. These JSON files are the inputs to the actual rating system.

The rating system is a separate piece of Python code, and I use the same code for every sport. It reads the JSON file with schedule data and game results, runs the rating system, and then stores the outputs in another JSON file. Although there’s a single piece of code, it doesn’t mean that there can’t be sport or league specific adjustments to the ratings.

For example, the probability of a tie in modern college football is zero. The probability of a tie under current NFL rules is about 0.38% for a typical game. I just set this as a command line parameter when I run the rating system. I might well need to add more sport specific tweaks for the long NHL, NBA, and MLB seasons, especially that perhaps older games during the season should be weighted a bit less than the most recent games. For MLB, I might want to apply a park factor to the amount of runs scored. These adjustments can be added on and controlled with another command line parameter, and I just won’t turn them on for other sports.

My approach to ratings involves repeating the rating process many times, each with slightly different random adjustments to the ratings as they’re being calculated, and each producing a slightly different set of final ratings for the teams. The processing for one set of ratings doesn’t depend on any of the processing used to calculate any of the other rating attempts. They’re completely independent of each other, and this type of algorithm is described as embarrassingly parallel. My old Xeon system has four cores, and there’s no reason to let three of them sit idle while one core does all the work. Instead, each of the four cores can be generating a set of ratings simultaneously, and the different cores don’t even need to share data with each other. This makes it quite simple to run the rating system in parallel, and I do exactly that. The result is that the computer’s fan runs a lot louder, I use quite a bit more power when the ratings are running (the readout on my UPS confirms this with an increase of over 100 more watts), and I get the ratings in slightly more than a quarter of the time it would take if I didn’t run the system in parallel.

The rating system produces a JSON file with the results and a lot of data about past and future games. This output JSON file is then read by a postprocessing script that produces the tables for my articles. I also have another tool that’s specific to the NFL, which reads the rating JSON file, simulates the season a very large number of times, and produces tables to summarize the results.

Scraping Data

There’s not a single way to scrape data from every website, and it instead needs to be customized for different sites. This usually requires some experimentation because documentation about data sources and API endpoints tends to be limited. This has also become more difficult in recent months because many sites have a high volume of traffic from poorly designed bots that are trying to scrape entire websites and build large collections of data for training large language models. Many of these sites rightly don’t want their performance degraded because of poorly behaved bots, and they don’t want their content used to train generative AI. The same measures that block some of this bot traffic sometimes also block the scripts I use to acquire data.

I use the Python requests module to load the pages and API endpoints that have data, though I might switch to a different module such as curl_cffi in the future to avoid being blocked by some of the anti-bot measures. Either way, the best practice is to not scrape large amounts of data at once, and to use a function like the time.sleep function in Python to wait a few seconds between requesting data. I’d prefer not to make the problem with flooding websites worse than it already is. However, I’d like to be able to retry downloads if there’s a temporary failure and make the process robust. Here’s the code I use to request data:

# Attempt to download a page in a robust manner
def retrieve_page (request_url, request_delay_time = 5, max_requests = 10):
    downloaded = False
    end_requests = False
    request_count = 0

    while end_requests == False:
        server_response = requests.get(request_url)
        request_count = request_count + 1
        # If the page is downloaded successfully
        if 200 <= server_response.status_code <= 299:
            end_requests = True
            downloaded = True
        # The maximum number of requests has been reached
        elif request_count >= max_requests:
            end_requests = True
        # There’s an error, so wait and retry
        else:
            time.sleep(request_delay_time)

    if not downloaded:
        warnings.warn((’Error downloading %s, code %d’) % (request_url, server_response.status_code))
        return None

    return server_response

If I’m requesting a JSON file, this can easily be converted into a dictionary in Python, which is simple to work with. Here’s an example I use to retrieve data from the NHL API, which returns a week of games at a time and requires multiple requests to span the entire preseason:

def get_nhlapi_schedule(preseason_start, season_start, request_delay = 5):
    schedule_rows = []
    current_date = preseason_start
    while current_date < season_start:
        date_string = current_date.strftime(’%Y-%m-%d’)
        nhlapi_url = ‘https://api-web.nhle.com/v1/schedule/’ + date_string
        nhlapi_page = retrieve_page(nhlapi_url)
        if nhlapi_page is None:
            warnings.warn(’Cannot retrieve API data for date ‘ + date_string)
        else:
            try:
                nhlapi_data = json.loads(nhlapi_page.text)
                for cur_day in nhlapi_data[’gameWeek’]:
                    for cur_game in cur_day[’games’]:
                        if cur_game[’gameType’] == 1:
                            schedule_rows.append(cur_game)
                current_date = datetime.datetime.strptime(nhlapi_data[’nextStartDate’].strip(), ‘%Y-%m-%d’).date()
            except:
                warnings.warn(’Error parsing game data for date beginning ‘ + date_string)
                nhlapi_data = None
                current_date = current_date + datetime.timedelta(days = 7)
        time.sleep(request_delay)
    return schedule_rows

In this code, it checks the gameType variable in the dictionary to verify that it’s a preseason game, and it keeps requesting the next set of games until the entire preseason has been downloaded.

There’s also code for loading and parsing a CSV file, and I use the pop function to remove the header line from a CSV file that says what the columns are. Here’s an example from the college football preprocessing code:

# Read the list of games from the CSV file
input_handle = open(input_file, newline = ‘’, encoding = ‘utf-8-sig’)
reader = csv.reader(input_handle)
input_data = list(reader)
input_handle.close()
input_data_header = input_data.pop(0)

The best tool I know of for parsing HTML data is the Beautiful Soup module, usually imported as bs4. For data from Sports Reference, using this can be a bit more complex because there are sometimes lots of tables, and some of the tables are actually loaded as comments but are made visible in your browser after the page loads. To get around this, when I’m searching for tables in a page, I also search through any comments in the page for tables:

# Get a list of parsed Sports Reference tables from HTML text
def get_parsed_sref_tables (htmltext, delete_headers = True):
    # Parse the HTML text
    soup = bs4.BeautifulSoup(htmltext.replace(’&nbsp;’, ‘ ‘), features=’html.parser’)

    # Get a list of tables that match the required classes
    table_list = soup.find_all(’table’, class_ = [’sortable’, ‘stats_table’])
    if table_list is None:
        table_list = []

    # Scan through the comments and convert comments that contain HTML to actual HTML
    for comment_match in soup.find_all(string = lambda string:isinstance(string, bs4.Comment)):
        parent = comment_match.parent
        reject_comment = False
        while parent:
            if parent.name in [’script’, ‘style’]:
                reject_comment = True
            parent = parent.parent
        if not reject_comment:
            comment_match_soup = bs4.BeautifulSoup(str(comment_match), ‘html.parser’)
            if len(comment_match_soup.find_all()) > 0:
                table_list_result = comment_match_soup.find_all(’table’, class_ = [’sortable’, ‘stats_table’])
                if table_list_result is not None:
                    table_list.extend(table_list_result)

    # Loop through each table in the list and delete all extra headers
    if delete_headers:
        if table_list is not None:
            for this_table in table_list:
                header_list = this_table.find_all(’tr’, class_ = [’thead’])
                if header_list is not None:
                    for cur_header in header_list:
                        cur_header.extract()
                header_list = this_table.find_all(’tr’, class_ = [’over_header’])
                if header_list is not None:
                    for cur_header in header_list:
                        cur_header.extract()

    # Return the list of tables
    return table_list

There are several tools for parsing tables, but I prefer to convert tables into Pandas dataframes, then go through the dataframe to find the data I want. Pandas can also extract the URLs from links when it does this conversion, and that can be helpful in many ways. For example, teams often have three letter identifiers such as NYY for the New York Yankees, DET for Detroit teams, or STL for teams from St. Louis. These identifiers are convenient, but they’re often not in any of the table columns. However, Sports Reference tables often include the name of the team, and the name is also a link to a page about the team. In most cases, the link will actually contain the three letter identifier. Recent versions of Pandas can extract those links, and then those links can be parsed to get the three letter identifiers. Parsing an HTML table into a Pandas dataframe is exactly what pd.read_html does, and I use the extract_links option to get the link URLs. Instead of each cell in the table corresponding to one entry in the dataframe, it’s returned as a tuple in the dataframe. The first element of the tuple has the text of the cell and the second is the link URL. Here’s the code for parsing Hockey Reference schedule tables, though this needs to be adapted for other sports:

# Parse a Sports Reference table as if it’s a schedule
def parse_sref_schedule_table (this_table, this_season = None, season_new_column = ‘Season’):
    # If we can get the table ID, then retrieve it
    if this_table.has_attr(’id’):
        this_table_id = this_table[’id’].strip()
    else:
        this_table_id = None

    # If we can retrieve the table ID, then we should parse it
    if this_table_id is not None:
        this_table_data = pd.read_html(io.StringIO(str(this_table)), extract_links = ‘body’)[0]

        # We need to iterate through the dataframe columns and rows to extract player names and team IDs, remove tuples, and put an asterisk prior to teams and players that aren’t real
        for column_name in list(set(this_table_data.columns).intersection([’Visitor’, ‘Home’])):
            this_table_data[column_name + ‘.Name’] = pd.Series(dtype = ‘string’)
        nrows = len(this_table_data)
        ncols = len(this_table_data.columns)
        for i in range(0, nrows, 1):
            for j in range(0, ncols, 1):
                cdata = this_table_data.iat[i, j]
                # Test if the data in the cell is iterable and not a string
                if (isinstance(cdata, collections.abc.Iterable)) and (not isinstance(cdata, str)):
                    cdata0 = cdata[0]
                    cdata1 = cdata[1]
                    isiter = True
                else:
                    cdata0 = cdata
                    cdata1 = cdata
                    isiter = False
                # Handle if it’s a column for teams, try to extract the team ID from the URL
                if [’Visitor’, ‘Home’].count(this_table_data.columns[j]) > 0:
                    if isiter:
                        if cdata1 is not None:
                            try:
                                this_table_data.iat[i, j] = cdata1.split(’/’)[-2].split(’.’)[0]
                            except:
                                this_table_data.iat[i, j] = ‘*’ + str(cdata0)
                            this_table_data.loc[this_table_data.index[i], this_table_data.columns[j] + ‘.Name’] = str(cdata0)
                        else:
                            this_table_data.iat[i, j] = ‘*’ + str(cdata0)
                            this_table_data.loc[this_table_data.index[i], this_table_data.columns[j] + ‘.Name’] = str(cdata0)
                    else:
                        this_table_data.iat[i, j] = ‘*’ + str(cdata0)
                        this_table_data.loc[this_table_data.index[i], this_table_data.columns[j] + ‘.Name’] = str(cdata0)
                # Otherwise, just remove the tuple
                else:
                    this_table_data.iat[i, j] = str(cdata0)

        # Next, go through and remove extra rows like league averages and totals, and clear unwanted rows (multiple rows with stats from different teams, league averages/totals, etc...)
        drop_rows = []
        if list(this_table_data.columns).count(’Home’) > 0:
            home_column = list(this_table_data.columns).index(’Home’)
        else:
            home_column = None
        if list(this_table_data.columns).count(’Visitor’) > 0:
            visitor_column = list(this_table_data.columns).index(’Visitor’)
        else:
            visitor_column = None
        for i in range(0, nrows, 1):
            row_index = list(this_table_data.index).index(i)
            del_row = False
            if this_season is not None:
                this_table_data.at[row_index, season_new_column] = this_season
            if (home_column is None) or (visitor_column is None) or (len(str(this_table_data.iat[i, home_column]).strip()) == 0) or (len(str(this_table_data.iat[i, visitor_column]).strip()) == 0):
                del_row = True
            # Not a schedule table, or a blank row in a schedule table, then delete
            if del_row:
                drop_rows.append(row_index)

        # Set the season column type to integer
        if this_season is not None:
            if nrows > 0:
                this_table_data[season_new_column] = this_table_data[season_new_column].astype(int)
        # Drop rows that should be dropped
        if len(drop_rows) > 0:
            this_table_data = this_table_data.drop(index = drop_rows)
    # We don’t really need this, but set the table data to None if there’s no identifier
    else:
        this_table_data = None

    return this_table_id, this_table_data

There are some other details that have to be considered, too. For example, the three letter team identifiers don’t always remain the same from one season to the next, especially if a team relocates to a new city. If I’m using the games from a previous season at the start of the next season but some of the team identifiers have changed, the rating system would not recognize that it’s still the same team. In many cases, there’s a separate three letter identifier for the franchise. An example is that the Anaheim Ducks (currently ANA) were previously known as the Mighty Ducks of Anaheim, and the previous name had the team identifier MDA. But there’s a single franchise identifier, which is ANA.

Pro Football Reference seems unique in that it doesn’t change the three letter team identifiers when teams relocate. For example, both the St. Louis Rams and the Los Angeles Rams are just RAM. But that’s not true for other sports, where it’s necessary to use the franchise identifiers instead of the team identifiers. And if I’m combining data from multiple sources like Hockey Reference and the NHL stats API, I need to make sure I can match up the teams between the two data sets. My point is that there are a lot of details to consider when scraping data. It just requires attention to detail when acquiring the data and careful testing to make sure everything works as expected.

There’s no one size fits all approach to scraping data. It’s site-dependent, and it often takes some time and experimentation to get the acquisition of data working properly. Just before the start of the NFL season, a tweak to the tables on Pro Football Reference required an update to my code, so this does happen from time to time. My code is available on Github for the entire system rating system, and I’ll update the NBA scraping tools once I add some code to use the nba_api software and scrape preseason results from the NBA’s stats API.

NHL and NBA Ratings

I probably won’t post weekly articles with NHL and NBA ratings and game predictions, though I’m not ruling it out. Instead, I’d like to have a landing page with the data, and that page will get updated on a weekly basis. I do have a landing page for NHL data. At some point when I have the resources to automate more of the processing, I’d like to move to nightly updates. But I don’t want to make the commitment at this time to manually updating Substack pages on a daily basis.

Here are the NHL ratings at the start of the season, including preseason games:

Predictive Ratings
Home advantage: 0.361 goals
Mean score: 3.036 goals
Rank Move Rating Change Team                  Offense Defense
   1   +2 0.846  +0.081 Tampa Bay Lightning   0.518   0.336  
   2   -1 0.779  -0.220 Winnipeg Jets         0.285   0.488  
   3   +2 0.768  +0.112 Washington Capitals   0.397   0.377  
   4   +2 0.758  +0.107 Colorado Avalanche    0.432   0.332  
   5   +2 0.691  +0.067 Dallas Stars          0.350   0.346  
   6   -4 0.673  -0.114 Florida Panthers      0.313   0.353  
   7   +3 0.528  +0.027 Los Angeles Kings     0.028   0.511  
   8      0.485  -0.099 Toronto Maple Leafs   0.309   0.178  
   9   -5 0.416  -0.260 Vegas Golden Knights  0.093   0.322  
  10   +1 0.392  -0.015 Edmonton Oilers       0.259   0.128  
  11   -2 0.389  -0.143 Carolina Hurricanes   0.230   0.167  
  12      0.251  -0.035 St. Louis Blues       0.117   0.141  
  13      0.108  -0.041 New Jersey Devils     -0.137  0.244  
  14      0.040  -0.033 Ottawa Senators       -0.084  0.118  
  15   +2 -0.008 +0.029 Minnesota Wild        -0.179  0.177  
  16   -1 -0.055 -0.122 New York Rangers      0.091   -0.145 
  17   +7 -0.087 +0.215 Vancouver Canucks     -0.026  -0.056 
  18   -2 -0.119 -0.108 Columbus Blue Jackets 0.094   -0.200 
  19   +1 -0.150 +0.090 Montreal Canadiens    -0.015  -0.148 
  20   +3 -0.210 +0.081 Detroit Red Wings     -0.162  -0.054 
Rank Move Rating Change Team                  Offense Defense
  21      -0.236 +0.025 Buffalo Sabres        0.184   -0.417 
  22      -0.274 +0.008 Seattle Kraken        -0.093  -0.187 
  23   -5 -0.290 -0.149 Utah Mammoth          -0.208  -0.083 
  24   +2 -0.318 +0.205 Anaheim Ducks         -0.250  -0.074 
  25   -6 -0.325 -0.158 Calgary Flames        -0.372  0.049  
  26   +3 -0.514 +0.134 Pittsburgh Penguins   -0.098  -0.406 
  27   +1 -0.564 +0.072 Boston Bruins         -0.300  -0.260 
  28   +2 -0.587 +0.179 Nashville Predators   -0.421  -0.167 
  29   -4 -0.589 -0.142 New York Islanders    -0.332  -0.253 
  30   -3 -0.663 -0.053 Philadelphia Flyers   -0.136  -0.518 
  31      -0.967 +0.048 Chicago Blackhawks    -0.407  -0.557 
  32      -1.205 +0.128 San Jose Sharks       -0.477  -0.735

Changes in rank and rating are the difference between ratings with and without preseason games. For the current ratings, preseason games are weighted at 100%, and last season’s games are weighted at 50%. As the season goes on, I’ll slowly phase out the impact of the preseason and last season’s games just like I’ve done in college football and the NFL. I don’t plan to run any alternative approaches to weighting games in the NHL and NBA, though, with the possible exception of reducing the weight of early season games later in the season. Here is the schedule strength table, with data both based on the expected losing percentage so that tougher schedules are weighted more (SOS/Future), and also the average of opponent ratings with adjustments for home ice advantage (OppRtg/Future):

Schedule Strength for an Average Team
Home advantage: 0.361 goals
Mean score: 3.036 goals
Rank Team                  SOS Future    OppRtg Future     
   1 Tampa Bay Lightning   --- .495 (30) ---    -0.031 (30)
   2 Winnipeg Jets         --- .497 (25) ---    -0.024 (25)
   3 Washington Capitals   --- .494 (32) ---    -0.042 (32)
   4 Colorado Avalanche    --- .497 (26) ---    -0.024 (26)
   5 Dallas Stars          --- .499 (21) ---    -0.011 (21)
   6 Florida Panthers      --- .499 (18) ---    -0.007 (18)
   7 Los Angeles Kings     --- .496 (29) ---    -0.030 (29)
   8 Toronto Maple Leafs   --- .498 (23) ---    -0.015 (23)
   9 Vegas Golden Knights  --- .496 (28) ---    -0.030 (28)
  10 Edmonton Oilers       --- .495 (31) ---    -0.036 (31)
  11 Carolina Hurricanes   --- .496 (27) ---    -0.028 (27)
  12 St. Louis Blues       --- .497 (24) ---    -0.018 (24)
  13 New Jersey Devils     --- .500 (17) ---    -0.003 (17)
  14 Ottawa Senators       --- .501 (16) ---    0.003 (16) 
  15 Minnesota Wild        --- .501 (15) ---    0.005 (15) 
  16 New York Rangers      --- .499 (20) ---    -0.008 (19)
  17 Vancouver Canucks     --- .499 (22) ---    -0.011 (22)
  18 Columbus Blue Jackets --- .501 (13) ---    0.008 (13) 
  19 Montreal Canadiens    --- .502 (11) ---    0.014 (11) 
  20 Detroit Red Wings     --- .504 (3)  ---    0.027 (3)  
Rank Team                  SOS Future    OppRtg Future     
  21 Buffalo Sabres        --- .504 (4)  ---    0.026 (4)  
  22 Seattle Kraken        --- .503 (8)  ---    0.018 (8)  
  23 Utah Mammoth          --- .502 (12) ---    0.009 (12) 
  24 Anaheim Ducks         --- .499 (19) ---    -0.009 (20)
  25 Calgary Flames        --- .501 (14) ---    0.006 (14) 
  26 Pittsburgh Penguins   --- .503 (9)  ---    0.016 (9)  
  27 Boston Bruins         --- .503 (7)  ---    0.021 (7)  
  28 Nashville Predators   --- .503 (10) ---    0.015 (10) 
  29 New York Islanders    --- .504 (6)  ---    0.024 (6)  
  30 Philadelphia Flyers   --- .504 (5)  ---    0.026 (5)  
  31 Chicago Blackhawks    --- .505 (2)  ---    0.033 (2)  
  32 San Jose Sharks       --- .506 (1)  ---    0.041 (1)

Some of the difference from top to bottom is probably due the schedules not being exactly balanced, but a lot of it also also that very good and very bad teams don’t play themselves.

Here are the preliminary NBA ratings, which include games from the 2024-25 season but not any 2025-26 preseason games:

Predictive Ratings
Home advantage: 1.83 points
Mean score: 113.48 points
Rank Rating Team                   Offense Defense
   1 12.33  Oklahoma City Thunder  5.03    7.33   
   2 9.52   Cleveland Cavaliers    8.46    1.06   
   3 8.44   Boston Celtics         1.94    6.49   
   4 5.00   Minnesota Timberwolves 0.22    4.77   
   5 4.78   Los Angeles Clippers   -1.39   6.15   
   6 4.68   Houston Rockets        0.21    4.47   
   7 4.15   Memphis Grizzlies      7.59    -3.44  
   8 3.71   New York Knicks        1.28    2.41   
   9 3.70   Denver Nuggets         6.42    -2.73  
  10 3.61   Golden State Warriors  -0.31   3.92   
  11 3.45   Indiana Pacers         3.96    -0.51  
  12 2.47   Milwaukee Bucks        1.57    0.90   
  13 1.69   Detroit Pistons        1.56    0.15   
  14 1.38   Los Angeles Lakers     -0.52   1.91   
  15 0.47   Sacramento Kings       2.06    -1.55  
  16 -0.35  Miami Heat             -3.66   3.32   
  17 -0.56  Dallas Mavericks       0.79    -1.33  
  18 -0.65  Orlando Magic          -8.72   8.05   
  19 -1.90  Atlanta Hawks          4.64    -6.55  
  20 -1.95  Chicago Bulls          4.13    -6.04  
Rank Rating Team                   Offense Defense
  21 -2.57  Portland Trail Blazers -2.73   0.17   
  22 -2.65  Phoenix Suns           0.28    -2.93  
  23 -2.72  San Antonio Spurs      0.49    -3.19  
  24 -4.15  Toronto Raptors        -2.34   -1.79  
  25 -6.63  Philadelphia 76ers     -3.90   -2.70  
  26 -6.99  Brooklyn Nets          -8.45   1.48   
  27 -8.43  New Orleans Pelicans   -3.44   -4.99  
  28 -8.50  Utah Jazz              -1.29   -7.19  
  29 -9.03  Charlotte Hornets      -8.46   -0.56  
  30 -12.34 Washington Wizards     -5.34   -6.99

Upcoming Articles

The NBA season doesn’t start as early as the NHL season, so I have a bit of time to update the NBA scraping code. There are a few other details I’d like to change in the NBA code to make it a bit easier to work with, but I’ll get that done before the season begins. And there will be an article with preseason NBA ratings once I have everything working.

My big project for the remainder of the year will be developing a new system to project MLB player stats in future seasons. I’d planned to already post a couple of baseball articles about that but got distracted with updates to the rating system. My approach in the past has been to estimate a player’s raw skills and extrapolate how those skills change as a player ages. Instead of directly projecting stats like ERA, WHIP, OBP, and SLG, I prefer to look at things like exit velocity, launch angle, and contact rates. But a player’s launch angle has some impact on the resulting exit velocity, so just looking at the average exit velocity or even the percentage of hard hit balls (exit velocity >= 95 mph) doesn’t always tell the full story about a player’s ability to hit the ball hard. The article is going to be about developing some better tools to measure a player’s skills at hitting the ball hard and consistently making good contact. That will probably be posted during the weekend, or at least after I post the college football and NFL ratings and game predictions.

You can access my code for the rating system on Github, including all the data scraping tools, the NFL season simulator, the rating system, my postprocessor for generating tables, and the tool I used to graph how teams are connected. I named the system Tiger because I’ve occasionally read discussions on Reddit about whether computer ratings are biased, and the name is my attempt at trolling about any suggestion that I’ve biased the computer ratings in favor of my alma mater, the University of Missouri, and the Tigers. I do plan to better document how to use these tools and to simplify the command line options they require. I’ll also post my tools for verifying the performance of my predictions once those are ready. I also want to test and verify that my code for scraping college basketball data hasn’t broken since last season before I add that to the repository.

There are quite a few systems that are used to rate teams, and you can actually find a very large number of college football ratings on a page maintained by Kenneth Massey. My system is somewhat unique in that it’s completely open source, whereas a lot of other rating systems are proprietary. It takes a lot of time and effort to develop this software, test it, and post these articles. As you may have inferred from this article, I am not supportive of generative AI, and I never use it in the creation of my work. It might mean that my writing is a bit more awkward and less polished, and I do recycle text from my previous articles, but I’d prefer this approach over doing something that feels like cheating. Lewis Hamilton found out on the last couple of laps of the Singapore Grand Prix that cutting corners gets you penalized in racing, and I don’t support cutting corners in creating content, either. If you’d like to help this project continue, and I really will hope you will if you are able to, please consider visiting my about page to see how you can contribute financially. Even if you can’t contribute financially, please consider helping me by subscribing, posting comments, and sharing my articles on social media.

Thanks for reading!