top of page

My Projects This Year

  • bsample301
  • Jul 9
  • 7 min read

I like to keep this site more about my work with hockey, however I thought this would be a great space to go into further detail about what I have accomplished with MoeAnalytics this season.


Baseball RPI & SOS


RPI (Rating Percentage Index) is a great way to rank teams in sports, it is mainly used in college sports like basketball and baseball. 


Earlier this year I stumbled upon the website warrennolan.com which has many different college sports rankings on it, but the one I was most interested in was its college baseball rankings, and its way to get the schedule and scores. A big part of this website is its RPI rankings for its baseball, and I was influenced by this to try and see if I could make this for high school baseball in Ohio. 


Another reason why I wanted to try and do this was because of the way that high school sports in Ohio function. Of the big 4 sports (football, basketball, hockey, baseball) baseball is the only sport that does not use a formula to determine its postseason tournament. Football uses the formula developed by the OHSAA called the Football Computer Rankings. It can also be seen live by Joe Eitel on his website. 


Basketball has started to use the RPI developed by the OHSAA, in coordination with MaxPreps. And finally, just this last season, hockey has started to use the computerized rankings from MaxPreps, instead of the traditional coaches voting. 


Baseball has not implemented a system like this before. They continue to do the traditional coaches voting to determine its seeds and matchups, which for the most part have proved to go well. 


But, while coaches rankings and votes are a great way to see how good teams rank compared to each other, I wanted to see what the data says. 


I ran into multiple problems while trying to conduct this experiment.


Problem #1: How do I even do this?

I have very little coding experience, so my only option to conduct this was in Excel or Google Sheets, which is not a bad thing as I have a lot of experience in that field. 


If I were going to do this, I would want to have to find a way where the data would automatically update for two main reasons.


The first reason is that pretty much the only place to get all this data that I need is in MaxPreps. MaxPreps is an amazing site to use and it has a lot of information, but from what I have learned from previous experience is that it is very difficult to get data from. Also, you can copy and paste a team’s schedule from its site, but it pastes in a very wonky format and it is a pain to be able to go through all the schedules to get it right.


The second reason is that, even if I was able to copy and paste the schedules in a nice format, copying and pasting 100+ schedules into an Excel document takes up a lot of time. 


But one day when looking up some things, I found my answer: the importhtml formula.


This formula was incredibly helpful. It automatically updated the page that I linked it to after an edit, and it imported the schedules in a very nice format which made it incredibly easy to use. This formula is only available in Google Sheets, so it meant that Sheets was the thing that I would use for this project.


This made it very easy to use and get the schedules for all the teams that I needed, but with the amount of importhtml’s that I had, it did take some time to load all of them, but it was a lot easier than getting all of them myself. 


Problem #2: What teams do I even use?

For this project, I knew I wanted to only get the RPI of D1 teams to start, so that if I wanted to expand later on in the future, I could increase what divisions I wanted, but I could start in D1 for now.


This past season the OHSAA restructured baseball to have 7 divisions instead of 4. This made D1 much smaller, leaving them with only 65 teams, which made it nice for the sheet.


However, D1 teams play many games against non-D1 opponents, especially in Northern Ohio where there are less D1 teams. 


My solution was to only include D1 and D2 teams in my formula. A main reason for this was that there were only 64 teams in D2, but in D3 there were 125. If I also included D3 teams, it would be more accurate, but I don’t know if the sheet would be able to handle that. So I just kept it to D2 to start with, until I find a better way to formulate it.


All that left me with was to keep up with it during the season, and get the results. 


For the RPI formula, I just used the basic formula: 0.25*(Team’s Winning Percentage)+0.50*(Opponents’ Winning Percentage)+0.25*(Opponents’ Opponents’ Winning Percentage)


A lot of people do the average Winning Percentage of the opponents, but I combined them and did the total Winning Percentage.


I also calculated the SOS (Strength of Schedule) of the D1 teams, like I did in a project last year (Tweet 1, Tweet 2, Tweet 3). For SOS, I just used the Opponent Winning Percentage to get it, nothing fancy. 


Here are my RPI results:

1 JACKSON

2 ST IGNATIUS

3 ARCHBISHOP MOELLER

4 GLENOAK

5 PERRYSBURG

6 OLENTANGY

7 FINDLAY

8 OLENTANGY ORANGE

9 OLENTANGY BERLIN

10 OLENTANGY LIBERTY

11 MEDINA

12 McKINLEY

13 MENTOR

14 ELDER

15 MARYSVILLE

16 LITTLE MIAMI

17 LAKOTA WEST

18 HILLIARD DARBY

19 CENTERVILLE

20 ST EDWARD

21 BEAVERCREEK

22 BRUNSWICK

23 THOMAS WORTHINGTON

24 MASON

25 LEBANON

26 ST XAVIER

27 LANCASTER

28 GROVE CITY

29 SPRINGBORO

30 STRONGSVILLE

31 LAKOTA EAST

32 PRINCETON

33 WESTERN HILLS

34 UPPER ARLINGTON

35 HILLIARD DAVIDSON

36 BEREA-MIDPARK

37 PICKERINGTON CENTRAL

38 NEWARK

39 MILFORD

40 JOHN MARSHALL

41 WEST CLERMONT

42 DUBLIN JEROME

43 PICKERINGTON NORTH

44 FAIRMONT

45 HILLIARD BRADLEY

46 OAK HILLS

47 WHITMER

48 HAYES

49 HAMILTON

50 SYCAMORE

51 WESTLAND

52 DUBLIN COFFMAN

53 WALNUT HILLS

54 COLERAIN

55 ELYRIA

56 LINCOLN

57 SPRINGFIELD

58 WAYNE

59 REYNOLDSBURG

60 CLEVELAND HEIGHTS

61 FAIRFIELD

62 CENTRAL CROSSING

63 MIDDLETOWN

64 LORAIN

65 GROVEPORT MADISON


Here are my SOS Results (ranked by hardest schedule first):

1 ST IGNATIUS

2 WHITMER

3 PERRYSBURG

4 McKINLEY

5 MARYSVILLE

6 DUBLIN JEROME

7 ST EDWARD

8 ELDER

9 CENTERVILLE

10 HILLIARD DARBY

11 ST XAVIER

12 WALNUT HILLS

13 ARCHBISHOP MOELLER

14 OLENTANGY

15 BEREA-MIDPARK

16 FINDLAY

17 GLENOAK

18 JACKSON

19 WESTERN HILLS

20 ELYRIA

21 CLEVELAND HEIGHTS

22 THOMAS WORTHINGTON

23 OLENTANGY BERLIN

24 DUBLIN COFFMAN

25 UPPER ARLINGTON

26 OLENTANGY ORANGE

27 FAIRFIELD

28 MIDDLETOWN

29 STRONGSVILLE

30 FAIRMONT

31 HILLIARD DAVIDSON

32 WEST CLERMONT

33 OLENTANGY LIBERTY

34 LAKOTA EAST

35 BRUNSWICK

36 SYCAMORE

37 SPRINGFIELD

38 PICKERINGTON CENTRAL

39 LAKOTA WEST

40 MILFORD

41 WAYNE

42 HILLIARD BRADLEY

43 CENTRAL CROSSING

44 LEBANON

45 PICKERINGTON NORTH

46 LITTLE MIAMI

47 NEWARK

48 GROVE CITY

49 JOHN MARSHALL

50 MENTOR

51 MEDINA

52 REYNOLDSBURG

53 BEAVERCREEK

54 HAYES

55 COLERAIN

56 MASON

56 PRINCETON

58 GROVEPORT MADISON

59 LINCOLN

60 SPRINGBORO

61 HAMILTON

62 LANCASTER

63 WESTLAND

64 OAK HILLS

65 LORAIN


I also decided to do the RPI rankings of the conferences of the D1 teams:

1 Federal

2 Independent

3 NLL BUCK

4 GCL

5 OCC CARD

6 GCC

7 OCC CEN

8 ECC

9 CMAC

10 GWOC

11 SAL

12 GMC

13 SWC

14 OCC OH

15 OCC BUCK

16 OCC CAP

17 Lake Erie


Doing the RPI project allowed me to look at how individual teams are doing as well:

ree

Park Factor 


Park Factor was the 2nd project that I did this season. I did this before the season, back in February.


Park Factor is a way to see how hitter or pitcher friendly a park is, cause as we all know, not all parks are created equally. 


Here is the formula for Park Factor:

ree

100 is neutral, below 100 is a pitcher friendly park, and above 100 is a hitter friendly park. 


I collected data for 40 teams and their fields around the Cincinnati and Dayton area.


I collected data for 3 seasons, so at the time it was between 2022-2024, here are my results:

1 Anderson Anderson High School

2 Princeton Princeton Baseball Fields

3 Badin Joyce Park

4 Milford Milford High School

5 Lakota East Baseball Field

6 Wayne Wayne High School

7 St. Xavier Baseball Stadium

8 Fenwick Fenwick High School

9 Winton Woods Winton Woods High School

10 Centerville Booster Park

11 West Clermont West Clermont High School

12 Oak Hills Oak Hills High School

13 Carroll Carroll High School

14 Fairmont Fairmont Park

15 Sycamore Sycamore High School

16 McNicholas Paradise Athletic Complex

17 Lakota West Firebird Field

18 Chaminade Julienne Howell Field

19 Miamisburg Toadvine Field

20 Alter Nischwitz Stadium

21 Springboro Lundt Baseball Field

22 Colerain Colerain High School

23 Lebanon Lebanon Junior High School

24 Kings Kings High School

25 Moeller Kremchek Stadium

26 Hamilton Hamilton High School

27 LaSalle Lancer Baseball Field

28 CHCA Robert Gardner Baseball Stadium

29 Turpin Turpin High School

30 Northmont Northmont High School

31 Walnut Hills Reds Urban Youth Academy

32 Middletown Lefferson Park

33 Mason Mason Middle School

34 Fairfield Joe Nuxhall Field

35 Beavercreek Mark Stewart Field

36 Loveland Dave Evans Field

37 Vandalia Butler Vandalia Butler High School

38 Little Miami Little Miami High School

39 Springfield Springfield High School

40 Elder Panther Athletic Complex


CJ, Miamisburg, and Alter are the closest to 100 that I have, so anything above those are hitter friendly, and anything below are pitcher friendly.

 
 
 

Comments


© 2021 by Braeden Sample. Proudly created with Wix.com

bottom of page