MMA weight class. To the limit and not beyond?

I like MMA. I’m not expert, neither I watch every single fight. Actually, I don’t know many of the fighters, but I like the adrenaline of a good fight. I even got dragged into the Rhonda Rousey’s fan train about 2 years ago, when she was at her peak way before her WWA “fights”.

I remember I was watching Rousey during the pre-fight weighting event. She was extremely fit, almost malnourished, but I thought she was on a diet trying to stay in her weight class. It was when I saw her a few days later at the fight when it hit me (pun intended). She looked fluffy, fatter, no muscles, a completely different body. Just after a few days from the weighting day! So I did my research looking for the question.

Rhonda Rousey the weighting day and during a fight

Rhonda Rousey the weighting day and during a fight

The more they weight, the more power at each punch or kick, so fighters try to be as heavy as they can while staying within the limit of their class. So I wondered if the numbers will show that. Once I got the question, get the data!

Lucky me, one Reddit user collected MMA fighters data from SherDog.com and tabulated them into a very convenient format. The data were old (2016), but still valid to answer my question. They can be downloaded from here.

I got the observation, I got the question, I got the data. Let’s have the answer!

First, I load the libraries I will use.

library (gsheet)
library (ggplot2)
library (data.table)

Data processing

Even if I could download the .csv file from the Google Sheet page, I opted to load the data from it to R using gsheet.

fighters <- as.data.table ( gsheet2tbl (url = "https://docs.google.com/spreadsheets/d/1z3QX0uWXv-XHX2Nfuj6zZHrfEeXI3A9CKWkrGaBzB8s/edit#gid=0"))

Check that the data has been correctly downloaded.

summary (fighters)
##      url                 fid             name               nick          
##  Length:1561        Min.   :     4   Length:1561        Length:1561       
##  Class :character   1st Qu.:  2920   Class :character   Class :character  
##  Mode  :character   Median : 15806   Mode  :character   Mode  :character  
##                     Mean   : 26761                                        
##                     3rd Qu.: 41586                                        
##                     Max.   :172941                                        
##                                                                           
##   birth_date            height          weight      association       
##  Length:1561        Min.   :61.00   Min.   :105.0   Length:1561       
##  Class :character   1st Qu.:69.00   1st Qu.:146.0   Class :character  
##  Mode  :character   Median :71.00   Median :170.0   Mode  :character  
##                     Mean   :70.63   Mean   :175.4                     
##                     3rd Qu.:73.00   3rd Qu.:190.0                     
##                     Max.   :84.00   Max.   :600.0                     
##                     NA's   :55      NA's   :28                        
##     class             locality           country         
##  Length:1561        Length:1561        Length:1561       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
## 

Let’s see the distribution of fighters and their weight for each class

fighters [, .(min = min (weight, na.rm = T), max = max (weight, na.rm = T), count = .N), by = class]
##                 class min max count
##  1:     Featherweight 139 146   150
##  2: Light Heavyweight 189 206   184
##  3:      Bantamweight 134 136   133
##  4:         Flyweight 123 126    59
##  5:       Heavyweight 208 265   187
##  6:      Welterweight 158 171   293
##  7:       Lightweight 148 156   243
##  8:      Middleweight 173 186   218
##  9:       Strawweight 110 116    39
## 10: Super Heavyweight 270 600    19
## 11:               N/A 137 207    35
## 12:        Atomweight 105 105     1

It looks like the weight variable is not perfect. Too wide range and a few NAs. Let’s show it and fix it.

I will drop the extreme values (classes Super Heavyweight and Atomweight) since they are either too spread or they have just one fighter, respectively. I will also remove every NAs in weight and class.

fighters <- fighters [(!fighters$class == "Super Heavyweight" &
                           !fighters$class == "Atomweight" &
                           !is.na(fighters$weight) &
                           !is.na(fighters$class) &
                           !fighters$class == "N/A")]

Re-checking that both classes and NAs are gone for good.

fighters [, .(min = min (weight, na.rm = T), max = max (weight, na.rm = T), count = .N), by = class]
##                class min max count
## 1:     Featherweight 139 146   150
## 2: Light Heavyweight 189 206   184
## 3:      Bantamweight 134 136   133
## 4:         Flyweight 123 126    59
## 5:       Heavyweight 208 265   187
## 6:      Welterweight 158 171   293
## 7:       Lightweight 148 156   243
## 8:      Middleweight 173 186   218
## 9:       Strawweight 110 116    39

I’ll add the limits for each class according to the Wikipedia. Here I’d have two options: Either I create a new data frame (or data table) and I merge it by the variable class with the main data table (I.e.. fighters); or I create a series of if else clauses. I rather merge both data frames.

classes <- data.table (class = c("Strawweight", "Flyweight", "Bantamweight", "Featherweight",
                                 "Lightweight", "Welterweight", "Middleweight", "Light Heavyweight",
                                 "Heavyweight"),
                       limit = c(115, 125, 135, 145, 155, 170, 185, 205, 265))
fighters <- merge (fighters, classes, by.all = class)

Now, I will factorize and order the variable class.

fighters$class <- factor(fighters$class,
                         levels = c("Strawweight", "Flyweight", "Bantamweight", "Featherweight",
                                    "Lightweight", "Welterweight", "Middleweight", "Light Heavyweight",
                                    "Heavyweight"),
                         ordered= T)

Plotting time

Now I have a data table with the weights for each fighter and the class he/she belongs to, lets plot it.

ggplot (data = fighters,
        aes (x = weight, fill = class)) +
    theme_classic () +
    geom_bar () +
    #geom_density (linetype = "dotted", alpha = 0.1) +
    geom_vline (aes (xintercept = limit), color = "red", alpha = 0.3, lwd = 0.2) +
    labs (title = "Number of MMA fighters per weight and limit for each class",
          x = "Weight (lbs)", y = "Count") +
    theme (plot.title = element_text (hjust = 0.5),
           legend.position = "bottom") +
    guides (fill = guide_legend (title = "Class")) +
    scale_x_continuous (breaks = fighters$limit) + # I'm not sure why `limit`is not found alone.
    scale_fill_brewer(palette="Paired")

After inspecting the plot, the fact that some fighters appear over the limit of their class caught my eye. Let’s see what’s going on.

I’ll create a summary table for the number of fighters for each category, how many they are over the weight limit, and the percentage that represent.

fighters [, .(fighters = .N, countOverLimit = sum (weight > limit), percentOverLimit = round ( ( (sum (weight > limit)*100) / .N), 2)), by = class]
##                class fighters countOverLimit percentOverLimit
## 1:      Bantamweight      133             14            10.53
## 2:     Featherweight      150              8             5.33
## 3:         Flyweight       59              2             3.39
## 4:       Heavyweight      187              0             0.00
## 5: Light Heavyweight      184              4             2.17
## 6:       Lightweight      243             14             5.76
## 7:      Middleweight      218             10             4.59
## 8:       Strawweight       39              7            17.95
## 9:      Welterweight      293             12             4.10

Even for some categories there are no fighters over the weight limit (I.e.. Heavyweight), for other classes almost a 18% are over the limit (I.e.. Strawweight). Perhaps it could be due to a rounding up during the data collection or the regular weight of the fighters, but that’s something to look into.

Conclusion

My observation was on the right track. Most of the fighters are trying to weight as much as they can within the limits of their weight class. Once they are on the edge during the weighting day, they can gain a few pounds before the fight and, legally, be over the limit.

Extra

Just out of curiosity, let’s see the height for each class and the body mass index for each fighter according to the categories from the Wikipedia article.

fighters [, bmi := ((weight / (height^2)) * 703)]
fighters [, bmiCat := ifelse (bmi < 15, "Very severely underweight",
                              ifelse (bmi >= 15 & bmi < 16, "Severely underweight",
                                ifelse (bmi >= 16 & bmi < 18.5, "Underweight",
                                    ifelse (bmi >= 18.5 & bmi < 25, "Normal (healthy weight)",
                                        ifelse (bmi >= 25 & bmi < 30, "Overweight",
                                            ifelse (bmi >= 30 & bmi < 35, "Moderately obese",
                                                ifelse (bmi >= 35 & bmi < 40, "Severely obese",
                                                    ifelse (bmi >= 40 & bmi < 45, "Very severely obese",
                                                        ifelse (bmi >= 45 & bmi < 50, "Morbidly Obese",
                                                            ifelse (bmi >= 50 & bmi < 60, "Super Obese", "Hyper Obese"
                                                                )
                                                            )
                                                        )
                                                    )
                                                )
                                            )
                                        )
                                    )
                                )
                            )
            ]

fighters$bmiCat <- factor(fighters$bmiCat,
                          levels = c("Very severely underweight", "Severely underweight",
                                     "Underweight", "Normal (healthy weight)",
                                     "Overweight", "Moderately obese", "Severely obese",
                                     "Very severely obese", "Morbidly Obese",
                                     "Super Obese", "Hyper Obese"),
                          ordered= T)

Plot the BMI for each weight category.

ggplot (data = fighters [!is.na (fighters$height)],
        aes (x = class, y = bmi, color = bmiCat)) +
    theme_classic () +
    geom_jitter (size = 2.5, alpha = 0.5) +
    labs (title = "MMA fighters Body Mass Index by Weight Class",
          x = "MMA Class", y = "BMI") +
    theme (plot.title = element_text (hjust = 0.5),
           axis.text.x = element_text (hjust = 1, angle = 45)) +
    guides (color = guide_legend (title = "BMI Category", reverse = T))

Almost no fighter is underweight, and those who are, they are just below the normal weight. There is a clear tendendy for heavy classes to be overweight and even obese. After all, more pounds, more power.