I like MMA. I’m not expert, neither I watch every single fight. Actually, I don’t know many of the fighters, but I like the adrenaline of a good fight. I even got dragged into the Rhonda Rousey’s fan train about 2 years ago, when she was at her peak way before her WWA “fights”.
I remember I was watching Rousey during the pre-fight weighting event. She was extremely fit, almost malnourished, but I thought she was on a diet trying to stay in her weight class. It was when I saw her a few days later at the fight when it hit me (pun intended). She looked fluffy, fatter, no muscles, a completely different body. Just after a few days from the weighting day! So I did my research looking for the question.
The more they weight, the more power at each punch or kick, so fighters try to be as heavy as they can while staying within the limit of their class. So I wondered if the numbers will show that. Once I got the question, get the data!
Lucky me, one Reddit user collected MMA fighters data from SherDog.com and tabulated them into a very convenient format. The data were old (2016), but still valid to answer my question. They can be downloaded from here.
I got the observation, I got the question, I got the data. Let’s have the answer!
First, I load the libraries I will use.
library (gsheet)
library (ggplot2)
library (data.table)
Data processing
Even if I could download the .csv file from the Google Sheet page, I opted to load the data from it to R using gsheet
.
fighters <- as.data.table ( gsheet2tbl (url = "https://docs.google.com/spreadsheets/d/1z3QX0uWXv-XHX2Nfuj6zZHrfEeXI3A9CKWkrGaBzB8s/edit#gid=0"))
Check that the data has been correctly downloaded.
summary (fighters)
## url fid name nick
## Length:1561 Min. : 4 Length:1561 Length:1561
## Class :character 1st Qu.: 2920 Class :character Class :character
## Mode :character Median : 15806 Mode :character Mode :character
## Mean : 26761
## 3rd Qu.: 41586
## Max. :172941
##
## birth_date height weight association
## Length:1561 Min. :61.00 Min. :105.0 Length:1561
## Class :character 1st Qu.:69.00 1st Qu.:146.0 Class :character
## Mode :character Median :71.00 Median :170.0 Mode :character
## Mean :70.63 Mean :175.4
## 3rd Qu.:73.00 3rd Qu.:190.0
## Max. :84.00 Max. :600.0
## NA's :55 NA's :28
## class locality country
## Length:1561 Length:1561 Length:1561
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
Let’s see the distribution of fighters and their weight for each class
fighters [, .(min = min (weight, na.rm = T), max = max (weight, na.rm = T), count = .N), by = class]
## class min max count
## 1: Featherweight 139 146 150
## 2: Light Heavyweight 189 206 184
## 3: Bantamweight 134 136 133
## 4: Flyweight 123 126 59
## 5: Heavyweight 208 265 187
## 6: Welterweight 158 171 293
## 7: Lightweight 148 156 243
## 8: Middleweight 173 186 218
## 9: Strawweight 110 116 39
## 10: Super Heavyweight 270 600 19
## 11: N/A 137 207 35
## 12: Atomweight 105 105 1
It looks like the weight
variable is not perfect. Too wide range and a few NAs. Let’s show it and fix it.
I will drop the extreme values (classes Super Heavyweight
and Atomweight
) since they are either too spread or they have just one fighter, respectively. I will also remove every NAs in weight
and class
.
fighters <- fighters [(!fighters$class == "Super Heavyweight" &
!fighters$class == "Atomweight" &
!is.na(fighters$weight) &
!is.na(fighters$class) &
!fighters$class == "N/A")]
Re-checking that both classes and NAs are gone for good.
fighters [, .(min = min (weight, na.rm = T), max = max (weight, na.rm = T), count = .N), by = class]
## class min max count
## 1: Featherweight 139 146 150
## 2: Light Heavyweight 189 206 184
## 3: Bantamweight 134 136 133
## 4: Flyweight 123 126 59
## 5: Heavyweight 208 265 187
## 6: Welterweight 158 171 293
## 7: Lightweight 148 156 243
## 8: Middleweight 173 186 218
## 9: Strawweight 110 116 39
I’ll add the limits for each class according to the Wikipedia. Here I’d have two options: Either I create a new data frame (or data table) and I merge it by the variable class
with the main data table (I.e.. fighters
); or I create a series of if else
clauses. I rather merge both data frames.
classes <- data.table (class = c("Strawweight", "Flyweight", "Bantamweight", "Featherweight",
"Lightweight", "Welterweight", "Middleweight", "Light Heavyweight",
"Heavyweight"),
limit = c(115, 125, 135, 145, 155, 170, 185, 205, 265))
fighters <- merge (fighters, classes, by.all = class)
Now, I will factorize and order the variable class
.
fighters$class <- factor(fighters$class,
levels = c("Strawweight", "Flyweight", "Bantamweight", "Featherweight",
"Lightweight", "Welterweight", "Middleweight", "Light Heavyweight",
"Heavyweight"),
ordered= T)
Plotting time
Now I have a data table with the weights for each fighter and the class he/she belongs to, lets plot it.
ggplot (data = fighters,
aes (x = weight, fill = class)) +
theme_classic () +
geom_bar () +
#geom_density (linetype = "dotted", alpha = 0.1) +
geom_vline (aes (xintercept = limit), color = "red", alpha = 0.3, lwd = 0.2) +
labs (title = "Number of MMA fighters per weight and limit for each class",
x = "Weight (lbs)", y = "Count") +
theme (plot.title = element_text (hjust = 0.5),
legend.position = "bottom") +
guides (fill = guide_legend (title = "Class")) +
scale_x_continuous (breaks = fighters$limit) + # I'm not sure why `limit`is not found alone.
scale_fill_brewer(palette="Paired")
After inspecting the plot, the fact that some fighters appear over the limit of their class caught my eye. Let’s see what’s going on.
I’ll create a summary table for the number of fighters for each category, how many they are over the weight limit, and the percentage that represent.
fighters [, .(fighters = .N, countOverLimit = sum (weight > limit), percentOverLimit = round ( ( (sum (weight > limit)*100) / .N), 2)), by = class]
## class fighters countOverLimit percentOverLimit
## 1: Bantamweight 133 14 10.53
## 2: Featherweight 150 8 5.33
## 3: Flyweight 59 2 3.39
## 4: Heavyweight 187 0 0.00
## 5: Light Heavyweight 184 4 2.17
## 6: Lightweight 243 14 5.76
## 7: Middleweight 218 10 4.59
## 8: Strawweight 39 7 17.95
## 9: Welterweight 293 12 4.10
Even for some categories there are no fighters over the weight limit (I.e.. Heavyweight
), for other classes almost a 18% are over the limit (I.e.. Strawweight
). Perhaps it could be due to a rounding up during the data collection or the regular weight of the fighters, but that’s something to look into.
Conclusion
My observation was on the right track. Most of the fighters are trying to weight as much as they can within the limits of their weight class. Once they are on the edge during the weighting day, they can gain a few pounds before the fight and, legally, be over the limit.
Extra
Just out of curiosity, let’s see the height for each class and the body mass index for each fighter according to the categories from the Wikipedia article.
fighters [, bmi := ((weight / (height^2)) * 703)]
fighters [, bmiCat := ifelse (bmi < 15, "Very severely underweight",
ifelse (bmi >= 15 & bmi < 16, "Severely underweight",
ifelse (bmi >= 16 & bmi < 18.5, "Underweight",
ifelse (bmi >= 18.5 & bmi < 25, "Normal (healthy weight)",
ifelse (bmi >= 25 & bmi < 30, "Overweight",
ifelse (bmi >= 30 & bmi < 35, "Moderately obese",
ifelse (bmi >= 35 & bmi < 40, "Severely obese",
ifelse (bmi >= 40 & bmi < 45, "Very severely obese",
ifelse (bmi >= 45 & bmi < 50, "Morbidly Obese",
ifelse (bmi >= 50 & bmi < 60, "Super Obese", "Hyper Obese"
)
)
)
)
)
)
)
)
)
)
]
fighters$bmiCat <- factor(fighters$bmiCat,
levels = c("Very severely underweight", "Severely underweight",
"Underweight", "Normal (healthy weight)",
"Overweight", "Moderately obese", "Severely obese",
"Very severely obese", "Morbidly Obese",
"Super Obese", "Hyper Obese"),
ordered= T)
Plot the BMI for each weight category.
ggplot (data = fighters [!is.na (fighters$height)],
aes (x = class, y = bmi, color = bmiCat)) +
theme_classic () +
geom_jitter (size = 2.5, alpha = 0.5) +
labs (title = "MMA fighters Body Mass Index by Weight Class",
x = "MMA Class", y = "BMI") +
theme (plot.title = element_text (hjust = 0.5),
axis.text.x = element_text (hjust = 1, angle = 45)) +
guides (color = guide_legend (title = "BMI Category", reverse = T))
Almost no fighter is underweight, and those who are, they are just below the normal weight. There is a clear tendendy for heavy classes to be overweight and even obese. After all, more pounds, more power.