The Library is Closed

With the finale aired and a new Superstar crowned, I take some time to see how the algorithms perform overall, comparing predictions across all eight seasons. If you missed my earlier blog posts, see the first to understand what’s going on in the rest of this post.

The finale aired a couple weeks ago and we got our new Drag Race Superstar: Bob the Drag Queen!giphy_5

The finale brought many great moments, including Carol Channing raving over Bob’s Snatch Game performance, Nancy Grace forcefully claiming Acid Betty was robbed in that same challenge, and Margaret Cho wishing Kim Chi luck. Speaking of Kim Chi, each of the top three performed a choreographed lip-sync to a song written specifically for them by Lucian Piane, and Kim Chi slayed despite not being able to dance.

giphy_3 (2)


Also, Violet Chachki reminded us why she deserved to win last season with her high fash-on!


Final Predictions

So how did the algorithms do overall? The table below contains final predictions. Overall, not too bad. Kim Chi and Bob were both predicted to come out on top, with only one algorithm each predicting 2nd instead of first. The best performing algorithm was the Gaussian Naive Bayes, with a rank score of 0.932!


Bob the Drag Queen1811121
Kim Chi2811211
Naomi Smalls3813763
Chi Chi DeVayne4854345
Derrick Barry5844434
Thorgy Thor6874679
Robbie Turner7854455
Acid Betty8889989
Naysha Lopez98898107
Cynthia Lee Fontaine108811121111
Dax ExclamationPoint1181212101211
Laila McQueen1181181097
Rank Score0.9320.8980.8680.8980.857

Predicting Other Seasons

Throughout the season I had been training the algorithms on data from seasons 1-6, testing on season 7, then predicting season 8. But I was curious how each algorithm would do at predicting all the other seasons. So I wrote a script that would step through each algorithm and season, training the algorithm on all the seasons other than the one it was predicting, and using this to predicting the relevant season. So for season one, all the algorithms have been trained on seasons 2-8, and then asked to predict season 1. I generated the following figure to plot the rank scores for each season for each algorithm.


There’s a few interesting things going on here. Season 4 was especially tough to predict for Gaussian Naive Bayes and for the Random Forest Classifier. Season 7 was also a challenge for these algorithms. Support Vector Machines and Random Forest Regressor are both consistently good at predicting seasons, and Neural Network gets good past season 3. So what did each algorithm predict would happen in each season? Let’s find out.

Season 1

Bebe Zahara Benet1111121
Nina Flowers2133442
Rebecca Glasscock3144334
Tammie Brown8169888
Victora "Porkchop" Parker9198999

The algorithms all seemed to agree that BeBe Zahara Benet would come out on top, but they also predicted that Ongina would join her there. Ongina’s elimintation in season one was perhaps one of the hardest for RuPaul in all eight seasons, having to excuse herself from the set before she could decide who to send home.


Season 2

Tyra Sanchez1211211
Pandora Boxx5234185
Jessica Wild6227242
Sahara Davenport7232777
Morgan McMichaels82311968
Nicole Paige Brooks1121211121212

For season 2, the algorithms were pretty much in agreement about who would win. No other queens comes close to have the average ranking as Tyra Sanchez.


Season 3

Manila Luzon2311111
Alexis Mateo3345333
Yara Sofia4342455
Carmen Carrera5367677
Delta Work73897108
Stacy Layne Matthews8382768
India Ferrah1031110131310
Mimi Imfurst1138109811
Venus D-Lite1331110111111

What’s really interesting in season 3 is that all five algorithms agreed that Manila Luzon would be taking home the crown, but she placed second after Raja. By the end of the season, Manila’s and Raja’s win/high/low/lipsync profile was identical, and so the three algorithms not placing Raja first must be dinging her on her age (Raja was 36 during the season, while Manila was 28). I’d also like to point out that favorite among my friends, Stacy Layne Mathews, was predicted to come in second by the Neural Network.


Season 4

Sharon Needles1411211
Chad Michaels2412123
Phi Phi O'Hara3414232
Latrice Royale4494443
Kenya Michaels54431395
DiDa Ritz641288105
Jiggy Caliente84104585
Madame LaQueer104118869
The Princess11451081113
Lashauwn Beyond1245128129
Alisa Summers13412128139

The algorithms in season 4 had a consensus that Sharon Needles would be taking home the crown, with Chad Michaels a close second (Chad won the first season of Drag Race All Stars shortly after season 4). Interestingly, even though Willam was eliminated at seventh place for violating the rules (her husband was making unauthorized conjugal visits to the hotel they were sequestered in), and so presumably would have made it further in the season, the algorithms predicted she would place approximately where she actually placed.


Season 5

Jinkx Monsoon1511131
Roxxxy Andrews3515221
Coco Montrese5587885
Alyssa Edwards6573776
Ivy Winters7515253
Jade Jolie851498108
Lineysha Sparx95586610
Honey Mahogany1051014141311
Vivienne Pinay115109898
Monica Beverly Hillz1251011121111
Serena ChaCha1351013121211
Penny Traition14591181411

Season 5 had the algorithms placing the top three in the right order, on average. What seems to have dragged the rank scores down for this season is how far Honey Mahogany made it – she was predicted to have gone home first by most of the algorithms. Coco Montrese seems to have made it further than predicted, possibly because the drama between her and Alyssa Edwards made such good TV (Also, Coco’s best moments were when she lip synced).


Season 6

Bianca Del Rio1611111
Adore Delano2611323
Courtney Act3613131
Darienne Lake4674454
Joslyn Fox6674567
Trinity K. Bonet7697588
Laganja Estranja8659879
Gia Gunn10612914139
April Carrion116599109
Magnolia Crawford1361213111412
Kelly Mantle1461414111112

Season 6 was easy to predict. It was pretty obvious from the first time Bianca Del Rio walked into the workroom she would taking home the crown. Each episode after merely confirmed the inevitable, and the algorithms agree. All five predicted Bianca to come in first.


Season 7

Voilet Chachki1711311
Ginger Minj2781122
Kennedy Davenport4714352
Trixie Mattel6787876
Miss Fame77686105
Jaidynn Diore Fierce8786896
Kandy Ho10761011810
Mrs. Kasha Davis117131381310
Jasmine Masters1271110131110
Sasha Belle1371110111210
Tempest DuJour1471314131410

The algorithms had a difficult time deciding who would be in the top three in season 7. Max was predicted to be there by three algorithms, despite going home relatively early in the season. Four algorithms predicted Katya making it to the top three, while only two predicted Pearl getting so high. This actually tracks pretty well with what people were expecting early on in the season РPearl was going to go home early, Max would make it pretty far, and Katya was going to make it to the top.



Overall, Support Vector Machine has the highest average rank score across all eight seasons, followed by Neural Network, Random Forest Regressor, Random Forest Classifier, and Gaussian Naive Bayes coming in last. Her performance in season 8 was largely a fluke compared to her performance the rest of the seasons.

This was a fun exercise for me. I learned more about how each of the machine learning algorithms worked, I practiced using python for data analysis, and I got to use a whole bunch of gifs of drag queens. You can find the code I used for both the weekly predictions, as well as the predictions in this blog post, on my github.