"Bli2kun Project" Start!! Predict the Win Team from Picked Champions Using ML or DL methods.【LoL】

Hey guys, how are you enjoying Worlds 2020?
I enjoy watching WCS games myself, of course. I just started LoL this spring after Corona, and a lot of the pro games are exciting for me. I try to share my impressions and original content on my blog.

By the way, have you ever heard of the "Bliz-kun" model that appeared on the LJL game broadcast? It's a system for predicting winnings rate after the draft and in the middle of a game. I haven't seen anything like this outside of the LJL and LPL. (I was watching games in the LJL, LCK, LPL, and LEC regions this summer.) I like the system as an element of spectatorship, and I'm technically interested in it because it's close to my field of study.

As you know, no such system for predicting the odds of winning has appeared in the WCS. Then I'm going to try to mimic this system myself, which is the main purpose of this article. I decided to call this challenge the "Bli2kun Project".

In this article, I'm going to write about the ideas I tried and the interesting results I got, not the program code and other details.

Overview of "Bli2kun Project" Version 1
Test & Trials
- Trials Using the Matches in WCS Group Stage
Appendix: Champion Importance which Bli2-kun learned
Finally...

Overview of "Bli2kun Project" Version 1

Collecting Data

For training the machine learning model, I have prepared data for about 1200 matches of SoloQ matches in KR's Challenger using the "Riot Developer API" [link] and split the data 80% for training and 20% for testing. The games' patch is limited to 10.19, which is used in WCS.

At the time of getting the data by API, the number of samples was much larger, but the number of samples has been reduced by formatting the data and processing the missing values.

Machine Learning Model which I Used

I used Random Forest[link] and LightGBM[link]. Both of them are based on Decison Tree[Wiki]. RF use Bagging[Wiki] and LGBM use Boosting[Wiki] technique.

One of the reasons I used these models is that they show the importance of features. It is possible to analogize a meta-champion for each patch from the importance of the features, I guess.

Input Shape for the Model: One-hot Encoding

The information of the 5 champions picked by the Blue team and Red team is combined.

First, we create a zero vector of 151 dimensions (i.e. the number of champions as of patch10.19). Next, we convert the values of the parts corresponding to the 5 picked champions from 0 to 1. In technical terms, this expression is called "one-hot encoding".

f:id:YNicki:20201031150119p:plain — This figure illustrates the conversion of the picked champion to the model's input by one-hot encoding.

Problem Setting

The Blue team wins or the Red team wins in binary classification.

Motivation of Version1

I'm sure many of you are uncomfortable with this challenge.
This is because LoL is a very complex game, and even if the same 10 players made the same pick, the result would not be the same. Trying to predict the winner of such a game based only on the picked champions is ridiculous.
However, let me remind you that the purpose of this project is not to build a model that accurately predicts the odds of winning a game, but to make it an enjoyable part of the game viewing experience. In other words, no matter how accurately you build a model that predicts the winning team, if the output of the model says that the team will win 100% of the time, you will lose interest in watching the game. *1 Rather, a model that outputs in such a way that the win rate is antagonistic (so that the perception is 50%), even if it is somewhat inaccurate, is ideal.
For the above reasons, I think it's worth a try, even if it's expected that the model won't train well.

Future Works

Defamation of information: failure to represent the role of the picked champions.
The structure that can represent the relationships between champions (synergy with allies or compatibility with face-to-face).
Sparse problems with one-hot encoding.
Search for other information useful for this task.

Test & Trials

In the test data, the accuracy of Random Forest is 0.46 and LightGBM is 0.48. Even if we predict completely randomly, this value is 0.5.

Looking at this, I honestly don't think it's useful, but on the other hand, the fact that it's not 0.5 means that it's not a random prediction, but it may be based on some rule which Bli2kun learned. I'll pick up some actual WCS group stage matches to try it out.

Trials Using the Matches in WCS Group Stage

The team on the left is the Blue team and the team on the right is the Red team. I chose these three matches myself, so I can't evaluate the model with this one. But it looks like there is some use for Random Forest's behavior, but it's going to be difficult to use LightGBM.

Also, I think it would be helpful for you to look at this section's win rate predictions after looking at the "champion importance" that the model has learned, which will be presented in the next section, for your consideration. If you have the time, please read this section again after reading the next section.

Group Stage Day5 Tiebreaker: Suning vs G2：Suning Win

Suning: GP, Graves, Akali, Ashe, Leona
G2: Sion, Lilia, Syndra, Senna, Sett
Random Forest: 60% vs 40%
LightGBM: 94% vs 6%
Both predictions are spot on, but Suning's win rate is too big. Not good.

Group Stage Day6: JDG vs DRX：JDG Win

JDG: Camille, Lilia, TF, MF, Bard.
DWG: Sett, Graves, Galio, Kalista, Alistair
Random Forest: 56% vs 44%
LightGBM: 70% vs 30%
The fact that JDG is taking the top and support meta seems to be appreciated.

Group Stage Day8: DRX vs TES：TES Win

DRX: Ornn, Kindred, Galio, Caitlyn, Bard.
TES: Vlad, Graves, Orianna, Ashe, Lulu.
Random Forest: 50.2% vs 49.8%
LightGBM: 85% vs 15%
As for this pick, I was personally interested in this pick, TES had the strongest late scale Vlad, and picked Graves&Ashe in the meta. DRX configuration looks good with the tank and engagement champion. Caitlyn doesn't look weak either, even though she fell out of the meta, and the scale of Ornn and Kindred isn't so bad. Both models predicted DRX win, and missed it. LightGBM is a big miss, which makes it difficult to apply.

Appendix: Champion Importance which Bli2-kun learned

f:id:YNicki:20201015025558p:plain — Top 20 feature importance of Random Forest.

f:id:YNicki:20201015025638p:plain — Top20 features importance of LightGBM.

I was surprised to see Xinjao show up in the Random Forest rankings. This is because of the buffs in the previous patch 10.18.

It's also impressive to see Ezreal in the ranks, who I personally thought had been away from the meta for a while. In fact, I see the scene Ezreal picked in WCS, so I guess he's coming back as a blink-holding adc champion.

Finally...

This article is an English translation of a Japanese article.
In the Japanese article, an explanation of the advanced model has already been published. I plan to translate those articles into English in the future.

*1:There may be some use of ingenuity in this part of the model outside the model.

ニッキーの日記

気が向いたときに趣味のことを書くブログです！最近は旅行や食べ歩きなどのお出かけ、ゲームを楽しんでます！