社会人データサイエンティストの雑多なブログ

社会人データサイエンティストが暇なときに更新するブログです。趣味のLoLとか勉強した内容について、雑多に書いていこうと思います。

Simple Idea Made Prediction Model Understand the Difference in Time-Dependent Features of Champions 【LoL】【ML】

My last article was somewhat abstract, writing about the concept of "Bli2kun Project".
This time, I'm reporting on a simple idea that yielded some interesting results. I hope you will enjoy reading it more than last time.

 

 

First of all, there are no changes to the machine learning model used, so I will not explain it in this article. You can find links to the previous articles below.

 

 

Challenge: How to Get the Model to Learn the Different Timing of Different Champions' Power Spikes

As you know, each LoL champion has different characteristics, such as different timing for power spikes and whether they are strong early on or scaling to the late game.

Let's take a look at figure1 showing the top 20 most important champions that the previous model (Bli2kun ver1) learned. For example, Lucian is one of the strongest champions at the beginning of the game. Camille is also a strong champion who scales towards the late game. (Pathc10.19) It wouldn't be surprising to see them ranked higher with each of their strong time zones.

On the other hand, Graves, the #1 ranked champion in importance, is currently a meta jungle champion. He's capable of playing strong no matter which champion comes face-to-face, and if farmed properly, he can hold the pressure against the lanes for a long time. His relatively large role on the team in any situation is likely to make him highly regarded.

f:id:YNicki:20201015025558p:plain

Figure1. This figure shows the top 20 most important champions that the precious model learned.

In my opinion, the above features of LoL should be taken into account in constructing a winning team prediction model. In this article, I will address this issue, 

 

Approach: Simple Idea Solved This Problem!

I took game duration into account for the model in a simple way.

I divided the training data into three types (early game, middle game, and late game) according to the length of the game. Using the three datasets, I trained three models with the same mechanism (Random Forest) to predict whether the blue or red team will win. 

 

How to determine the length of the game be split?

First, I need to determine the length of the game time, which is the basis for splitting the training data, and also need to check if differences exist between Blue and Red teams in terms of the times at which they are more likely to win.

In this work, I used match data from Challenger and Patch 10.19 in the three regions of KR, EU, and NA. The following figure2 shows the distribution of the game duration.

f:id:YNicki:20201018083728p:plain

Figure2. This figure shows the distribution of game duration in the KR/EU/NA regions for Challenger and Patch 10.19. The horizontal axis is in seconds.

No significant differences were found between the three regions. Therefore, I treat the three regions together here. There's a peak around 25 minutes (1500sec~), and the distribution is thicker on the right side (corresponding to late game).

f:id:YNicki:20201018084215p:plain

Figure3. This figure is a box-beard diagram showing the game duration which the Blue and Red teams won respectively.

The figure3 shows that there is no significant difference between the Blue and Red teams in terms of game time.

I looked at the trigonometric points and found that they are 1497 sec (about 25min) and 1758 sec (about 30 minutes). The characteristics of the distribution of the game duration influence the shortening of the interval between the first and second tristiles. In my personal feeling, it is uncomfortable to call a late-game more than 30 minutes long, so I changed it to 35 minutes. Probably even at 35 minutes, you would feel uncomfortable (I think it's a long game after 40 minutes and 45 minutes), but I decided that I couldn't make it longer than 35 minutes because we needed to keep a certain number of samples of training data.

In the next section, I'll discuss the results. Has Bli2 improved in the end!?

 

Results: Model Captures the Difference in the Characteristics of Champions

Result of the model learning early game (~25min) data

f:id:YNicki:20201018085433p:plain

Figure4. This figure shows the top25 most important champions up to 25 minutes. 

As I mentioned at the beginning of this article, Lucian is now at the top of the list. The reason for the large discrepancy in Lucian's importance between the Blue and Red team is probably due to LoL's BanPick system; the 1st pick is always the Blue team, so the current op champion would almost always be picked at the 1st pick. Therefore, the fact that Lucian is often used in the Blue team is likely responsible for the difference in importance. 

Shen ranked third, is also an interesting champion. He can loom into other lanes with his ult skill, which has an impact on the entire map after level 6. It is true that Shen is not good in terms of late-scaling, but there's no denying that he's a strong champion early and middle game. He's also unique in that he can be picked for top and jungle (and support) as the flex pick. 

Syndra at #21 is also noteworthy. I've heard that her win rate at SoloQ has been low following repeated nerfs, but she's still a champion on the pro scene, picked when the team wants to control the early-mid lane. This makes her more important than in later times. 

 

Result of the model learning middle game (25min~35min) data

f:id:YNicki:20201018085528p:plain

Figure5. This figure shows the top25 most important champions for 25~35 minutes. 

The importance of Lucian is down from before. Also, the importance of the current meta junglers such as Lillia, Graves, and Nidalee have increased in importance. Until 25 minutes in the game, the strong lane champions are probably more important in terms of beating their opponents in the lane and helping the jungle, but in the middle of the game, controlling the object (Dragon, Herald, and Baron) is considered to be more important for winning the game. In this regard, the importance of the jungle has increased. 

The surprise appearance was that Blue team Sett came in at #1 in importance. I was under the impression that Sett had suffered a serious nerf prior to Patch 10.19 and had lost some of his influence on the game. But as you can see from the group stage of Worlds2020, he's still a champion of mid, top, and sup flex operations. At this point, Sett is still useful in BanPick, I guess.

 

Result of the model learning late game (35min~) data

f:id:YNicki:20201018085606p:plain

Figure6. This figure shows the top25 most important champions after 35 minutes. 

Good scaling GP, Orianna, Cassiopeia, and Akali are among the ranks.

I didn't expect Camille, who I mentioned at the beginning of this article as an example of a scaling champion, to move down the rankings. Does this mean that it won't take much for Camille to become a winner?

Also, Bloodmere, which is touted as the strongest late game, and Senna Kindred Scion, which has an infinite scale, are not ranked. Enduring the early stages with late-game configuration may be a bit more difficult in SoloQ compared to the pro scene.

 

To summarize the discussion, it seems safe to say that I've succeeded in getting the differences in the characteristics of champions based on the time of the game into the prediction model.

 

Finally...

I will continue to update the articles in English about "Bli2kun Project" and keep you posted on Twitter for updates, so please be sure to follow my Twitter account and read the article again! 

twitter.com