Data for on players in Pro Ladder is released on playgwent.com, but it is limited.
You get the rank, the score, the country of origin and the number of matches played. Using some fairly basic data
analysis tricks there must be more we can do with these data! Using python we’ll scrape data for the top
Gwent players on pro-ladder
and calculate additional statistics about the current season, popularity of the game in different countries, players’
efficiency, players national rank, …
A jupyter notebook with all code can be found on GitHub which you can explore
through Binder without installing
anything. For those that want to check their own national rank or ladder efficiency index (and don’t care about the
code). Download links for the full tables discussed here (in Excel-format) are available here:
- Player Statistics : Player data for each season, with ladder efficiency and national rank added.
-
Player Summaries : Summary data for each player that made an appearance in pro rank. Includes number of appearances on
the leaderboards, min and max MMR, best rank, best national rank, … - Seasonal summary : Number of games played each season in Masters 2, minimum and maximum MMR as well as top 500,200 and 64 cutoffs.
- National Statistics : Data per country, number of pro players per million inhabitants, …
Update 04/08/2021: Downloads are now linked to the Gwent Pro Rank Data page which will be updated more frequently.
Update 12/03/2021: All files were updated and now contain Season of the Wolf and Season of Love from Masters 3 (2021).
Update 03/09/2020: Credit where credit is due! After putting this blog post up I found two articles by Lerio2 that
predate mine where he did the same analysis to check the popularity and rank countries (based on teams of 4 players).
Though I did my analysis independently, he had the idea several months earlier and deserves full credit for that!
You can read his articles, called Nations of Gwent, here and here
Getting the Data
Python has two powerful packages to scrape data from the web: the requests library to download data and BeautifulSoup
to parse the HTML that comes back and extract information. The tabular data from playgwent.com
is pretty straightforward to parse. There is the rank, the player’s handle, the number of matches played and their score
(which is called the Matchmaking Rating or MMR).
Furthermore, there is a flag icon indicating the country the player is
from. These icons have a class that contains the two letter code, which follows the official ISO 3166 international
standard.
class="flag-icon flag-icon-pl">
The two letter code can be easily extracted from the the html tag, while converting it to a human readable name can be
done in a few lines of code using the python library pycountry. As shown in the
stub below, you can provide it with a two letter code (pl in the example below) and it will return all other names, including the
common name (Poland here). So after scraping the data, the pycountry library was used to get proper names for all
countries.
import pycountry
pycountry.countries.get(alpha_2='pl')
# Output:
# Country(alpha_2='PL', alpha_3='POL', name='Poland', numeric='616', official_name='Republic of Poland')
While reading the data we’ll also keep track of which players were in the top 500 the season before (note that this
does require all seasons to be loaded and in orde). So we end up with a table (called full_df in the code), that looks
like this:
rank | name | country | matches | mmr | season | previous_top500 |
---|---|---|---|---|---|---|
1 | kolemoen | Germany | 431 | 10484 | M2_01 Wolf 2020 | no |
2 | kams134 | Poland | 923 | 10477 | M2_01 Wolf 2020 | no |
3 | TailBot | Poland | 538 | 10472 | M2_01 Wolf 2020 | no |
4 | Pajabol | Poland | 820 | 10471 | M2_01 Wolf 2020 | no |
5 | Adzikov | Poland | 1105 | 10442 | M2_01 Wolf 2020 | no |
Adding National Rank and Efficiency Statistics
The rank on playgwent.com is the global rank, adding a national rank can be done in a single line of code. The
groupby
function in combination with the rank
function does exactly what we want here.
full_df['national_rank'] = full_df.groupby(['country','season'])["mmr"].rank("first", ascending=False)
In Gwent you need to play at least 25 games with four out of six factions.
This will give you a base score, MMR, of 9600. Winning a game increases the MMR, depending on the current rank of your
opponent (usually about 7 points are gained) and losing costs you MMR points. The highest reached MMR per faction is
summed up to get the final score. So with a higher win-rate, better scores can be obtained with fewer games. To find out
which players are more efficient in climbing (and arguably better at the game than others at the same MMR) we we take
the MMR, subtract the base value (9600) and divide by the number of matches. However, as
increasing the MMR score becomes progressively more difficult as players will face better opponents as they climb the
ladder, Lerio2 from Team Legacy proposed to divide by the square root of the number of matches. Their metric, the
Ladder Efficiency Index or LEI is
calculated here as well.
full_df['efficiency'] = ((full_df['mmr']-9600))/full_df['matches']
full_df['lei'] = ((full_df['mmr']-9600))/np.sqrt(full_df['matches'])
Now our full dataframe has two additional columns one with the simple linear efficiency and one with Team Legacy’s
Ladder Efficiency Index.
rank | name | country | matches | mmr | season | previous_top500 | national_rank | efficiency | lei |
---|---|---|---|---|---|---|---|---|---|
1 | kolemoen | Germany | 431 | 10484 | M2_01 Wolf 2020 | no | 1.0 | 2.051044 | 42.580782 |
2 | kams134 | Poland | 923 | 10477 | M2_01 Wolf 2020 | no | 1.0 | 0.950163 | 28.866807 |
3 | TailBot | Poland | 538 | 10472 | M2_01 Wolf 2020 | no | 2.0 | 1.620818 | 37.594590 |
4 | Pajabol | Poland | 820 | 10471 | M2_01 Wolf 2020 | no | 3.0 | 1.062195 | 30.416639 |
5 | Adzikov | Poland | 1105 | 10442 | M2_01 Wolf 2020 | no | 4.0 | 0.761991 | 25.329753 |
You can download the full table here.
Season Summary
About every month or so there is a new season in Gwent. Using the groupby
function we can very quickly create a summary
how many games were played by the pro-ranked players (do note that only the 2860 best players are listed on the website).
We’ll also add the cutoff values for rank 500, 200 and 64 as these are important thresholds for competitive players.
Here the aggregate function agg
is used in combination with NamedAgg to calculate all statistics in one go.
per_season_df = full_df.groupby(['season']).agg(
min_mmr = pd.NamedAgg('mmr', 'min'),
max_mmr = pd.NamedAgg('mmr', 'max'),
num_matches = pd.NamedAgg('matches', 'sum')
).reset_index()
top500_cutoffs = full_df[full_df['rank'] == 500][['season', 'mmr']].rename(columns={'mmr': 'top500_cutoff'})
top200_cutoffs = full_df[full_df['rank'] == 200][['season', 'mmr']].rename(columns={'mmr': 'top200_cutoff'})
top64_cutoffs = full_df[full_df['rank'] == 64][['season', 'mmr']].rename(columns={'mmr': 'top64_cutoff'})
per_season_df = pd.merge(per_season_df, top500_cutoffs, on='season')
per_season_df = pd.merge(per_season_df, top200_cutoffs, on='season')
per_season_df = pd.merge(per_season_df, top64_cutoffs, on='season')
per_season_df
The full output from this you can see below:
season | min_mmr | max_mmr | num_matches | top500_cutoff | top200_cutoff | top64_cutoff |
---|---|---|---|---|---|---|
M2_01 Wolf 2020 | 2407 | 10484 | 699496 | 9749 | 9872 | 10061 |
M2_02 Love 2020 | 7776 | 10537 | 769358 | 9832 | 9952 | 10117 |
M2_03 Bear 2020 | 9427 | 10669 | 862678 | 9867 | 9995 | 10204 |
M2_04 Elf 2020 | 9666 | 10751 | 1004830 | 9952 | 10087 | 10293 |
M2_05 Viper 2020 | 9635 | 10622 | 859640 | 9910 | 10028 | 10255 |
M2_06 Magic 2020 | 9624 | 10597 | 793401 | 9896 | 10002 | 10191 |
M2_07 Griffin 2020 | 9698 | 10667 | 996742 | 9978 | 10100 | 10289 |
M2_08 Draconid 2020 | 9666 | 10546 | 838212 | 9946 | 10061 | 10246 |
The number of matches played by the top players is an indication how many people are playing the game, as more
active players would require more games to be played to climb pro ladder. You can see that the popularity peaked during
the Season of the Elves. During this season also some new leader abilities
were introduced, so the fresh content could also to players return to the game. A similar increase in matches can be
seen in the Season of the Griffin with the release of new cards through the Master Mirror expansion. So it seems that
new content is a good incentive for players to play more, and spark a fiercer competition.
You can download the full table here.
Where is Gwent Being Played
So using the groupby
function in combination with the agg
we can very quickly count how many pro players there are
per country. We can then combine this with the population size of each country (and somewhat up-to-date list can be found
here). By dividing the number
of players in pro-ladder by the number of inhabitants (in millions) we can get the number of pro players per capita.
season | country | total_matches | num_players | pro_players_per_million | matches_per_player |
---|---|---|---|---|---|
M2_08 Draconid 2020 | Poland | 72225 | 267 | 7.047129 | 270.505618 |
M2_08 Draconid 2020 | Estonia | 1726 | 7 | 5.280436 | 246.571429 |
M2_08 Draconid 2020 | Russian Federation | 195905 | 673 | 4.613626 | 291.092125 |
M2_08 Draconid 2020 | Belarus | 10260 | 39 | 4.125931 | 263.076923 |
M2_08 Draconid 2020 | Ukraine | 52333 | 162 | 3.682351 | 323.043210 |
The top 5 countries is comprised out of Eastern European Countries, which is no surprise as the company that created
Gwent is based in Poland and The Witcher lore has been created based on Slavic myths and legends. Iceland, Finland,
Hong Kong, Malta and Croatia complete the top 10. These are all relatively small countries, so a single player
making it up to Pro Rank boosts them up in the ranking.
You can download the full table, which includes data for all seasons and countries here.
Which Country has the best Gwent Team
Now we know where the most pro players are per capita, but what if countries were able to send a team of three
e-athletes to a world championship? Which countries would do best with their team of three pro players. To this end
all countries with three or more players were selected, and the top 3 players for each of those countries picked.
Next, the average MMR and total MMR for those players, was calculated as well as the efficiency to climb and the rank
for each country. The code is up on GitHub for those interested, but also here it is a simple matter of filtering and
grouping data using built-in pandas functions.
The results for Season of the Draconid are shown below. It seems that China has the best team of three this season
followed by Russia and Poland.
season | country | mean_mmr | total_mmr | mean_matches_per_player | total_matches | nation_rank | efficiency | lei |
---|---|---|---|---|---|---|---|---|
M2_08 Draconid 2020 | China | 10489.333333 | 31468 | 409 | 1227 | 1 | 2.174409 | 43.974703 |
M2_08 Draconid 2020 | Russian Federation | 10479.666667 | 31439 | 636 | 1908 | 2 | 1.383124 | 34.881052 |
M2_08 Draconid 2020 | Poland | 10439.333333 | 31318 | 657 | 1971 | 3 | 1.277524 | 32.745512 |
Player Summaries
For players that made it up to Pro Rank during multiple seasons we’ll quickly generate a summary. Again the groupby and
agg function are being leveraged again to group things and get the summary statistics. We’ll count the number of
appearnaces on pro ladder, the min, mean, max MMR score. Average number of matches and total number of matches as well as
the best global and national ranks.
This can give you a quick impression of all the data available on a player. Here you can see the output from myself (handle sepro).
name | country | appearances | min_mmr | mean_mmr | max_mmr | mean_matches | num_matches | best_rank | best_national_rank |
---|---|---|---|---|---|---|---|---|---|
sepro | Belgium | 3 | 9746 | 9782 | 9820 | 243 | 728 | 1138 | 2.0 |
You can download the full table here to find your
own or your favorite players stats.
Conclusion
Initially, I set out to get players national ranks. When you are from a small country, just making it to
Pro Rank will likely give you bragging rights about being in the top 3 of your country. Though with some fairly basic
data science you can very quickly get a lot more details on various aspects of the game.
This type of project I would really recommend for people starting out with programming. Find a topic you like and write
some code to get some data about it, do some analysis and generate a few plots.
0 Comments