MTGO Ratings

#1 May 18, 2014
Autumnowl23
Autumnowl23

View User Profile

View Posts

Send Message
Wizard Mentor
- Location: fukof
- Join Date: 4/11/2011
- Posts: 614
- Member Details
Two questions.

A) Is there an explanation anywhere on how ratings are calculated? It'd be useful to know how many 3-0s I'd need to get to X point, or how a 2-1 in a match differs from a 2-0, or etc.

B) How would you classify Limited ratings in terms of how good a player someone is? You know, 1600 is this, 1700 is this, pro players are usually this, etc. I'm curious to see how good I am as a player compared to other people. You know, as a bit of a confidence booster.

Thanks!

Private Mod Note ():

Rollback Post to Revision RollBack
#2 May 18, 2014
delfam
delfam

View User Profile

View Posts

Send Message
Archmage Overlord
- Join Date: 10/28/2013
- Posts: 1,452
- Member Details
the very best limited players are usually around 1900 or above

Private Mod Note ():

Rollback Post to Revision RollBack
#3 May 18, 2014
TheLizard
TheLizard

View User Profile

View Posts

Send Message
Resident Planeswalker
- Join Date: 4/11/2011
- Posts: 4,196
- Member Details
The number of rating points at stake in any match is given by the formula 16/(1+10^((Loser's Rating - Winner's Rating)/400)). [EDIT: For Daily Events and other scheduled tournaments, replace 16 with 24.] The winner gets that many points. The loser loses that many. The number of game wins has nothing to do with it.

Players start at 1600.

Example: If your rating is 1700 and you play against another 1700-rated player, you will steal 8 points from your opponent if you win, and your opponent will steal 8 points from you if he wins. If your rating is 1700 and you play against an 1800-rated player, you will steal 10.25 points from your opponent if you win, and your opponent will steal 5.75 points from you if he wins.

A general interpretation of the ratings: 1600 = not so hot or new, 1700 = above average, 1800 = good, 1900 = great (or more likely, you've been lucky recently).

Last edited by TheLizard: May 18, 2014

Private Mod Note ():

Rollback Post to Revision RollBack
#4 May 18, 2014
FTW1987
FTW1987

View User Profile

View Posts

Send Message
Archmage Overlord
- Join Date: 2/2/2014
- Posts: 2,627
- Member Details
I believe they use the Elo system, with K-value specified for each event (e.g. 16, 24, etc.). You can look it up in Wikipedia if you want. It's what they use in The Social Network to rate hotness of girls. Requires some rudimentary understanding of math, but it's not difficult to figure out. It's based on a logistic model. Basically, it compares your expected outcome vs a person with a different rating with who actually wins and does an adjustment to your rating based on that. The size of the adjustment (i.e. how much your rating goes up or down) depends on the K-value. So if you want to increase your Limited rating most efficiently, go play a bunch of Sealed and premier events with higher K-value. Draft queues have the lowest.

This system was scrapped in favor of Planeswalker points for paper Magic, since Planeswalker points reward people for playing a lot of events (i.e. spending more money and supporting the card shops and the commercial aspect of the game) whereas ratings rewarded people for winning and strategic times and then just backing out and sitting on their high ratings. There are many criticisms of the Elo system. Planeswalker points are certainly a better business strategy.

EDIT: Ninja'd.
The above equation explains it, except it doesn't account for variable K-values, which are a thing on MTGO. If you 3-0 Swisses all day, you're probably stealing only a marginal number of points off lower rated players. The lower K-value hurts too. Win some premier Sealed.
EDIT2: Now it does. Double ninja'd FT 0-2 drop.

Last edited by FTW1987: May 18, 2014

Private Mod Note ():

Rollback Post to Revision RollBack
#5 May 18, 2014
cricketHunter
cricketHunter

View User Profile

View Posts

Send Message
Ascended Mage
- Join Date: 9/17/2009
- Posts: 186
- Member Details
So I've kept track of my rating change over the last 90 swiss matches. Based on the formulas above I've calculated my opponent's rating. Then assuming the distribution players are normal I can calculate the expected percentile for different ratings/win percentages. My average swiss opponent has a rating of 1634.94 with a standard deviation of 95.78. If all that is true, that means:

The average player has a rating of 1634.94 and has an expected win percentage of 50% (duh).

The top 1/3 (67th percentile) of players have a rating of 1,677.08 and have an expected win percentage of 56.03%.

The top 1/4 (75th percentile) of players have a rating of 1,699.55 and have an expected win percentage of 59.19%.

The top 1/5 (80th percentile) of players have a rating of 1,715.55 and have an expected win percentage of 61.40%.

The top 1/10 (90th percentile) of players have a rating of 1,757.69 and have an expected win percentage of 66.97%.

The top 1/100 (99th percentile) of players have a rating of 1,857.77 and have an expected win percentage of 78.29%

Finally the top 1/1000 (99.9th percentile) of players have a rating of 1,930.94 and have an expected win percentage of 84.60%.

These number could be wrong for a LOT of reasons - biased samples (because of when I play, who I play because of my record), noise in the measurement (especially the standard deviation, which is more sensitive to noise), or my assumption of normal distribution (which would change the relationship between my estimated standard deviation and percentile).

That said, this correlates with a lot of anecdotal data. The top ratings in the mtgo hall of champions (which doesn't exist anymore) was around 2000 - an aberrantly high number that would be consistent with the average rating of the top players being around 1930. (In 800 simulated matches a player with an average rating of 1930.94 would see a low of about 1862 and a high of about 2028).

Private Mod Note ():

Rollback Post to Revision RollBack
#6 May 18, 2014
divisionbyzorro
divisionbyzorro

View User Profile

View Posts

Send Message
Wizard Mentor
- Location: The Beach
- Join Date: 8/17/2012
- Posts: 565
- Member Details
That seems reasonable; that would put me right in the 85th percentile of players, which seems to line up well with my experiences and self-evaluation (reasonably better than average, but still working hard at trying to get to the upper echelon).

Private Mod Note ():

Rollback Post to Revision RollBack

It's not your job to win games of Magic where you're mana screwed.
It's your job win every game of Magic where you're not.
#7 May 19, 2014
TheEternalVortex
TheEternalVortex

View User Profile

View Posts

Send Message
Ascended Mage
- Join Date: 11/29/2012
- Posts: 244
- Member Details
Yeah that's about consistent with my experience. I was in the 1860s for a while last year and I won around 80% of my matches. I haven't been playing Theros block that much so I'm a lot worse now :).

Private Mod Note ():

Rollback Post to Revision RollBack
#9 May 19, 2014
Sene
Sene

View User Profile

View Posts

Send Message
drifts like worried fire
- Location: Asker, Norway
- Join Date: 5/22/2005
- Posts: 24,183
- Member Details
Quote from pizzap
Quote from cricketHunter
So I've kept track of my rating change over the last 90 swiss matches. Based on the formulas above I've calculated my opponent's rating. Then assuming the distribution players are normal I can calculate the expected percentile for different ratings/win percentages. My average swiss opponent has a rating of 1634.94 with a standard deviation of 95.78.

I've done the same. Usually I used this web form to calculate my opponent's rating.

I think the average of slightly over 1600 is correct. You would expect an average of exactly 1600, because everybody starts at 1600 and there are no points added to the system. However, probably there are people that start, play one or two drafts and lose, and then decide drafting is nothing for them and leave. So, their rating goes below 1600, and by this probably they donate points to the players that do play on a regular basis, making the average a bit above 1600.

Yeah, this. Someone who's losing a lot is probably more likely to quit than someone who's winning a decent amount. So even though there should theoretically be a 1400-rated player for each 1800-rated player, I find it more likely that there are multiple 1500-something rated players for each 1800-rated player.

cricketHunter's numbers make sense to me.

Edit: if you want to keep track of ratings like that, I suggest using a spreadsheet instead - it's just much faster.

Private Mod Note ():

Rollback Post to Revision RollBack
#11 May 19, 2014
Sene
Sene

View User Profile

View Posts

Send Message
drifts like worried fire
- Location: Asker, Norway
- Join Date: 5/22/2005
- Posts: 24,183
- Member Details
They are? Aww, too bad.

Though I question whether the new client is ready in July (seems extremely unlikely), but I suppose that's a different topic altogether.

Private Mod Note ():

Rollback Post to Revision RollBack
#14 May 19, 2014
Phyrre56
Phyrre56

View User Profile

View Posts

Send Message
Grumpy Old Man
- Join Date: 1/5/2005
- Posts: 6,864
- Member Details
I would hate to see ratings go away because they're one of the few ways to actually track your progress in MTGO. Sure, I could track it all myself, but that takes quite a bit of work that a computer could be doing for me. Realistically, we should be swimming in stats since MTG is such a quantitative game -- numbers of cards, life totals, casting costs, power, toughness, turns -- there are numbers everywhere! Yet we track so few of them. As a statistician who looks to numbers to infer truths, this makes me sad.

The only argument I've heard in favoring of eliminating rating as a purely observational measure (i.e. not even used as a qualifying factor) is that there have been reports of bullying over ratings. Even if you keep your rating private, someone you play against can calculate your rating after the match by checking how their rating changed. I've never experienced this myself, perhaps because I was never truly a new and inexperienced player on MTGO -- I worked through my awkward phase with cardboard -- but I guess it happens? It always felt like a straw man argument to me, or at least that the solution should be cracking down on bullying, not removing an interesting metric from the face of the earth. Bullies will bully no matter what, why should the rest of us have to give something up for them?

Private Mod Note ():

Rollback Post to Revision RollBack
#16 May 19, 2014
cricketHunter
cricketHunter

View User Profile

View Posts

Send Message
Ascended Mage
- Join Date: 9/17/2009
- Posts: 186
- Member Details
Quote from Phyrre56
The only argument I've heard in favoring of eliminating rating as a purely observational measure (i.e. not even used as a qualifying factor) is that there have been reports of bullying over ratings. Even if you keep your rating private, someone you play against can calculate your rating after the match by checking how their rating changed. I've never experienced this myself, perhaps because I was never truly a new and inexperienced player on MTGO -- I worked through my awkward phase with cardboard -- but I guess it happens? It always felt like a straw man argument to me, or at least that the solution should be cracking down on bullying, not removing an interesting metric from the face of the earth. Bullies will bully no matter what, why should the rest of us have to give something up for them?

Right, I don't like the mentality that "it must be the number, not the bully that's the problem here." That said it was probably pragmatic step by WotC to make it private. However, I too would be sad to see Elo go permanently. I suspect the reason for eliminating Elo is more about addressing the other problems with Elo, namely it can "punish" you for playing and losing, which in turn can lead to sitting on high ratings, or worse for WotC a low rating can discourage players completely. I thought that planeswalkers points was good at solving THESE problems, at the expense of us Spikes who like to measure our progress, not just our games played.

Private Mod Note ():

Rollback Post to Revision RollBack
#18 May 19, 2014
spairy
spairy

View User Profile

View Posts

Send Message
Archmage Overlord
- Join Date: 3/12/2012
- Posts: 1,421
- Member Details
Your average, experienced drafter who plays often is probably somewhere in the 1700-1750 range. Anything 1750+ on average is very good, 1800+ is elite, 1850 average is about the highest sustainable average. The 1850 average players can get to a 1900 rating when they get on a hot streak, but I don't think this is sustainable long term for anyone. Players used to "retire" accounts when they hit 1900, as it was like a badge or soemthing to get an account there.

The 2000 account rating is just an anomaly, basically reserved for when a player with an already high limited rating goes through a high K-Value event like a limited MOCS undefeated. Many of the high end accounts in the hall of champions got there through cheating/collusion, and most of the rest were aforementioned hot streaks taht the players just retired the account to let the rating stay at that level.

As for making the ratings not public, this was just pure laziness on the part of Wizards. They couldn't be bothered with a decent system for bad behavior, so they just canned something that players enjoyed. I highly doubt that removing the ratings even had any effect on behavior anyway: I doubt jerks stopped beign jerks just because they couldn't see ratings. To those idiots, anyone who beats them is a "sack" or a "donk". Raging after a loss is not rational behavior, so I highly doubt getting rid of ratings as "ammo" did much good at all. Instead it just took away a feature from the overwhelming majority of players who are capable of decent online behavior.

Private Mod Note ():

Rollback Post to Revision RollBack
#19 May 20, 2014
FTW1987
FTW1987

View User Profile

View Posts

Send Message
Archmage Overlord
- Join Date: 2/2/2014
- Posts: 2,627
- Member Details
Thanks for recording all that data! I question some of the assumptions though..

1) Only a sample size of 90 opponents, which is fine for computing averages, but not so good at getting 99th percentiles and such

2) Only accounts for Swiss. Their ratings are influenced by a combination of Sealed, 8-4 and Swiss. They may have a higher win rate in Swiss with rating brought down by performance in other events. In fact, using the same rating across different events messes up the whole rating system. If a player is better at Sealed than Swiss, that's not reflected in the expectations implied by the ratings and the number of points exchanged doesn't reflect the true Swiss skill difference of the players.

3) I suspect player ratings are not Normally distributed. That would imply symmetry. But there's not 1 1400 player for every 1800 player, since lower rated players play less often and/or quit and leave the system before they can lose that many points (otherwise their credit cards must be crying), whereas higher rated players would keep playing and have the chance to reach greater extremes. Because there's an economic incentive to quit if you lose and play if you win, there's no way the distribution is symmetrical.

4) Not only are ratings skewed, but they probably have higher kurtosis than a Normal curve. i.e. there is probably a glut of people with extremely high ratings, either because they just play a lot more pro Magic and are much better players, or because when they get high ratings they stop playing and sit on them where variance would normally cause them to lose and drop in rating if they kept playing.

What does the actual distribution of ratings you recorded look like? Might help.

Private Mod Note ():

Rollback Post to Revision RollBack
#20 May 20, 2014
cricketHunter
cricketHunter

View User Profile

View Posts

Send Message
Ascended Mage
- Join Date: 9/17/2009
- Posts: 186
- Member Details
@FTW1987:

I'll add these to the list of things that could make my numbers wrong

Specific thoughts:

1) True. Interestingly if I chop off random chunks of my data the 99th percentile number doesn't change much (it fluctuates in the 1830-1860 range). That implies that while the number could be completely wrong it's at least pretty numerical stable. I'm pretty sure the 99.9% percentile point is much less stable (I don't tend to look at it so I'm not sure).

2) True. I do suspect that Swiss and 8-4 ratings should be directly comparable, given enough player movement between queues. This is analogous to real world DCI ratings across LGS's and geographical regions. While you could have skewed ratings there seemed to be enough rating points moved through the system via big events like prereleases and GPs. I think the situation in MTGO is even more fluid and thus I'm not worried about the 8-4/Swiss divide. I agree with the point about Sealed. If you are measuring two different skill levels your Elo will always be a weighted mix of the two true averages.

3 & 4) True. Chess for instance found it's distribution of player skills on chess skill tests to be non-normal (basically a distribution that looked like a normal distribution but with heavier weight in each tail). That would change the relationship between percentile and distance from mean. Same with taking a normal distribution and chopping off the left tail. In practice unless the distribution is very not normal you shouldn't be introducing too much error.

I've attached a histogram of my actual observed distribution.

ATTACHMENTS

chart_2
Private Mod Note ():

Rollback Post to Revision RollBack
#21 May 20, 2014
divisionbyzorro
divisionbyzorro

View User Profile

View Posts

Send Message
Wizard Mentor
- Location: The Beach
- Join Date: 8/17/2012
- Posts: 565
- Member Details
I think it's reasonable to assume that there would be a spike of players around 1850. The weighting inherent in the ELO system creates a "rubber band" effect that prevents players of high skill from running off the end of the range and becoming outliers, so instead they bunch up in the 1850-1900 range.

Private Mod Note ():

Rollback Post to Revision RollBack

It's not your job to win games of Magic where you're mana screwed.
It's your job win every game of Magic where you're not.
#22 May 20, 2014
fnord
fnord

View User Profile

View Posts

Send Message
🏆🏆🏆
- Join Date: 9/13/2006
- Posts: 5,912
- Member Details
Quote from Phyrre56 »
I would hate to see ratings go away because they're one of the few ways to actually track your progress in MTGO. Sure, I could track it all myself, but that takes quite a bit of work that a computer could be doing for me. Realistically, we should be swimming in stats since MTG is such a quantitative game -- numbers of cards, life totals, casting costs, power, toughness, turns -- there are numbers everywhere! Yet we track so few of them. As a statistician who looks to numbers to infer truths, this makes me sad.

This would be the case if rating were an accurate measure of skill/progress. But it's not. The underlying math is based on chess and is invalid when applied to Magic. If you're a statistician you should know what happens if you do statistics based on invalid assumptions. So it's good that they're getting rid of rating. Your rating doesn't mean what 99% of people seem to think it means.

Rating is useful for one thing and one thing only. If you're in a match on the current (V3) client and your opponent seems to be taking a while, check your rating. If it's above 0, everything's fine. If it doesn't show up, you've been disconnected from the server and need to restart the client.

Private Mod Note ():

Rollback Post to Revision RollBack

Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
#23 May 20, 2014
divisionbyzorro
divisionbyzorro

View User Profile

View Posts

Send Message
Wizard Mentor
- Location: The Beach
- Join Date: 8/17/2012
- Posts: 565
- Member Details
Quote from fnord »
This would be the case if rating were an accurate measure of skill/progress. But it's not. The underlying math is based on chess and is invalid when applied to Magic. If you're a statistician you should know what happens if you do statistics based on invalid assumptions. So it's good that they're getting rid of rating. Your rating doesn't mean what 99% of people seem to think it means.

More specifically, your rating at any given moment is not indicative of any sort of progress/skill level. The real issue that there is no way to see your rating over time. If I could look at my profile and see a graph of my rating and how it has changed over the past two years, I think it would do a good job of showing my growth as a player. But the specific number that it currently sits at isn't all that helpful.

Private Mod Note ():

Rollback Post to Revision RollBack

It's not your job to win games of Magic where you're mana screwed.
It's your job win every game of Magic where you're not.
#24 May 20, 2014
fnord
fnord

View User Profile

View Posts

Send Message
🏆🏆🏆
- Join Date: 9/13/2006
- Posts: 5,912
- Member Details
Quote from divisionbyzorro »

More specifically, your rating at any given moment is not indicative of any sort of progress/skill level. The real issue that there is no way to see your rating over time. If I could look at my profile and see a graph of my rating and how it has changed over the past two years, I think it would do a good job of showing my growth as a player. But the specific number that it currently sits at isn't all that helpful.

Even long-term trends have a problem, because aside from having separate Limited/Constructed ratings there aren't different ratings for different formats.

So if my Constructed rating is lower than it was 2 years ago, maybe I've gotten worse...or maybe I just play more Momir Basic now than I used to.

Private Mod Note ():

Rollback Post to Revision RollBack

Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
#25 May 20, 2014
divisionbyzorro
divisionbyzorro

View User Profile

View Posts

Send Message
Wizard Mentor
- Location: The Beach
- Join Date: 8/17/2012
- Posts: 565
- Member Details
Quote from fnord »
Even long-term trends have a problem, because aside from having separate Limited/Constructed ratings there aren't different ratings for different formats.

So if my Constructed rating is lower than it was 2 years ago, maybe I've gotten worse...or maybe I just play more Momir Basic now than I used to.

That's a fair point. A while back I managed to tank my limited rating all the way back down to just a little over 1600 by playing sealed (I am just awful at sealed, apparently) for a few weeks; there is some skill overlap between draft and sealed, but not a lot.

Private Mod Note ():

Rollback Post to Revision RollBack

It's not your job to win games of Magic where you're mana screwed.
It's your job win every game of Magic where you're not.
#26 May 21, 2014
cricketHunter
cricketHunter

View User Profile

View Posts

Send Message
Ascended Mage
- Join Date: 9/17/2009
- Posts: 186
- Member Details
Quote from fnord »
This would be the case if rating were an accurate measure of skill/progress. But it's not. The underlying math is based on chess and is invalid when applied to Magic. If you're a statistician you should know what happens if you do statistics based on invalid assumptions. So it's good that they're getting rid of rating. Your rating doesn't mean what 99% of people seem to think it means.

I don't understand your objection. The underlying math is based on a distribution of skill levels for two player games. What part of that exactly is invalid in Magic?

Private Mod Note ():

Rollback Post to Revision RollBack
#27 May 21, 2014
Hardened
Hardened

View User Profile

View Posts

Send Message
I like my beans with ketchup.
- Location: Maine
- Join Date: 2/18/2011
- Posts: 2,465
- Member Details
Probably that Magic has more variables than chess, but we've discussed already that a chart of rating progress would be better than a thin-slice number.

Also, it thrills me to no end to see Guybrush Threepwood discussing complex statistical analysis.

Private Mod Note ():

Rollback Post to Revision RollBack

My helpy helpdesk of helpfulness.
My Decks:
EDH: Sygg, River Cutthroat , Road to Scion
Grimgrin, Corpseborn
Modern: Polytokes
IRL: Progenitus Polymorph , Goblins

Just a friendly reminder that I will drive this car off a bridge

bethematch.org. Save a life.
#28 May 21, 2014
Phyrre56
Phyrre56

View User Profile

View Posts

Send Message
Grumpy Old Man
- Join Date: 1/5/2005
- Posts: 6,864
- Member Details
The problem with using ELO in Magic is that your chess pieces never randomly roll over and die on your first turn, like your deck does when you get screwed by the random element of shuffling. Your opponent also never sits down with more powerful chess pieces than you.

It's not perfect but at least it's something. I'd rather have ELO than nothing. I know it has limitations but even if it's misleading to most people, it's interesting to me.

Private Mod Note ():

Rollback Post to Revision RollBack
#30 May 21, 2014
Sene
Sene

View User Profile

View Posts

Send Message
drifts like worried fire
- Location: Asker, Norway
- Join Date: 5/22/2005
- Posts: 24,183
- Member Details
Quote from Phyrre56 »
The problem with using ELO in Magic is that your chess pieces never randomly roll over and die on your first turn, like your deck does when you get screwed by the random element of shuffling. Your opponent also never sits down with more powerful chess pieces than you.

It's not perfect but at least it's something. I'd rather have ELO than nothing. I know it has limitations but even if it's misleading to most people, it's interesting to me.

That's where I'm at.

I'm already keeping track of my own progress in other ways than ratings, but I'll still miss it.

Private Mod Note ():

Rollback Post to Revision RollBack
#31 May 21, 2014
fnord
fnord

View User Profile

View Posts

Send Message
🏆🏆🏆
- Join Date: 9/13/2006
- Posts: 5,912
- Member Details
Quote from cricketHunter »

I don't understand your objection. The underlying math is based on a distribution of skill levels for two player games. What part of that exactly is invalid in Magic?

The core of ELO is a mathematical equation which predicts the results of a match based on the ratings of the participants. If both players are 1700, for example, it will predict that (ignoring draws) each player has a 50% win chance. If the ratings are correct, if the players play each other a large number of times their ratings will remain approximately the same.

The problem arises when you have a gap in rating between the players and increases as you have a larger gap (say an 1800 plays a 1600). The ELO algorithm makes a prediction which ignores the effects of variance and therefore the wrong number of points are added/subtracted when a player wins/loses. So even when a player's skill in unchanged, their rating will vary wildly based on RNG as well as things which should not affect it (such as type of event they play).

Quote from Phyrre56 »

It's not perfect but at least it's something. I'd rather have ELO than nothing. I know it has limitations but even if it's misleading to most people, it's interesting to me.

Personally I use the ticket system to track my progress. If the value of my collection in tickets has increased during my play session, I've made progress.

Private Mod Note ():

Rollback Post to Revision RollBack

Practice for Khans of Tarkir Limited:
Draft: (#1) (#2) (#3) (#4) (#5)
#32 May 21, 2014
pierrebai
pierrebai

View User Profile

View Posts

Send Message
Almost Famous
- Location: Montreal, Quebec
- Join Date: 11/25/2005
- Posts: 3,846
- Member Details
Quote from fnord » »

Personally I use the ticket system to track my progress. If the value of my collection in tickets has increased during my play session, I've made progress.

That adds to the randomness of play the randomness of the value of cards you open. So you're speculating that the two cancel each other out due to karma. I've noticed that when I've not logged on for a long time or just bought a good amount of tix, my first picks tend to be high-value cards. I knew I was on something!

Private Mod Note ():

Rollback Post to Revision RollBack
To post a comment, please login or register a new account.

Previous Thread

Next Thread

MTG Salvation

MTGO Ratings

Social Media

Services

Resources

Our Communities