The Usage Stats Problem: Using 1760 Stats for Tiering

Jukain--these are not suspect reqs we're talking about. The purpose of raising the cutoff is to remove the contribution from non-competitive players. To that end, going 8-0 is nothing to sneer at.

Also: please stop referring to W-L as if it means something. PS matches players together based on rating, and in an ideal world, everyone would have a 1:1 win-loss rate. There are plenty of shitty players with 2.5:1 W-L records--it's much more a function of when (which relates to who) you battle rather than how good you are.

Edit:
graph2.png


Note the player with the win rate of 85% and GXE of 40.

Zoom:
graph-zoom.png

Yes, there's some degree of correlation after all, but there's so much variance that "win rate" is about the most useless metric I can think of.
 
Last edited:
Double-post for this:
gxe vs. n.png


Eventually I'll replot this and the above graph as heatmaps, but for now it looks like there is indeed no real correlation between skill and battles played.
 
Just curious - what is the significance of the 1695 number? Or did you just choose to give information on the number in the middle of 1630 and 1760, which happens to be 1695?

Either way, 1695 sounds a lot better than both 1630 and 1760. It's not filled as much by average players and it doesn't give too much power to any 1 elite person. The latter isn't much of a problem in OU but it will surely be a much more significant problem in lower tiers.
 
DTC, 1695 = 1500+1.5x130, that is, one-and-a-half standard deviations above the mean.

Keep in mind--if you like the theoretical exclusivity of 1695 (the avg. weight of 0.14), you get that FOR THE ACTUAL 1630. Now that I'm awake and able to think clearly, the deviation of theoretical from actual is 100% consistent with previous observations, namely, that a large fraction (possibly the majority?) of alts only play 1-2 games and thus don't get their ratings too far off initial (1500).
 
Zarel out of ~380k players on the OU ladder,

  • ~54k (14%) have had 1 battle (or fewer*)
  • ~92k (24%) have had 2 or fewer
  • ~221k (58%) have had 10 or fewer
  • Only 24% have had at least 30 battles, which is IIRC what I told Aldaron was the minimum before Glicko converged.
  • Only 16% have had at least 50 battles
*not sure how players with 0 battles got on the ladder but there are 299 of them...
 

Jukain

!_!
is a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnus
From what I'm seeing, 1760 is probably fine in OU, but it's a concern in lower tiers. Well, looking at the UU ladder, most of the top 500 have a 1760 Glicko. 1760 should be perfectly fine overall, so long as most of the higher ranked players have it, it isn't too hard to achieve and is right where we want the cutoff to be at.

EDIT: The LC comment was silly of me, sorry about that! The point still stands for everything else.
 
Last edited:
  • Like
Reactions: trc
While I disagree strongly that 1760 is fine for LC (again, the goal here is to remove noncompetitive players, not to get "1337" stats), LC also does not do usage-based tiering.
 
Since it appears we've come to an official descision on this, will an announcement be posted somewhere on the forums at some point? I find it very odd that there have been announcements on Facebook and Twitter but not on the main site (aside from a single post in the UU subforum that will only be seen by a small fraction of the userbase), especially given the fact that over 20 Pokemon have changed tiers as a result.
 
Magnemite--we're still working out the specifics. The short of it is: 1760 stats will be used this month for determining the OU-UU cutoff. Other tiers will be evaluated as needed. How I'll be reporting the stats in these threads moving forward hasn't yet been decided.
 
Who's in charge of the facebook page these days anyway? I think I PM'd RBG this morning. Probably not the right thing to do, but I was only half-awake at the time.
 
It's a collaborative effort, me, darkie, tennisace and a few others


And we can post an update to that post today clarifying (talk to darkie, he's the one who posted it.)
 
I've been trying to summon the mental energy to start a thread describing the new policy that you could then link to, but I've gotten maybe eight hours of sleep over the past two nights, so I'm not running on all cylinders.

The skeleton is:
  • To reiterate, what ever cutoff we use is not "hard." Our weighting system weights players based on the likelihood that they are "better" than a player with the rating specified by the cutoff. I'll throw in some sample calculations of "this is what your weight would be if your rating were R±RD).
  • Tiers are threatlists. They're the Pokemon you need to prepare for to succeed in a tier.
  • Who are our tiers for? The average (read: casual) player or the competitive Pokemon player? Obviously the latter.
  • The idea is this: you're new to a metagame, but you're a competitive player--you want to win and do well--and you're not interested in using gimmicky teams like mono-normal or Metronome. So then you really don't need to worry about all the players who are doing the gimmicks or are just doing it for fun. These days, 1500--the "average" player--is not competitive. Unfortunately, there's no completely objective way to assess a new baseline. Right now, we're just doing it by "feel" and the cutoff level that "feels right" for OU is 1760, which is two "standard deviations" above the mean.
  • Moving forward, we will be able to assess this using what I'm calling "candles of known brightness," basically a series of indicators we can look at to objectively assess at which points gimmicks and non-competitive teams "fall away." In Little Cup, an example would be the use of Leftovers, Sitrus Berry and Assault Vest, which never have any competitive use in the metagame. These need to be things that are completely unambiguous. Donphan being OU doesn't count. Donphan carrying Giga Impact? Yes.
  • The policy will be this: moving forward, whenever it's time for a usage-based tier update, I will generate a variety of moveset statistics and "candle of known brightness" measurements for each metagame at a variety of cutoff levels. The tiering council will then decide which cutoff best reflects the state of the "competitive" metagame, and we will use that cutoff for computing the tier update. I will *not* tell the council in advance what results (in terms of tier movements) will result from each choice. I can also veto the council's decision if their reasoning isn't sound or is contrary to the premises above (for instance, if the primary rationale is "I want the cutoff to be at this level so that X drops"). The council can override my veto with a supermajority vote, at which point I will grump loudly and accede to their wishes.
  • Haven't decided what to do about the monthly stats. Possibly I'll stop posting them altogether in favor of just one thread where I announce that the stats are up in the web folder (maybe we clean it up and turn it into a fully-functional site, but in any case, Zarel, is there any way to get it onto port 80 rather than 8080? Failing that, chaos, maybe we can set up a section of smogon.com to host this stuff: http://www.smogon.com/stats ?
 

a fairy

is a Tournament Directoris a Site Content Manageris a Community Leaderis a Community Contributoris a Smogon Discord Contributoris a Contributor to Smogonis a Top Smogon Media Contributoris a Dedicated Tournament Hostis a Social Media Contributor Alumnus
Community Leader
Antar regarding your last point, myself and one/two other users have been working on getting stats onsite, at least for the (currently) 7 official tiers (ou, uu, ubers, doubles, lc, and soon to come ru/nu but I may be missing something)

Currently I think we have all of the XY (prebank when applicable) stats waiting to be uploaded, I'm just finding it hard to find the time to do that
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top