Jump to content

SDI stats


Viluin

Recommended Posts

Viluin are you checking the SDI status of nations on the chosen AA as well as their opponents? It seems to me like the stats page lists every nuclear attack that an alliance is involved with, not just the ones that they launch or they receive. If you haven't been, this might explain why an alliance like Darkfall is off even though they all have SDIs since some of their opponents might not.

Also, it really does matter that you filter out all the nations that don't have SDIs; I've played around with the numbers a little bit and a small downward shift in the hit rate can easily make the results non-significant for some of these alliances.

It also might be the case that thwarted nuclear attacks don't always show up on the reports page for some reason. I think it would be far more likely for there to be a glitch in the code causing thwarted attacks to be under-reported than for something as simple as P(hit)=0.4 to be screwed up.

Edited by Bakunin's Dream
Link to comment
Share on other sites

  • Replies 56
  • Created
  • Last Reply

Top Posters In This Topic

[quote name='Bakunin's Dream' timestamp='1296933084' post='2620753']
Viluin are you checking the SDI status of nations on the chosen AA as well as their opponents? It seems to me like the stats page lists every nuclear attack that an alliance is involved with, not just the ones that they launch or they receive. If you haven't been, this might explain why an alliance like Darkfall is off even though they all have SDIs since some of their opponents might not.[/quote]

I know all of Darkfall's opponents in this war by heart, and I know they all had an SDI. If you look through Darkfall's nuclear reports you'll see that every nation involved has thwarted at least one nuke, with one exception (that guy only got nuked once) but he also has an SDI.

I check all nuclear attacks alliances are involved with. I sort the nuclear reports page by "Defend nation", which includes nations on that AA as well as their opponents. I view all nuclear reports, so the list is a mixture of thwarted attacks and direct hits. Defending nations with no thwarted attacks stand out, and I inspect their nation to see if they have an SDI. This way I can eliminate all non-SDI attacks from the equation.

[quote]It also might be the case that thwarted nuclear attacks don't always show up on the reports page for some reason. I think it would be far more likely for there to be a glitch in the code causing thwarted attacks to be under-reported than for something as simple as P(hit)=0.4 to be screwed up.
[/quote]

The reports seem to be accurate for my nation and all of my opponents, so I can't really comment on this. I usually attack at update too, which is the logical time of the day for such glitches to occur (if they exist).

Edited by Viluin
Link to comment
Share on other sites

[quote name='Quinoa Rex' timestamp='1296930160' post='2620691']
I did look at the stats, and they might prove something if your methods were statistically valid, which they aren't. In order to have a properly random sample, you would have to pull data regardless of alliances from nations that hold an SDI.
[/quote]
His methodology actually does seem statistically valid. Going back to your earlier example, flipping a penny 1,000 times is unlikely to yield exactly 500-500, but you can calculate the likelihood of any particular deviation. For instance, you are more likely to get a number of heads between 400 and 600 than you are to get a number of heads between 100 and 200.

This also applies whether you flip the same penny 1,000 times or take 1,000 pennies and flip each of them once. You can also have ten jars of pennies and only flip the pennies from two of the jars and it won't affect the results because there is no difference between pennies from one jar to the next. This all applies to sampling in aggregate versus individual nations and sampling by alliance. The only way there should be a difference in yield based on those results is if the RNG wasn't random and instead carried a history for each nation, which actually still shouldn't affect an aggregate outcome, or if only sampling from specific AAs rather than randomly across all of them made a difference, it would mean that what AA you were on has an effect on the effectiveness of your SDI.

The factors that need to be accounted for are as follows:
Did everyone on the sampled AA have an SDI?
Did the opponents of everyone on the sampled AA have an SDI?

Or put more basically:
Did everyone sampled have an SDI?


If he accurately controlled for this and removed all non-SDI carrying nations from the sample, then his results are entirely valid and it means one of the consequences that Bakunin laid out is the culprit.

Edited by Delta1212
Link to comment
Share on other sites

OK, I aggregated all of Viluin's data (N=6593) and the total hit rate is 49.52% with a 95% confidence interval of (48.316%, 50.729%). So it looks like the the SDIs are working 50% of the time instead of 60% of the time. The fact that it works out to almost exactly 50% strongly suggests to me that issue really is that P(hit)=0.5 instead of 0.4 rather than there being an error in the data collection process (we wouldn't expect it to work out to a round number like 50% if it were just a case of bad data).

As far as I can tell, the bottom line here is that SDIs really work 50% of the time, not 60%.

Kind of funny how it took a giant war to give us the data set we needed to figure this out.

Edited by Bakunin's Dream
Link to comment
Share on other sites

[quote name='Bakunin's Dream' timestamp='1296942866' post='2620978']
OK, I aggregated all of Viluin's data (N=6593) and the total hit rate is 49.52% with a 95% confidence interval of (48.316%, 50.729%). So it looks like the the SDIs are working 50% of the time instead of 60% of the time. The fact that it works out to almost exactly 50% strongly suggests to me that issue really is that P(hit)=0.5 instead of 0.4 rather than there being an error in the data collection process (we wouldn't expect it to work out to a round number like 50% if it were just a case of bad data).

As far as I can tell, the bottom line here is that SDIs really work 50% of the time, not 60%.

Kind of funny how it took a giant war to give us the data set we needed to figure this out.
[/quote]
Have you brought this to Admin's attention? It would be a very quick check to see if what you are saying is correct.

It would be a slower check to see if something else was polluting the data (like underreported thwarts, or some other factor affecting the base effectiveness of an SDI (which would be P(hit)=0.4)). Either way, I am confident enough in your methods (Viluin and Bakunin) to say an investigation into this is warranted.

Edited by iMatt
Link to comment
Share on other sites

[quote name='iMatt' timestamp='1296956642' post='2621202']
Have you brought this to Admin's attention? It would be a very quick check to see if what you are saying is correct.

It would be a slower check to see if something else was polluting the data (like underreported thwarts, or some other factor affecting the base effectiveness of an SDI (which would be P(hit)=0)). Either way, I am confident enough in your methods (Viluin and Bakunin) to say an investigation into this is warranted.
[/quote]

I PMed admin about this thread.

Link to comment
Share on other sites

  • 2 weeks later...

I think I know what's going on here. While SDI blocks are supposed to be completely random, the incoming attacks are not. Since a nation can only receive one nuclear attack a day, attackers will always be forced to stop after a successful hit. This means the number of nations that have their last nuke be a hit every day will probably be above 40%, and if attackers are aggressive and have large stockpiles it will be MUCH higher than 40%. Even if the SDI has exactly 60% success the attackers will keep attacking until their daily nuke string ends with a "hit".

Example: Lets look at some potential outcomes for 2 sequential nukes:
(Hit, Miss) (Miss, Hit)

If SDI odds were instead 50% these two situations would be equally likely. However, the attacker in the first situation will only be able to send one nuke that day, and the second attacker will likely attempt the second attack if they still have a stock pile. Therefore the actual outcome looks like this:
(Hit) (Miss, Hit)

So even if the SDI is perfect the actual block odds go down to 33%. I don't know what the actual percentage would look like for 60/40, but there should be a shift that favors success.

EDIT: Just realized I was a week late on this thread but I hope this helps.

Edited by asawyer
Link to comment
Share on other sites

[quote name='Viluin' timestamp='1296932582' post='2620732']
Alpha Omega:

Total number of nuclear attacks: 564

# of direct hits: 295
# of direct hits not involving an SDI: 57
# of direct hits involving an SDI: 238
[b]# of thwarted nukes (expected): 357
# of thwarted nukes (real): 269[/b]


Global Order of Darkness:

Total number of nuclear attacks: 1170

# of direct hits: 663
# of direct hits not involving an SDI: 196
# of direct hits involving an SDI: 467
[b]# of thwarted nukes (expected): 701
# of thwarted nukes (real): 507[/b]

I have yet to find an alliance that hasn't thwarted significantly less than 60%. They all seem to be in the 45-55% range. Maybe I'll do a big one like NpO later.

EDIT:

Proper Umbrella stats:

Total number of nuclear attacks: 1170

# of direct hits: 676
# of direct hits not involving an SDI: 52
# of direct hits involving an SDI: 624
[b]# of thwarted nukes (expected): 936
# of thwarted nukes (real): 598[/b]

NOTE: While searching through thousands of nuclear reports of all these alliances, I encountered a handful of nations that had never thwarted a nuke, but were deleted, so I couldn't check their wonders. I've always included those nations in the non-SDI category, to be on the conservative side. So, if anything, I've recorded too many (1%) non-SDI hits in all these samples.
[/quote]

The problem with aggregating data like this is that the SDI's performance is not aggregate. Each nuke is a separate check, independent of the previous.

Its the old coin toss question. Take 50 coin tosses, if the first 49 come up heads what are the odds my 50th toss will be heads?

Link to comment
Share on other sites

[quote name='TypoNinja' timestamp='1297798004' post='2634587']
The problem with aggregating data like this is that the SDI's performance is not aggregate. Each nuke is a separate check, independent of the previous.

Its the old coin toss question. Take 50 coin tosses, if the first 49 come up heads what are the odds my 50th toss will be heads?
[/quote]

It's not. The sample size is simply too large for that. Although it is a coin toss, the sample size means that the chance that such a 20% difference would be nearly impossible.

Link to comment
Share on other sites

[s]Just realized I said something backwards in my previous post. If nations have infinite stockpiles the odds still approach 60/40. It's when nations have a limited number of nukes to launch that you see a shift closer to 50/50.[/s]

Now I'm really confused because I checked and the odds still go to 60/40 regardless. I still think the discrepancies have something to do with the 1 nuke a day limit but I can't figure out the math.

Edited by asawyer
Link to comment
Share on other sites

[quote name='asawyer' timestamp='1297801767' post='2634629']
[s]Just realized I said something backwards in my previous post. If nations have infinite stockpiles the odds still approach 60/40. It's when nations have a limited number of nukes to launch that you see a shift closer to 50/50.[/s]

Now I'm really confused because I checked and the odds still go to 60/40 regardless. I still think the discrepancies have something to do with the 1 nuke a day limit but I can't figure out the math.
[/quote]
If nations have a limited number of nukes, there's a higher chance of a difference in probability: basic statistics. But it's just as likely to swing to 50-50 as it is to 70-30 in that case. It has nothing to do with the 1 nuke a day limit.

Link to comment
Share on other sites

[quote name='NOMNOMNOMNOMNOM' timestamp='1297799100' post='2634597']
It's not. The sample size is simply too large for that. Although it is a coin toss, the sample size means that the chance that such a 20% difference would be nearly impossible.
[/quote]

Nope. Its still only 50%. Its a common misconceptions. While the odds of getting 50 heads in a row is very low, the odds of any individual coin toss is always 50/50. So once I have 49 heads the odds of me hitting 50 is still 50%.

When attempting to compile into an aggregate form data on probabilities that are independent of each other the aggregate is only guaranteed to match expectations when plotted along an infinite progression. Anything finite can vary. That's why we see streaks in practice.

Link to comment
Share on other sites

[quote name='NOMNOMNOMNOMNOM' timestamp='1297803200' post='2634636']
If nations have a limited number of nukes, there's a higher chance of a difference in probability: basic statistics. But it's just as likely to swing to 50-50 as it is to 70-30 in that case. It has nothing to do with the 1 nuke a day limit.
[/quote]

No, the fact that you will always stop on a hit means that your sample data will always be slightly biased towards nukes slipping through.

If I need to fire three nukes on average to get a hit I'm still going to stop on that hit, so if its one miss then one hit I stop there. That shows me 50/50 odds becuase I'm not going to let fly another miss after I already hit to keep the data statically consistent.

Edit: Sorry I seem to have managed to quote my self instead of editing my original post, forgive my Double post :P

Edited by TypoNinja
Link to comment
Share on other sites

[quote name='TypoNinja' timestamp='1297798004' post='2634587']
The problem with aggregating data like this is that the SDI's performance is not aggregate. Each nuke is a separate check, independent of the previous.

Its the old coin toss question. Take 50 coin tosses, if the first 49 come up heads what are the odds my 50th toss will be heads?
[/quote]
Not quite.

For the coin toss question, the odds that the 50th toss will be heads is 50% (as it's independent of the previous flips). However, is it impossible for the first 49 flips to come up heads with the 50th also being heads? No, the odds are just highly against it happening and there is little expectation such a result would come about merely by chance (ie, you would suspect a trick coin).

The same is true for the SDI data.

For the SDI, it's expected that the probability that it will block a nuke is 60%. Experimentally, however, it's been determined that the expected value does not fall within the experimental confidence interval range as showed by Bakunin's Dream, which brings into the question the initial hypothesis that the SDI has a 60% chance to block the nuke. (ie, H0 = P(SDI Block) = 0.6, HA = P(SDI Block) =/= 0.6)

You can make the argument that the sample size isn't large enough, but considering it's looking at 7k nukes, it's large enough for it to be statistically significant. You could also make the argument that the data is tainted with nations that don't have a SDI, but it doesn't seem like it based on Villuin's explanation on how he collected the data.

In addition, Bakunin's Dream makes an interesting point. Iirc, in the past, there were some complaints of nuclear attacks not showing up in the list (ie, someone nukes somebody else and it never shows up in the nuclear new report), so it's possible that some nuclear attacks were never reported (especially considering how many nukes do go flying around at war times). This would under-report the number of successful hits, I think, which could also explain the apparently lowered block rate.

Link to comment
Share on other sites

[quote name='Iceknave' timestamp='1297804058' post='2634647']
Not quite.

For the coin toss question, the odds that the 50th toss will be heads is 50% (as it's independent of the previous flips). However, is it impossible for the first 49 flips to come up heads with the 50th also being heads? No, the odds are just highly against it happening and there is little expectation such a result would come about merely by chance (ie, you would suspect a trick coin).

The same is true for the SDI data.

For the SDI, it's expected that the probability that it will block a nuke is 60%. Experimentally, however, it's been determined that the expected value does not fall within the experimental confidence interval range as showed by Bakunin's Dream, which brings into the question the initial hypothesis that the SDI has a 60% chance to block the nuke. (ie, H0 = P(SDI Block) = 0.6, HA = P(SDI Block) =/= 0.6)

You can make the argument that the sample size isn't large enough, but considering it's looking at 7k nukes, it's large enough for it to be statistically significant. You could also make the argument that the data is tainted with nations that don't have a SDI, but it doesn't seem like it based on Villuin's explanation on how he collected the data.

In addition, Bakunin's Dream makes an interesting point. Iirc, in the past, there were some complaints of nuclear attacks not showing up in the list (ie, someone nukes somebody else and it never shows up in the nuclear new report), so it's possible that some nuclear attacks were never reported (especially considering how many nukes do go flying around at war times). This would under-report the number of successful hits, I think, which could also explain the apparently lowered block rate.
[/quote]

By the way some people haven't actually taken statistics, don't assume they have. But the rules say this: The larger the sample size, the closer the actual value will be to the expected value. With a sample size of 7000, the actual value should be very close to the expected value, but it isn't.

If some direct hits didn't show up on the list, that would explain the difference.

Link to comment
Share on other sites

[quote name='NOMNOMNOMNOMNOM' timestamp='1297807252' post='2634694']
By the way some people haven't actually taken statistics, don't assume they have. But the rules say this: The larger the sample size, the closer the actual value will be to the expected value. With a sample size of 7000, the actual value should be very close to the expected value, but it isn't.

If some direct hits didn't show up on the list, that would explain the difference.
[/quote]
Thanks for adding that clarification to my explanation. Sometimes I'm not as clear as I should be when discussing things like this, :(.

Link to comment
Share on other sites

[quote name='Iceknave' timestamp='1297807693' post='2634714']
Thanks for adding that clarification to my explanation. Sometimes I'm not as clear as I should be when discussing things like this, :(.
[/quote]

I'm not very good at it either. I'm bad at explaining things I'm good at. But I suck at stats. So I can explain.

Link to comment
Share on other sites

[quote name='TypoNinja' timestamp='1297803775' post='2634642']
No, the fact that you will always stop on a hit means that your sample data will always be slightly biased towards nukes slipping through.

If I need to fire three nukes on average to get a hit I'm still going to stop on that hit, so if its one miss then one hit I stop there. That shows me 50/50 odds becuase I'm not going to let fly another miss after I already hit to keep the data statically consistent.

Edit: Sorry I seem to have managed to quote my self instead of editing my original post, forgive my Double post :P
[/quote]

I don't think it actually works like that, but I'm not sure because it's a bit hard to grasp. Sure, you might thwart one and get hit, so that's 50/50, but there are plenty of people that'll thwart two (or more) and then get hit as well. In the end it should still even out to around 60/40 if the sample size is large.besides, you don't stop nuking after a hit, you continue the next day. So it's still a constant stream of nuclear attacks happening worldwide.

Edited by Viluin
Link to comment
Share on other sites

[quote name='Viluin' timestamp='1297815391' post='2634888']
I don't think it actually works like that, but I'm not sure because it's a bit hard to grasp. Sure, you might thwart one and get hit, so that's 50/50, but there are plenty of people that'll thwart two (or more) and then get hit as well. In the end it should still even out to around 60/40 if the sample size is large.besides, you don't stop nuking after a hit, you continue the next day. So it's still a constant stream of nuclear attacks happening worldwide.
[/quote]

It's been a long time since I've done Statistics so I may be getting this wrong, but I think the major issue is that each individual SDI check/nuking is a separate event. That is there is 60/40 for each event, but the events cannot effect each other. It means that the aggregate data does not have to resemble any individual odds. Its likely to, but is not required to.

If we were to plot SDI performance overall on a bell curve I think our midpoint would still be 60/40 (or close enough to not be statistically significant) but outliers will ruin any straight up averaging, because we know the standard deviation is absolutely huge. My SDI is running at around 2 for twenty-something, but my last target absorbed at least 5 a day before we got one through him and the target before that didn't block a single one all week.

These streaks should average out over a long enough timeline, but since we have no way of determining where in a streak any one SDI might be, any aggregate data will be polluted based on timing of the sample.

[quote]besides, you don't stop nuking after a hit, you continue the next day. So it's still a constant stream of nuclear attacks happening worldwide.[/quote]

Not quite, We aren't measuring how many hits per X nukes you get though, were measuring how many nukes it takes to get a hit. The difference is subtle but will bias the test results towards a hit, because if we expect there to be 4 hits in any sample of 10, but we get the 4th hit before the 10th nuke, we stop counting misses.

Edited by TypoNinja
Link to comment
Share on other sites

[quote name='TypoNinja' timestamp='1297817582' post='2634924']
It's been a long time since I've done Statistics so I may be getting this wrong, but I think the major issue is that each individual SDI check/nuking is a separate event. That is there is 60/40 for each event, but the events cannot effect each other. It means that the aggregate data does not have to resemble any individual odds. Its likely to, but is not required to.

If we were to plot SDI performance overall on a bell curve I think our midpoint would still be 60/40 (or close enough to not be statistically significant) but outliers will ruin any straight up averaging, because we know the standard deviation is absolutely huge. My SDI is running at around 2 for twenty-something, but my last target absorbed at least 5 a day before we got one through him and the target before that didn't block a single one all week.

These streaks should average out over a long enough timeline, but since we have no way of determining where in a streak any one SDI might be, any aggregate data will be polluted based on timing of the sample.



Not quite, We aren't measuring how many hits per X nukes you get though, were measuring how many nukes it takes to get a hit. The difference is subtle but will bias the test results towards a hit, because if we expect there to be 4 hits in any sample of 10, but we get the 4th hit before the 10th nuke, we stop counting misses.
[/quote]

My head hurts too much to think about this, so instead I am going to write a computer simulation with 100 nations that are getting nuked every day for 7 days. This can be simulated by simply stopping the nuclear attacks after 7 direct hits. My guess is the hit rate might be a bit more than 40% but still not close to 50/50.

Brb.

Edited by Viluin
Link to comment
Share on other sites

[quote name='Viluin' timestamp='1297823133' post='2635045']
My head hurts too much to think about this, so instead I am going to write a computer simulation with 100 nations that are getting nuked every day for 7 days. This can be simulated by simply stopping the nuclear attacks after 7 direct hits. My guess is the hit rate might be a bit more than 40% but still not close to 50/50.

Brb.
[/quote]

Hey that'd be cool.

Link to comment
Share on other sites

[quote name='TypoNinja' timestamp='1297817582' post='2634924']
Not quite, We aren't measuring how many hits per X nukes you get though, were measuring how many nukes it takes to get a hit. The difference is subtle but will bias the test results towards a hit, because if we expect there to be 4 hits in any sample of 10, but we get the 4th hit before the 10th nuke, we stop counting misses.
[/quote]
Ah, I think I understand what you're trying to say now.

The problem that TypoNinja is stating is that the events we're looking at are NOT independent (ie, generally, people don't care how many nukes they fire as long as they get a hit).

That is, the firing of a second nuke (or later nuke) depends on whether the previous nuke hit or not. If it hit, any nukes after the first isn't fired. If it doesn't hit, another nuke is fired (unless said nation has run out of nukes to fire, in which case, they end on a miss).

Without this independence, it weakens the evidence given in support of the faulty SDI. Nevertheless, I think with a large enough sample size, the lack of independence wouldn't be as big of an issue.

Link to comment
Share on other sites

[IMG]http://img513.imageshack.us/img513/6620/sdi.png[/IMG]

If the time is shortened to just 1 day instead of 7, but for 1000 nations, we get (ran 5 times):

Total hits: 1000
Total thwarted: 1517

Total hits: 1000
Total thwarted: 1554

Total hits: 1000
Total thwarted: 1568

Total hits: 1000
Total thwarted: 1493

Total hits: 1000
Total thwarted: 1433


The fact that you stop nuking for the day after a hit doesn't seem to matter. Although CN's RNG seems to be way more streaky than this one, lol.

[code]import java.util.Random;

public class SDI


{
public static void main(String[] args)
{
System.out.println("SDI test. 100 nations get nuked for 7 days \n\nResults: \n");
Random rand = new Random();
//total number of direct hits and thwarted nukes of all nations combined
int totalhits = 0;
int totalthwarted = 0;
//a loop for every nation
for(int nation = 1; nation <= 1000; nation++)
{
//total hits and thwarted nukes for this nation
int hits = 0;
int thwarted = 0;
//a loop for every day
for (int day = 1; day <=7; day++)
{
int nuked = 0;
while(nuked == 0)
{
if (rand.nextInt(10) < 4)
{
//nuked is set to 1 to interrupt the loop and move on to the next day after a hit
nuked = 1;
hits++;
totalhits++;
}
else
{
thwarted++;
totalthwarted++;
}
}
}

System.out.println("Nation " + nation + ": " + hits + " hits, " + thwarted + " thwarted.");
}
System.out.println("\nTotal hits: " + totalhits);
System.out.println("Total thwarted: " + totalthwarted);
}
}[/code]

Edited by Viluin
Link to comment
Share on other sites

Yeah sorry Viliun I was wrong yesterday. The fact that people stop nuking after a hit shouldn't matter (the same reason you can't break a casino by having everyone play until they win). I think it's important to recognize that attacks are NOT independent events, so this may still skew the results. Not sure how though.

I checked Polar/Viridia SDIs for the last 2 days and they actually came out around 63% blocking (73% for Polar and 53% for Viridia :( ). I'm beginning to wonder if admin isn't using a random number generator, but instead some sort of preset distribution function. We've seen odd streaks in ground attack successes as well that don't seem to match the expected binomial distribution.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...