What's the probability that in the next ten years, a nuclear bomb will be detonated on only the western half of the United States? We don't know, so 50-50 is our best guess.

What's the probability that in the next ten years, a nuclear bomb will be detonated on only the eastern half of the United States? We don't know, so 50-50 is our best guess.

What's the probability that no nuclear bomb will be detonated in the United States? We don't know, so 50-50 is our best guess. And, of course, the probability of two bombs, one in the eastern half and one in the western is also 50-50.

So now we have four possible cases, all mutually exclusive, one of which must occur. And our best guess for the probability of each one is 50-50.

Similarly, our guess that at least one bomb will be detonated is also 50-50. So we have two events, each with a 50-50 chance. And the probability that both will occur is 50-50, the probability that neither will occur is 50-50, and the probability that one but not the other will occur is also 50-50.

Surely we can do better than these obviously nonsensical alleged "best guess"es.

]]>@Porlock: Evens and odds were on the table because besides having one-to-one correspondence, they also are equiprobable. That is, in the limit, as an interval of integers gets larger without bound, the fraction of evens and the fraction of odds both equal 1/2. That is, of course, not true of squares and non-squares.

]]>I'm not sure why it is that the statement is surprising. Zero and one both behave the same way - being squares of themselves. Two and all numbers beyond have corresponding squares.

Is it because of the behavior of zero and one that gives people trouble imagining this? It's also not difficult to imagine if one is geometrically-minded: an actual square of any arbitrary unit in whole numbers can be constructed.

Zero and one are certainly naughty numbers, though. Paired up (for instance, in identity matrices in linear algebra) they wreak havoc. The proof of the PNT is concerned with the area between both numbers (the "critical strip") in which all of the nontrivial zeroes of the Riemann zeta function are proven to reside. They're also the only numbers one needs in order to represent any other number (the binary numbering system being the simplest).

As the "dynamic duo" of mathematics, I guess one should rather expect them to possess super powers. ;)

]]>Again, think: one to one correspondence. The proof is not difficult.

This has been called Galileo's Paradox, a pretty fair eponym, since it appears in Two New Sciences (1638) and does not seem to have been stated with the same clarity by any earlier author.

]]>You have failed to prove, of course, that the chances that the LHC can destroy the world are 0 when it is turned off. It may be that an unpowered LHC contains more dangers than a powered LHC.

:-)

In general, of course, people have forgotten to consider the post-black-hole-creation scenario: How do you comprehend "better", "worse" or even "destroy" in an environment where you cross a black hole boundary?

If a black hole event does occur, maybe heaven and earth as we know it will be destroyed and be instantly replaced with a better one.

At another level, what if we're crossing black hole boundaries all the time, but just don't know it?

--recherche

@Charles: I think it's clear that Charles Murray does understand that about averages. But I also think that IQ is a pretty crude way to assess people's intelligence. In my opinion, it measures the dot product between the test taker's mind and the test maker's mind, and little else. (I understand it's somewhat well correlated with future salary, though.)

]]>This is fromt he Wall Street Journal:

http://www.opinionjournal.com/extra/?id=110009531

"Today's simple truth: Half of all children are below average in intelligence. "

The interesting thing is that that is how averages ACTUALLY work. I think we know in which half Charles Murray belongs.

The 'less stable genetically' assertion seems intuitively correct, given the speculation that the Y chromosome is naught but a mangled chromosome. There are certainly a number of sex-linked traits, and a lot of them are unfavorable for males.

The second (that we're more likely to kill ourselves doing stupid shit) seems less so. I admit a certain bias toward more concrete fields of science (biology) than the weaker social sciences (psychology) which, perhaps, taints my expectations on this matter. It just doesn't seem terribly plausible that the margin of "doing stupid shit" would be so statistically high, given evolutionary theory. Clearly some females are interested in that trait, generally at ages prior to childbearing (the flocks of girls who surround street racers come to mind), but it obviously wanes as responsibility and that intrinsic desire for security asserts itself. There's probably some bias in that data along those lines, but I'm not comfortable in an assertion that as high as 1/20 males die doing stupid things....

As for the first bit, I confess an attraction to that assumption. I'm currently reading "Prime Obsession" and see analytical concepts swimming about in my head. My proclivity is to take extra care to not see the entire world (though I certainly see the world a different way now, perhaps irrevocably) through whichever lens I've just picked up, so I'm perhaps over-leary of looking for relevance of power functions.

The elegance of notions such as π(x) ~ li(x) strikes me more, I think, because I've only recently taught myself calculus. Aside from calculating areas, I'm fascinated to see the integral pop up elsewhere (I expected it to, and am impatient to discover its application elsewhere). Its my current distraction from the nuisance that the practice of taking integrals themselves have proven to be. ;)

]]>Seriously, if people estimate things they do not understand at 50:50, that would explain why most have trouble with understanding the risks of rare events. Useful information and no little amusing.

Side note: This guy being a science teacher sounds fitting for the backwards, 3rd world country he is from...oh, wait.

]]>This podcast explains more (I can't find a better link which is more.. readable.

]]>You say "I'd tend toward a Gaussian distribution over other guesses on probability such as 50/50, seeing as how often that distribution appears in real life scenarios."

But I would bet a small sum that power laws are much more common. Which is why this conversation is happening on Bruce's blog not your or mine.

As for male/female gender split, at conception the ration is more like male 55% (we are less stable genetically than females apparently) and at birth around male 51% (we are also more likely to kill ourselves doing stupid shit than females.)

]]>Mr. Schneier, let me remind you that you do not have a Ph.D. degree.

This made me smile. This is akin to saying "people don't fall down manholes if you put covers on them." It's certainly true, but it's not exactly the point. ;)

]]>Now a couple of flips that all come up heads does not (statistically) say much about the behaviour of this coin. But a trillion flips in a row that all came up heads (or even a few hundred!) would seem to be STRONG evidence that the coin is NOT unbiased.

]]>Of course, there are some competent, dedicated high-school math teachers. But the probability of finding one in your local high school is ... well, let's say it's less than 50%.

>Better still, yyyy-mm-dd, as per ISO 8601

Yes, this looks like the way to resolve the problem once and for all. I should just get into the habit of entering dates into applications as yyyy-mm-dd. I haven't encountered an application that gets that wrong.

@Brian Feir

>Which is why people who want to be precise don't use dd/mm/yy or mm/dd/yy

The point I was trying to make was that this is often the default display format for an application and it's not always possible to override it.

For example I entered the date 2008-12-1 into a Google Docs spreadsheet and it helpfully translated it to 12/1/2008 because I had defaulted to a US locale. I switched to a UK locale and it did the right thing and displayed it as 1/12/2008 but you can't tell at a glance what the date is without knowing that.

]]>Secondly, anybody putting me and a friend in that situation is an ass, and I'd have no trouble telling them so, too, because anyone who'd do that, I wouldn't trust them if they said they'd let me go if we did guess the same number.

Lastly, I don't think estimation by extortion is a good motivational example. People don't tend to think rationally when their life is in danger (so maybe I'm kidding myself when I say I'd tell this guy he's an ass). So no, I don't think this is "classical game theory," or at least I don't think that classical game theory has much to say about this game. Psychology, though, maybe.

@Ward Denker: You're quite right that we don't generally have a complete lack of data. Such scenarios are usually artificial.

The problem in the original story is that there is some data (or foreknowledge, or whatever you may call it), but the science teacher steadfastly refuses to use it. Or is not properly equipped to use it. Or both. At any rate, he lazily ignores whatever other information is available, and assumes that two options divide 50/50 by default. There is no default, except in people's imagination.

Gaussian distributions typically come about because of something in nature approaching the central limit theorem. So again, if I *know* that there's a sum of lots of iid variables, then I guess Gaussian, sure. But absent that, I have no reason at all to expect Gaussian. I have no reason to expect anything at all. It's not nihilist not to expect anything, if you really don't know anything. (But since you will practically never run into a situation of which you know literally nothing, this means it's important to be aware of exactly what you do know.)

By the way, there are exactly as many evens as odds. Arithmetic with infinite cardinalities don't work the same way as with integers. There's a one-to-one mapping between evens and odds whether you include zero or not. (If you include zero, use f(x) = x+1. If you don't, use f(x) = x+1 if x > 0, and x-1 if x

]]>Better still, yyyy-mm-dd, as per ISO 8601 ( http://en.wikipedia.org/wiki/ISO_8601 http://www.iso.org/iso/support/faqs/faqs_widely_used_standards/widely_used_standards_other/date_and_time_format.htm )

]]>The "weird prison cell thing" is a classic game theory situation, the point of the lesson being that humans [whose backgrounds are similar] may be able to guess what each other would answer to certain kinds of problems.

Like Pat says, the game is not so much to match reality as to match the other person's guess.

]]>> In summary, I think it's perfectly reasonable to expect that

> the majority of people who one may encounter are relatively

> untrained in mathematics (and have forgotten much of what

> they might have once understood). It's also probably

> reasonable to expect that they'll have numerous cultural

> biases which shape the way they think.

I'm not sure if I'm misreading you, or you misread me, or we're both agreeing with each other, because that was sort of my point :)

Ah, I see... I mis-read the scenario that Nannite gave, my bad. Her example was:

"You and a friend are taken into custody and put into two separate jail cells where you cannot communicate with each other. You are told some unknown experiment is taking place and that there are two possible outcomes, A and B.

You are then forced to guess the probability of A occurring. Your friend is in the same situation and you must both guess the same value otherwise you will be killed."

In this case, you're right, what you are guessing is what your friend is guessing, not what the actual probability of A occurring *is*.

Well, in this case, what you're doing has nothing to do with statistics and probability distributions and everything to do with knowing what the other person would guess. This doesn't tell us anything interesting about probability or the "reasonableness" of picking "50-50" as a default guess for an actual probability distribution, though. Maybe that's why I mis-read the example :)

]]>Which is why people who want to be precise don't use dd/mm/yy or mm/dd/yy... they use yyyy/mm/dd. There's no month/day swap possibility in that order, and it has two other advantages: it makes yyyy/mm/dd hh:mm:ss a monotonic progression from largest to smallest units, and a trivial ASCII sort implementation will sort the dates in chronological order.

]]>I never use the word "billion" myself because of that niggling feeling of impreciseness as soon as you utter it.

You might as well just revert to an exponential notation and be done with it - at least it's unambiguous. The SI approach has some appeal too, but I'm in a country (Australia) which was sensible enough to go metric a long time ago so it seems pretty natural.

The same problem occurs with dates of course. There's no way of telling whether 1/2/09 refers to Jan 2 or Feb 1 because of the same transatlantic confusion so that notation might as well be abandoned as well.

Lack of data doesn't infer lack of foreknowledge. Personally, I'd tend toward a Gaussian distribution over other guesses on probability such as 50/50, seeing as how often that distribution appears in real life scenarios.

However, most people have at least knowledge of elementary arithmetic, in which there are a lot of basic assertions for a number of common events. Those of us who frequent Bruce's blog are probably a standard deviation (perhaps several) off of that norm. ;)

How many integers in the infinite series are even versus odd? 50/50. (Technically there is exactly one more even number than all of the odds, which is zero.) Negative vs. positive? If you were only asked to guess at whether the number I was thinking of in terms of even or odd, you'd have an equal chance. It's generally safe to assume that, should anyone ask you to pick a number between two arbitrary natural numbers, they're not going to expect you to be a shit and throw π, or e out as your guess, after all.

How about a child being born male or female? 50% How about the probability that it's night or day out at any given time?

Coin flips are known to everyone as are games of rock-paper-scissors to decide who begins a match (the latter being somewhat deterministic, as individuals tend to have biases).

People often think in terms of 'right' and 'wrong' or 'good' and 'evil', etc. These characterizations lead us to make a lot of assumptions, like that of a 50/50 split.

I think those ideas are pretty well ingrained in many cultures, such as the yin and yang of Chinese culture, for example.

In summary, I think it's perfectly reasonable to expect that the majority of people who one may encounter are relatively untrained in mathematics (and have forgotten much of what they might have once understood). It's also probably reasonable to expect that they'll have numerous cultural biases which shape the way they think.

]]>At the moment? Zero.

It's been turned off.

Hey, we Amurricans at least know what we mean when we use the word. Well, anyway, the 20% or so who do know don't have to explain to each other which value they're using.

Incidentally, in 1969 I heard Tony Benn use billion in the American sense, 10^9, in a speech in Commons. I infer the sense from the fact that he was talking about North Sea oil reserves, and was not measuring in milli-barrels or anything. And it wasn't just leftie-usage back then: the OED finds a 10^12 trillion in the Telegraph from 1971.

]]>> With no other information you would guess 50%.

> It is the only logical equilibrium.

No, there *is* no logical equilibrium. You have *no* data.

With the scenario you are given, you don't even know what the significant figures are.

Without anything resembling real data, you're just as well off saying, "How many significant figures?" and the rolling percentile dice and taking that as your "guess".

You actually have a much better chance of picking the correct answer by complete random chance (again, assuming no data), because humans are wired to pick non-random numbers. I would guess that a huge number of people would guess 50-50, then 60-40, and then probably 7, 11, 17, 12, and other metaphysically laden numbers would be next most likely choices for people to select... but none of them is quantifiable as being "more likely" to be the correct answer than some random percentage.

]]>You, sir, win the Internet!

]]>What... is your name?

OBAMA: It is 'Obama', President of the United States.

BRIDGEKEEPER: What... is your quest?

OBAMA: To fight the Global Recession.

BRIDGEKEEPER: How... many millions are in a trillion?

OBAMA: What do you mean? An American or European trillion?

BRIDGEKEEPER: Huh? I-- I don't know that. *Auuuuuuuugh!*

BIDEN: How do know so much about trillions?

OBAMA: Well, you have to know these things when you're a President, you know.

]]>> I would never pretend that The Daily Show is good news, but sadly it is the *best* news currently available on television.

Which is why I stopped getting my news from TV years ago. I get my news now from two sources: NPR and via a specific simple routine of gathering information about a story from the internet that I've developed.

I like NPR because they have nothing but their voice to communicate with you. No flashy graphics or videos. No body language. Just words (and inflections/tones woven inbetween the words). It is far less entertaining than TV news but that's the beauty of it. They aren't there to entertain, they are there to report, and it is some of the best reporting left in the US.

As for the routine I use to get information on a story on the internet here it is:

1) Go to CNN (it's order here is random) -- this more-or-less what the left is saying about the matter

2) Go to Fox and find the same story -- this is what the right is saying

3) Go to an online guide about how to detect bias in reporting (there's lots of them out there)

4) Realize that both CNN and Fox have sub-par reporting

5) Go to news.google.com and search for that story

6) Read several articles/blogs comments on that story

7) If still not satisfied, search Wikipedia on related topics (ie: medicine, science, politics, cultures, religions, etc.)... bonus points if you end up reading peer-reviewed papers on the topic (citations in Wikipedia are wonderful)

This takes more time, but you'd be suprised how much more time you have to do this when you're not watching all the commercials and pointless/repetitive reporting on TV.

]]>We (were) talking about dollars which are counted in powers of ten. Anyways, some people prefer kibi-, mebi-, gibi-, etc. as the binary alternatives to the SI prefixes.

]]> - Thousand

- Mono-illion (M'illion)

- Bi-illion (B'illion)

- Tri-illion (Tr'illion)

- Quad-illion (Quadrillion)

etc etc etc. In Britain, a thousand thousand thousands is called a Million (i.e. a thousand millions), and I think a thousand thousands is also called a million-- hence why POSIX.1 establishes a million as a "Thousand Thousands" in some definitions of time functions.

]]>According to the Shorter Oxford, 6th edition:

Trillion - Originally (especially in the UK), a million million million (10^18). Now usually (originally US), a million million (10^12; cf. billion)

Billion - 1. A million million, 10^12. Cf. trillion. (Now only in British popular use.)

2. A thousand million, 10^9.

The units of measurement used in computer science are NOT SI units. They all derive from the binary system, so be aware that kilo = 1024, not 1000. A kilobyte of RAM/disk is 1024 bytes, a megabyte is 1024 * 1024 bytes, and so on.

]]>If we want useful probabilities, we need to work from observed facts and accepted science, rather than arbitrarily declaring all imaginable possibilities equally likely.

I'll confess I don't understand the "high entropy" thing, but if entropy is what we want, probably we really could use hot air.

As to people in the UK grasping the difference between 'point in time' and 'predictive cumulative' probabilities, or the value a trillion represents, I'd hazard a guess at

]]>Give us time.