Jono's Tall Tale
Lies, dam lies and statistics. Hello my name is Jono Tuke and I am a statistician, and I am bloody good at all three.
ACEMS AI
Jono Tuke
I feel that I have an unfair advantage to my esteemed colleagues. For those who do not know academic speak - esteemed colleagues is the equivalent of liar liar pants on fire. And in all due respect to them - again equivalent of you me carpark now - they are out of their depth. When it comes to making shit up I am a professional. I lie all the time, including this very sentence.
But tonight, for a change I would like to tell the truth, nothing but the truth, a you can't handle the truth - big secret.
Come closer. Closer.
I know where you live.
But how - I rightly hear you ask - because you tweet, blog, chat, and take pictures of your breakfast. Yes, we are using social media to track you.
Now I can see a load of you quickly opening your twitter account and switching of the gps tracking with a hashtag - #gpsoff, #jonocantfindme, #upyoursjono, and feeling smug.
But it is too late, the how of the process is not the GPS location tracking, but the hashtags, the words you use, these give me your location.
How? Let me explain. I will use twitter to illustrate.
First, we are happy that if you GPS locate while you tweet, then I know where you are? Fair, but what about when you do not GPS locate how do I know then.
Let us assume that this person here on the front row has let the cat out of the bag with a quick tweet of
Help I am stuck in the Rob Roy with a shouty Manchunian skinhead talking about statistics #bestnightever.
And the GPS locates it.
Now imagine that I know that this person here is near to the first person. Well it is easy to locate the second person.
But wait - how do I know that this person is close?
Well you are thinking too literally about the concept of distance. Remember I am a statistician, if you prick me do I not bleed - actually no and that is getting off the point, no underneath it all a statistician is just a mathematician, all be it a mathematician with a fluid sense of ethics, but still a mathematician, and us mathematicians have an abstract concept of distance.
So, what do we mean by a distance?
First distances will take two objects in space and give us a number such that large values mean they are a far apart and small numbers mean they are close.
Second, if the distance is zero, then the objects are at the same point. Third, the distance from one object to another is the same as the reverse.
And finally, we have the triangle inequality - consider three points, then the distance between any two will always be less than or equal to the sum of the distance of those points to the third.
Notice that I have not discussed metres, inches, millimetres. Not even stated that I need your bog-standard space. I could have any space and thank you very much I will.
How about a linguistic space? The language we use can be used as a distance.
The hashtags you used tonight #talltales #statsiscool #nowayjonowouldlietous identify you as close to your fellow spectators. Not just the hashtags, but also the words in the rest of the tweet.
Statements like:
Having a barbie - a nice couple of stubbies in the eski and a couple of snags make for a great arvo.
Me: Having a BBQ - a nice couple of pints in the cooler box and a couple of sausages resulted in a great afternoon.
Every 'go the crows', or 'come on Macc' town locate you.
So how do we do it. I find who you are close to in linguistic space and from that infer that you are close to them in geographic space. And finally, if they give me their GPS point I have your GPS point.
While we are talking about language, I can use this also to get your location, just because you switch off GPS does not mean that there is no location informa- tion.
You take photos, tell me you are at the Rob Roy, tell me it is raining. All of these help me locate you.
But why I hear you ask - why do you want to know where we live. Come, seriously, why would I not.
Oh, you are serious - you really want to know.
Imagine a hypothetical scenario where a country decides to have an election and someone suggests that we should build a moat with crocodiles to keep 'Johnny Foreigner' out, or another country decides to have an referendum to see if it should leave the human race- because it noticed that some the humans looked a bit funny and spoke a bit funny. In both cases, we can feel safe in the knowledge that this crazy thing will not happen as everyone on social media says it will not happen. And then they bloody well happen.
As a statistician, making predictions that are wrong is annoying. So we want to locate twitter users, that way we can compare what they say online with what people is similar areas say via the election and then we can adjust one opinion to estimate the other.
I can finally use your twitterings for prediction not just to make me very very angry.
So, I can locate you not on where you said something but on what you said. And the take home message is this – If you did not pay for it, then you are the product.
Is Jono telling the truth?
No, it is a lie because while we can use this method to GPS locate people through their tweets, people mainly tweet when they are away from home or out for the night. We wanted to know where tweeters lived, but instead we got where they partied.
It's a tall tale meaning it's not true, because while we can use this method to GPS locate people through their tweets, people mainly tweet when they are away from home or out for the night. We wanted to know where tweeters lived, but instead we got where they partied.