The Dumb Data Science X vs Y Wars

ognd 023

Yep it’s a rant… announcing ahead that a blog post is a rant is almost a tradition since before blogs were cool and web forums were the thing. If someone in some obscure web forum started a post with <rant> or “Rant Mode On” and other derivatives, he was signaling “I’m not a troll but I’m really upset!”

It is a message with mixed signals between “keep reading because I’m going to get nasty and you’ll like it” and “you can just skip if you need apologies if and when a web user get’s crazy”.

Either you got the idea by now or you know what I’m talking about for two whole paragraphs and you’re itching for the juicy stuff. So, here goes…

Rant Mode On!  Continue reading

Retention and Churn

This post was written 10 months ago… yep, right after Retention 101. Since then it has been in an out of the publishing queue. I’ve been picking up things to improve it but it doesn’t make sense to keep it out… and it took too long really! I wanted to improve it beyond this but it’s better to simply publish it and follow up if I make up my mind about what is that magical improvement than to leave it lingering in the Drafts section any longer.

This post is about ways of measuring retention, how each of them relates with true churn and which should be used.

Retention 101 post was an overall intro. I gave the formula generally used to calculate retention and mentioned there are other ways of calculating it. This post is about those additional formulas, namely rolling retention and rolling window retention and also about churn.

Each retention formula has strengths and weaknesses. Some are more adequate for reporting, other’s for modelling and each has a different relationship with churn. Let’s start! Continue reading

Databases and tables for game analytics

databases_and_tables.jpg

It has been some time so let’s recap how we got here. First I gave an overview of what a game analytics stack can be. Then I moved to the planning stage pointing the steps from zero to data science. In the last couple of posts in this category I wrote about basic events. First how to think and define them, later on the structure of the data created from those events. The last couple of posts were about user state, what it is and how we can use it.

I think it is abundantly clear that there is method in the madness! Today I’ll write about the databases and tables needed for basic reporting. Not only the definition of the fields but also different structures and technical considerations.  Continue reading