Who wants to be a Data Scientist at Miniclip?!


Small break in the blog posting holiday break to let everyone know that Miniclip is hiring… well… we’ve been hiring for quite a while so I guess it’s not a big surprise. The reason why I’m posting is because we’re looking for a Data Scientist to work with yours truly, while pretending for dramatic effect that you didn’t read the title and are really surprised!

You can see the job description (and apply!) following this link. However I want to give you a bit more information than just the job title and job description.

At Miniclip anyone in the Data Science and Data Engineering teams can be involved in data science projects. Analysts, engineers, scientists and team leads have different operational responsibilities but our major strength is that we work as a multidisciplinary team.

And what data science projects are those you ask? As I see it, Data Scientists build data products. Data products are automated or interactive data centric applications that would not be possible using traditional systems. What on earth does this mean? What about some real examples at Miniclip:

  • UA LTV: An interactive data product that allows business users to analyse predicted LTV across all possible cohort combinations. Business users can also export data to create reports. This export includes the mentioned LTV predictions but also retention rate predictions. This data product was coded in R with Shiny and interacts with S3, Redshift and EC2 instances in AWS. Cute statistic, although the models are very simple, there are 724 of them in the application. At peak it runs more than 200.000 predictions under 4 minutes. Not too shabby for an interactive application.
  • Fraud Detection: An automated predictive data product coded in R running on an EC2 instance. Although it is a rather simple script, the beauty is that the algorithm was coded in house. Redshift and MySQL are also used.
  • The Super Hyper Secret Mega Project That I Will Not Name: I know… I know… if I’m not going to explain it, why say it? Because of the history and the tech. This is the project I’m currently working on and is the end game of almost a year of prototypes, investigation and analysis. From random forests and SVMs to association rule mining, from Python to R, from local data sets to terabytes of data.
  • User Stats: An interactive data product that I list here because it is NOT predictive. In a nutshell, no dashboard tool could create exactly what our Customer Support team needed… so we did it ourselves! This application queries and builds visualisations across billions of rows of data.

We have many cool projects to build, a lot of things to learn together. If building machine learning models, writing code, building applications and playing games is your thing and you are not afraid of a lot of data, click here and see you soon!


The Dumb Data Science X vs Y Wars

ognd 023

Yep it’s a rant… announcing ahead that a blog post is a rant is almost a tradition since before blogs were cool and web forums were the thing. If someone in some obscure web forum started a post with <rant> or “Rant Mode On” and other derivatives, he was signaling “I’m not a troll but I’m really upset!”

It is a message with mixed signals between “keep reading because I’m going to get nasty and you’ll like it” and “you can just skip if you need apologies if and when a web user get’s crazy”.

Either you got the idea by now or you know what I’m talking about for two whole paragraphs and you’re itching for the juicy stuff. So, here goes…

Rant Mode On!  Continue reading

Learning data science with John Oliver


You know this guy, right? In case you don’t, he is John Oliver, an english comedian with a perspective on the modern world that can only be matched by his distinctive voice!

I saw the video below sometime ago. In it, John Oliver presents in his usual style what is wrong with how science is used and presented. I won’t discuss the large amount of pet peeves I have with what I see on mainstream media or shared on Facebook regarding science or the lack of it. It would be out of context, too long and, to be intellectually honest, incredibly boring especially after John’s tremendous piece.

Instead I want to invite to watch the video in case you haven’t and I’ll tell you why I believe his views are important in the context of data science also.  Continue reading

You might be a data redneck…


I like comedy a lot and stand up in particular. Some years ago I saw a video of The Blue Collar Comedy Tour. While I am not a fan and was only mildly entertained, there was a piece of it by Jeff Foxworthy that, I learned later, it’s sort of his stand up business card. That piece is widely known as “You might be a redneck”.

To Jeff, the definition of redneck is and I quote “The glorious absence of sophistication”. Let’s save this bit for later…

The reason why I’m writing this post is because in this day and age every knowledge worker claims to be data driven(*)… and many aren’t. This is a very touchy subject. The reason is simple. If everyone around me says they are data driven, it is very hard for me to admit that I’m not. It is even harder to say “I don’t know” when everyone seems to know.

Trust me on this, most don’t know! It is ok to not know. It is the prerequisite to start learning. The problem is that with so many people “knowing” there is a vast widespread glorious absence of data sophistication… See what I did there? 😉
Continue reading

Why questions are better than data

ognd_ 016

Fred entered the room. The walls are covered by whiteboards scribbled with words that he reads in his LinkedIn feed and hand made scatterplots and line charts in red and black, green and blue. This is the lair of the data analysts and data engineers. He is very proud that his company has a data science team. He has been reading a lot of nice stuff about data science and big data and he brags about its impact to his friends.

“Hey Gabriel! I want to ask you something.” Gabriel takes his eyes of his monitor and smiles back to Fred: “Hey, what’s up? What do you need?” Fred requests “If you have the time, can you send me an Excel file with… let’s say… the last 6 months of in-app sales?”

Continue reading

Mobile Game Analytics: Miniclip’s Story



As I wrote sometime ago, Microsoft organised Game Dev Camp 2015, an event for the Portuguese game development community. This post and next week’s post are about that but to write them I wanted to see all the 30 talks.

This post is simply my talk there. Kinda egocentric I know but next week I’ll post my favourite talks from that event and will exclude myself from it.

In this talk I tell the full story of implementing analytics in Miniclip, from the early stages to the current state but the take away points were the mistakes we did and I hope everyone can learn from. So without further ado and since WordPress is giving me a hard time with videos, here’s the link to the video of the talk.

Microsoft Game Dev Camp 2015


Some years ago, I was one of the moderators of GameDevPT. A Portuguese gamedev forum where new blood, veterans, wannabes and professionals helped and networked in the ways that were only possible through the accessibility of the pre Facebook forums. We made a lunch reservation for 18 people the last time I helped to arrange a gamedev event in Portugal. One of them was brought by his mother who sat in separate table. This was many years ago… 9 maybe…

Since then I’m afraid I lost contact with the Portuguese gamedev community. The reasons for that are irrelevant but all of them are my own responsibility.

A couple of months ago during a rambling analytics phone call with the great guys from Bica Studios, the sentence “you should talk in the next Game Dev Camp” was heard. The point was that this Game Dev Camp was about taking the next step in Portugal and analytics is a big part of taking the next step in many industries, gaming included.

A couple of weeks went by and I was involved in a conversation at Miniclip about this particular event. Miniclipers that had something to give to the gamedev community stepped up to give talks.

Some years ago we were 18 people in a lunch. Here’s what I saw in 2015.


This was the crowd for the keynote. I checked many photos and none really shows how packed that room was and many were left outside. There were more than 400 people attending, 35 speakers, multiple tracks with simultaneous talks. There was a showroom with a lot of fine games. All under the umbrella of Microsoft with the support of Microsoft and the event partners Miniclip, B5, Bica Studios, Nerd Monkeys, Raindance Studios, Lisboa Games Week and Emergency Agency.

But the guy in the middle of the tornado was Miguel Vicente to whom I have to personally thank for all the hard work in making this a reality. Here he is thanking the partners during the keynote!


So… what happened there?

Quite a lot! Did I mentioned once we were 18? 400+ is what I call a pretty good forward step for a 10 million people country. To me the most relevant thing was that many of the people I met a decade ago, those that introduced me to gamedev, are still here and they were speaking with people from Unity, Gameloft, Miniclip and Microsoft but also that hundreds more joined and are building an industry.

What we had was a mix of gamedev veterans plus people from all walks of gamedev, plus the ones that joined larger companies in the industry, plus their networks sharing knowledge.

And free waffles… never forget the free waffles!

I’ll post links to the talks I found more relevant since everything was recorded… ain’t that neat, huh?!

My role


I was a speaker in this event. A proud one I might add. I do apologise for the extremely ugly man in the photo but to this day no camera is so amazing that can make me look any better. My talk was about how Miniclip went from having no analytics to a company wide data science team. The good things and bad things, the challenges so that anyone interested is aware of it.

I believe that my role goes a bit further than this talk. This event was a bit emotional to me and I left it with a feeling of responsibility to the community. I work on a successful company doing something that is rare and valuable. I have a fantastic team and a great department with whom I learn something new almost every day. If we were in England or Germany we would still be great but I doubt we could make a big difference in the local gaming industry hubs. But we are in Portugal and I feel I personally owe it to the community.

See you all around and until next year!

Planning Game Analytics from 0 to data science


Game analytics can be very simple or go wide, far and deep. The trick is to define what it is that you will want on a given timeframe. The length of the timeframe depends on how sophisticated and complex are your objectives.

This post will go through the role that sophistication and complexity take in defining both your objectives and the analytics stack to support them. Continue reading