Small break in the blog posting holiday break to let everyone know that Miniclip is hiring… well… we’ve been hiring for quite a while so I guess it’s not a big surprise. The reason why I’m posting is because we’re looking for a Data Scientist to work with yours truly, while pretending for dramatic effect that you didn’t read the title and are really surprised!
You can see the job description (and apply!) following this link. However I want to give you a bit more information than just the job title and job description.
At Miniclip anyone in the Data Science and Data Engineering teams can be involved in data science projects. Analysts, engineers, scientists and team leads have different operational responsibilities but our major strength is that we work as a multidisciplinary team.
And what data science projects are those you ask? As I see it, Data Scientists build data products. Data products are automated or interactive data centric applications that would not be possible using traditional systems. What on earth does this mean? What about some real examples at Miniclip:
- UA LTV: An interactive data product that allows business users to analyse predicted LTV across all possible cohort combinations. Business users can also export data to create reports. This export includes the mentioned LTV predictions but also retention rate predictions. This data product was coded in R with Shiny and interacts with S3, Redshift and EC2 instances in AWS. Cute statistic, although the models are very simple, there are 724 of them in the application. At peak it runs more than 200.000 predictions under 4 minutes. Not too shabby for an interactive application.
- Fraud Detection: An automated predictive data product coded in R running on an EC2 instance. Although it is a rather simple script, the beauty is that the algorithm was coded in house. Redshift and MySQL are also used.
- The Super Hyper Secret Mega Project That I Will Not Name: I know… I know… if I’m not going to explain it, why say it? Because of the history and the tech. This is the project I’m currently working on and is the end game of almost a year of prototypes, investigation and analysis. From random forests and SVMs to association rule mining, from Python to R, from local data sets to terabytes of data.
- User Stats: An interactive data product that I list here because it is NOT predictive. In a nutshell, no dashboard tool could create exactly what our Customer Support team needed… so we did it ourselves! This application queries and builds visualisations across billions of rows of data.
We have many cool projects to build, a lot of things to learn together. If building machine learning models, writing code, building applications and playing games is your thing and you are not afraid of a lot of data, click here and see you soon!