Transfers and remaining games

Over the last few months I have been creating Fantasy Football data flows using Azure which allows me to automate the data flows. This has enabled me to start looking at what data is important. I’ve been concentrating on the Transfers just lately. So here is a Power BI dashboard which shows almost real time data (Data captured every 15 minutes)

In this dashboard I have the ability to look at various pieces of data:

  • Transfer data. This allows me to see how big a player is expected to be in the coming game week.
  • Goals and assists for those players. I can select the game weeks so I can select what I believe is the right amount of weeks to go back over, 3 to 6 appears to be the ‘good zone’.
  • Remaining games details.


The actual Transfer section starts off at the team level then you drill down to players.

I love the capabilities of Power BI to drill down like this, it really gives you some options.



The whole dashboard becomes the selection options. If I actually select Teams or Positions only the data which meets that criteria will be shown, if I select something on the charts the rest stays there but looks opaque. Here I have selected Harry Kane as he wasn’t high up the charts for scoring/Assists.

The final part allows me to see the fixtures. Now this could be an interesting selection to highlight. Over the years I have observed that the first 6 and last 6 games of the season throws up the biggest surprises. Take this season, West Brom have beaten Man united and drew with Liverpool in the last few weeks, and they are in the relegation zone.ย  Spurs have West Brom so I suspect that the players who are bringing in Harry Kane expect a big return, but could there be surprise.

I still have a lot of data to blend to get me to where I want to be, but I’m building up some good history, this at least gives me a good place to start.

Having the historical data allows me to change calculations and then apply the new settings to see what happens.

Roll on next season ๐Ÿ™‚




Real Time Data Collecting – ver2

In the last post I explained how I collect the data using Azure by means of Webjobs, event hubs etc, there is a simpler version. Python.

I’m still using Azure but this time I have a VM which has a series of Python scripts. These scripts are run at different times based on the requirements. The actual Architecture diagram looks like this.

Basically, I use Windows scheduler to run different Python scripts at various times.

  • 5 minutes interval – These are used for up-to-date transfers of fantasy premier League players. The scripts extract the information from the json files and inserts it into a SQL db. My on prem SQL db runs a stored procedureย  at a predefined time which moves the data across then deletes the old records from the cloud. This enables me to keep the db in Azure at a decent size.
  • Weekly interval – These are used to get the data based on points, minutes played, cards received etc. The new data is pulled across to an on prem db in the same way as the transfer data.

The Python scripts works in the following way.

  • A main script is run via the schedule, this calls the specific Python files that reads the data. The DB connections are held in another Python file. This way I have one connection file which the others ‘IMPORTS’

We are at the tail end of the season, so all these scripts will be sued in ‘Anger’ next season. I have been using them mind you and I have found some good insights.


My team for wk34 which is a double game week I used some of this data to pick my players.

  • Captain – Eriksen: 2 games and he has been doing well recently. So far he has scored 1 goal, so I should expect a decent return.
  • GK – Schmeichel: 2 games so I took out WBA keeper for him, my other choice was Pickford so against Man Utd I would have never thought they would keep a clean sheet. Still 1 game to go for Kasper so I might get something. His points were slightly swayed by his penalty save.

My other players were my stalwarts which have been in there for a long time, mind you the City players didn’t return much, now they are champions maybe time for a change.

It’s an interesting time of the season for a few reasons.

  • Premiership race is over, so it’s the next three places that are up for grabs, so Spurs, Liverpool, Man Utd and Chelsea players have it all to play for.
  • Relegation is still not decided so teams will be fighting for their Premiership survival and expect unusual results Like Man Utd getting beat at Home by WBA.

I need to look at the fixtures and the stats together next. Next season this will be an automated process. I’m hoping to click on a button and it will give me a narrative on the coming weeks ๐Ÿ™‚

Loads more to do really, the thing is the more data I look at the more data I want. Its like going round in circles ๐Ÿ™‚


Real time data collecting

This is the first part in my series on collecting data

I thought I would go back to the beginning and post about how I collect the data, as the reports are nothing without data. Most of the charts are based on data which is updated once a week, things like Goals or minutes played so the actual data capture is relatively simple, but for some of the data like transfers this gets updated throughout the week so I wanted to see what insights I could get from that. When I say transfers I’m referring to the way people move players in and out of their teams every week.

Collecting real-time data is something I have experience with as I designed a process for work, so I use the same building blocks. The overall design looks like this, but lets break it down.

The APIs I use are provided for anyone, so its just a case of having the knowledge on how to use them.

A Webjobs – runs continuously and gets the data, the data is too big to be sent to an Event Hub so I do some work on the data, I extract the nodes I want and send that to an even Hub but I also send the entire json file to a Storage account. The reason being I can use that data when I need it.

The data that is sent to the Event Hub is consumed by the Stream Analytics and ends up in an Azure SQL database. Within the Stream I can modify the data or add additional data if I so wish.

In my design I also want to merge this data with other data sources so I use Azure Data Factory to copy the data from the cloud to a SQL db on a local machine which is being used as a Data Warehouse and contains all the other football data. I have various processes running which collects different types of data.

At this point I have real football data and fantasy football together so I can start to blend it together to see what insights I can get.

This is an ongoing project for me which I have been working on as a side project for many years in which time I have changed and modified all the elements. Originally I used to write stuff but now I use C#, I removed ETLs which were based on SSIS and now I use Azure or even Python, cant wait till the support for Python in Function Apps is better ๐Ÿ™‚

So that was a brief overview but it gives you the main parts and how I collect from that data source.


Next part will be based on the other data sources and how we blend data before visualizing it.