Transfers and remaining games

Over the last few months I have been creating Fantasy Football data flows using Azure which allows me to automate the data flows. This has enabled me to start looking at what data is important. I’ve been concentrating on the Transfers just lately. So here is a Power BI dashboard which shows almost real time data (Data captured every 15 minutes)

In this dashboard I have the ability to look at various pieces of data:

  • Transfer data. This allows me to see how big a player is expected to be in the coming game week.
  • Goals and assists for those players. I can select the game weeks so I can select what I believe is the right amount of weeks to go back over, 3 to 6 appears to be the ‘good zone’.
  • Remaining games details.

 

The actual Transfer section starts off at the team level then you drill down to players.

I love the capabilities of Power BI to drill down like this, it really gives you some options.

 

 

The whole dashboard becomes the selection options. If I actually select Teams or Positions only the data which meets that criteria will be shown, if I select something on the charts the rest stays there but looks opaque. Here I have selected Harry Kane as he wasn’t high up the charts for scoring/Assists.

The final part allows me to see the fixtures. Now this could be an interesting selection to highlight. Over the years I have observed that the first 6 and last 6 games of the season throws up the biggest surprises. Take this season, West Brom have beaten Man united and drew with Liverpool in the last few weeks, and they are in the relegation zone.  Spurs have West Brom so I suspect that the players who are bringing in Harry Kane expect a big return, but could there be surprise.

I still have a lot of data to blend to get me to where I want to be, but I’m building up some good history, this at least gives me a good place to start.

Having the historical data allows me to change calculations and then apply the new settings to see what happens.

Roll on next season 🙂

 

 

 

Real Time Data Collecting – ver2

In the last post I explained how I collect the data using Azure by means of Webjobs, event hubs etc, there is a simpler version. Python.

I’m still using Azure but this time I have a VM which has a series of Python scripts. These scripts are run at different times based on the requirements. The actual Architecture diagram looks like this.

Basically, I use Windows scheduler to run different Python scripts at various times.

  • 5 minutes interval – These are used for up-to-date transfers of fantasy premier League players. The scripts extract the information from the json files and inserts it into a SQL db. My on prem SQL db runs a stored procedure  at a predefined time which moves the data across then deletes the old records from the cloud. This enables me to keep the db in Azure at a decent size.
  • Weekly interval – These are used to get the data based on points, minutes played, cards received etc. The new data is pulled across to an on prem db in the same way as the transfer data.

The Python scripts works in the following way.

  • A main script is run via the schedule, this calls the specific Python files that reads the data. The DB connections are held in another Python file. This way I have one connection file which the others ‘IMPORTS’

We are at the tail end of the season, so all these scripts will be sued in ‘Anger’ next season. I have been using them mind you and I have found some good insights.

 

My team for wk34 which is a double game week I used some of this data to pick my players.

  • Captain – Eriksen: 2 games and he has been doing well recently. So far he has scored 1 goal, so I should expect a decent return.
  • GK – Schmeichel: 2 games so I took out WBA keeper for him, my other choice was Pickford so against Man Utd I would have never thought they would keep a clean sheet. Still 1 game to go for Kasper so I might get something. His points were slightly swayed by his penalty save.

My other players were my stalwarts which have been in there for a long time, mind you the City players didn’t return much, now they are champions maybe time for a change.

It’s an interesting time of the season for a few reasons.

  • Premiership race is over, so it’s the next three places that are up for grabs, so Spurs, Liverpool, Man Utd and Chelsea players have it all to play for.
  • Relegation is still not decided so teams will be fighting for their Premiership survival and expect unusual results Like Man Utd getting beat at Home by WBA.

I need to look at the fixtures and the stats together next. Next season this will be an automated process. I’m hoping to click on a button and it will give me a narrative on the coming weeks 🙂

Loads more to do really, the thing is the more data I look at the more data I want. Its like going round in circles 🙂

 

Real time data collecting

This is the first part in my series on collecting data

I thought I would go back to the beginning and post about how I collect the data, as the reports are nothing without data. Most of the charts are based on data which is updated once a week, things like Goals or minutes played so the actual data capture is relatively simple, but for some of the data like transfers this gets updated throughout the week so I wanted to see what insights I could get from that. When I say transfers I’m referring to the way people move players in and out of their teams every week.

Collecting real-time data is something I have experience with as I designed a process for work, so I use the same building blocks. The overall design looks like this, but lets break it down.

The APIs I use are provided for anyone, so its just a case of having the knowledge on how to use them.

A Webjobs – runs continuously and gets the data, the data is too big to be sent to an Event Hub so I do some work on the data, I extract the nodes I want and send that to an even Hub but I also send the entire json file to a Storage account. The reason being I can use that data when I need it.

The data that is sent to the Event Hub is consumed by the Stream Analytics and ends up in an Azure SQL database. Within the Stream I can modify the data or add additional data if I so wish.

In my design I also want to merge this data with other data sources so I use Azure Data Factory to copy the data from the cloud to a SQL db on a local machine which is being used as a Data Warehouse and contains all the other football data. I have various processes running which collects different types of data.

At this point I have real football data and fantasy football together so I can start to blend it together to see what insights I can get.

This is an ongoing project for me which I have been working on as a side project for many years in which time I have changed and modified all the elements. Originally I used VB.net to write stuff but now I use C#, I removed ETLs which were based on SSIS and now I use Azure or even Python, cant wait till the support for Python in Function Apps is better 🙂

So that was a brief overview but it gives you the main parts and how I collect from that data source.

 

Next part will be based on the other data sources and how we blend data before visualizing it.

 

Latest dashboards

Been playing again with Power BI, the latest dashboards shows various stats to assist with my player picking:

Points & Average points by teams and positions  over a given time period.

Points & Average points by players over a given time period.

Points & Average points by players outside the top 6 teams  over a given time period.

Goals scored & Assists by teams and positions  over a given time period.

Goals scored & Assists by players  outside the top 6 teams  over a given time period.

If I drill down on the assists to show players you can see Benteke had 4 assists prior to last weeks games, who’d have thought that 🙂

 

Goalkeepers stats.

The purpose of these dashboards is to show what is happening and to give you the ability to ‘drill down’ from teams to players. Need to start looking at the next game week and think about game week 31 as that is a reduced game week in terms of games.

 

More APIs for Fantasy data.

After a bit of digging I found some useful APIs to get data, so I created some more Python scripts to capture data and now I’m in a position where I’m looking at that data and considering redesigning the entire data model. The reason for this is some of the data I was creating will be more accurate with the new APIs.

The new data allows me to summarise the players data by week. I have made a few more dashboards to show the data so I can see what I can do. Here they are.

This gives me a summary based on the last 6 weeks.  I can see things like minutes played and goals. Here I have selected Midfield.

If I select Hazard from the treemap I get his stats. This will enable me to see how his points are broken down.

And the big question is normally Kane or Aguero for my forward.

Based on the last 6 weeks Sergio wins 🙂 but what happens if we look at the last 3 weeks, well, Firmino comes into the mix

The good thing is we are getting to a point where the data is enabling us to make a decision. With footy, emotions play a major part so having decisions based on data rather than emotions may make a different.

What happens if we want to include price. We can add Price and Price range to the dataset. This will give us the ability to look at players within a certain budget. Having the ability to select the weeks you want to look at will also show how they are progressing or not!

Adding in a price range we can select our budget. I want to look at the Midfield players in the £9 million to £11 million range, as expected the big hitters like Salah and KDB are there. In the last three weeks they have amassed 9 goals and 7 assists.

 

If I select KDB I can see that he has scored once and has 4 assists. The bottom right hand chart shows me he has had a price increase as well

But if my budget was less, in the £5 million to £7 million range it would look like this

The top two being Prowse and Ramsey, so lets look at them. 1 goal 2 assists for Prowse

3 goals for Ramsey in a single game, and his game time is less.

So in this situation I would probably pick Prowse, and he is cheaper.

In summary leveraging Power BI to visualise the data will hopefully give me more of a competitive edge going forward 🙂 Remember spend time on getting the data right then the Power BI part is easier

Next I need to use Python to provide some stats prior to displaying in Power BI.

 

Home Team dashboard

I’ve spent more time playing on dashboards, it appears the more time I spend the more options I want to include :). I decided to go with a Home Team dashboard.

You pick a League and a season and it gives you the home info.

If you select a Team you get the details

If we look at the games in details we can see that home draws are going to cost Liverpool.

You can look at the data across multiple season as well, here I selected 3 seasons and it gave me this data.

I’ve built quite a few dashboards now, I need to build the away dashboard next then I’m onto the predictive analysis stuff. I will then start to use R/Python to really push the data. The current Predictions stuff is not accurate enough yet so I’m busy building in some other variables.

Its amazing what you can add in and luckily with some of the variables I can retest all the previous results to see what effect it has. Some of the variables I cant use as I only have certain data points but going forward that will grow.

During my time on this side project it has made me realise the importance of collecting data, but not just collecting it but actually using it.

 

 

Working on the Prediction dashbord

A new dashboard has been added, this time it based on real football and it includes Predictions 🙂

This one allows you to select a home team and an away team and you can see stats based on that fixture. In the example I selected ‘Manchester City’ and ‘Newcastle’. Looking at the stats its going to be a home win.

 

It has also been predicted as a home win 67.74%. The predictions are still being tested as I have changed the algorithm a lot this season.

The stats in the tiles are based on historical fixtures between the 2 teams. Currently I’m using all data but I will modify this once I see how relevant all the data is.

If we change the teams to Arsenal/Palace we can see at one point this would have been a guaranteed Home win. But recent form shows a different story. The Prediction is 60.78% for a home win but I think it might be a draw.

if we use the players dashboard we can see it might be closer.

After this weeks games I will be doing some more analysis before I update all the data.

Watch out for another post coming out very soon.

Power BI and Fantasy footy data

The joys of data and hopefully some useful insights 🙂 Here is my latest dashboard on the Fantasy Footy Data.

3 week diff

This charts shows the data when you look at it over a three week period. The purpose of this data is see in form players, but by looking at three weeks it shows more analysis than looking at a single week. It shows the data in the following way:

  • Data by teams
  • Data by players
  • Data by position

It can be filtered by team or cost band.

 

Goals & Assists

These charts are showing goals and assists by players. It is the same format as the last chart. Interesting you can see that only two teams have not ahd a defender score and Chelsea have 11 goals from defenders.

Quadrant

This chart is showing goals and assists per minute played. The size of the bubble is total points. Bottom left quadrant is the one to be in.

This can be filtered by position.

If we zoom in we can see the stats for Mo Salah.

Drilldown

The drill down capabilities shows us the records which makes up the charts, if we right click on Chelsea and DEF we can see who has scored.

 

Drill down records

The records are showing us that Alonso has scored 6.

 

The purpose of these charts is to help me select which players do I bring into my team.

I will be posting once a week from now as we in a position when the data is flowing and we are seeing some good insights.

 

More analysis on players using Power BI – upto and including week 9

So in my quest to see if I can use stats to improve my Fantasy league team I have come up with a few more dashboards.

Team and Players

The first dashboard shows me the overall points for the teams, then the weekly points for the teams and then the players. Ignore the up/down as that is based on limited data, it will become more important as the weeks go by.

By selecting the team it will filter all the charts so you can see what each player got.

As well as the dashboards based on overall points etc. I wanted to see if I could create a ‘weighting’ which could be applied and then you can see if a player is worth buying, hence the Bargain column. It will be interesting to see the stats after I have uploaded this weeks data.

 

 

I will be updating the data and posting the latest stats later.

 

Power BI and fantasy league data

Back to playing with Power BI. I’m using the data from my Fantasy league team and seeing what insights I can get. This is the first post of many on this subject. Future posts will have more details in them.

 

The idea is to see the data from a details and KPI perspective.

KPIs are based on what I would my team to score, so Captain points are 15, other players just below 5. I came up with these numbers based on.

  • Team needs to get 60 points
  • Captain needs 15 points
  • 45 points shared between the other 10 players.

In the first set of charts the waterfall chart has drill down capabilities.

Top level, which shows points by week.

First level down shows points by players

Bottom level shows points by bench and first 11. ‘Y’ is bench.

 

I’m hoping this data will help me get higher in the leagues.