I’m a big football fan, and have been for years, so one of my first ideas for my own data projects was to do something to do with that. Where better to start then goals. So which stadium sees the most goals was the first idea. One use for this would be which team should you go to if all you care about is a good match, and one measure of a good match is a number of goals. Not a perfect science but still interesting to see.
So, I started by getting the data. I googled to find the home table (at the time) for the Premier League this season (2015/16), and pasted it into excel. Working out the total goals scored at a club’s stadium was easy from there, I simply added the goals scored and the goals conceded. Of course, with the season not finished yet, not all of the teams had played the same number of games so far, so I divided the total goals by the number of games to get the average.
This was all the data I needed you could say, but I wanted to use this data with fusion tables to visualize it on a map, but points on a map I felt wouldn’t work as well, so I decided to do it for the county. So, a real life use for this maybe if you were travelling to England and wanted to go to a good match, which county would be best. Firstly, I researched to find which English county each of the clubs were in and inserted that data into Excel. Of course this created its own problem as some counties had more than one club within it. Rather than use the data of the club with the highest average goals, which would be one of the approaches, I instead got the average per county by adding all the averages from the same county and dividing by the number of clubs within that county.
To be clear, this isn’t me working all this out in my head or on a calculator. Excel has a number of functions to do this for you. Some basic like basic maths with symbols, sum and average. To get the average per county, I used the slightly more complicated “sumif” and “countif”. An example of this was = (SUMIF($P$2:$P$21,P2,$L$2:$L$21))/COUNTIF($P$2:$P$21,P2). It mightn’t make sense without context, but it does the job. Sadly, fusion tables, although they can do some formulas and functions, its only a small amount compared to excel, so I worked out all my data in Excel (or Google spreadsheets) and uploaded it as a fusion table once I was done.
Finally, we mustn’t forget the reason I got all this data, to make a map, although the data itself is interesting too. I searched online for a KML file with the details for English counties and their borders. Like with the Ireland map, I downloaded a copy and uploaded it to my fusion tables. I then merged the two tables together with the county name as the common column to join on. The important thing I took was the geometry column.
With the merge complete, I had my map. I just now needed to customize it a bit. Basically, I decided and tinkered with what my ranges would be and made it fill in the bucket colours that looked best. This is the result:
So it looks like Merseyside is the clear winner, followed by Dorset in second. Merseyside contains two Premier league clubs, Liverpool and Everton. Everton is the driving force of this with a 3.67 average, with Liverpool only at 2.77 per game. Dorset’s Bournemouth just edges with 2nd highest range with 2.87, although that’s more than Liverpool. None of these are the highest however. The highest is Manchester City with a slightly higher 3.73 average. They are let down by their United rivals with their low 1.86 per game. The lowest, for both club and county, belongs to Watford of Herfortshire with a stadium and county average of 1.67 goals per game. No other counties and only one other stadium failed to break the 2 goal per game mark. So the conclusion based on this is to to go Merseryside, with Everton being the preferred option within it. But what if you only cared about the home team?
If for some reason you cared about the home team scoring rather than the away team, maybe for the atmosphere and cheering, the results are a little different. The steps to get this were the same. The only difference is I used the goals for, aka the goals scored by the home team, rather than the total scored in the matches. The results were this:
As you can see we have a new winner with Greater Manchester, with Merseyside relegated to joint second, and Dorset even further. Greater Manchester with their 2 clubs, Manchester City and Manchester Utd, come out on top with a 1.98 average. Again, this is fueled by City with a ridiculous 2.6 goals per home game. No other team, even breaks two per game. Utd again hold them back with only 1.36 per game.
Everton does its best with a league second highest of 1.933 per game, but it wasn’t to be, especially with Liverpool even lower. They just about nick second with 1.69 county average to their rivals 1.67 averages. Its Watford at the bottom again with a 0.87 average. Every other county matches or breaks the 1 goal average mark, although bottom of the league Aston Villa are at least lower per stadium with 0.73.
So the conclusion so far, is go to Merseyside to see goals and Manchester if you care about the home team scoring specifically. In my next blog (this one is getting a bit long) I’ll analyse whether scoring goals at home effects their position in the league. Stay tuned!