Solar energy installations
About this lesson
This lesson uses data about solar energy installations to investigate data analysis. The dataset shows how many solar systems were installed, in each postcode, from 2001 to 2016. It is a useful way to understand how to explore and characterise datasets as well as to explore the use of data in the media. This lesson was devised by Linda McIver, Australian Data Science Education Institute.
Year band: 7-8
Curriculum Links AssessmentCurriculum Links
Links with Digital Technologies Curriculum Area.
Year | Strand |
---|---|
7-8 |
Acquire, store and validate data from a range of sources using software, including spreadsheets and databases (AC9TDI8P01) Analyse and visualise data using a range of software, including spreadsheets and databases, to draw conclusions and make predictions by identifying trends |
Links with Science Curriculum Area.
Year | Strand |
---|---|
7-8 |
Science Understanding: Some of Earth’s resources are renewable, including water that cycles through the environment, but others are non-renewable (ACSSU116) Science Inquiry Skills: Construct and use a range of representations, including graphs, keys and models to represent and analyse patterns or relationships in data using digital technologies as appropriate (ACSIS129) Science Inquiry Skills: Communicate ideas, findings and evidence based solutions to problems using scientific language, and representations, using digital technologies as appropriate (ACSIS133) |
Links with Mathematics Curriculum Area.
Year | Strand |
---|---|
7-8 |
Statistics and Probability: Identify and investigate issues involving numerical data collected from primary and secondary sources (ACMSP169) Statistics and Probability: Construct and compare a range of data displays including stem-and-leaf plots and dot plots (ACMSP170) |
Learning intentions
- Analyse data about solar energy installations.
- Save, store and access datasets and use data cleaning techniques
- Sort and filter datasets to make sense of data to answer particular inquiry questions
- Visualise data using online mapping software or as a chart such as a histogram or column graph.
Suggested steps
Learning hook
Solar installations seem to be popular in domestic housing. Why do people install solar systems, and is there an upward trend in installation across Australia?
View the video Renewable Energy 101: National Geographic (3 minutes) and discuss how renewable energy can address climate change.
Alternatively, view the video How Do Solar Panels Work? (5 minutes) and discuss the impact of solar panels on household CO2 emissions.
Provide students with a dataset found on Postcode data for small-scale installations. Note: there is a range of datasets of this web page to accommodate students with differing levels of spreadsheeting skills.
In particular, use this dataset provided in csv format:
Girls in Focus:
Environmental issues are important to many girls, as they are often interested in careers that ‘do good,’ and this lesson shows how a career in data analytics can lead to solving some of the world’s big problems.
Girls may be further motivated by being introduced to female role models in data science. They may not realise the diverse applications of data science, and how they can connect their personal passions and interests to maths. Consider printing these posters from AMSI Careers for your classroom, or share the profiles of these women blazing a trail in data science.
Step 1: Dataset download and data cleaning
Download the dataset and open the csv file in a spreadsheet.
Look at the postcodes.
In most spreadsheets this data will show postcodes with one, three and four digits, although postcodes in Australia all have four digits. What has happened to these postcodes?
This is an example of a spreadsheet being programmed to modify the data (hiding things the programmer believes the user does not need to know about – in this case, leading 0s). Mathematically speaking, there’s no difference between 0, 00, 000, and 0000. They all just mean 0. So spreadsheets (and other software) tend to remove the leading 0s, which means postcode 0 is actually 0000, 200 is actually 0200 and so on.
This is an important part of data cleaning. Sometimes you have to convert the data back to its original form to fix errors that spreadsheets and other software introduce in an attempt to be ‘helpful’. Format cells and select ‘custom’ from the drop-down menu and type in 0000 to allow for a 4-digit postcode.
Girls in Focus:
Jumping straight into the data analysis might be confronting for some students. Many girls lack a strong self-concept in maths and are limited by a fixed mindset about their ability. Scaffold and create context by first asking students to use Google Earth or Apple Maps satellite view to explore their local area to see how many solar panel installations they can identify. As a class, discuss the limitations of this type of data gathering (for example, image date, identifying suburb boundaries, image resolution) and then introduce the data set.
Step 2: Is this dataset useful?
Now let’s look at the first two columns in the dataset. The first is historical installations from 2001 to 2016. The dataset doesn’t seem to have any data for installations prior to 2001, but that’s not because there were no installations. What might be the reason?
It turns out it’s because 2001 was when the Government introduced the mandatory renewable energy target and began tracking renewable energy.
Students may ask, ‘How many solar panels are actually operating currently?’
We can’t exactly tell that from this dataset. This data tracks installations. It doesn’t track people getting rid of their solar panels, or the panels ceasing to work. Installations are a reasonable, but not perfect, measure of how much solar we have.
This question can springboard a useful conversation about the data we want, versus the data we have, and how many data studies work with flawed or missing data, simply because it’s all that is available.
Class discussion: How might we find out how much solar is actually operating right now in Australia? What organisations might have that information?
Girls in Focus:
In class discussions, pay attention to which students are contributing. If you notice that there is a gender imbalance in contributions, consider using a strategy such as think-pair-share to encourage more girls to have a voice in discussions.
Step 3: Sorting the data
Look at the first column in the dataset. Having it sorted by postcode is logical, but not terribly interesting. Let’s look at the top 20 postcodes – to do that, we can sort the entire table by the second column (how many installations happened between 2001 and 2016), in descending order. In other words, put the largest values up the top.
Table 1: Solar Installations sorted by Previous Years column, largest to smallest
Small Unit Installation Postcode | Previous Years (2001-2016) - Installations Quantity | Previous Years (2001-2016) - SGU Rated Output In kW |
---|---|---|
4670 | 10598 | 34,298.31 |
6210 | 10101 | 25,929.69 |
4655 | 9346 | 27,713.17 |
4551 | 8611 | 23,956.86 |
4350 | 7766 | 25,187.09 |
6065 | 7174 | 22,991.62 |
4211 | 6990 | 23,588.38 |
4305 | 6773 | 19,862.10 |
4740 | 6443 | 26,016.07 |
4207 | 6399 | 20,911.23 |
6155 | 6341 | 18,627.15 |
4570 | 6248 | 20,379.99 |
3029 | 6203 | 18,688.43 |
3977 | 6146 | 18,643.95 |
6164 | 5986 | 17,725.29 |
4556 | 5899 | 17,801.28 |
4306 | 5806 | 19,665.36 |
6112 | 5785 | 17,822.82 |
4510 | 5740 | 18,658.14 |
A quick glance shows us that the majority of the top 20 postcodes start with a 4, indicating they’re in Queensland. (To check postcodes refer to a postcode site.)
The top postcode, 4670, covers 53 regions, including Bundaberg. There’s a surprisingly large gap between the top postcodes and the bottom of the top 20, which is interesting. Most of the postcodes in this list that aren’t in Queensland are in Western Australia, except for 3029, which is west of Melbourne, around Hoppers Crossing, and 3977, which is south-east of Melbourne, in the Cranbourne area.
There’s a rich conversation to be had around why these suburbs have so much more solar than other places in Victoria. Toorak, for example, a famously wealthy suburb, comes in at 1701 on the list. Areas with a lot of new housing are more likely to have solar, as it gets put in when the house is built as a way to increase the energy rating of the house.
This is a topic worth exploring! You don’t have to know all the answers, as it’s an opportunity for the students to research and explore, and come up with their own theories for why it might be the case.
Figure 1: Data visualised in Google MyMaps: 20 postcodes with highest number of installations
Step 4: Data analysis
Let’s look at column 4: solar installations in January 2017. How different are the top 20 if you sort the whole table by this column?
Table 2: Solar Installations sorted by January Installations
Small Unit Installation Postcode | Previous Years (2001-2016) - Installations Quantity | Previous Years (2001-2016) - SGU Rated Output In kW | Jan 2018 - Installations Quantity |
---|---|---|---|
6065 | 7,714 | 22,991.619 | 99 |
3977 | 6,146 | 18,643.949 | 82 |
6210 | 10,101 | 25,929.687 | 77 |
6164 | 5,986 | 17,725.288 | 72 |
4211 | 6,990 | 23,588.384 | 71 |
4670 | 10,598 | 34,298.313 | 67 |
6112 | 5,785 | 17,822.82 | 65 |
6000 | 220 | 1,187.91 | 59 |
4655 | 9,346 | 27,713.172 | 55 |
6155 | 6,341 | 18,627.154 | 55 |
4510 | 5,740 | 18,658.14 | 52 |
4300 | 5,563 | 17,935.325 | 49 |
6069 | 4,296 | 13,534.08 | 46 |
4560 | 4,544 | 13,567.41 | 45 |
2259 | 4,215 | 12,031.366 | 45 |
6031 | 1,671 | 5,367.85 | 44 |
4680 | 5,472 | 19,982.779 | 42 |
6169 | 4,546 | 11,636.551 | 42 |
6107 | 3,725 | 10,233.827 | 42 |
Now WA scores better, and the rest is still largely over to Queensland, except for one Victorian postcode (Cranbourne area again), and this time one NSW representative.
Why do WA and Queensland do so well on both historic and recent measures? This is an opportunity to explore the politics and have your students find out what incentives there are to install solar in those states. Could it be due to solar feed-in tariffs, government incentives, or home energy rating requirements?
Girls in Focus:
Students could work in small groups to explore why some suburbs have so much more solar than others. Research has shown that girls respond well to cooperative learning opportunities. Students may wish to compare local postcodes, where they can more easily connect the postcode with a physical location, and will have some initial ideas about socioeconomic and social factors.
Step 5: Data calculations
Calculate the average solar installations per postcode per state. You can do this by sorting the data by postcode, and manually copying and pasting each state into a separate sheet, or you can write a Python script to sort the data into a separate file for each state using the first digit of the postcode.
The media misreported this data by using a naive analysis technique – just sorting the data. It is an easy mistake to make. Despite WA and Queensland dominating the top 20 postcodes list, when you calculate the average for each state, you get a quite different picture.
For examples of the media ‘analysis’ go to:
‘Queensland is leading Australia’s rooftop solar boom with eight of the country’s top 10 postcodes for installations in the Sunshine State, according to the new Clean Energy Australia 2018 report.’
The Sydney Morning Herald, June 16, 2018
Compare the media analysis with the averages by state.
Check that students’ results come out the same as in the example below: (Note: in the example ACT has been included in NSW postcode data as it makes the postcode handling easier).
Figure 2: Python program output: Average solar installations by state</>
Interestingly, this shows that the state that dominates the top 20 doesn’t perform as well when you average over all of its postcodes, so there is another rich conversation to be had about different ways of ranking data outcomes, and how you can characterise data in accurate but misleading ways.
The media reported this data saying that Queensland and WA were the best for solar. However, although they had the highest-ranking postcodes, as states they ranked very low.
Step 6: Data – interrelated
Students can continue to explore the different columns. Alternatively, as a further challenge look at how the columns are related. For example, are postcodes with a lot of historical solar installations also likely to have a lot of recent ones? You can do that roughly by eye, simply by looking at whether the top twenty, when sorted by those two columns, is similar or very different, or you can use the correlation function to find out whether the columns are correlated.
Girls in Focus:
Students could work in small groups to explore why some suburbs have so much more solar than others. Research has shown that girls respond well to cooperative learning opportunities. Students may wish to compare local postcodes, where they can more easily connect the postcode with a physical location, and will have some initial ideas about socioeconomic and social factors.
Step 7: Visualise and aggregate data
Students can visualise their data. Students first consider the aim of their visualisation.
Use question prompts such as:
- What is it you want to highlight about the data?
- For example, do you want to highlight January Installations, the states that do best with solar installations in total, or the change over time?
- Which column or summary statistic shows that point best?
- What is the best graph type for that information?
- This is a useful tutorial on graph types.
- How much data do you want to display?
- Trying to graph all of the information will make your graph cluttered and your graph elements hard to read.
- Can you aggregate the data in some way – for example, by state instead of by postcode, or show the number of postcodes above 20,000, the number between 15,000 and 20,000, the number between 10,000 and 15,000. This is a histogram.
- How would the data display on an online map?
- The data could be plotted on an online map such as Google MyMaps.
- Data could be collated in tabs on a csv spreadsheet for each state and territory and uploaded. The data can be selected and deselected to visualise data.
- In order to effectively upload data using online mapping software, the location needs to be identified in some way, often using latitude and longitude. In our dataset we have postcode data. However, for that data to be plotted more accurately, adding the suburb information helps [Note: this can be time consuming for a large dataset.
Girls in Focus:
Students could work in small groups to design an infographic to communicate relevant data to a specific audience. Girls value context-based learning and opportunities to be creative. By posing a challenge, such as convincing the government to provide more incentives for home owners to install solar, students can be encouraged to select appropriate data and find interesting, persuasive ways to visualise that data and communicate trends.
It is important, however, that girls are not just given the task of ‘making the infographic pretty’, but are actively involved in analysing and visualising the data. One way to ensure this is to require that each member of the group contributes a different data visualisation to the infographic. The group will need to decide on a style and consistent conventions, and all members will need to adopt that approach.
Table 3: Dataset with suburb name and state/territory added
Small Unit Installation Postcode | Suburb | Previous Years (2001-2016) - Installations Quantity | Previous Years (2001-2016) - SGU Rated Output In kW |
---|---|---|---|
4670 | RUBYANNA, QLD | 10598 | 34298.313 |
6210 | COODANUP, WA | 10101 | 25929.687 |
4655 | WALLIEBUM, QLD | 9346 | 27713.172 |
4551 | SHELLY BEACH, QLD | 8611 | 23956.862 |
4350 | CRANLEY, QLD | 7766 | 25187.085 |
6065 | ASHBY, WA | 7174 | 22991.619 |
4211 | BINNA BURRA, QLD | 6990 | 23588.384 |
4305 | BASIN POCKET, QLD | 6773 | 19862.104 |
4740 | WEST MACKAY, QLD | 6443 | 26016.072 |
4207 | EDENS LANDING, QLD | 6399 | 20911.231 |
6155 | CANNING VALE SOUTH, WA | 6341 | 18627.154 |
4570 | CHATSWORTH, QLD | 6248 | 20379.991 |
3029 | HOPPERS CROSSING, VIC | 6203 | 18688.427 |
3977 | CRANBOURNE, VIC | 6146 | 18643.949 |
6164 | BANJUP, WA | 5986 | 17725.288 |
4556 | BUDERIM, QLD | 5899 | 17801.277 |
4306 | WIVENHOE POCKET, QLD | 5806 | 19665.363 |
6112 | MOUNT NASURA, WA | 5785 | 17822.82 |
4510 | CABOOLTURE, QLD | 5740 | 18658.14 |
Another technique would be to colour a map by number of solar installations, say, bright red for >9000 and becoming paler for each drop of 1000. This would be rather time-consuming given that there are 2795 postcodes listed, so this is an opportunity to consider aggregating your data and colouring by state. It’s a great example of not needing complex technical skills to explore a dataset. This is called a choropleth map. You can use National Map to do this.
Resources
- Australian Government Clean Energy Regulator: Postcode data for small-scale installations
- Solar PV Maps and Tools
Understand the Australian solar PV market with live generation data, historical maps and animations, and tools to explore rooftop PV potential and per-postcode market data.
- National Map
National Map is an online map-based tool to allow easy access to spatial data from Australian government agencies. Mapping data can be added, but needs to include latitude and longitude data.
- Find a postcode
This postcode finder from Australia Post is a quick and easy way to search and check postcodes for all suburbs and locations around Australia.
- Postcodes in Australia
Provides postcode ranges for Australian states and territories.
- Graphing tutorial
Useful tutorial on graph types
- Solar Energy Installations: Assessment checklist: MS Word
- Solar Energy Installations: Assessment checklist: PDF