Machine learning algorithm locates nearly all US solar panels

Stanford researchers have located almost all US solar panels by applying a machine learning algorithm to one billion satellite images. They identified the GPS locations of the solar panels as well as their sizes. By using publicly-available data, they identified factors that promote solar energy use. They also identified factors that discourage the use of solar energy.

The researchers wrote about their work in the journal Joule (citation below).

Solar energy involves capturing the Sun’s energy and converting it into electricity. We also use the term ‘solar power‘ with the same meaning.

Knowing who in American have installed solar panels on their roofs and why would be extremely useful. Specifically, useful for managing the country’s evolving electricity system. It would also help understand what barriers there are to greater use of renewable energy sources.

Until now, however, we have only had estimates of how many people use renewable energy.

Renewable energy is energy that originates from an everlasting source. In other words, that source, in a human timescale, never runs out. Wind energy, solar energy, and geothermal energy, for example, are renewable sources of energy. We never run out of wind, energy from the Sun, or the Earth’s internal heat.

Solar panels USA - image
Stanford scientists’ machine learning algorithm detected 1.47 million solar rooftop installations in the USA. (Image:

Scientists used a machine learning algorithm

To get more accurate numbers, researchers analyzed over one billion high-resolution satellite images. They used a machine learning algorithm that detected almost all the solar power installations in the contiguous forty-eight states.

According to their analysis, there are 1.47 million installations. This figure is significantly higher than two widely-recognized estimates.

The research team also integrated US Census plus other data with their solar catalog. This allowed them to identify factors that led to solar power adoption.

Study author, Ram Rajagopal, said:

“We can use recent advances in machine learning to know where all these assets are, which has been a huge question, and generate insights about where the grid is going and how we can help get it to a more beneficial place.”

Rajagopal is an Associate Professor of Civil and Environmental Engineering at Stanford University. Prof. Rajagopal and Arun Majumdar, a Professor of Mechanical Engineering at Stanford, supervised the project.

Who opts to install solar panels?

The authors say their data could be useful to regulators and utilities. It could also be useful to solar panel marketers and other stakeholders.

If electricity providers know how many solar panels there are in an area, they can more easily balance supply and demand. Balancing the supply and demand of utilities is the key to reliability.

Their data highlights not only the drivers of solar deployment but also the barriers. They found that, for example, household income is an important factor, but only to a point.

Income quickly stops playing much of a role in households’ decisions when annual income is above $150,000. In other words, income helps determine whether people install solar panels if the household earns $150,000 or less per year. Income is less of a driver among wealthier households.

On the other hand, medium- and low-income households are less likely to install solar panels even in areas where doing so would be profitable over the long-term.

Could installers of solar panels exploit unmet demand?

In sunny areas with relatively high electricity prices, for example, electricity bill savings would be more than the monthly cost of the equipment. However, the barrier for medium- and low-income households is meeting the upfront cost, the authors believe.

This could mean that there is unmet demand which solar installers could exploit if they developed new financial models.

The researchers used publicly-available data for US Census tracts to overlay socioeconomic factors. Each tract covers approximately (on average) 1,700 households. In other words, about 4% of a typical county or half the size of a ZIP code.

The team also discovered that solar panel penetration takes off when it reaches a certain level in a neighborhood. This is not surprising. However, this does not happen in areas with high levels of income inequality.

The researchers also discovered an important threshold of how much sunlight a specific area needs to trigger adoption.

Prof. Majumdar said:

“We found some insights, but it’s just the tip of the iceberg of what we think other researchers, utilities, solar developers and policymakers can further uncover. We are making this public so that others find solar deployment patterns, and build economic and behavioral models.”

Training DeepSolar to find the solar panels

The researchers trained DeepSolar to find the solar panels by providing it with approximately 370,000 images. DeepSolar is the name of their machine learning algorithm. Each image covered about 100 feet x 100 feet. Each image was labelled as either not having or having a solar panel in it.

DeepSolar then learned to identify features related to solar panels. Size, texture, and color, for example, were some of the features.

Co-author Jiafan Yu, a doctoral candidate in electrical engineering, said:

“We don’t actually tell the machine which visual feature is important. All of these need to be learned by the machine.”

Yu built the system with co-author Zhecheng Wang, a doctoral candidate in civil and environmental engineering.

Eventually, the machine learning program could identify an image as containing solar panels correctly 93% of the time. It missed approximately 10% of images that did have solar panels.

On both scores, however, DeepSolar is more accurate than previous models, the authors said.

DeepSolar analyzed one billion images

The researchers then got DeepSolar to analyze one billion satellite images to find the ones with solar panels present. Existing technology would have taken years to complete that task. DeepSolar, with some novel efficiencies, completed the job in one month.

The researchers’ database now has information not only on residential solar panels, but also those on the roofs of companies as well as the solar power plants of many large utility companies.

The authors made DeepSolar skip the least densely-populated areas. Buildings in sparsely-populated areas are less likely to have solar panels. The ones that do are unlikely to be attached to the gird.

The researchers estimated that 5% of solar panels of homes and businesses are in these sparsely-populated areas.

Wang said:

“Advances in machine learning technology have been amazing. But off-the-shelf systems often need to be adapted to the specific project and that requires expertise in the project’s topic. Jiafan and I both focus on using the technology to enable renewable energy.”

The team plans to expand DeepSolar’s database to include solar panels in rural areas. They want to include other countries that have high-resolution satellite images. They would also like to be able to calculate the angle and orientation of the solar panels. Angle and orientation make it possible to estimate power generation.


DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States,” Jiafan Yu, Zhecheng Wang, Arun Majumdar, Ram Rajagopal. Joule, VOLUME 2, ISSUE 12, P2605-2617, DECEMBER 19, 2018. DOI: