With an impressive background in consulting for government clients in emergency management, defence, and intelligence, Heidi brings a wealth of knowledge and experience to the table. Currently working as a Machine Learning Engineer at Pachama, a groundbreaking startup dedicated to leveraging technology to combat climate change, Heidi’s insights shed light on the exciting possibilities of satellite imaging.
Heidi, what motivated you to work in the field of satellite imaging?
My background is in mathematics. I was always passionate about the intersection of mathematics and geography and looking at how we can bring mathematical methods to bear on geographic questions. That naturally led to an interest in satellite imagery.
If we have access to no constructed maps or no survey maps; but real time (or near real time) information about the surface of the planet…I realised that I could use a lot of the same mathematical tools that I had learned and apply them to satellite imagery to extract these geographic impacts. So, my interest was always in applying mathematics to these mathematical-related questions.
What’s most interesting to you about the field of satellite imaging?
The surface of the earth is always changing. The location of cars is always changing. The buildings that people are putting up and taking down are always changing. What excites me most about this field is having the opportunity to have an unprecedented view. It’s almost as if you’re an astronaut at the international space station looking at the surface of the earth to understand how things are changing – and then to trace those changes all the way down to the level of impact at the individual level, at the societal level, and to really understand how these systems are connected.
Can you explain what the differences are between satellite image data to the data that we use in our daily lives?
So often, when people think about computer vision or processing imagery, they see examples of models that can detect the difference between a cat and a dog. And what we’re doing with satellite imaging is in some ways similar, but in some ways very different because the imagery itself is very different.
The type of imagery that comes from your phone has three channels, RGB: red, green, and blue. Some satellite imagery has those channels as well, but may have additional channels, too. I’d say there are three differences between regular imagery and satellite imagery. One is the resolution. The second is the band, so the wavelengths are captured by the sensors – and the third is the spatial information or metadata that comes with this.
In satellite imagery, the resolution is limited by the sensor that you’re working with. With common resolutions, if you have something that’s very high resolution, each pixel might correspond to 20 centimetres on the ground, whereas with something that’s very low resolution like the Landsat satellites, it might correspond to 15 centimetres, but the resolution has physical meaning in the real world. Th second is the spectral band.
Take traditional imagery…if you just take a picture with your phone, it will have three bands – red, green, and blue. With satellite imaging, some satellites have additional bands. So, they’ll have near infrared bands or pan (panchromatic) bands that provide additional information that can be used to detect things that humans can’t see, which again, from a data processing perspective, is a far more interesting question. We don’t just want to train algorithms to see humans.
And then, on the last point about the differences is the spatial information and the metadata. When you take the information – such as taking an image from a satellite – it will contain information about where on earth that is, and the altitude, the angle, the time of day. All of which provides additional metadata that can be used to build advanced models about what’s happening within that image.
How is the data acquired, and which types of data are you actually using for your analysis?
There are a variety of different satellites out there that have different resolutions. And in addition, there are a number of other platforms besides just satellites. So, with regards to satellites, there are commercially available sources – the likes of planet labs, digital globe’ those are commercially available sources. There are also government sources that are publicly available.
For example, with Landsat data, this is very coarse resolution data. That’s great for land cover and vegetation and disease. And then there’s also government sources that are not publicly available, and in addition to the satellite imagery sources, there are other sources from lower altitude platforms. In particular, one area of interest right now in terms of research and development is something called HAPS, which is High Altitude Platform systems. These are systems that operate at the altitude of a weather balloon, so lower than a satellite but higher than a plane.
There are also systems that can persist in the atmosphere for a significant amount of time on the order of hours, days, and weeks, but not years. A satellite’s advantage is that they can be beneath some of the clouds and you can receive different types of imagery. Imagine if you have a similar sensor on a weather balloon, then on a satellite. You’re going to get higher resolution data, and you’re also going to be able to avoid some of the atmospheric influence from clouds and other things. There’s a variety of sensors available in this space, and that’s not to mention the traditional imagery sources from aircraft or imagery sources from drones.
What limitations and challenges are there with Satellite Imaging data? How do you overcome these?
There’s certainly no scarcity of challenges in this domain. I will point out one issue that you mentioned, and that I’ve mentioned previously: the weather.
So, you can imagine there are a lot of objects of interest, particularly around object detection in the national security domain. Alot of these objects of interest aren’t found in perfectly sunny places with nice weather, and in particular, trying to find objects in snowy conditions, in winter conditions and in low light conditions present very serious challenges. Both from an object detection standpoint, but also from an imagery sourcing standpoint, if you have outdated imagery, it’s going to be very difficult to find things.
Another challenge that we face – and I think this is a challenge that’s quite common to a lot of people working across data science as a discipline – is data labelling.
If we’re building a training algorithm or we’re building it as a detection algorithm, we need a training data set that contains appropriately labelled instances of whatever it is we’re trying to detect. Now, in some cases, for example, we have commercial applications that count the number of cars in a parking lot. It’s not difficult to obtain and label a significant Corpus of information to allow these algorithms to be successful. For instance, with rare classes of aircraft in the winter, it’s very difficult to attain the base data that’s needed to train up these models.
What developments are happening in your field that you are most excited about?
Read Heidi’s answer in our full article, over in our magazine here. LINK: https://issuu.com/datasciencetalent/docs/the_data_scientist_mag_issue_2_digital_copy_for_is/s/19459891