DrivenData: Hakuna Ma-data: Identify Wildlife on the Serengeti with AI for Earth
Leverage millions of images of animals on the Serengeti to build a classifier that distinguishes between gazelles, lions, and more!
In this competition, participants will predict the presence and species of wildlife in new camera trap data from the Snapshot Serengeti project, which boasts over 6 million images.
Camera traps are motion-triggered systems for passively collecting animal behavior data with minimal disturbance to their natural tendencies. Camera traps are an invaluable tool in conservation research, but the sheer amount of data they generate presents a huge barrier to using them effectively. This is where AI can help!
There are two immediate challenges where efforts like this competition are needed:
- Camera traps can’t automatically label the animals they observe, creating an immense (and sometimes prohibitive) burden on humans to determine where and what wildlife are present.
- Even when automated animal tagging models are available, the models that do exist don’t generalize well across time and locations, severely limiting their usefulness with new data.
To address these opportunities, we’re challenging data scientists, researchers, and developers from around the world to build the best algorithms for wildlife detection.
The competition is designed with a few objectives in mind:
- Innovation: Participants use state-of-the-art approaches in computer vision and AI and get live feedback on how well their solutions perform
- Generalization: This competition is designed to reward the best generalizable solutions. The private test data used to determine the winners will come entirely from the latest, unreleased season of data from the Snapshot Serengeti project (season 11). For more information on the competition timeline and evaluation, see the problem description.
- Execution: Models are trained locally and submitted to execute inference in the cloud – read on!
- Openness: All prize-winning models are released under an open source license for anyone to use and learn from
This is a brand new kind of DrivenData challenge! Previous models trained for camera trap images have often failed to generalize well. In this competition, we want to reward the models that generalize best to new images, so you won’t interact directly with the test set. Rather than submitting your predicted labels for a test set you have, you’ll package up everything needed to do inference and send it to us. We’ll execute that code on Azure in a Docker container that has access to the test set images. By leveraging Microsoft Azure’s cloud computing platform and Docker containers, we’re moving our competition infrastructure one step closer to translating participants’ innovation into impact.
We can’t wait to run what you come up with!