The Dawn of AI-Powered Geolocation
GeoSpy Astonishes Open Source Intelligence Community on Christmas Day
Digital Digging wrote today about a brand new artificial intelligence tool for geolocating photos, GeoSpy. Despite its Christmas Day launch, GeoSpy is rapidly gaining popularity among the open-source intelligence (OSINT) community. The initial release, version 0.1, is showing promising results, especially in urban environments.
Geospy is a product of a Boston-based company Graylark established by a three brothers, all triplets. One of them, Daniel Heinen, worked with AI for the defense industry for so called Unmanned Autonomous Systems. I talked to him tonight via e-mail and DM (see full interview below).
He is almost apologetic about the tool. “What we released is a very basic version of what we have in mind. At first, we didn’t want to release it but in the end, we wanted the community to help drive the direction. It has a lot of bugs and things we need to add. I never thought it would get the attention it did”
Well, he got my attention. He passed the Santo Domingo test and that’s impressive in my book. I love to figure out how #ai can help professionals to find faster answers, see #ChatGPT : Unlocking Visual Search. In that article, one of my students of the Walter Cronkite Institute in Phoenix asks the classic geolocation question: "Where am I?"
I consulted ChatGPT with just three queries to discover the image was from the Dominican Republic, impressed by the speed of its accurate response. Pinpointing the city, Santo Domingo, took a bit longer.
Welcome to the next generation of AI-driven geolocation tools. Unlike traditional methods, GeoSpy doesn't require interrogation. Simply provide a photo, and it will attempt to determine the location. It will fail miserably, hey, its version V0.1, but it is able to impress too, as you can see below. Country and city were predicted correctly, the location was 2 km off.
That’s amazingly good for a first public geolocation tool of this kind. The concept is based on what is mentioned in this paper and became news a few weeks ago.
Earlier, it beat human experts in geoguessing:
The technique worked, but there wasn’t a public tool available yet. Daniel Heinen: “ I would like to say that we stand on the shoulders of giants. We couldn’t do it without the AI community and all the work that has been done by everyone already. We are applying their ideas, suggestions, and techniques to cybersecurity applications”
So how does it work? The program uses, amongst other techniques, CLIP. OpenAI, the same group behind ChatGPT, introduced it. It's a brainy tool that can understand pictures by reading descriptions about them. This way, it gets better at recognizing what's in images. The cool part is, you don't have to train it specifically for each task; just tell it what to look for using words, and it can handle a variety of challenges without extra work. It's a big deal because it makes things simpler and works better in real-world scenarios than older methods. They did this by teaching it with lots and lots of internet pictures and the words that come with them. This approach helps it to understand and classify all sorts of images, whether they're pictures of dogs, sketches, or even different styles of photos. It's the subtle art of not knowing all there is, but just using what is out there and it's making computers better at understanding the world visually.
This was a challenge via Twitter, GeoSpy managed to find the correct country.
The system utilizes a combination of models: CLIP for understanding images in the context of natural language, OCR (Optical Character Recognition) for extracting text from images, and LLMs (Large Language Models) for understanding and generating text.
In upcoming versions, Geospy will use knowledge graphs, a significant step forward. These graphs are not merely databases but are structured to make complex connections between concepts, places, and people. This allows the system to make educated guesses about the location of a photograph. For instance, if the system recognizes a Massachusetts license plate through OCR, it doesn't simply conclude the photo is from Massachusetts. Instead, it considers a range of possibilities based on statistical data, such as the average travel distance from home, to suggest other potential locations.
For the current verdict about these tools, see X.
This method represents a shift from traditional vector databases previously used with LLMs, offering a more nuanced approach to data correlation. The creator of GeoSpy emphasize that any effective system must combine traditional AI and machine learning techniques with the sophisticated capabilities of Large Language Models. The goal of this hybrid model is to achieve a more accurate, context-aware tool for photo geolocation, promising a new level of efficiency and precision in digital imaging and analysis. I can’t wait.
Interview with Daniel Heinen (Graylark) about GeoSpy
“We must defend ourselves against AI cyber weapons”
Henk van Ess: Daniel, with your extensive background in AI for unmanned autonomous systems, especially in defense, could you share the AI techniques used in developing GeoSpy?
Daniel Heinen: Absolutely.
Keep reading with a 7-day free trial
Subscribe to Digital Digging with Henk van Ess to keep reading this post and get 7 days of free access to the full post archives.