Dilara Albayrak Logo Image
Dilara Albayrak

Building a Semantic Geospatial Agent: A Hybrid .NET & Python Approach

This project explores the technical challenges of bridging .NET 8 and Python micro services, implementing vector search, and visualising complex topology using the OS Open Rivers dataset.

C#
Python
PostGIS
GeoJSON
ETL
Semantic Search
RESTful API Design
UI architecture

Project Overview

Semantic Geospatial Agent Engineered as an initial work for a full-stack search engine to solve data discovery challenges within the OS Open Rivers topology. The solution utilises PostGIS for coordinate transformation and efficient bounding-box visualisation of the complete 190,000-link network. Simultaneously, it integrates a Python-based Transformer model to enable semantic querying on a randomly sampled dataset, validating the feasibility of vector search for hydrological features without the overhead of full-scale indexing in a development environment.

  • Scalability: The current in-memory vector search (O(N) complexity) is fast for 200k rows, but will degrade linearly. Implementing HNSW (Hierarchical Navigable Small World) indexing via pgvector would reduce search complexity to O(logN).
  • To handle O(N) complexity on a local development machine, the current semantic index is intentionally restricted to a representative sample of 1,000 records, ensuring real-time responsiveness during the PoC phase.
  • Containerization: The dependency on a running local Python instance makes deployment fragile. Wrapping both the API and the Python service in Docker containers (via docker-compose) is a logical next step.
  • Search Robustness: Currently, the system relies purely on semantic similarity. A Hybrid Search would offer the best of both worlds, catching exact name matches that the semantic model might sometimes overlook.