This fine-tuned LLM is intended for the task of geocoding complex location references, and accompanies Coordinates from Context: Using LLMs to Ground Complex Location References (Masis & O'Connor, EACL 2026). The model is referred to as "Geoparser-augmented FT Qwen 14B" in the paper.

Model description

The base model is a quantized Qwen3-14B model (unsloth/Qwen3-14B-unsloth-bnb-4bit), which has been fine-tuned for geocoding, i.e. linking a location reference to an actual geographic location. The model was trained using parameter-efficient fine-tuning via low-rank adaptation. It was trained for our 'Geoparser-augmented' approach, where a separate geoparsing tool augments the inputs with the center coordinates of mentioned locations; our fine-tuned model then uses both the original location reference and the mentioned locations' coordinates to generate the described location's bounding box. For more details, please see the accompanying paper.

Training data

The model is trained on 13k examples from the training subset of the GeoCoDe dataset, where the input is a complex location reference and the center coordinates of each mentioned location and the output is the location's corresponding bounding box.

Limitations

Due to data limitations, this model has been trained and evaluated for our task only in Mainstream American English.

Usage (unsloth)

The following code snippet illustrates how to use the model. For the system prompt we used and for example prompts, please see the appendices in the accompanying paper.

from unsloth import FastLanguageModel
import torch

model_name = "tmasis/geocoding-complex-location-references"

# Load model and tokenizer from Huggingface Hub
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# Prepare model input
messages = [{"role": "system", "content": <system_prompt>},
    {"role": "user", "content": <prompt>}]
text = tokenizer.apply_chat_template(messages,
    tokenize=False,
    add_generation_prompt = True,
    enable_thinking = False
)

# Conduct text generation
outputs = model.generate(**tokenizer(text, return_tensors="pt").to(model.device),
    max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20)
response = tokenizer.batch_decode(outputs)[0]
print(response)

Usage (HuggingFace transformers)

Alternatively, you can use the HuggingFace transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "tmasis/geocoding-complex-location-references"

# Load model and tokenizer from Huggingface Hub
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name = model_name,
    torch_dtype = "auto",
    device_map = "auto"
)

# Prepare model input
messages = [{"role": "system", "content": <system_prompt>},
    {"role": "user", "content": <prompt>}]
text = tokenizer.apply_chat_template(messages,
    tokenize=False,
    add_generation_prompt = True,
    enable_thinking = False
)

# Conduct text generation
outputs = model.generate(**tokenizer(text, return_tensors="pt").to(model.device),
    max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20)
response = tokenizer.batch_decode(outputs)[0]
print(response)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tmasis/geocoding-complex-location-references

Finetuned
Qwen/Qwen3-14B
Finetuned
(120)
this model

Paper for tmasis/geocoding-complex-location-references