Config & Settings¶

Configuration is split into two files: settings.py for all tunable parameters, and prompts.py for all LLM prompt strings.

This separation means you can tune retrieval behaviour (changing RRF_K, RETRIEVAL_N) or swap LLM providers (changing GCP_GEMINI_MODEL) purely through environment variables — no code changes needed.

Settings¶

All parameters are read from environment variables at startup (via python-dotenv). Defaults are shown below. Override any value in your .env file.

Key environment variables

GCP_PROJECT_ID=top-arc-65ca
GCP_LOCATION_ID=australia-southeast1
GCP_EMBED_MODEL=text-embedding-005
GCP_GEMINI_MODEL=gemini-2.5-flash
GCLOUD_PATH=/path/to/gcloud
HTTPS_PROXY=cloudproxy.auiag.corp:8080  # omit if not behind a proxy
DB_DSN=dbname=anzsic_db
RRF_K=60
RETRIEVAL_N=20
TOP_K=5
EMBED_DIM=768

settings ¶

config/settings.py ────────────────────────────────────────────────────────────────────────────── Single source of truth for all tuneable parameters.

All values can be overridden via environment variables or a .env file placed at the project root. The frozen dataclass ensures settings are never mutated at runtime.

To swap providers, change the relevant env var — no code edits required: GCP_EMBED_MODEL → swap embedding model GCP_GEMINI_MODEL → swap LLM model DB_DSN → swap database

Settings `dataclass` ¶

Immutable application settings loaded from environment variables.

Source code in prod/config/settings.py

@dataclass(frozen=True)
class Settings:
    """Immutable application settings loaded from environment variables."""

    # ── Provider selection ──────────────────────────────────────────────────
    # Set EMBED_PROVIDER=openai or LLM_PROVIDER=openai to switch from Vertex AI.
    # Valid values: "vertex" | "openai"
    embed_provider: str = field(
        default_factory=lambda: _env("EMBED_PROVIDER", "vertex")
    )
    llm_provider: str = field(
        default_factory=lambda: _env("LLM_PROVIDER", "vertex")
    )

    # ── OpenAI ─────────────────────────────────────────────────────────────
    openai_api_key: str = field(
        default_factory=lambda: _env("OPENAI_API_KEY", "")
    )
    # text-embedding-3-small: 1536-dim natively; set EMBED_DIM to match.
    # text-embedding-3-large: 3072-dim natively or reduced via dimensions param.
    openai_embed_model: str = field(
        default_factory=lambda: _env("OPENAI_EMBED_MODEL", "text-embedding-3-small")
    )
    openai_llm_model: str = field(
        default_factory=lambda: _env("OPENAI_LLM_MODEL", "gpt-4o")
    )

    # ── GCP / Vertex AI ────────────────────────────────────────────────────
    gcp_project_id: str = field(
        default_factory=lambda: _env("GCP_PROJECT_ID", "top-arc-65ca")
    )
    gcp_location_id: str = field(
        default_factory=lambda: _env("GCP_LOCATION_ID", "australia-southeast1")
    )
    gcp_embed_model: str = field(
        default_factory=lambda: _env("GCP_EMBED_MODEL", "text-embedding-005")
    )
    gcp_gemini_model: str = field(
        default_factory=lambda: _env("GCP_GEMINI_MODEL", "gemini-2.5-flash")
    )
    gcloud_path: str = field(
        default_factory=lambda: _env(
            "GCLOUD_PATH",
            "/Users/s748779/gemini_local/google-cloud-sdk/bin/gcloud",
        )
    )

    # ── Network ────────────────────────────────────────────────────────────
    https_proxy: str = field(
        default_factory=lambda: _env("HTTPS_PROXY", "cloudproxy.auiag.corp:8080")
    )

    # ── Database ───────────────────────────────────────────────────────────
    db_dsn: str = field(
        default_factory=lambda: _env("DB_DSN", "dbname=anzsic_db")
    )

    # ── Retrieval pipeline ─────────────────────────────────────────────────
    rrf_k: int = field(
        default_factory=lambda: _env_int("RRF_K", 60)
    )
    retrieval_n: int = field(
        default_factory=lambda: _env_int("RETRIEVAL_N", 20)
    )
    top_k: int = field(
        default_factory=lambda: _env_int("TOP_K", 5)
    )
    embed_dim: int = field(
        default_factory=lambda: _env_int("EMBED_DIM", 768)
    )
    embed_batch_size: int = field(
        default_factory=lambda: _env_int("EMBED_BATCH_SIZE", 50)
    )

    # ── Data paths ─────────────────────────────────────────────────────────
    master_csv_path: Path = field(
        default_factory=lambda: _env_path(
            "MASTER_CSV_PATH",
            Path(__file__).parent.parent.parent / "anzsic_master.csv",
        )
    )

    # ── HTTP timeouts (seconds) ────────────────────────────────────────────
    embed_timeout: int = field(default_factory=lambda: _env_int("EMBED_TIMEOUT", 30))
    llm_timeout: int   = field(default_factory=lambda: _env_int("LLM_TIMEOUT", 90))
    embed_retries: int = field(default_factory=lambda: _env_int("EMBED_RETRIES", 3))

    # ── GENI LLM (internal IAG platform) ──────────────────────────────────────
    # Only required when LLM_PROVIDER=geni.
    geni_base_url: str = field(
        default_factory=lambda: _env(
            "GENI_BASE_URL",
            "https://eag-lumi-backend-532613543802.australia-southeast1.run.app",
        )
    )
    geni_domain: str = field(
        default_factory=lambda: _env("GENI_DOMAIN", "eag")
    )
    geni_bot_version_id: str = field(
        default_factory=lambda: _env("GENI_BOT_VERSION_ID", "")
    )
    geni_poll_timeout: float = field(
        default_factory=lambda: _env_float("GENI_POLL_TIMEOUT", 60.0)
    )
    geni_poll_interval: float = field(
        default_factory=lambda: _env_float("GENI_POLL_INTERVAL", 1.0)
    )
    # Pre-upload the ANZSIC CSV once (via POST /api/files) and paste the UUID
    # here to skip the upload step on every cold start.  Leave empty to let
    # the adapter upload automatically on first use.
    geni_csv_file_id: str = field(
        default_factory=lambda: _env("GENI_CSV_FILE_ID", "")
    )
    # Set GENI_DISABLE_CSV_UPLOAD=true to skip the file-upload attempt entirely
    # and always use the inline-text fallback.  Use this when the CSV is known
    # to exceed GENI's token limit so the upload attempt (which can take ~25s
    # to time out / be rejected) is never made.
    geni_disable_csv_upload: bool = field(
        default_factory=lambda: _env("GENI_DISABLE_CSV_UPLOAD", "").lower()
        in ("1", "true", "yes")
    )

get_settings `cached` ¶

get_settings() -> Settings

Returns a cached singleton Settings instance.

Use this everywhere instead of instantiating Settings() directly — it guarantees a single object is shared across the entire process.

Source code in prod/config/settings.py

@lru_cache(maxsize=1)
def get_settings() -> Settings:
    """Returns a cached singleton Settings instance.

    Use this everywhere instead of instantiating Settings() directly —
    it guarantees a single object is shared across the entire process.
    """
    return Settings()

Prompts¶

All LLM prompt strings live in one place. To tune how Gemini re-ranks candidates, edit RERANK_SYSTEM_BASE. To change the output schema, edit RERANK_USER_TEMPLATE and update the corresponding Pydantic models.

prompts ¶

config/prompts.py ────────────────────────────────────────────────────────────────────────────── All LLM prompt strings in one place.

Why centralise prompts? • Easy to diff and review prompt changes in version control • Swap or tune a prompt without touching service logic • Single source for prompt versioning / A-B testing

To change the re-ranking prompt: edit RERANK_SYSTEM_BASE below. To support a different output schema: change RERANK_OUTPUT_SCHEMA.

build_system_prompt ¶

build_system_prompt(include_reference: bool, csv_reference: str) -> str

Assembles the Gemini system prompt, optionally appending the full CSV.

Parameters:

Name	Type	Description	Default
`include_reference`	`bool`	When True, appends all 5,236 ANZSIC codes as a fallback lookup table for low-confidence queries.	required
`csv_reference`	`str`	Pre-loaded CSV reference string (code: desc lines).	required

Returns:

Type	Description
`str`	Complete system prompt string ready to send to the LLM.

Source code in prod/config/prompts.py

def build_system_prompt(include_reference: bool, csv_reference: str) -> str:
    """Assembles the Gemini system prompt, optionally appending the full CSV.

    Args:
        include_reference: When True, appends all 5,236 ANZSIC codes as a
                           fallback lookup table for low-confidence queries.
        csv_reference:     Pre-loaded CSV reference string (code: desc lines).

    Returns:
        Complete system prompt string ready to send to the LLM.
    """
    if not include_reference or not csv_reference:
        return RERANK_SYSTEM_BASE

    divider = "─" * 77
    header = CSV_REFERENCE_HEADER.format(divider=divider)
    return RERANK_SYSTEM_BASE + header + csv_reference

build_candidate_block ¶

build_candidate_block(candidates: list[dict]) -> str

Renders the numbered candidate list for the LLM user message.

Parameters:

Name	Type	Description	Default
`candidates`	`list[dict]`	List of candidate dicts (from Candidate.model_dump()).	required

Returns:

Type	Description
`str`	Formatted multi-line string.

Source code in prod/config/prompts.py

def build_candidate_block(candidates: list[dict]) -> str:
    """Renders the numbered candidate list for the LLM user message.

    Args:
        candidates: List of candidate dicts (from Candidate.model_dump()).

    Returns:
        Formatted multi-line string.
    """
    lines = []
    for i, c in enumerate(candidates, 1):
        block = CANDIDATE_BLOCK_TEMPLATE.format(
            idx=i,
            anzsic_code=c.get("anzsic_code", ""),
            anzsic_desc=c.get("anzsic_desc", ""),
            class_desc=c.get("class_desc", ""),
            group_desc=c.get("group_desc", ""),
            subdivision_desc=c.get("subdivision_desc", ""),
            division_desc=c.get("division_desc", ""),
        )
        if c.get("class_exclusions"):
            block += CANDIDATE_EXCLUSION_LINE.format(
                exclusions=c["class_exclusions"]
            )
        lines.append(block)
    return "\n".join(lines)

build_user_message ¶

build_user_message(query: str, candidates: list[dict], top_k: int) -> str

Assembles the user-turn message for the LLM.

Parameters:

Name	Type	Description	Default
`query`	`str`	Raw input description from the user.	required
`candidates`	`list[dict]`	List of candidate dicts.	required
`top_k`	`int`	Number of results to request from the LLM.	required

Returns:

Type	Description
`str`	Formatted user message string.

Source code in prod/config/prompts.py

def build_user_message(query: str, candidates: list[dict], top_k: int) -> str:
    """Assembles the user-turn message for the LLM.

    Args:
        query:      Raw input description from the user.
        candidates: List of candidate dicts.
        top_k:      Number of results to request from the LLM.

    Returns:
        Formatted user message string.
    """
    return RERANK_USER_TEMPLATE.format(
        query=query,
        n_candidates=len(candidates),
        candidate_block=build_candidate_block(candidates),
        top_k=top_k,
    )

Config & Settings¶

Settings¶

settings ¶

Settings dataclass ¶

get_settings cached ¶

Prompts¶

prompts ¶

build_system_prompt ¶

build_candidate_block ¶

build_user_message ¶

Settings `dataclass` ¶

get_settings `cached` ¶