This is the first post in the Sky Finance series. I'll use this series to document the project's evolution — architecture decisions, lessons learned, and interesting technical details along the way.
A month ago I published a POC post: SkyFinance POC: I Built a Personal Stock Alert System with Local LLMs and Cloud AI.
That version validated a core idea: use a local 7B model for per-ticker news filtering and classification (free), and reserve frontier cloud models only for cross-holding portfolio synthesis (cheap). The whole system cost around $0.50/month in API fees while running twice every trading day across all the tickers I follow.
Once the POC held up, I wanted to take it further:
That's the starting point for Sky Finance.
One sentence: a local-first financial intelligence platform that monitors US and Japanese equities, cleans market data with a local LLM, and generates RAG-powered strategy analysis with multi-provider model support.
Each trading day, the system automatically:
yfinance and news from Google RSSEverything runs on a Celery + Redis task queue. A web dashboard gives real-time visibility into pipeline status, strategy results, and evaluation scores.
INGESTION → PIPELINE → STRATEGY ENGINE → NOTIFICATIONS
Ingestion — the data entry point. yfinance handles price data synchronously, one task per ticker, to stay within Yahoo Finance rate limits. News is fetched concurrently via httpx.AsyncClient + asyncio.gather across multiple Google RSS feeds (L1-EN, L2-EN, L1-JA, L2-JA). A single feed failure returns an empty list and doesn't block the others.
Pipeline — data processing. Three sequential steps per record: cleaner.py → llm_summariser.py → embedder.py. Each record is an independent Celery task with its own retry policy.
Strategy Engine — RAG + LLM analysis. The core of the system; covered in detail below.
Notifications — Slack delivery. A daily morning digest plus per-ticker price move alerts.
Every stage runs as Celery tasks. Beat enqueues task names into Redis on a cron schedule; workers consume and execute them.
| Beat Job | Schedule (UTC) | What It Does |
|---|---|---|
ingest-us-stocks | 23:00 Mon–Fri | Pull US equities after NYSE close |
ingest-japan-stocks | 07:30 Mon–Fri | Pull Japan equities after TSE close |
ingest-news | :00 every hour | Fetch news for all tickers |
run-pipeline | :30 every hour | Clean + LLM summarize + embed |
run-strategies | 09:00 Mon–Fri | RAG retrieval + AI analysis |
send-digest | 09:05 Mon–Fri | Send Slack morning digest |
Beat never executes business logic — it only drops task names into Redis. Workers are the only processes that run actual Python code. task_acks_late = True ensures in-flight tasks requeue automatically if a worker goes down.
The pipeline is the cost control layer of the whole architecture.
Each raw record (news article or price data) passes through:
qwen2.5:3b-instruct) for structured output:
summary: short summary of the contentsentiment: + / = / -key_facts: list of key factstopics: topic tagsrelevance_score: how relevant the content is to the tickernomic-embed-text, store in pgvectorThis entire stage runs locally — zero API cost. Cloud models are only called during the Strategy Engine step when generating the final analysis reports.
The naive RAG approach: embed a query, retrieve top-k most similar chunks, stuff them into a prompt.
For stock analysis, this creates a real problem: during a bull run, a ticker's news corpus might be 90% positive. A plain top-k retrieval will fill the context window with bullish articles and bury the few negative risk signals that often matter most.
Sky Finance solves this with sentiment-bucketed retrieval:
top_k is independently configurable per strategyThis guarantees that negative signals always make it into the context, regardless of how many positive articles exist.
Within each sentiment bucket, two retrieval legs run in parallel:
ts_rank_cd over a GIN-indexed tsvector column (keyword precision)The two ranked lists are fused with Reciprocal Rank Fusion:
RRF(d) = 1/(60 + rank_vector(d)) + 1/(60 + rank_bm25(d))
A document that appears in both lists scores higher than one that dominates only one. No score normalization required. Each strategy can fall back to pure vector retrieval by setting retrieval_mode = "vector".
Strategies reference a tier name, not a specific model ID. Swapping the underlying model is a one-line edit in config/settings.toml — no code changes needed.
| Tier | Provider | Default Model | Structured Output | Use Case |
|---|---|---|---|---|
local | Ollama | qwen2.5:14b-instruct | format: <schema> | Free, no API key |
nano | OpenAI | gpt-5.4-nano | response_format: json_schema | Low cost |
advanced | OpenAI | gpt-5 | response_format: json_schema | High quality, deep analysis |
claude | Anthropic | claude-sonnet-4-6 | tool_use + prompt caching | Cost-efficient for long contexts |
Each provider exposes structured output differently. The abstraction layer handles the differences so strategy code stays provider-agnostic.
Prompt caching: the claude tier marks its system prompt with cache_control: ephemeral, cutting repeated-call token costs by ~90%. OpenAI automatically caches prompts of 1,024+ tokens at 50% reduction. Every call records input_tokens, output_tokens, cached_tokens, and cost_usd into strategy_results.metadata, visible in the dashboard.
Intuition isn't enough to know whether a RAG strategy actually works. Sky Finance includes a built-in evaluation module.
The approach: for the same strategy, generate two reports — one using sentiment-bucketed retrieval, one using plain top-k retrieval. A judge LLM scores each report on three dimensions (0–10 each):
The output is a per-ticker score comparison, a Δ value, and an overall win rate (% of tickers where bucketed retrieval beats plain retrieval).
Run via the sky-eval CLI:
# Evaluate all tickers in strategy 1
uv run sky-eval --strategy-id 1
# Run entirely local — no API keys needed
uv run sky-eval --strategy-id 1 --model-tier local --judge-model qwen2.5:14b-instruct
# Use a stronger judge for higher-stakes evaluation
uv run sky-eval --strategy-id 1 --judge-model claude-opus-4-7
| Layer | Technology |
|---|---|
| Task queue | Celery + Redis |
| Database | PostgreSQL + pgvector (Docker) |
| Local LLM | Ollama (qwen2.5:3b / 14b, nomic-embed-text) |
| Cloud LLM | OpenAI API / Anthropic Claude API |
| Web dashboard | FastAPI + Jinja2 + HTMX |
| Data sources | yfinance + Google RSS |
| Schema management | Alembic |
| Notifications | Slack Webhook |
| Process management | honcho (dev) / Docker Compose (prod) |
| Python tooling | mise + uv |
Sky Finance is actively evolving. Planned posts:
Code is here: github.com/peifengstudio/sky-finance
Sky Finance Series #1 | May 2026