JFS

Methodology

How we find signals, score them, and turn them into articles.

The Content Pipeline

  1. 1
    Collect

    RSS feeds + HTML scraping from 16 Japanese government and industry sources. robots.txt respected. Rate-limited.

  2. 2
    Extract

    Body text extracted via trafilatura (HTML) or pypdf (PDFs). Min. 200 chars to proceed.

  3. 3
    Classify

    Keyword matching against 7 categories and 16 investment themes. LLM assist for low-confidence items.

  4. 4
    Score

    7-axis scoring (0–5 each, max 35). Only items scoring ≥15 proceed.

  5. 5
    Generate

    Claude API with journalist-voice prompt. Deterministic humanize pass applied automatically.

  6. 6
    Review

    Human editor reviews draft. [VERIFY] markers block publish until resolved.

  7. 7
    Publish

    Manual publish command moves draft to published/. Translations generated for all 4 locales.

Signal Scoring

Only signals scoring 15 or above out of 35 proceed to the article generation stage.

Investment RelevanceDoes this signal have a direct connection to an investment or business decision?
Foreign UtilityIs this information specifically useful to a non-Japanese reader?
NoveltyIs this fresh — not yet covered by mainstream financial press?
Market ScaleHow large is the market or policy budget involved?
Niche DepthDoes this cover a Japan-specific angle unavailable elsewhere?
SEO PotentialIs there demonstrated search demand for this topic?
ShareabilityDoes this contain a number, ranking, or angle that spreads?

Article Generation

Qualifying signals are drafted using a custom Claude API prompt designed around journalist voice — not consulting language. The prompt explicitly bans the patterns that make AI writing detectable: uniform sentence length, filler transitions, vague quantifiers, and marketing copy. A deterministic post-processing pass then replaces any remaining AI-typical phrases using a curated list of 80+ substitutions.

Human Review Gate

Every draft is saved to a staging directory and reviewed by a human editor before publication. Drafts containing unverified factual claims (marked [VERIFY] by the generation system) are blocked from publishing until the editor resolves them. No article is ever auto-published.

Translations

English articles are the authoritative source. Translations to Hindi, French, and Simplified Chinese are generated via the Claude API with explicit instructions to preserve company names, yen figures, and proper nouns unchanged. Translation quality is reviewed spot-check for each language.