Back

How to Build a Google News Headline Agent

May 29, 2026
How to Build a Google News Headline Agent

How to Build a Google News Headline Agent

A Google News headline agent is a workflow that collects recent headlines, normalizes them, ranks them against a defined user intent, and returns a cited briefing. It should not pretend to be a news organization. It should not invent headlines. It should not scrape Google result pages directly.

For AI teams, the engineering problem is control. You need predictable source handling, freshness checks, citations, deduplication, ranking rules, prompt versioning, evals, and traces. The LLM should help structure and explain the result, not act as the source of truth.

What the agent should do

Define the agent as a narrow production workflow:

  • Accept a topic, region, language, time window, and optional source preferences.
  • Fetch headline candidates from approved news data sources.
  • Normalize each item into a strict schema.
  • Remove duplicates and near-duplicates.
  • Rank stories using explicit rules.
  • Return headlines with source names, URLs, timestamps, and short summaries.
  • Refuse to answer when it cannot verify source data.

The output should read like a sourced headline briefing. It should not read like original reporting unless your system actually performs reporting and verification outside the LLM.

Use approved data access instead of scraping Google pages

Do not build this by scraping Google News HTML pages or Google search result pages. Direct scraping can violate terms, break without warning, trigger blocking, and produce brittle parsers. It also creates audit problems when a customer asks where a headline came from.

Use a source layer that you can monitor and replace. Common options include:

  • Google News RSS feeds, when they fit your use case and usage constraints.
  • Licensed news APIs such as NewsAPI, Event Registry, or commercial data providers.
  • Publisher RSS feeds for sources you explicitly support.
  • Internal content feeds, if your organization tracks approved sources.
  • Search APIs with clear terms, rate limits, and attribution rules.

Keep the provider separate from the rest of the agent. Your ranking, summarization, and output logic should not depend on one feed format.

Reference architecture

A reliable headline agent usually has six components:

  1. Request parser: Converts a user request into structured parameters such as topic, market, language, and recency.
  2. Source adapter: Fetches candidate headlines from approved feeds or APIs.
  3. Normalizer: Converts each source item into a standard record.
  4. Deduper: Groups repeated coverage of the same story.
  5. Ranker: Scores items with explicit rules, then optionally asks an LLM to explain the ordering.
  6. Briefing generator: Produces a cited response in a fixed output format.

If you split these steps across multiple agents, keep the boundaries strict. For example, one agent can collect candidates, another can rank, and another can write the final briefing. Use AI agent orchestration patterns when you need clear handoffs, retries, and traceable state.

Define the input contract

Start with a structured request. Avoid passing a raw user message straight into the news retrieval layer.

{
  "topic": "OpenAI enterprise contracts",
  "region": "US",
  "language": "en",
  "time_window_hours": 24,
  "max_results": 10,
  "preferred_sources": ["Reuters", "AP", "The Verge"],
  "excluded_sources": [],
  "briefing_style": "developer-focused"
}

This contract gives your system a testable surface. You can run the same request through staging and production, compare output, and catch regressions when a prompt or model changes.

Normalize every headline candidate

Every fetched item should be converted into a schema before the LLM sees it. Do not let the model guess missing metadata.

{
  "id": "hash-of-url-or-provider-id",
  "title": "Anthropic releases new Claude model for enterprise users",
  "source_name": "Example News",
  "source_url": "https://example.com/article",
  "published_at": "2026-05-29T10:15:00Z",
  "retrieved_at": "2026-05-29T10:30:00Z",
  "snippet": "Short provider-supplied excerpt if available",
  "topic": "AI",
  "language": "en",
  "provider": "news-api-name"
}

Reject records that lack a title, source URL, source name, or retrieval timestamp. If the published timestamp is missing, mark it as unknown and lower its freshness score instead of inventing a date.

Deduplicate before ranking

News feeds often contain repeated coverage of the same event. If your agent returns ten versions of the same acquisition rumor, users will lose trust quickly.

Use a two-pass deduping approach:

  1. Exact matching: Normalize URLs, remove tracking parameters, and hash canonical URLs.
  2. Near-duplicate matching: Compare titles with embeddings or token similarity. Group items that share the same entities, event, and date.

For each story cluster, keep the strongest canonical item and retain alternate sources as supporting citations. This lets the final output say that multiple outlets reported the same event without flooding the response.

Rank with rules, not model opinion alone

Do not ask an LLM to decide which headlines are “most important” without constraints. The model may over-rank familiar companies, dramatic wording, or sources that appeared often in pretraining.

Use a scoring function first. A practical starting point:

  • Freshness: 0 to 40 points based on age within the requested time window.
  • Source quality: 0 to 25 points based on your configured source tiers.
  • Topic match: 0 to 20 points using keyword, entity, or embedding similarity.
  • Coverage breadth: 0 to 10 points when multiple independent sources report the same story.
  • User preference match: 0 to 5 points for preferred sources, regions, or categories.

Then pass the top candidates to the LLM and ask it to produce a concise explanation. The model can explain the ranking, but your code should own the ranking policy.

Handle freshness as a first-class requirement

A headline agent that ignores freshness is dangerous. A six-month-old article can look relevant if the topic matches the query. Always include both published_at and retrieved_at in the context sent to the model.

Add hard freshness rules:

  • Reject items outside the requested time window unless the user asks for historical context.
  • Show “published time unknown” when the source does not provide a timestamp.
  • Prefer retrieved data from the current run over cached data.
  • Expire cached headline candidates after a short period, such as 15 minutes for breaking news or 6 hours for slower industry topics.

Also include the retrieval time in your final response. This makes stale results easier to diagnose.

Keep the LLM inside a narrow role

The LLM should not create new headlines. It should select, group, summarize, and format records that already exist in your candidate set.

Use direct prompt rules such as:

  • Use only the provided headline records.
  • Do not create, rewrite, or embellish source titles unless the output field is explicitly labeled as a summary.
  • Every item must include a source URL.
  • If no reliable candidates are available, return an empty result with a reason.
  • Do not present summaries as original reporting.

If you use Gemini for the briefing or classification step, keep the same tool and output contracts. PromptLayer supports teams working with Google Gemini while preserving prompt history, metadata, and evaluation workflows.

Example system prompt

You are a headline briefing agent for an LLM application.

You will receive a list of verified headline records. Use only those records.
Do not invent headlines, source names, URLs, timestamps, quotes, or facts.
Do not present any summary as original reporting.
If a record lacks a source_url, exclude it.
If a record is outside the requested time window, exclude it.
Return the result in the requested JSON schema.
Each returned item must include title, source_name, source_url, published_at, retrieved_at, and summary.
The summary must be based only on the title and snippet fields.
If there are no valid records, return an empty items array and explain why in status_reason.

Example output schema

{
  "status": "ok",
  "status_reason": "Found 5 valid headline records from approved sources.",
  "retrieved_at": "2026-05-29T10:30:00Z",
  "items": [
    {
      "title": "Example source headline",
      "source_name": "Example News",
      "source_url": "https://example.com/story",
      "published_at": "2026-05-29T09:45:00Z",
      "retrieved_at": "2026-05-29T10:30:00Z",
      "summary": "One-sentence summary based only on the provided title and snippet.",
      "rank_reason": "Fresh, from a preferred source, and closely matches the requested topic."
    }
  ]
}

Use JSON mode or structured outputs where possible. Validate the model response before returning it to the user. If validation fails, retry once with the validation error, then fail closed.

Add citations to every user-facing item

Every returned headline should include a clickable source link. If your product UI hides URLs behind cards, still store the full URL in the response object and trace. Citations are part of the contract, not optional decoration.

Good citation behavior:

  • Show the publisher name next to the headline.
  • Link to the original article or source record.
  • Display the published time when available.
  • Display retrieval time for the whole briefing.
  • Keep summaries clearly separate from source titles.

Bad citation behavior:

  • Returning “according to reports” without links.
  • Combining multiple articles into one unsourced claim.
  • Using the LLM’s memory as a source.
  • Hiding missing URLs to make the response look complete.

Build evals before you ship

Create a small test set of realistic requests before releasing the agent. Include common and failure-case prompts:

  • “Top AI regulation headlines in the EU from the last 24 hours.”
  • “Latest news about NVIDIA earnings, only from the last 6 hours.”
  • “Give me headlines about a fake company name.”
  • “Summarize Google News on OpenAI and include sources.”
  • “What are today’s headlines?” with no region or category.

Score each run with checks such as:

  • Citation coverage: 100% of returned items include source URLs.
  • Freshness accuracy: 0 items outside the requested time window.
  • Schema validity: 100% valid JSON responses.
  • No invented headlines: Every returned title matches a fetched candidate.
  • Deduplication quality: Repeated stories are grouped or removed.
  • Refusal behavior: The agent returns a clear empty result when no valid records exist.

Run these evals whenever you change the prompt, model, source provider, ranking weights, or output schema.

Trace the whole workflow

A headline agent can fail in several places. The source API can return stale records. The deduper can group unrelated stories. The ranker can overweight source quality. The model can omit a citation. Without traces, these issues turn into vague complaints from users.

Log these fields for every run:

  • User request and parsed request parameters.
  • Source provider, query, response status, and latency.
  • Raw candidate count and normalized candidate count.
  • Rejected records with rejection reasons.
  • Deduped clusters.
  • Ranking scores by component.
  • Prompt version, model, temperature, and response schema.
  • Final response and validation result.

If you use multiple agents for retrieval, ranking, and briefing, treat each handoff as an observable event. In larger systems, multi-agent systems need strong contracts between agents so one weak step does not corrupt the final answer.

Common implementation mistakes

Scraping Google pages directly

This creates legal, reliability, and maintenance risk. Use approved APIs, RSS feeds, licensed datasets, or publisher feeds.

Letting the LLM invent headlines

Never ask the model to “write today’s headlines” without source records. Pass verified candidates and require exact source-backed output.

Presenting summaries as reporting

A model-generated summary is a derived artifact. Label it as a summary and cite the underlying article.

Omitting citations

A headline without a source URL should fail validation. Do not let the UI or model hide that gap.

Ranking by model preference

Use explicit scoring rules. Let the LLM explain or format the result, not decide importance from vague instructions.

Ignoring freshness

News data decays quickly. Enforce time windows in code and pass timestamps into the prompt.

When to use more than one agent

A single workflow is enough for many headline products. Add more agents only when the separation gives you better control.

A practical split might look like this:

  • Retrieval agent: Queries approved sources and returns normalized records.
  • Verification agent: Checks required fields, timestamps, and source allowlists.
  • Ranking agent: Applies ranking rules and returns ordered candidates.
  • Briefing agent: Writes the final cited response.

If agents communicate with each other, define strict input and output contracts. The patterns behind agent-to-agent workflows apply here: each step should know what it can trust, what it must verify, and when it should fail.

Production checklist

  • Use approved source access instead of scraping Google result pages.
  • Normalize all candidate headlines into a strict schema.
  • Reject records without source URLs.
  • Track published time and retrieval time.
  • Deduplicate exact and near-duplicate stories.
  • Rank with explicit scoring rules.
  • Constrain the LLM to provided records.
  • Validate structured output before returning it.
  • Run evals for citations, freshness, schema validity, and invented headlines.
  • Trace every source call, prompt, model response, and validation result.

Final recommendation

Build your Google News headline agent as a sourced workflow, not as a general-purpose news writer. Your system should retrieve real candidate records, enforce freshness, cite every item, and keep the LLM inside a controlled summarization and formatting role.

This approach gives you a headline agent that can be tested, monitored, and improved without relying on model memory or fragile page scraping.


PromptLayer helps AI teams manage prompts, trace agent workflows, run evals, and compare model behavior as they ship LLM applications. If you are building a headline agent or any production AI workflow, create a PromptLayer account at https://dashboard.promptlayer.com/create-account.

The first platform built for prompt engineering