News data is extremely large / raw -> causing 6–7k token usage even with top-5 articles

Hi, I noticed that the news data block is extremely large. Even when selecting only the top 5 news articles, the input to the LLM ends up being 6–7k tokens, mainly because the full raw article text is passed through.

Other data sources (fundamentals, technicals) remain compact; only news is causing heavy token usage.

### **Problem**

* Very high token cost per run
* Slower inference
* News text contains many irrelevant sections (ads, disclaimers, long paragraphs)

### **Suggestion**

It might help to:

* Use only the **headline + first paragraph**, or
* Add a built-in summarization step, or
* Add a `max_chars`/`max_tokens` limit per article.

I’m currently using a workaround that keeps just the title + first paragraph, which reduces tokens significantly while keeping signal quality.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

News data is extremely large / raw -> causing 6–7k token usage even with top-5 articles #291

Problem

Suggestion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

News data is extremely large / raw -> causing 6–7k token usage even with top-5 articles #291

Description

Problem

Suggestion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions