Transparent, Reproducible Analysis

Haberler.cloud uses a multi-stage NLP pipeline with 11 specialized analyzers to evaluate news articles. Our methodology is designed to be transparent, objective, and continuously improving.

Analysis Pipeline Overview

When an article enters our system, it goes through a comprehensive analysis pipeline:

1

Content Extraction

We extract the article text, title, author, and publication date. Only a 500-character snippet is stored for fair use compliance.

2

Text Preprocessing

The text is tokenized, normalized, and prepared for analysis. We detect the language (Turkish or English) and load appropriate NLP models.

3

Multi-Analyzer Processing

The content passes through 11 specialized analyzers, each examining different aspects of the article. These run in parallel for efficiency.

4

Score Aggregation

Individual analyzer outputs are combined using weighted aggregation to produce final credibility and quality scores.

5

Version Tracking

We hash the content and compare against previous versions to detect stealth edits or deletions over time.

The 11 Analyzers

Each analyzer is a specialized module that examines specific aspects of the article:

😊 Sentiment Analyzer

Uses TextBlob and VADER lexicons to determine the emotional tone of the article.

πŸ“– Readability Analyzer

Calculates Flesch Reading Ease score and grade level.

βš–οΈ Bias Analyzer

Detects political bias using keyword patterns and linguistic markers.

βœ… Credibility Analyzer

Detects citations, named sources, and other credibility indicators.

πŸ“’ Propaganda Analyzer

Identifies propaganda techniques such as loaded language, name-calling, and fear-mongering.

❓ Misinformation Analyzer

Assesses misinformation risk by analyzing claim patterns and source attribution.

πŸ” Fallacy Detector

Identifies logical fallacies such as ad hominem attacks, strawman arguments, and false dichotomies.

πŸ“š Educational Value Analyzer

Evaluates whether the article provides educational value by checking for context and complexity explanation.

🎣 Clickbait Detector

Analyzes headlines and content for clickbait patterns including curiosity gaps and exaggeration.

πŸ‘€ NER Analyzer

Uses spaCy NER to extract persons, organizations, and locations mentioned.

🏷️ Topic Extractor

Extracts keywords and identifies main topics using TF-IDF algorithms.

Technologies Used

Our analysis pipeline is built with industry-standard NLP and machine learning technologies:

Python 3.11+ BERT Transformers spaCy 3.7 TextBlob VADER Sentiment scikit-learn FastAPI PostgreSQL

Limitations and Caveats

Important: Our analysis has limitations that users should understand.

  • Not Fact-Checking: We analyze writing patterns and indicators, not factual accuracy.
  • Algorithmic Bias: Our models may have inherent biases from training data.
  • Language Limitations: Currently optimized for Turkish and English.
  • Context Blindness: Algorithms may miss context, satire, or nuance.

Feedback Welcome: If you notice analysis errors or have suggestions, please contact us.