Trend Research Automation User Manual

This manual explains how the trend-research-automation system works end-to-end: data collection, storage, normalization, scoring, dashboards, exports, reports, competitor analysis, configuration, maintenance, and future extension.

Project trend-research-automation
Primary stack Python, SQLite, pandas, pydantic, Typer, Streamlit, Jinja2
Primary use case Fashion and homewear trend discovery for organic and paid testing
Default timezone Africa/Cairo

1. System Purpose

The system is designed to help a pajamas, lingerie, homewear, loungewear, and nightwear brand decide what to create and what to test. It combines trend demand, competitor advertising signals, internal fit heuristics, and future-ready placeholders for owned performance data.

The output is not a generic trend feed. It is a decision-support system intended to answer questions like:

2. Architecture

Core Layers

  • config/: editable YAML business configuration
  • src/providers/: external source adapters
  • src/pipelines/: executable workflows
  • src/services/: scoring, classification, reporting, alerts, config loading
  • src/models/: schemas and SQLite management
  • dashboards/: Streamlit app plus dashboard export CSV
  • reports/: rendered HTML reports and alert files

Execution Modes

  • Mock: deterministic test data
  • Manual: CSV imports
  • Live: Google Trends live collection
  • Apify: competitor Meta Ad Library scraping

3. Data Flow

  1. Providers fetch or import raw source data.
  2. Raw rows are stored in source-specific SQLite tables.
  3. Normalization maps different sources into one standard trend-signal shape.
  4. Scoring calculates source scores, fit score, historical fit, and competitor proxy score.
  5. Thresholds convert final scores into an operational decision.
  6. Exports, reports, alerts, and dashboard views are generated from the scored data.
The system is intentionally modular. If one provider fails, the project should still remain usable with the available sources.

4. Data Providers

Google Trends

Google Trends is the strongest direct demand signal in the current setup. The project uses Egypt-focused tracked terms and collects interest data with locale-aware handling.

Stored in: google_trends_raw

Meta Competitor Monitoring

Competitor ads are collected through Apify using the Facebook Ads Library scraper actor. Exact Facebook page URLs are preferred where available because they produce better results than broad keyword search.

Stored in: meta_competitor_raw

Competitor scrape quality depends on the actor, the page URL quality, timeouts, and what Meta exposes publicly. Not all brands will return the same richness of data.

TikTok

TikTok is currently supported through mock and manual workflows. The architecture is ready for a more live integration later, but a stable live collector is not yet part of the current system.

Owned Social Performance

Internal Instagram, Facebook, and TikTok performance data can be imported manually. This is the future source for true historical performance weighting, including CTR, orders, revenue, and conversion-derived fit.

5. Database Model

The project uses SQLite. The main operational tables are:

Table Purpose
tracked_termsKeyword inventory used by live and manual collection.
google_trends_rawRaw Google Trends rows.
tiktok_trends_rawRaw TikTok trend rows.
meta_competitor_rawRaw competitor ad rows from Meta/Apify.
social_performance_rawOwned content/ad performance inputs.
normalized_trendsUnified cross-source trend signal table.
trend_scoresFinal scored trend decisions.
weekly_shortlistAction-oriented shortlist for testing.

6. Scoring Logic

Final Formula

final_score =
(google_score * 0.30) +
(tiktok_score * 0.25) +
(competitor_score * 0.20) +
(fit_score * 0.15) +
(historical_conversion_fit_score * 0.10)

Decision Thresholds

Decision Rule Meaning
paid_test>= 75Strong enough to justify paid testing.
organic_test>= 55 and < 75Strong enough for content testing.
watchlist>= 35 and < 55Worth monitoring, not a top priority.
ignore< 35Low priority right now.

Fit Score

Fit score is a heuristic measure of how relevant a topic is to the brand. It rewards direct product terms such as pajamas, lingerie, homewear, satin, cotton, and Egyptian Arabic equivalents.

Historical Conversion Fit Score

If owned performance data exists, the system can infer fit from revenue, orders, and CTR. If not, it defaults to a neutral score of 50.

Competitor Score

Competitor score is the strongest of:

Competitor Effectiveness Proxy

Because competitor conversion rate is not available from Ad Library scraping, the system uses a transparent proxy. It rewards:

The system does not claim competitor conversion rate. The proxy is a directional signal only.

Keyword Matching Rules

Competitor relevance is intentionally constrained:

7. Dashboard Guide

Overview

Executive summary of the current system state: top score, top trend, current shortlist size, the top curated trend table, and basic charts.

Trend Scores

Detailed trend ranking view with the full planning-ready fields: scores, decision, platform recommendation, hook, format, paid-vs-organic recommendation, and shortlist rationale.

Competitors

Raw competitor ad rows. This is where scraped ad text, offer text, and media type can be inspected directly.

Keyword Support

Audit layer for Google keyword support. It shows, for each Google keyword, whether any competitor ad supports it by:

Raw Signals

Data quality and debugging section. Use it to inspect what was ingested from each source and how it was normalized.

Weekly Shortlist

Action layer for execution teams. Shows why a trend matters now and what to make first.

Exports

Download center for CSV and HTML outputs.

Important Dashboard Columns

Field Meaning
trend_topicTrend being ranked.
final_scoreWeighted final priority score.
decisionOperational recommendation: paid, organic, watchlist, ignore.
platform_priorityRecommended primary platform.
recommended_hookSuggested opening angle.
recommended_formatRecommended creative format.
why_nowHuman-readable rationale for acting now.
content_hookShortlist-ready execution angle.

8. Pipelines and Commands

Main commands:

python -m src.main ingest-google-trends
python -m src.main ingest-meta-competitors
python -m src.main normalize-signals
python -m src.main score-trends
python -m src.main build-weekly-shortlist
python -m src.main export-dashboard
python -m src.main generate-daily-report
python -m src.main generate-weekly-report
python -m src.main run-daily-pipeline
python -m src.main run-weekly-pipeline
python -m src.main run-dashboard

Daily Pipeline

  1. Ingest enabled sources
  2. Normalize signals
  3. Score trends
  4. Export dashboard data
  5. Generate daily report
  6. Write alert output

Weekly Pipeline

  1. Refresh scoring
  2. Build weekly shortlist
  3. Export dashboard data
  4. Generate weekly report
  5. Write weekly alert output

9. Configuration Files

Main editable files:

10. Daily Operations

Recommended Daily Process

  1. Run the daily pipeline.
  2. Open the dashboard.
  3. Check Overview and Trend Scores.
  4. Check Keyword Support to see whether competitor reinforcement is exact or category-based.
  5. Review Weekly Shortlist for action-ready ideas.
  6. Export CSVs if reporting or BI tools need the latest output.

Recommended Weekly Process

  1. Run the weekly pipeline.
  2. Review shortlist quality and rationale.
  3. Decide what becomes organic content, what becomes paid testing, and what remains on watchlist.

11. Maintenance and Troubleshooting

Common Issues

Issue Likely Cause Action
Competitor scrape returns weak data Generic keyword search instead of exact page URL Add or improve page_url in competitors.yaml.
No competitor data Actor timeout or empty page results Retry scrape, reduce scope, or improve competitor source URLs.
Scores look inflated Matching rules too broad Inspect the Keyword Support tab and tighten taxonomy rules.
Dashboard looks outdated Exports not regenerated after scoring Run score-trends, build-weekly-shortlist, and export-dashboard.

Security Practices

12. Future-Proofing Guidance

The system has been structured so it can evolve without a full rewrite. Recommended next upgrades:

The most important future-proofing rule is to keep the system honest. If a signal is proxy-based, label it as a proxy. If a provider is weak, expose that weakness rather than hiding it inside a score.

Appendix: Key File Paths

This manual is designed to be maintained with the codebase. When major provider, scoring, or dashboard behavior changes, update this file at the same time so the operational model stays aligned with the implementation.