diff --git a/README.md b/README.md index 4e10e15..83db6a8 100644 --- a/README.md +++ b/README.md @@ -1,138 +1,231 @@ -# Zpravobot Digest Bot +# Zprávobot AI Digest -Automatický denní digest z Mastodon postů pomocí Claude AI. +Automatický denní digest systém pro Mastodon boty používající Claude AI. -## Struktura +## 🎯 Co to dělá -``` -/app/data/zpravobot-digest/ -├── export-daily.sh # Export postů z DB do CSV -├── digest-bot.py # Hlavní script (Claude + Mastodon) -├── run-digest.sh # Wrapper s config -├── config.env.example # Template konfigurace -└── README.md -``` +Systém 3× denně: +1. Načte včerejší posty z CSV exportu +2. Automaticky je kategorizuje podle témat (🌍 Politika, 🏒 Sport, 🎬 Kultura...) +3. Analyzuje pomocí Claude AI +4. Publikuje 2-toot thread na Mastodon -## Instalace +### Tři boty s různými styly -### 1. Klonuj repo +| Bot | Čas | Styl | Účel | +|-----|-----|------|------| +| @zpravobot | 7:30 | Neutrální | Ranní přehled zpráv | +| @pozitivni | 12:00 | Pozitivní | Polední motivace | +| @sarkasticky | 19:00 | Sarkastický | Večerní komentář | +## 📋 Požadavky + +- Ruby 3.0+ +- mastodon-api gem +- PostgreSQL s Mastodon daty +- Claude API klíč +- 3 Mastodon bot tokeny + +## 🚀 Instalace (Cloudron) + +### 1. Připrav prostředí + +V Mastodon terminalu (Cloudron): ```bash cd /app/data -git clone https://gitea.tvoje-domena.cz/user/zpravobot-digest.git +git clone https://gitea.vhsky.cz/user/zpravobot-digest.git cd zpravobot-digest ``` -### 2. Konfigurace +### 2. Nainstaluj Ruby gem +```bash +export GEM_HOME=$HOME/.gem +export PATH=$GEM_HOME/bin:$PATH +gem install mastodon-api --user-install +``` +Ověř instalaci: +```bash +ruby -e "require 'mastodon'; puts 'OK'" +``` + +### 3. Konfigurace ```bash cp config.env.example config.env nano config.env ``` -Vyplň: - -- `ANTHROPIC_API_KEY` - Claude API token -- `TOKEN_ZPRAVOBOT` - Mastodon token pro @zpravobot -- `TOKEN_POZITIVNI` - Mastodon token pro @pozitivni -- `TOKEN_SARKASTICKY` - Mastodon token pro @sarkasticky - -### 3. Práva - +Vyplň tokeny: ```bash -chmod +x export-daily.sh run-digest.sh digest-bot.py +export ANTHROPIC_API_KEY="sk-ant-api03-..." +export ZPRAVOBOT_TOKEN="token-zde" +export POZITIVNI_TOKEN="token-zde" +export SARKASTICKY_TOKEN="token-zde" +``` + +**Jak vytvořit Mastodon tokeny:** +1. Přihlásit se jako bot účet +2. Settings → Development → New application +3. Scopes: `read:statuses` + `write:statuses` +4. Zkopírovat "Your access token" + +### 4. Spustitelné práva +```bash +chmod +x export-daily.sh publish_digest.rb run-digest.sh chmod 600 config.env ``` -### 4. Test +## 🧪 Testování +### Dry-run (bez publikace) ```bash -./export-daily.sh # Export CSV -./run-digest.sh zpravobot # Test digestu +source config.env +./run-digest.sh zpravobot --dry-run +./run-digest.sh pozitivni --dry-run +./run-digest.sh sarkasticky --dry-run ``` -## Použití - -### Manuální spuštění - +### Live test (skutečná publikace) ```bash -./run-digest.sh zpravobot # Neutrální digest -./run-digest.sh pozitivni # Pozitivní digest -./run-digest.sh sarkasticky # Sarkastický digest +./run-digest.sh zpravobot ``` -### Automatizace (Cloudron Cron) +Zkontroluj na Mastodonu že se thread publikoval. +## ⏰ Automatizace (Cron) + +V Cloudron UI → Mastodon app → Cron tab: ``` 0 6 * * * /app/data/zpravobot-digest/export-daily.sh -0 7 * * * /app/data/zpravobot-digest/run-digest.sh zpravobot +30 7 * * * /app/data/zpravobot-digest/run-digest.sh zpravobot 0 12 * * * /app/data/zpravobot-digest/run-digest.sh pozitivni -0 18 * * * /app/data/zpravobot-digest/run-digest.sh sarkasticky +0 19 * * * /app/data/zpravobot-digest/run-digest.sh sarkasticky ``` -## Výstup - -### Export CSV - -- **Lokace:** `/app/data/posts-latest.csv` -- **Formát:** `id,created_at,text,uri,url,account_id` -- **Rozsah:** Posledních 48 hodin -- **Archiv:** `/app/data/archive/posts-YYYY-MM-DD.csv` (7 dní) - -### Digest - -- 2-toot thread (summary + odkazy) -- Publikováno na příslušný bot účet -- Styl podle bot personality - -## Jak to funguje - -1. **Export (6:00):** SQL → CSV export z PostgreSQL -2. **Digest (7:00/12:00/18:00):** - - Načte CSV - - Pošle data Claude API - - Claude analyzuje témata - - Publikuje 2-toot thread na Mastodon - -## Struktura souborů +**Rozvrh:** +- 6:00 - Export postů z databáze +- 7:30 - Neutrální digest (@zpravobot) +- 12:00 - Pozitivní zprávy (@pozitivni) +- 19:00 - Sarkastický komentář (@sarkasticky) +## 📁 Struktura souborů ``` /app/data/ -├── zpravobot-digest/ # Git repo -│ ├── export-daily.sh -│ ├── digest-bot.py -│ ├── run-digest.sh -│ ├── config.env # Gitignored! -│ └── README.md -├── posts-latest.csv # Denní export -├── archive/ # 7-denní historie -│ └── posts-YYYY-MM-DD.csv +├── zpravobot-digest/ +│ ├── export-daily.sh # CSV export z PostgreSQL +│ ├── publish_digest.rb # Hlavní Ruby script +│ ├── run-digest.sh # Wrapper (načte config) +│ ├── config.env # Tokeny (gitignored!) +│ └── config.env.example # Template +├── posts-latest.csv # Denní export (2 dny postů) +├── archive/ +│ └── posts-YYYY-MM-DD.csv # 7 denní historie └── logs/ - └── export.log + └── export.log # Logy exportu ``` -## Požadavky +## 🔧 Ruční použití -- Python 3.x (v Cloudron Mastodonu je) -- Mastodon instance (zpravobot.news) -- Claude API přístup -- 3× Mastodon bot účty s tokeny +### Publikovat digest +```bash +source config.env +./run-digest.sh zpravobot # Neutrální +./run-digest.sh pozitivni # Pozitivní +./run-digest.sh sarkasticky # Sarkastický +``` -## Bezpečnost +### Použít specifické datum +```bash +./run-digest.sh zpravobot --date=2026-01-05 --dry-run +``` -- ⚠️ `config.env` obsahuje citlivé tokeny → chmod 600 -- ⚠️ Nepublikuj `config.env` do Gitu (je v .gitignore) -- ✅ DB přístup jen pro export script -- ✅ Digest script čte pouze CSV (bez DB přístupu) +### Export CSV +```bash +./export-daily.sh +``` -## TODO +## 🎨 Vlastnosti -- [ ] Prompt optimalizace pro Clauda -- [ ] Error handling v digest-bot.py -- [ ] Notifikace při selhání -- [ ] Web dashboard pro statistiky +- ✅ **Automatická kategorizace témat** (Politik, Sport, Kultura...) +- ✅ **Claude AI analýza** s fallbackem při selhání API +- ✅ **Style filtering** - pozitivní bot filtruje negativní zprávy +- ✅ **2-toot threads** - summary + odkazy +- ✅ **URL extraction** z postů +- ✅ **Error handling** a logging +- ✅ **Dry-run mode** pro testování -## Autor +## 📊 Monitoring -Kolega + Claude +### Zkontrolovat dnešní běhy +```bash +# V logu exportu +tail -50 /app/data/logs/export.log +# Ověřit CSV +ls -lh /app/data/posts-latest.csv +wc -l /app/data/posts-latest.csv +``` + +### Zkontrolovat publikace + +Navštiv: +- https://zpravobot.news/@zpravobot +- https://zpravobot.news/@pozitivni +- https://zpravobot.news/@sarkasticky + +## 🐛 Troubleshooting + +### "CSV file not found" +```bash +# Ověř že export běžel +ls -la /app/data/posts-latest.csv + +# Spusť manuálně +./export-daily.sh +``` + +### "Missing token" +```bash +# Ověř environment +source config.env +echo $ZPRAVOBOT_TOKEN +``` + +### "The access token is invalid" + +Token vypršel nebo je neplatný. Vygeneruj nový v Mastodon → Settings → Development. + +### Ruby gem chyba +```bash +# Reinstaluj gem +export GEM_HOME=$HOME/.gem +export PATH=$GEM_HOME/bin:$PATH +gem install mastodon-api --user-install +``` + +## 💰 Náklady + +- **Claude API**: ~$3/měsíc (3 requesty/den) +- **Infrastruktura**: $0 (běží na Mastodon serveru) + +## 🔒 Bezpečnost + +- ✅ Žádný přímý DB přístup (používá CSV export) +- ✅ Tokeny v `config.env` (gitignored) +- ✅ Read-only přístup k datům +- ✅ Minimální oprávnění + +## 📝 Licence + +Open source - vytvořeno pro Zprávobot.news komunitu. + +## 🙏 Credits + +- **Zprávobot.news** - České/Slovenské Mastodon zpravodajství +- **Anthropic Claude** - AI analýza +- **Mastodon** - Decentralizovaná sociální síť + +--- + +**Verze:** 1.0.0 +**Aktualizováno:** Leden 2026 diff --git a/config.env.example b/config.env.example index e688a3e..08ba24b 100644 --- a/config.env.example +++ b/config.env.example @@ -1,14 +1,13 @@ -export ANTHROPIC_API_KEY="sk-ant-xxx..." -export TOKEN_ZPRAVOBOT="your-token-here" -export TOKEN_POZITIVNI="your-token-here" -export TOKEN_SARKASTICKY="your-token-here" -``` +cat > config.env.example << 'EOF' +# Ruby gem setup +export GEM_HOME=$HOME/.gem +export PATH=$GEM_HOME/bin:$PATH -## .gitignore -``` -config.env -*.csv -logs/ -archive/ -__pycache__/ -*.pyc +# API Keys +export ANTHROPIC_API_KEY="your-claude-api-key-here" + +# Mastodon Bot Tokens +export ZPRAVOBOT_TOKEN="your-zpravobot-token-here" +export POZITIVNI_TOKEN="your-pozitivni-token-here" +export SARKASTICKY_TOKEN="your-sarkasticky-token-here" +EOF diff --git a/digest-bot.py b/digest-bot.py deleted file mode 100644 index e0848d2..0000000 --- a/digest-bot.py +++ /dev/null @@ -1,115 +0,0 @@ -#!/usr/bin/env python3 -import csv -import requests -import os -import sys -import json - -# Config -CSV_PATH = '/app/data/posts-latest.csv' -CLAUDE_API = 'https://api.anthropic.com/v1/messages' -CLAUDE_KEY = os.getenv('ANTHROPIC_API_KEY') -MASTODON_URL = 'https://zpravobot.news' - -# Bot selection -bot_name = sys.argv[1] if len(sys.argv) > 1 else 'zpravobot' -TOKENS = { - 'zpravobot': os.getenv('TOKEN_ZPRAVOBOT'), - 'pozitivni': os.getenv('TOKEN_POZITIVNI'), - 'sarkasticky': os.getenv('TOKEN_SARKASTICKY') -} - -if bot_name not in TOKENS: - print(f"❌ Unknown bot: {bot_name}") - sys.exit(1) - -TOKEN = TOKENS[bot_name] - -if not TOKEN or not CLAUDE_KEY: - print(f"❌ Missing env variables") - sys.exit(1) - -# 1. Load posts -try: - with open(CSV_PATH, 'r', encoding='utf-8') as f: - posts = list(csv.DictReader(f)) - print(f"📊 Loaded {len(posts)} posts for @{bot_name}") -except Exception as e: - print(f"❌ Error loading CSV: {e}") - sys.exit(1) - -# 2. Prepare data for Claude -posts_sample = posts[:500] # Limit to 500 posts -posts_json = json.dumps(posts_sample, ensure_ascii=False) - -# 3. Claude API -print("🤖 Calling Claude API...") -try: - response = requests.post( - CLAUDE_API, - headers={ - 'x-api-key': CLAUDE_KEY, - 'anthropic-version': '2023-06-01', - 'content-type': 'application/json' - }, - json={ - 'model': 'claude-sonnet-4-20250514', - 'max_tokens': 4000, - 'messages': [{ - 'role': 'user', - 'content': f'Vytvoř denní digest pro bot @{bot_name}. Data: {posts_json[:10000]}' - }] - }, - timeout=60 - ) - - if response.status_code != 200: - print(f"❌ Claude API error: {response.text}") - sys.exit(1) - - digest = response.json()['content'][0]['text'] - print(f"✅ Claude response: {len(digest)} chars") -except Exception as e: - print(f"❌ Claude API exception: {e}") - sys.exit(1) - -# 4. Split to 2 toots (500 chars limit) -toot1 = digest[:500] -toot2 = digest[500:1000] if len(digest) > 500 else None - -# 5. Publish toot 1 -print("📤 Publishing toot 1...") -try: - r1 = requests.post( - f'{MASTODON_URL}/api/v1/statuses', - headers={'Authorization': f'Bearer {TOKEN}'}, - json={'status': toot1} - ) - - if r1.status_code not in [200, 201]: - print(f"❌ Mastodon error: {r1.text}") - sys.exit(1) - - toot1_id = r1.json()['id'] - print(f"✅ Toot 1 published: {toot1_id}") -except Exception as e: - print(f"❌ Mastodon exception: {e}") - sys.exit(1) - -# 6. Publish toot 2 (if exists) -if toot2: - print("📤 Publishing toot 2...") - try: - r2 = requests.post( - f'{MASTODON_URL}/api/v1/statuses', - headers={'Authorization': f'Bearer {TOKEN}'}, - json={ - 'status': toot2, - 'in_reply_to_id': toot1_id - } - ) - print(f"✅ Toot 2 published (thread)") - except Exception as e: - print(f"⚠️ Toot 2 failed: {e}") - -print(f"✅ Done! Published to @{bot_name}") diff --git a/export-daily.sh b/export-daily.sh index 1e9b697..3b6c2d6 100644 --- a/export-daily.sh +++ b/export-daily.sh @@ -1,10 +1,11 @@ +cat >export-daily.sh <<'EOF' #!/bin/bash DATE=$(date +%Y-%m-%d) LOG="/app/data/logs/export.log" mkdir -p /app/data/logs /app/data/archive -echo "[$(date)] Starting export..." >>"$LOG" +echo "[$(date)] Starting export..." >> "$LOG" PGPASSWORD=${CLOUDRON_POSTGRESQL_PASSWORD} psql \ -h ${CLOUDRON_POSTGRESQL_HOST} \ @@ -18,10 +19,13 @@ PGPASSWORD=${CLOUDRON_POSTGRESQL_PASSWORD} psql \ AND deleted_at IS NULL AND created_at > NOW() - INTERVAL '2 days' ORDER BY created_at DESC - ) TO STDOUT WITH CSV HEADER" >/app/data/posts-latest.csv + ) TO STDOUT WITH CSV HEADER" > /app/data/posts-latest.csv cp /app/data/posts-latest.csv "/app/data/archive/posts-$DATE.csv" find /app/data/archive -name "posts-*.csv" -mtime +7 -delete -LINES=$(wc -l >"$LOG" +LINES=$(wc -l < /app/data/posts-latest.csv) +echo "[$(date)] Exported $LINES posts" >> "$LOG" +EOF + +chmod +x export-daily.sh diff --git a/publish_digest.rb b/publish_digest.rb new file mode 100644 index 0000000..d0bfabc --- /dev/null +++ b/publish_digest.rb @@ -0,0 +1,542 @@ +#!/usr/bin/env ruby +# -*- coding: utf-8 -*- +# +# Zprávobot.news - AI Daily Digest Publisher +# Version: 1.0.1 (Cloudron - Direct HTTP) +# +# Generates and publishes daily digest posts to Mastodon bots: +# - @zpravobot (7:30) - neutral overview +# - @pozitivni (12:00) - positive news +# - @sarkasticky (19:00) - sarcastic commentary + +require 'csv' +require 'json' +require 'time' +require 'net/http' +require 'uri' +require 'optparse' + +# ========================================== +# CONFIGURATION +# ========================================== + +MASTODON_URL = 'https://zpravobot.news' +CSV_PATH = '/app/data/posts-latest.csv' +ANTHROPIC_API_URL = 'https://api.anthropic.com/v1/messages' + +BOTS = { + 'zpravobot' => { + token: ENV['ZPRAVOBOT_TOKEN'], + style: 'neutral', + time_slot: 'morning', + hashtags: '#zpravobot #trendydne' + }, + 'pozitivni' => { + token: ENV['POZITIVNI_TOKEN'], + style: 'positive', + time_slot: 'noon', + hashtags: '#dobréZprávy #zpravobot' + }, + 'sarkasticky' => { + token: ENV['SARKASTICKY_TOKEN'], + style: 'sarcastic', + time_slot: 'evening', + hashtags: '#realita #zpravobot' + } +} + +# ========================================== +# COMMAND LINE PARSING +# ========================================== + +options = {} +OptionParser.new do |opts| + opts.banner = "Usage: publish_digest.rb [options]" + + opts.on("--bot BOT", String, "Bot name (zpravobot, pozitivni, sarkasticky)") do |b| + options[:bot] = b + end + + opts.on("--dry-run", "Test mode - don't actually publish") do + options[:dry_run] = true + end + + opts.on("--date DATE", String, "Process specific date (YYYY-MM-DD)") do |d| + options[:date] = d + end + + opts.on("-h", "--help", "Show this help") do + puts opts + exit + end +end.parse! + +bot_name = options[:bot] + +unless bot_name && BOTS.key?(bot_name) + puts "❌ ERROR: Invalid bot name. Use: zpravobot, pozitivni, or sarkasticky" + exit 1 +end + +config = BOTS[bot_name] + +# Validate environment +unless config[:token] + puts "❌ ERROR: Missing token for @#{bot_name}" + puts " Set environment variable: #{bot_name.upcase}_TOKEN" + exit 1 +end + +unless ENV['ANTHROPIC_API_KEY'] + puts "❌ ERROR: Missing ANTHROPIC_API_KEY" + exit 1 +end + +# ========================================== +# UTILITIES +# ========================================== + +def log(message) + timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S') + puts "[#{timestamp}] #{message}" +end + +def extract_url(text) + text[/https?:\/\/[^\s<>"]+/] +end + +# ========================================== +# DATA LOADING +# ========================================== + +def load_posts_from_csv(date = nil) + target_date = date || (Time.now - 86400).strftime('%Y-%m-%d') + + unless File.exist?(CSV_PATH) + log "❌ CSV file not found: #{CSV_PATH}" + exit 1 + end + + posts = [] + + CSV.foreach(CSV_PATH, headers: true, encoding: 'utf-8') do |row| + begin + created = Time.parse(row['created_at']) + + if created.strftime('%Y-%m-%d') == target_date + posts << { + 'text' => row['text'], + 'url' => row['url'] || '', + 'created_at' => row['created_at'] + } + end + rescue => e + # Skip problematic rows + next + end + end + + log "📊 Loaded #{posts.size} posts from #{target_date}" + + if posts.empty? + log "⚠️ No posts found for #{target_date}" + exit 1 + end + + posts +end + +# ========================================== +# TOPIC EXTRACTION +# ========================================== + +def extract_topics(posts) + topics = Hash.new { |h, k| h[k] = [] } + + posts.each do |post| + text = post['text'].downcase + + # Add URL to post if not present + post['extracted_url'] = extract_url(post['text']) || post['url'] + + # Categorize by topic + if text.match?(/trump|venezuela|maduro|grónsko|greenland|usa|bílý dům/) + topics['🌍 Zahraniční politika'] << post + elsif text.match?(/hokej|extraliga|nhl|ms u20/) + topics['🏒 Hokej'] << post + elsif text.match?(/fotbal|chelsea|liga|gól|penalty/) + topics['⚽ Fotbal'] << post + elsif text.match?(/film|seriál|stranger things|hudba|koncert|festival|netflix/) + topics['🎬 Kultura'] << post + elsif text.match?(/počasí|teplota|mráz|sníh|déšť/) + topics['❄️ Počasí'] << post + elsif text.match?(/politika|parlament|vláda|ministr/) + topics['🏛️ Politika'] << post + elsif text.match?(/ekonomika|koruna|inflace|mzdy|ceny/) + topics['💼 Ekonomika'] << post + end + end + + # Sort by post count + topics = topics.sort_by { |_, posts| -posts.size }.to_h + + log "🔍 Found #{topics.size} topics:" + topics.each { |topic, posts| log " #{topic}: #{posts.size} posts" } + + topics +end + +# ========================================== +# CONTENT FILTERING BY STYLE +# ========================================== + +def filter_topics_by_style(topics, style) + case style + when 'neutral' + topics + + when 'positive' + positive_topics = {} + + topics.each do |topic, posts| + next if topic.include?('Politika') || topic.include?('Zahraniční') + + positive_posts = posts.select do |post| + text = post['text'].downcase + has_positive = text.match?(/úspěch|vítěz|rekord|festival|koncert|ocenění|talent/) + no_negative = !text.match?(/nehoda|smrt|tragédie|havárie|konflikt|krize/) + has_positive && no_negative + end + + positive_topics[topic] = positive_posts unless positive_posts.empty? + end + + log "💚 Filtered to #{positive_topics.size} positive topics" + positive_topics + + when 'sarcastic' + sarcastic_topics = {} + + topics.each do |topic, posts| + if topic.include?('Zahraniční') || topic.include?('Politika') + sarcastic_topics[topic] = posts + end + end + + if sarcastic_topics.size < 3 + topics.each do |topic, posts| + break if sarcastic_topics.size >= 5 + sarcastic_topics[topic] = posts unless sarcastic_topics.key?(topic) + end + end + + log "😏 Selected #{sarcastic_topics.size} topics for sarcasm" + sarcastic_topics + + else + topics + end +end + +# ========================================== +# CLAUDE API ANALYSIS +# ========================================== + +def analyze_with_claude(posts, topics) + log "🤖 Analyzing with Claude API..." + + topic_summary = topics.map { |topic, posts| "#{topic}: #{posts.size}" }.join(', ') + sample_texts = posts[0..49].map { |p| p['text'][0..150] } + + prompt = <<~PROMPT + Analyzuj #{posts.size} českých/slovenských zpráv z Mastodon instance Zprávobot.news. + + Témata: #{topic_summary} + + Ukázka textů: + #{sample_texts[0..9].join("\n---\n")} + + Vrať POUZE JSON (žádný markdown): + { + "main_topics": ["téma1", "téma2", "téma3"], + "sentiment": "neutral|positive|negative", + "notable_events": ["událost1", "událost2"] + } + PROMPT + + uri = URI(ANTHROPIC_API_URL) + request = Net::HTTP::Post.new(uri) + request['anthropic-version'] = '2023-06-01' + request['content-type'] = 'application/json' + request['x-api-key'] = ENV['ANTHROPIC_API_KEY'] + + request.body = { + model: 'claude-sonnet-4-20250514', + max_tokens: 1000, + messages: [ + { role: 'user', content: prompt } + ] + }.to_json + + response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http| + http.request(request) + end + + if response.code != '200' + log "⚠️ Claude API error: #{response.code}" + return default_analysis(topics) + end + + data = JSON.parse(response.body) + text = data['content'][0]['text'] + + analysis = JSON.parse(text.gsub(/```json|```/, '').strip) + log "✅ Claude analysis complete" + analysis + +rescue => e + log "⚠️ Claude API error: #{e.message}" + default_analysis(topics) +end + +def default_analysis(topics) + { + 'main_topics' => topics.keys[0..2], + 'sentiment' => 'neutral', + 'notable_events' => [] + } +end + +# ========================================== +# TOOT GENERATION +# ========================================== + +def generate_summary_toot(posts_count, topics, style, hashtags) + date = (Time.now - 86400).strftime('%d.%m.%Y') + + topic_lines = topics.keys[0..4].map do |topic| + count = topics[topic].size + "#{topic} (#{count}#{style == 'sarcastic' ? '×' : ' postů'})" + end + + case style + when 'neutral' + summary = <<~TOOT + 📊 TRENDY DNE (#{date}) + + Zpracováno #{posts_count} postů: + + #{topic_lines.join("\n")} + + #{hashtags} + + 👇 Odkazy na vybrané články + TOOT + + when 'positive' + summary = <<~TOOT + ☀️ DOBRÉ ZPRÁVY DNE (#{date}) + + Z dnešních #{posts_count} zpráv vybrané momenty: + + #{topic_lines[0..3].join("\n")} + + #{hashtags} + + 👇 Inspirace na čtení + TOOT + + when 'sarcastic' + summary = <<~TOOT + 😏 DNEŠNÍ REALITA (#{date}) + + #{posts_count} postů = co se stalo? + + #{topic_lines[0..3].join("\n")} + + #{hashtags} + + 👇 Důkazy zmaru + TOOT + end + + if summary.length > 500 + summary = summary[0..496] + "..." + end + + summary.strip +end + +def generate_links_toot(topics, style) + links = [] + max_topics = 5 + max_links_per_topic = 2 + + topics.keys[0...max_topics].each do |topic| + posts = topics[topic] + links << "\n#{topic}:" + + selected = [] + selected << posts[0] if posts[0] + selected << posts[posts.size / 2] if posts.size > 1 + + selected[0...max_links_per_topic].each do |post| + title = post['text'].split("\n")[0][0..50].strip + title = title.gsub(/\s+/, ' ') + + url = post['extracted_url'] + next unless url && !url.empty? + + short_url = url.gsub(/https?:\/\//, '') + short_url = short_url[0..37] + '...' if short_url.length > 40 + + links << "• #{title}..." + links << " 🔗 #{short_url}" + end + end + + case style + when 'neutral' + header = "📌 VYBRANÉ ČLÁNKY DNE:" + footer = "\n#články #zprávy" + + when 'positive' + header = "💚 POZITIVNÍ PŘÍBĚHY DNE:" + footer = "\n💙 Máte skvělý den!\n#inspirace" + + when 'sarcastic' + header = "🤡 \"BREAKING NEWS\" DNE:" + footer = "\n🙃 Zítra: repeat\n#sarkasmus" + end + + toot = header + links.join("\n") + footer + + if toot.length > 500 + truncated_links = links[0..(links.size * 2 / 3)] + toot = header + truncated_links.join("\n") + footer + + if toot.length > 500 + toot = toot[0..496] + "..." + end + end + + toot.strip +end + +# ========================================== +# MASTODON PUBLISHING (DIRECT HTTP) +# ========================================== + +def publish_thread(bot_name, summary_toot, links_toot, dry_run: false) + config = BOTS[bot_name] + + log "📤 Publishing thread for @#{bot_name}..." + + if dry_run + log "🧪 DRY RUN MODE - Not actually publishing" + log "\n--- TOOT 1/2 (#{summary_toot.length} chars) ---" + log summary_toot + log "\n--- TOOT 2/2 (#{links_toot.length} chars) ---" + log links_toot + log "\n✅ Dry run complete" + return [nil, nil] + end + + # Publish toot 1 + uri = URI("#{MASTODON_URL}/api/v1/statuses") + request = Net::HTTP::Post.new(uri) + request['Authorization'] = "Bearer #{config[:token]}" + request['Content-Type'] = 'application/json' + request.body = { status: summary_toot, visibility: 'public' }.to_json + + response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http| + http.request(request) + end + + unless response.code == '200' + log "❌ ERROR: #{response.body}" + exit 1 + end + + toot1_data = JSON.parse(response.body) + toot1_url = toot1_data['url'] + toot1_id = toot1_data['id'] + log "✅ Toot 1/2 published: #{toot1_url}" + + # Publish toot 2 as reply + request2 = Net::HTTP::Post.new(uri) + request2['Authorization'] = "Bearer #{config[:token]}" + request2['Content-Type'] = 'application/json' + request2.body = { + status: links_toot, + in_reply_to_id: toot1_id, + visibility: 'public' + }.to_json + + response2 = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http| + http.request(request2) + end + + log "✅ Toot 2/2 published (thread)" + + [toot1_data, JSON.parse(response2.body)] + +rescue => e + log "❌ ERROR publishing thread: #{e.message}" + exit 1 +end + +# ========================================== +# MAIN EXECUTION +# ========================================== + +def main(bot_name, options = {}) + log "🚀 Starting Daily Digest for @#{bot_name}" + log "=" * 60 + + config = BOTS[bot_name] + + posts = load_posts_from_csv(options[:date]) + + log "\n🔍 Extracting topics..." + all_topics = extract_topics(posts) + + topics = filter_topics_by_style(all_topics, config[:style]) + + if topics.empty? + log "⚠️ No suitable topics found for style: #{config[:style]}" + exit 1 + end + + log "\n🤖 Analyzing with Claude..." + analysis = analyze_with_claude(posts, topics) + + log "\n📝 Generating content..." + summary = generate_summary_toot(posts.size, topics, config[:style], config[:hashtags]) + links = generate_links_toot(topics, config[:style]) + + log " Summary: #{summary.length} chars" + log " Links: #{links.length} chars" + + log "\n📤 Publishing to Mastodon..." + toot1, toot2 = publish_thread(bot_name, summary, links, dry_run: options[:dry_run]) + + log "\n" + "=" * 60 + log "✅ Digest complete for @#{bot_name}" + + unless options[:dry_run] + log "🔗 Thread: #{toot1['url']}" if toot1 + end +end + +# Run main +begin + main(bot_name, options) +rescue Interrupt + log "\n⚠️ Interrupted by user" + exit 130 +rescue => e + log "❌ FATAL ERROR: #{e.message}" + log " #{e.backtrace[0..4].join("\n ")}" + exit 1 +end diff --git a/run-digest.sh b/run-digest.sh index 08f32ba..b8693e1 100644 --- a/run-digest.sh +++ b/run-digest.sh @@ -1,3 +1,7 @@ +cat >run-digest.sh <<'EOF' #!/bin/bash -source /app/data/zpravobot-digest/config.env -python3 /app/data/zpravobot-digest/digest-bot.py "$@" +source /app/data/config.env +ruby /app/data/publish_digest.rb --bot="$1" "${@:2}" +EOF + +chmod +x run-digest.sh