Vincelele AI FOMO Skills: A Deep Dive Tutorial
Introduction
In the rapidly evolving landscape of artificial intelligence, staying abreast of the latest developments can feel overwhelming. The vincelele/ai-fomo-skills repository on GitHub offers a compelling solution: personal superalignment skills designed to transform AI information overload into structured, reusable knowledge. This tutorial provides an in-depth exploration of this Python-based project, covering its core principles, practical implementation, and best practices for integration into your personal AI workflow.
What you will learn:
- Understanding the philosophy behind AI FOMO skills and superalignment.
- Setting up and configuring the
ai-fomo-skills environment.
- Core components of the skill system and how they work together.
- Building and customizing your own knowledge extraction skills.
- Advanced patterns for multi-source aggregation and filtering.
- Integration with existing AI tools and automation platforms.
- Best practices and common pitfalls to avoid.
Understanding the Philosophy: Superalignment and Information FOMO
The AI Information Overload Problem
The term "FOMO" in the project name stands for Fear Of Missing Out, a common anxiety in the AI community. With new models, papers, tools, and techniques emerging daily, developers and researchers often struggle to filter signal from noise.
The ai-fomo-skills project approaches this problem through the lens of superalignment — ensuring that AI systems remain aligned with human values and intentions. In this context, superalignment is not just about safety; it is about personal alignment: ensuring that the AI tools you use surface the information most relevant to your specific goals and interests.
Core Principles
The repository is built on several key principles:
- Skill-Based Architecture: Knowledge extraction is modularized into discrete, reusable skills that can be composed and customized.
- Signal Extraction: Raw information is processed into actionable signals — concise, meaningful summaries tailored to specific domains.
- Digest Generation: Multiple signals are aggregated into periodic digests, providing a high-level overview of relevant developments.
- Personalization: The system is designed to be highly adaptable to individual preferences, expertise levels, and focus areas.
How It Differs from Traditional Approaches
Traditional information management often relies on:
- Manual curation such as bookmarking and note-taking.
- Simple keyword-based alerts like Google Alerts and RSS feeds.
- Generic AI summarization tools.
The ai-fomo-skills approach differs by:
- Using structured skill definitions that encapsulate domain-specific knowledge.
- Applying multi-stage processing pipelines for extraction, filtering, and synthesis.
- Enabling composable workflows that can be shared and extended by the community.
Project Architecture and Components
Repository Structure
The repository is organized into several key directories and files:
| 1 | ai-fomo-skills/
|
| 2 | ├── README.md
|
| 3 | ├── skills/
|
| 4 | │ ├── __init__.py
|
| 5 | │ ├── base_skill.py
|
| 6 | │ ├── extraction/
|
| 7 | │ │ ├── paper_extractor.py
|
| 8 | │ │ ├── repo_monitor.py
|
| 9 | │ │ └── social_listener.py
|
| 10 | │ ├── filtering/
|
| 11 | │ │ ├── relevance_filter.py
|
| 12 | │ │ └── novelty_detector.py
|
| 13 | │ └── synthesis/
|
| 14 | │ ├── digest_generator.py
|
| 15 | │ └── signal_aggregator.py
|
| 16 | ├── config/
|
| 17 | │ ├── default.yaml
|
| 18 | │ └── profiles/
|
| 19 | │ ├── researcher.yaml
|
| 20 | │ └── developer.yaml
|
| 21 | ├── examples/
|
| 22 | │ ├── basic_usage.py
|
| 23 | │ └── advanced_workflow.py
|
| 24 | ├── tests/
|
| 25 | ├── requirements.txt
|
| 26 | ├── setup.py
|
| 27 | └── pyproject.toml
|
Key Components
1. Base Skill Framework (skills/base_skill.py)
The foundation of the system is the BaseSkill abstract class, which defines the interface for all skills:
| 1 | from abc import ABC, abstractmethod
|
| 2 | from typing import Any, Dict, Optional
|
| 3 | from dataclasses import dataclass, field
|
| 4 | from datetime import datetime
|
| 5 |
|
| 6 | @dataclass
|
| 7 | class SkillResult:
|
| 8 | """Standardized result container for skill outputs."""
|
| 9 | content: str
|
| 10 | metadata: Dict[str, Any] = field(default_factory=dict)
|
| 11 | timestamp: datetime = field(default_factory=datetime.now)
|
| 12 | confidence: float = 1.0
|
| 13 | source: str = ""
|
| 14 |
|
| 15 | class BaseSkill(ABC):
|
| 16 | """Abstract base class for all FOMO skills.
|
| 17 |
|
| 18 | Each skill encapsulates a specific capability for processing
|
| 19 | AI-related information. Skills are designed to be composable
|
| 20 | and reusable across different workflows.
|
| 21 | """
|
| 22 |
|
| 23 | def __init__(self, config: Optional[Dict[str, Any]] = None):
|
| 24 | self.config = config or {}
|
| 25 | self._initialized = False
|
| 26 |
|
| 27 | @abstractmethod
|
| 28 | def execute(self, input_data: Any) -> SkillResult:
|
| 29 | """Execute the skill's primary function.
|
| 30 |
|
| 31 | Args:
|
| 32 | input_data: Input data to process. Can be text,
|
| 33 | structured data, or a combination.
|
| 34 |
|
| 35 | Returns:
|
| 36 | SkillResult containing the processed output.
|
| 37 | """
|
| 38 | pass
|
| 39 |
|
| 40 | @abstractmethod
|
| 41 | def describe(self) -> str:
|
| 42 | """Return a human-readable description of the skill."""
|
| 43 | pass
|
| 44 |
|
| 45 | def validate(self) -> bool:
|
| 46 | """Validate skill configuration and dependencies."""
|
| 47 | return True
|
| 48 |
|
| 49 | def initialize(self) -> None:
|
| 50 | """Perform any one-time setup operations."""
|
| 51 | self.validate()
|
| 52 | self._initialized = True
|
2. Extraction Skills
These skills are responsible for gathering raw information from various sources:
paper_extractor.py: Extracts key information from AI research papers (arXiv, conference proceedings).
repo_monitor.py: Monitors GitHub repositories for new releases, issues, and discussions.
social_listener.py: Aggregates discussions from platforms like Twitter/X, Hacker News, and Reddit.
3. Filtering Skills
These skills process raw extractions to identify relevant and novel information:
relevance_filter.py: Uses semantic similarity and keyword matching to score relevance.
novelty_detector.py: Identifies truly new information by comparing against historical data.
4. Synthesis Skills
These skills combine filtered signals into digestible outputs:
signal_aggregator.py: Merges signals from multiple sources into a unified view.
digest_generator.py: Creates formatted digests (Markdown, HTML, plain text) for consumption.
Installation and Setup
Prerequisites
- Python 3.9 or higher
- pip package manager
- Git (for cloning the repository)
- An LLM API key (OpenAI, Anthropic, or a local model via Ollama)
Step-by-Step Installation
Step 1: Clone the Repository
| 1 | # Clone the repository
|
| 2 | git clone https://github.com/vincelele/ai-fomo-skills.git
|
| 3 |
|
| 4 | # Navigate to the project directory
|
| 5 | cd ai-fomo-skills
|
Step 2: Create a Virtual Environment
| 1 | # Create a virtual environment
|
| 2 | python -m venv venv
|
| 3 |
|
| 4 | # Activate the virtual environment
|
| 5 |
|
| 6 | # On macOS/Linux:
|
| 7 | source venv/bin/activate
|
| 8 |
|
| 9 | # On Windows:
|
| 10 |
|
| 11 | # venv\Scripts\activate
|
Step 3: Install Dependencies
| 1 | # Install the package in development mode
|
| 2 | pip install -e ".[dev]"
|
| 3 |
|
| 4 | # Or install from requirements.txt
|
| 5 | pip install -r requirements.txt
|
Step 4: Configure Your Environment
Create a .env file in the project root:
| 1 | # .env file
|
| 2 |
|
| 3 | # LLM API Configuration
|
| 4 | OPENAI_API_KEY=your_openai_api_key_here
|
| 5 |
|
| 6 | # Or use Anthropic
|
| 7 |
|
| 8 | # ANTHROPIC_API_KEY=your_anthropic_api_key_here
|
| 9 |
|
| 10 | # Optional: Local LLM via Ollama
|
| 11 |
|
| 12 | # OLLAMA_BASE_URL=http://localhost:11434
|
| 13 |
|
| 14 | # OLLAMA_MODEL=llama3
|
| 15 |
|
| 16 | # GitHub API (for repo monitoring)
|
| 17 | GITHUB_TOKEN=your_github_personal_access_token
|
| 18 |
|
| 19 | # Database path for storing historical data
|
| 20 | DATA_DIR=./data
|
Step 5: Verify Installation
| 1 | # verify_installation.py
|
| 2 | from skills.base_skill import BaseSkill, SkillResult
|
| 3 |
|
| 4 | # Try importing the main modules
|
| 5 | from skills.extraction import PaperExtractor, RepoMonitor
|
| 6 | from skills.filtering import RelevanceFilter
|
| 7 | from skills.synthesis import DigestGenerator
|
| 8 |
|
| 9 | print("All modules imported successfully!")
|
| 10 | print("Installation verified.")
|
Run the verification script:
| 1 | python verify_installation.py
|
Hands-On: Building Your First Knowledge Extraction Workflow
Let us build a complete workflow that monitors AI repositories and generates a weekly digest.
Step 1: Define Your Profile
Create a custom profile configuration file:
| 1 | # config/profiles/my_profile.yaml
|
| 2 | profile:
|
| 3 | name: "My AI FOMO Profile"
|
| 4 | description: "Custom profile for tracking AI developments"
|
| 5 |
|
| 6 | interests:
|
| 7 | - "large language models"
|
| 8 | - "retrieval-augmented generation"
|
| 9 | - "AI agents"
|
| 10 | - "multimodal models"
|
| 11 | - "AI safety and alignment"
|
| 12 |
|
| 13 | repositories:
|
| 14 | - name: "langchain"
|
| 15 | url: "https://github.com/langchain-ai/langchain"
|
| 16 | frequency: "daily"
|
| 17 | - name: "llama_index"
|
| 18 | url: "https://github.com/run-llama/llama_index"
|
| 19 | frequency: "daily"
|
| 20 | - name: "ollama"
|
| 21 | url: "https://github.com/ollama/ollama"
|
| 22 | frequency: "weekly"
|
| 23 |
|
| 24 | sources:
|
| 25 | arxiv:
|
| 26 | enabled: true
|
| 27 | categories:
|
| 28 | - "cs.AI"
|
| 29 | - "cs.CL"
|
| 30 | - "cs.LG"
|
| 31 | keywords:
|
| 32 | - "LLM"
|
| 33 | - "RAG"
|
| 34 | - "agent"
|
| 35 | - "multimodal"
|
| 36 |
|
| 37 | hacker_news:
|
| 38 | enabled: true
|
| 39 | min_score: 50
|
| 40 | keywords:
|
| 41 | - "AI"
|
| 42 | - "GPT"
|
| 43 | - "LLM"
|
| 44 | - "machine learning"
|
| 45 |
|
| 46 | digest:
|
| 47 | frequency: "weekly"
|
| 48 | format: "markdown"
|
| 49 | output_dir: "./output/digests"
|
| 50 | max_items: 20
|
| 51 | include_summaries: true
|
| 52 | include_sentiment: true
|
Step 2: Create the Workflow Script
| 1 | # workflows/weekly_digest.py
|
| 2 | import os
|
| 3 | import yaml
|
| 4 | from datetime import datetime, timedelta
|
| 5 | from pathlib import Path
|
| 6 |
|
| 7 | from skills.extraction import RepoMonitor, PaperExtractor
|
| 8 | from skills.filtering import RelevanceFilter, NoveltyDetector
|
| 9 | from skills.synthesis import DigestGenerator, SignalAggregator
|
| 10 |
|
| 11 | def load_profile(profile_path: str) -> dict:
|
| 12 | """Load and parse a profile configuration file."""
|
| 13 | with open(profile_path, "r") as f:
|
| 14 | return yaml.safe_load(f)
|
| 15 |
|
| 16 | def extract_signals(profile: dict) -> list:
|
| 17 | """Extract raw signals from configured sources.
|
| 18 |
|
| 19 | This function uses extraction skills to gather information
|
| 20 | from repositories, papers, and social platforms.
|
| 21 | """
|
| 22 | all_signals = []
|
| 23 |
|
| 24 | # Extract from GitHub repositories
|
| 25 | repo_monitor = RepoMonitor(
|
| 26 | config={
|
| 27 | "github_token": os.getenv("GITHUB_TOKEN"),
|
| 28 | "repositories": profile.get("repositories", []),
|
| 29 | }
|
| 30 | )
|
| 31 | repo_monitor.initialize()
|
| 32 |
|
| 33 | for repo in profile.get("repositories", []):
|
| 34 | print(f"Extracting signals from {repo['name']}...")
|
| 35 | result = repo_monitor.execute({
|
| 36 | "repo_url": repo["url"],
|
| 37 | "since": (datetime.now() - timedelta(days=7)).isoformat(),
|
| 38 | })
|
| 39 | all_signals.append(result)
|
| 40 |
|
| 41 | # Extract from arXiv papers
|
| 42 | if profile.get("sources", {}).get("arxiv", {}).get("enabled"):
|
| 43 | paper_extractor = PaperExtractor(
|
| 44 | config=profile["sources"]["arxiv"]
|
| 45 | )
|
| 46 | paper_extractor.initialize()
|
| 47 |
|
| 48 | papers_result = paper_extractor.execute({
|
| 49 | "categories": profile["sources"]["arxiv"]["categories"],
|
| 50 | "keywords": profile["sources"]["arxiv"]["keywords"],
|
| 51 | "max_results": 50,
|
| 52 | })
|
| 53 | all_signals.append(papers_result)
|
| 54 |
|
| 55 | return all_signals
|
| 56 |
|
| 57 | def filter_signals(signals: list, profile: dict) -> list:
|
| 58 | """Apply relevance and novelty filters to raw signals."""
|
| 59 | # Apply relevance filter
|
| 60 | relevance_filter = RelevanceFilter(
|
| 61 | config={
|
| 62 | "interests": profile.get("interests", []),
|
| 63 | }
|
| 64 | )
|
| 65 | relevance_filter.initialize()
|
| 66 |
|
| 67 | filtered = []
|
| 68 | for signal in signals:
|
| 69 | result = relevance_filter.execute(signal)
|
| 70 | if result.confidence >= 0.6: # Relevance threshold
|
| 71 | filtered.append(result)
|
| 72 |
|
| 73 | # Apply novelty detector
|
| 74 | novelty_detector = NoveltyDetector(
|
| 75 | config={
|
| 76 | "data_dir": os.getenv("DATA_DIR", "./data"),
|
| 77 | }
|
| 78 | )
|
| 79 | novelty_detector.initialize()
|
| 80 |
|
| 81 | novel_signals = []
|
| 82 | for signal in filtered:
|
| 83 | result = novelty_detector.execute(signal)
|
| 84 | if result.metadata.get("is_novel", True):
|
| 85 | novel_signals.append(result)
|
| 86 |
|
| 87 | return novel_signals
|
| 88 |
|
| 89 | def generate_digest(signals: list, profile: dict) -> str:
|
| 90 | """Generate a formatted digest from filtered signals."""
|
| 91 | # Aggregate signals first
|
| 92 | aggregator = SignalAggregator()
|
| 93 | aggregated = aggregator.execute(signals)
|
| 94 |
|
| 95 | # Generate the digest
|
| 96 | generator = DigestGenerator(
|
| 97 | config={
|
| 98 | "format": profile.get("digest", {}).get("format", "markdown"),
|
| 99 | "max_items": profile.get("digest", {}).get("max_items", 20),
|
| 100 | "include_summaries": profile.get("digest", {}).get(
|
| 101 | "include_summaries", True
|
| 102 | ),
|
| 103 | }
|
| 104 | )
|
| 105 | generator.initialize()
|
| 106 |
|
| 107 | digest_result = generator.execute(aggregated)
|
| 108 | return digest_result.content
|
| 109 |
|
| 110 | def main():
|
| 111 | """Main workflow execution function."""
|
| 112 | print("=" * 50)
|
| 113 | print("AI FOMO Skills - Weekly Digest Generator")
|
| 114 | print("=" * 50)
|
| 115 |
|
| 116 | # Load profile
|
| 117 | profile_path = "config/profiles/my_profile.yaml"
|
| 118 | profile = load_profile(profile_path)
|
| 119 | print(f"Loaded profile: {profile['profile']['name']}")
|
| 120 |
|
| 121 | # Step 1: Extract signals
|
| 122 | print("\n[Step 1/3] Extracting signals...")
|
| 123 | raw_signals = extract_signals(profile)
|
| 124 | print(f"Extracted {len(raw_signals)} raw signal groups")
|
| 125 |
|
| 126 | # Step 2: Filter signals
|
| 127 | print("\n[Step 2/3] Filtering signals...")
|
| 128 | filtered_signals = filter_signals(raw_signals, profile)
|
| 129 | print(f"Filtered to {len(filtered_signals)} relevant, novel signals")
|
| 130 |
|
| 131 | # Step 3: Generate digest
|
| 132 | print("\n[Step 3/3] Generating digest...")
|
| 133 | digest = generate_digest(filtered_signals, profile)
|
| 134 |
|
| 135 | # Save digest
|
| 136 | output_dir = Path(
|
| 137 | profile.get("digest", {}).get("output_dir", "./output")
|
| 138 | )
|
| 139 | output_dir.mkdir(parents=True, exist_ok=True)
|
| 140 |
|
| 141 | timestamp = datetime.now().strftime("%Y-%m-%d")
|
| 142 | output_file = output_dir / f"digest_{timestamp}.md"
|
| 143 |
|
| 144 | with open(output_file, "w") as f:
|
| 145 | f.write(digest)
|
| 146 |
|
| 147 | print(f"\nDigest saved to: {output_file}")
|
| 148 | print(f"Total characters: {len(digest)}")
|
| 149 | print("\nDone!")
|
| 150 |
|
| 151 | if __name__ == "__main__":
|
| 152 | main()
|
Step 3: Run the Workflow
| 1 | python workflows/weekly_digest.py
|
Advanced Usage: Custom Skill Development
Creating a Custom Extraction Skill
One of the most powerful aspects of ai-fomo-skills is the ability to create custom skills. Let us build a skill that monitors specific AI newsletters:
| 1 | # skills/extraction/newsletter_monitor.py
|
| 2 | from typing import Any, Dict, List, Optional
|
| 3 | from datetime import datetime
|
| 4 |
|
| 5 | from skills.base_skill import BaseSkill, SkillResult
|
| 6 |
|
| 7 | class NewsletterMonitor(BaseSkill):
|
| 8 | """Monitor AI newsletters for relevant content.
|
| 9 |
|
| 10 | This skill subscribes to and processes AI newsletters,
|
| 11 | extracting key insights and developments.
|
| 12 | """
|
| 13 |
|
| 14 | def __init__(self, config: Optional[Dict[str, Any]] = None):
|
| 15 | super().__init__(config)
|
| 16 | self.newsletters = self.config.get("newsletters", [])
|
| 17 | self.llm_client = None
|
| 18 |
|
| 19 | def describe(self) -> str:
|
| 20 | return (
|
| 21 | "Monitors AI newsletters for relevant content, "
|
| 22 | "extracting key insights and developments."
|
| 23 | )
|
| 24 |
|
| 25 | def validate(self) -> bool:
|
| 26 | if not self.newsletters:
|
| 27 | print("Warning: No newsletters configured.")
|
| 28 | return True
|
| 29 |
|
| 30 | def initialize(self) -> None:
|
| 31 | """Set up the LLM client for processing newsletter content."""
|
| 32 | from skills.utils import get_llm_client
|
| 33 | self.llm_client = get_llm_client()
|
| 34 | self._initialized = True
|
| 35 |
|
| 36 | def execute(self, input_data: Any) -> SkillResult:
|
| 37 | """Process newsletter content and extract signals.
|
| 38 |
|
| 39 | Args:
|
| 40 | input_data: Dictionary containing:
|
| 41 | - content: Raw newsletter text content
|
| 42 | - source: Newsletter name/source
|
| 43 | - date: Publication date
|
| 44 |
|
| 45 | Returns:
|
| 46 | SkillResult with extracted insights.
|
| 47 | """
|
| 48 | if not self._initialized:
|
| 49 | self.initialize()
|
| 50 |
|
| 51 | content = input_data.get("content", "")
|
| 52 | source = input_data.get("source", "unknown")
|
| 53 | date = input_data.get("date", datetime.now().isoformat())
|
| 54 |
|
| 55 | # Use LLM to extract key insights
|
| 56 | prompt = (
|
| 57 | f"Analyze the following AI newsletter content and extract:\n"
|
| 58 | f"1. Key announcements or developments\n"
|
| 59 | f"2. New tools, models, or frameworks mentioned\n"
|
| 60 | f"3. Notable research findings\n"
|
| 61 | f"4. Potential impact assessment (high/medium/low)\n\n"
|
| 62 | f"Newsletter source: {source}\n"
|
| 63 | f"Date: {date}\n\n"
|
| 64 | f"Content:\n{content[:4000]}\n\n"
|
| 65 | f"Provide your analysis in structured JSON format."
|
| 66 | )
|
| 67 |
|
| 68 | response = self.llm_client.generate(prompt)
|
| 69 |
|
| 70 | return SkillResult(
|
| 71 | content=response,
|
| 72 | metadata={
|
| 73 | "source": source,
|
| 74 | "date": date,
|
| 75 | "content_length": len(content),
|
| 76 | },
|
| 77 | confidence=0.85,
|
| 78 | source=f"newsletter:{source}",
|
| 79 | )
|
| 80 |
|
| 81 | def batch_process(self, newsletters: List[Dict]) -> List[SkillResult]:
|
| 82 | """Process multiple newsletters in batch."""
|
| 83 | results = []
|
| 84 | for newsletter in newsletters:
|
| 85 | result = self.execute(newsletter)
|
| 86 | results.append(result)
|
| 87 | return results
|
Registering Your Custom Skill
To make your custom skill available in workflows, register it using the central registry:
| 1 | # skills/registry.py
|
| 2 | from typing import Dict, Type
|
| 3 | from skills.base_skill import BaseSkill
|
| 4 |
|
| 5 | class SkillRegistry:
|
| 6 | """Central registry for all available skills."""
|
| 7 |
|
| 8 | _skills: Dict[str, Type[BaseSkill]] = {}
|
| 9 |
|
| 10 | @classmethod
|
| 11 | def register(cls, name: str, skill_class: Type[BaseSkill]) -> None:
|
| 12 | """Register a skill with a unique name."""
|
| 13 | if name in cls._skills:
|
| 14 | print(f"Warning: Overwriting existing skill '{name}'")
|
| 15 | cls._skills[name] = skill_class
|
| 16 |
|
| 17 | @classmethod
|
| 18 | def get(cls, name: str) -> Type[BaseSkill]:
|
| 19 | """Retrieve a skill class by name."""
|
| 20 | if name not in cls._skills:
|
| 21 | raise KeyError(f"Skill '{name}' not found in registry")
|
| 22 | return cls._skills[name]
|
| 23 |
|
| 24 | @classmethod
|
| 25 | def list_skills(cls) -> Dict[str, str]:
|
| 26 | """List all registered skills with their descriptions."""
|
| 27 | return {
|
| 28 | name: skill_class().describe()
|
| 29 | for name, skill_class in cls._skills.items()
|
| 30 | }
|
| 31 |
|
| 32 | # Register built-in skills
|
| 33 | from skills.extraction import PaperExtractor, RepoMonitor
|
| 34 | from skills.filtering import RelevanceFilter
|
| 35 | from skills.synthesis import DigestGenerator
|
| 36 |
|
| 37 | SkillRegistry.register("paper_extractor", PaperExtractor)
|
| 38 | SkillRegistry.register("repo_monitor", RepoMonitor)
|
| 39 | SkillRegistry.register("relevance_filter", RelevanceFilter)
|
| 40 | SkillRegistry.register("digest_generator", DigestGenerator)
|
Now register your custom skill:
| 1 | # Register the custom newsletter monitor
|
| 2 | from skills.extraction.newsletter_monitor import NewsletterMonitor
|
| 3 | from skills.registry import SkillRegistry
|
| 4 |
|
| 5 | SkillRegistry.register("newsletter_monitor", NewsletterMonitor)
|
Integration with External Tools
Using with Ollama for Local LLM Processing
If you prefer to run LLMs locally for privacy or cost reasons, the project supports Ollama:
| 1 | # config/llm_config.py
|
| 2 | from dataclasses import dataclass
|
| 3 | from typing import Optional
|
| 4 |
|
| 5 | @dataclass
|
| 6 | class LLMConfig:
|
| 7 | """Configuration for LLM interactions."""
|
| 8 | provider: str = "openai" # openai, anthropic, ollama
|
| 9 | model: str = "gpt-4"
|
| 10 | temperature: float = 0.3
|
| 11 | max_tokens: int = 2000
|
| 12 |
|
| 13 | # Ollama-specific settings
|
| 14 | ollama_base_url: str = "http://localhost:11434"
|
| 15 | ollama_model: str = "llama3"
|
| 16 |
|
| 17 | def get_llm_client(config: Optional[LLMConfig] = None):
|
| 18 | """Factory function to create appropriate LLM client."""
|
| 19 | config = config or LLMConfig()
|
| 20 |
|
| 21 | if config.provider == "ollama":
|
| 22 | from skills.clients.ollama_client import OllamaClient
|
| 23 | return OllamaClient(
|
| 24 | base_url=config.ollama_base_url,
|
| 25 | model=config.ollama_model,
|
| 26 | )
|
| 27 | elif config.provider == "anthropic":
|
| 28 | from skills.clients.anthropic_client import AnthropicClient
|
| 29 | return AnthropicClient(model=config.model)
|
| 30 | else:
|
| 31 | from skills.clients.openai_client import OpenAIClient
|
| 32 | return OpenAIClient(model=config.model)
|
Automation with Scheduled Tasks
Set up automated digest generation using cron:
| 1 | # Add to crontab (run every Monday at 9 AM)
|
| 2 | 0 9 * * 1 cd /path/to/ai-fomo-skills && /path/to/venv/bin/python workflows/weekly_digest.py >> /path/to/logs/fomo_digest.log 2>&1
|
Or use the built-in scheduler:
| 1 | # scheduler.py
|
| 2 | import schedule
|
| 3 | import time
|
| 4 | import logging
|
| 5 |
|
| 6 | logging.basicConfig(level=logging.INFO)
|
| 7 | logger = logging.getLogger(__name__)
|
| 8 |
|
| 9 | def run_weekly_digest():
|
| 10 | """Execute the weekly digest workflow."""
|
| 11 | logger.info("Starting scheduled weekly digest generation...")
|
| 12 | try:
|
| 13 | # Import and run the main workflow
|
| 14 | from workflows.weekly_digest import main
|
| 15 | main()
|
| 16 | logger.info("Weekly digest generated successfully.")
|
| 17 | except Exception as e:
|
| 18 | logger.error(f"Error generating digest: {e}")
|
| 19 |
|
| 20 | # Schedule for every Monday at 9:00 AM
|
| 21 | schedule.every().monday.at("09:00").do(run_weekly_digest)
|
| 22 |
|
| 23 | if __name__ == "__main__":
|
| 24 | logger.info("Scheduler started. Waiting for next scheduled run...")
|
| 25 | while True:
|
| 26 | schedule.run_pending()
|
| 27 | time.sleep(60)
|
Comparison: AI FOMO Skills vs. Alternative Approaches
Compared to Traditional RSS Aggregators
AI FOMO Skills advantages:
- Semantic understanding of content relevance rather than keyword matching only.
- Multi-source integration beyond RSS (GitHub API, social platforms).
- AI-powered summarization and insight extraction.
- Novelty detection to avoid redundant information.
Traditional RSS advantages:
- Lower latency with near real-time updates.
- No LLM API costs involved.
- Simpler setup for basic use cases.
Compared to Generic AI Summarization Tools
AI FOMO Skills advantages:
- Domain-specific skill design tailored for AI topics.
- Composable and extensible architecture.
- Structured output with confidence scores.
- Historical tracking and novelty detection built in.
Generic tools advantages:
- Broader applicability across domains.
- Often have polished user interfaces.
- Lower barrier to entry for non-technical users.
Compared to Manual Curation
AI FOMO Skills advantages:
- Scalability — can process hundreds of sources simultaneously.
- Consistency — applies the same criteria uniformly.
- Time efficiency — automated processing saves hours of manual work.
Manual curation advantages:
- Higher quality judgment for nuanced or subjective topics.
- No risk of AI hallucination or misclassification.
- Better understanding of context and community dynamics.
Best Practices
1. Start Small, Scale Gradually
Begin with a focused set of interests and sources:
| 1 | # Start with this minimal config
|
| 2 | interests:
|
| 3 | - "large language models" # Just one or two topics
|
| 4 |
|
| 5 | repositories:
|
| 6 | - name: "ollama"
|
| 7 | url: "https://github.com/ollama/ollama"
|
| 8 | frequency: "daily" # Start with one repository
|
| 9 |
|
| 10 | sources:
|
| 11 | arxiv:
|
| 12 | enabled: true
|
| 13 | categories: ["cs.AI"] # Start with one category
|
| 14 | keywords: ["LLM"]
|
2. Tune Relevance Thresholds
The relevance threshold significantly impacts output quality:
| 1 | # Too low — you will get too much noise
|
| 2 | RELEVANCE_THRESHOLD = 0.3 # Not recommended
|
| 3 |
|
| 4 | # Too high — you will miss important signals
|
| 5 | RELEVANCE_THRESHOLD = 0.95 # Too restrictive
|
| 6 |
|
| 7 | # Recommended starting point
|
| 8 | RELEVANCE_THRESHOLD = 0.6 # Good balance
|
| 9 |
|
| 10 | # For fast-moving fields, lower slightly
|
| 11 | RELEVANCE_THRESHOLD = 0.5
|
3. Implement Proper Error Handling
| 1 | # skills/utils/error_handling.py
|
| 2 | import logging
|
| 3 | from functools import wraps
|
| 4 | from typing import Callable
|
| 5 |
|
| 6 | logger = logging.getLogger(__name__)
|
| 7 |
|
| 8 | def resilient_execution(max_retries: int = 3, delay: float = 1.0):
|
| 9 | """Decorator for resilient skill execution with retries."""
|
| 10 | def decorator(func: Callable) -> Callable:
|
| 11 | @wraps(func)
|
| 12 | def wrapper(*args, **kwargs):
|
| 13 | import time
|
| 14 | last_exception = None
|
| 15 |
|
| 16 | for attempt in range(max_retries):
|
| 17 | try:
|
| 18 | return func(*args, **kwargs)
|
| 19 | except Exception as e:
|
| 20 | last_exception = e
|
| 21 | logger.warning(
|
| 22 | f"Attempt {attempt + 1}/{max_retries} failed: {e}"
|
| 23 | )
|
| 24 | if attempt < max_retries - 1:
|
| 25 | time.sleep(delay * (attempt + 1))
|
| 26 |
|
| 27 | logger.error(f"All {max_retries} attempts failed.")
|
| 28 | raise last_exception
|
| 29 |
|
| 30 | return wrapper
|
| 31 | return decorator
|
4. Manage API Costs
| 1 | # skills/utils/cost_tracker.py
|
| 2 | from dataclasses import dataclass, field
|
| 3 | from datetime import datetime
|
| 4 |
|
| 5 | @dataclass
|
| 6 | class CostTracker:
|
| 7 | """Track LLM API usage and costs."""
|
| 8 | total_tokens: int = 0
|
| 9 | total_requests: int = 0
|
| 10 | estimated_cost: float = 0.0
|
| 11 | history: list = field(default_factory=list)
|
| 12 |
|
| 13 | # Cost per 1K tokens (adjust based on your plan)
|
| 14 | COST_PER_1K_TOKENS = 0.03 # GPT-4 pricing
|
| 15 |
|
| 16 | def record(self, tokens: int) -> None:
|
| 17 | """Record a single API call."""
|
| 18 | self.total_tokens += tokens
|
| 19 | self.total_requests += 1
|
| 20 | self.estimated_cost = (
|
| 21 | (self.total_tokens / 1000) * self.COST_PER_1K_TOKENS
|
| 22 | )
|
| 23 | self.history.append({
|
| 24 | "timestamp": datetime.now().isoformat(),
|
| 25 | "tokens": tokens,
|
| 26 | "running_cost": self.estimated_cost,
|
| 27 | })
|
| 28 |
|
| 29 | def report(self) -> str:
|
| 30 | """Generate a cost report."""
|
| 31 | return (
|
| 32 | f"Total requests: {self.total_requests}\n"
|
| 33 | f"Total tokens: {self.total_tokens:,}\n"
|
| 34 | f"Estimated cost: ${self.estimated_cost:.2f}"
|
| 35 | )
|
| 36 |
|
| 37 | # Usage
|
| 38 | cost_tracker = CostTracker()
|
| 39 |
|
| 40 | # After each LLM call:
|
| 41 |
|
| 42 | # cost_tracker.record(response.usage.total_tokens)
|
| 43 |
|
| 44 | # Periodically:
|
| 45 |
|
| 46 | # print(cost_tracker.report())
|
Common Pitfalls and Solutions
Pitfall 1: Overloading with Sources
Problem: Adding too many sources leads to noise and high API costs.
Solution: Curate sources carefully and use aggressive filtering:
| 1 | # Limit sources per digest cycle
|
| 2 | MAX_SOURCES_PER_RUN = 10
|
| 3 | MAX_ITEMS_PER_SOURCE = 5
|
Pitfall 2: Ignoring Historical Context
Problem: Without novelty detection, you will see redundant information across digests.
Solution: Always include novelty detection in your workflow:
| 1 | # Always use novelty detection
|
| 2 | from skills.filtering import NoveltyDetector
|
| 3 |
|
| 4 | detector = NoveltyDetector(config={"data_dir": "./data"})
|
| 5 | detector.initialize()
|
Pitfall 3: Not Customizing for Your Role
Problem: Generic configurations do not align with your specific needs.
Solution: Create role-specific profiles:
| 1 | # For researchers
|
| 2 | interests:
|
| 3 | - "novel architectures"
|
| 4 | - "training techniques"
|
| 5 | - "evaluation benchmarks"
|
| 6 | - "theoretical foundations"
|
| 7 |
|
| 8 | # For developers
|
| 9 | interests:
|
| 10 | - "production deployment"
|
| 11 | - "API design"
|
| 12 | - "prompt engineering"
|
| 13 | - "tool integration"
|
Real-World Use Cases
Use Case 1: AI Research Team Lead
A research team lead uses ai-fomo-skills to:
- Monitor key subfields including alignment, scaling laws, and multimodal AI.
- Generate weekly team briefings with the most relevant papers.
- Track competitor publications and open-source releases.
- Maintain a knowledge base of important developments.
Use Case 2: AI Startup CTO
A CTO uses the system to:
- Track emerging tools and frameworks relevant to their tech stack.
- Monitor GitHub repositories for breaking changes and new features.
- Generate investor-ready summaries of AI landscape trends.
- Identify potential acquisition targets or partnership opportunities.
Use Case 3: AI Educator
An educator uses the project to:
- Curate up-to-date examples for courses and tutorials.
- Identify trending topics for curriculum updates.
- Generate reading lists for students based on current developments.
- Track pedagogical innovations in AI education.
Conclusion
The vincelele/ai-fomo-skills repository offers a thoughtful, modular approach to managing AI information overload. Its skill-based architecture, combined with the power of LLMs for semantic understanding, provides a significant advantage over traditional information management approaches.
Key takeaways:
- The skill-based architecture makes the system highly modular and extensible.
- Multi-stage processing (extraction, filtering, synthesis) ensures high-quality outputs.
- Personalization through profile configurations allows tailoring to specific needs and roles.
- The project is well-suited for developers, researchers, and technical leaders who need to stay current without being overwhelmed.
Recommendations:
- Start with the default configuration and a focused set of interests.
- Gradually add sources and tune relevance thresholds based on output quality.
- Implement cost tracking to manage API expenses.
- Contribute custom skills back to the community to grow the ecosystem.
Further Reading and Resources:
- Repository: github.com/vincelele/ai-fomo-skills
- Related concepts: Superalignment research at OpenAI, Personal Knowledge Management (PKM) systems, AI-augmented information retrieval.
- Complementary tools: Ollama for local LLM inference, Obsidian for personal knowledge management, feedparser for RSS processing.
By leveraging ai-fomo-skills, you can transform the anxiety of AI information overload into a structured, actionable knowledge pipeline that keeps you informed without consuming all your time. The project exemplifies how AI itself can be used to manage the complexity of the AI ecosystem — a recursive solution to a recursive problem.