category focus

LLM & AI

Large Language Models and AI agents.

4725 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
llm-ai
1

livekit-prompt-builder

Guide for creating effective prompts and instructions for LiveKit voice agents. Use when building conversational AI agents with the LiveKit Agents framework, including (1) Creating new voice agent prompts from scratch, (2) Improving existing agent instructions, (3) Optimizing prompts for text-to-speech output, (4) Integrating tool/function calling capabilities, (5) Building multi-agent systems with handoffs, (6) Ensuring voice-friendly formatting and brevity for natural conversations, (7) Iteratively improving prompts based on testing and feedback, (8) Building industry-specific agents (debt collection, healthcare, banking, customer service, front desk).

Okeysir198
Okeysir198
data-ai
open
llm-ai
1

copywriting

Conversion copywriting formulas, headline templates, email copy patterns, landing page structures, CTA optimization, and writing style extraction. Activate for writing high-converting copy, crafting headlines, email campaigns, landing pages, or applying custom writing styles from assets/writing-styles/ directory.

hotriluan
hotriluan
data-ai
open
llm-ai
1

editorial-image-generator

Creates sophisticated HBR-style editorial illustrations for any content using AI understanding and visual analysis. Use when creating conceptual illustrations, analyzing generated images, or compositing logos. Works with any brand configuration. AI-native approach - Claude reasons about content rather than using rigid templates.

tallyfy
tallyfy
data-ai
open
llm-ai
1

command-design

Design and implement intuitive, user-friendly bot command interfaces with comprehensive help systems

forever19735
forever19735
data-ai
open
llm-ai
1

social-media-caption-writer

Creates engaging social media captions optimized for each platform (Instagram, LinkedIn, Twitter/X, Facebook, TikTok). Use when preparing social content, building content calendars, or repurposing content across platforms. Includes hashtag suggestions and CTA variations.

fracabu
fracabu
data-ai
open
llm-ai
1

copywriting

Conversion copywriting formulas, headline templates, email copy patterns, landing page structures, CTA optimization, and writing style extraction. Activate for writing high-converting copy, crafting headlines, email campaigns, landing pages, or applying custom writing styles from assets/writing-styles/ directory.

hotriluan
hotriluan
data-ai
open
llm-ai
1

nanobanana

Generate photorealistic images with perfect text rendering using Nano Banana Pro (Gemini 3 Pro Image). Automatically enhances prompts for optimal results with this reasoning-based image model. Use when users request image generation, logos, infographics, posters, diagrams, or any visual content.

johnpsasser
johnpsasser
data-ai
open
llm-ai
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

zircote
zircote
data-ai
open
llm-ai
1

video-generation

AI video generation using Google Veo or OpenAI Sora. Use when user wants to generate, create, or make videos from text prompts.

ThrownLemon
ThrownLemon
data-ai
open
llm-ai
1

video-to-article

Use this skill when the user wants to convert a lecture, presentation, or talk video into text formats (transcript, outline, or article). Trigger when user mentions processing video recordings, creating transcripts from lectures, or generating articles from recorded presentations.

lttr
lttr
data-ai
open
llm-ai
1

ken-style-social-media

Generate authentic social media content in Ken's distinctive multilingual style - mixing Cantonese, English, and Traditional Chinese for personal branding, product promotion, and educational content. Perfect for creating viral posts that combine vulnerability, authority, and conversion optimization.

kuse-ai
kuse-ai
data-ai
open
llm-ai
1

ask-gemini

This skill should be used when the user asks to "ask Gemini", "get Gemini's opinion", "have Gemini review", "improve writing style", "make less AI-sounding", "get feedback on article", "review this draft", or needs a second opinion on content, writing, code, or design. Supports text questions and up to 10 images.

b-open-io
b-open-io
data-ai
open
llm-ai
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.

nb150301
nb150301
data-ai
open
llm-ai
1

imagegen

AI image generation using Google Gemini (Gemini) and OpenAI GPT-Image. Generate, edit, iterate, and create assets.

ThrownLemon
ThrownLemon
data-ai
open
llm-ai
1

fal-ai

Generate images, videos, and speech using fal.ai API. Use when asked to: - Create/generate images from text prompts (Flux, Flux Kontext, Nano Banana Pro, Recraft) - Generate videos from text prompts (Veo 3, Kling v2.6, Hunyuan, LTX, Minimax, Wan) - Animate images into videos (Kling v2.6, Kling, Luma, Runway) - Convert text to speech or clone voices (F5-TTS, Kokoro) Trigger phrases: "generate image", "create video", "text-to-video", "animate this image", "make a video of", "voice cloning", "text-to-speech", "fal.ai", "veo", "kling"

nunomen
nunomen
data-ai
open
llm-ai
1

gemini-audio

Guide for implementing Google Gemini API audio capabilities - analyze audio with transcription, summarization, and understanding (up to 9.5 hours), plus generate speech with controllable TTS. Use when processing audio files, creating transcripts, analyzing speech/music/sounds, or generating natural speech from text.

AIA-11-HN-MIB
AIA-11-HN-MIB
data-ai
open
llm-ai
1

video-director

AI video generation using Google Veo 3.1 and Imagen. Use when the user wants to create videos, advertisements, promotional content, or any video generation task. Automatically activates for requests like "create a video", "make an advertisement", "generate a promotional video".

rohittp0
rohittp0
data-ai
open
llm-ai
1

gemini-imagegen

Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

unclecode
unclecode
data-ai
open
llm-ai
1

podcast-generator

Converts blog posts into podcast audio using Kokoro TTS engine. Use this when asked to generate podcasts, create audio, or run TTS for blog posts.

cmwen
cmwen
data-ai
open
llm-ai
1

ai-multimodal

Analyze images/audio/video with Gemini API (better vision than Claude). Generate images (Imagen 4), videos (Veo 3). Use for vision analysis, transcription, OCR, design extraction, multimodal AI.

hotriluan
hotriluan
data-ai
open
llm-ai
1

tts-mcp-server

This skill provides comprehensive guidance for using TTS-MCP-SERVER with ElevenLabs eleven_v3 model for sultry, seductive, ominous, and emotionally expressive voice output. Use this skill when generating voice announcements, applying audio tags for dark/sexy/mischievous emotional expression, crafting "naughty professional accountability partner" scripts, or troubleshooting TTS integration. Activates on speak_text calls, voice output requests, ElevenLabs TTS operations, audio tag usage, or voice announcements during automated workflows.

mamba-mental
mamba-mental
data-ai
open
llm-ai
1

nano-banana

AI image generation using Nano Banana PRO (Gemini 3 Pro Image) and Nano Banana (Gemini 2.5 Flash Image). Use this skill when: (1) Generating images from text prompts, (2) Editing existing images, (3) Creating professional visual assets like infographics, logos, product shots, stickers, (4) Working with character consistency across multiple images, (5) Creating images with accurate text rendering, (6) Any task requiring AI-generated visuals. Triggers on: 'generate image', 'create image', 'make a picture', 'design a logo', 'create infographic', 'AI image', 'nano banana', or any image generation request.

horuz-ai
horuz-ai
data-ai
open
llm-ai
1

catskill-writer

Write Catskill Crew newsletter content in Michael's voice. Use when writing HAPPENINGS, REPORT, BULLETIN sections or assembling a complete newsletter edition.

aniketpanjwani
aniketpanjwani
data-ai
open
llm-ai
1

gemini-imagegen

This skill should be used when generating and editing images using the Gemini API (Nano Banana Pro). It applies when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

nunomen
nunomen
data-ai
open
Previous
Page 118 / 197
Next