home/categories/llm-ai

category focus

LLM & AI

Large Language Models and AI agents.

4725 skillsall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

llm-ai

wavecap-evaluate

Evaluate WaveCap audio analysis and transcription accuracy. Use when the user wants to run regression tests, compare transcriptions against ground truth, calculate WER/CER metrics, or assess overall system quality.

TobiasWooldridge

data-ai

open

llm-ai

learning-glossary-management

Create multilingual glossaries for educational content, maintain terminology consistency across translations, build translation memory databases, and define preferred terms by domain and region. Use when managing translation terminology. Activates on "glossary", "terminology management", or "translation memory".

pauljbernard

data-ai

open

llm-ai

stt-integration

ElevenLabs Speech-to-Text transcription workflows with Scribe v1 supporting 99 languages, speaker diarization, and Vercel AI SDK integration. Use when implementing audio transcription, building STT features, integrating speech-to-text, setting up Vercel AI SDK with ElevenLabs, or when user mentions transcription, STT, Scribe v1, audio-to-text, speaker diarization, or multi-language transcription.

vanman2024

data-ai

open

llm-ai

intelligent-text-chunking

Split large texts into meaningful, AI-optimized chunks while preserving semantic coherence and document structure. Use when processing large documents for AI training, RAG systems, or when you need to break down content while maintaining context and relationships.

findinfinitelabs

data-ai

open

llm-ai

nlp-engineer

Expert in Natural Language Processing, designing systems for text classification, NER, translation, and LLM integration using Hugging Face, spaCy, and LangChain. Use when building NLP pipelines, text analysis, or LLM-powered features. Triggers include "NLP", "text classification", "NER", "named entity", "sentiment analysis", "spaCy", "Hugging Face", "transformers".

404kidwiz

data-ai

open

llm-ai

medical-ocr

Medical document OCR processing for extracting structured clinical data from medical images (prescriptions, lab results, clinical notes). Uses Google Cloud Vision for text extraction and medical NLP for entity recognition. Deploy when processing healthcare documents, extracting patient data, or converting medical images to structured formats.

Fadil369

data-ai

open

llm-ai

court-record-transcriber

Development skill for CaseMark's Court Recording Transcriber - an AI-powered application for transcribing court recordings with speaker identification, synchronized playback, search, and legal document exports. Built with Next.js 16, PostgreSQL, Drizzle ORM, wavesurfer.js, and Case.dev APIs. Use this skill when: (1) Working on or extending the court-record-transcriber codebase, (2) Integrating with Case.dev transcription APIs, (3) Working with audio playback/waveforms, (4) Building transcript export features, or (5) Adding speaker identification logic.

CaseMark

data-ai

open

llm-ai

stenography-mastery

Complete stenography guide for court reporting and legal transcription. Use when building steno projects from basic to professional level, learning court reporting workflows, optimizing legal dictionaries, setting up Plover for courtroom use, creating specialized legal briefs, or developing speed for legal documentation.

hafiznaveedchuhan-ctrl

data-ai

open

llm-ai

transcribe-walk

Process walk recordings into usable research material. Transcription workflow, insight extraction, integration with pipeline. Use after WALK stage recording.

pentaxis93

data-ai

open

llm-ai

media-understand

使用 AI 理解和分析多媒体内容（图片、视频、音频）。Use when user wants to 理解图片, 分析视频, 音频转文字, 视频问答, understand media, analyze video, transcribe audio, describe image, what is in this video/image/audio.

InfQuest

data-ai

open

llm-ai

embed

Generate text embeddings using Qwen3 models via HuggingFace TEI. Use this skill to embed texts, configure the embedding service, or batch process documents. Invoke with /embed.

wellcomecollection

data-ai

open

llm-ai

deodorizing-ai-text

AI生成日本語の違和感（AI臭）を検出・解消し、人間らしい自然な文章に脱臭するスキル。「この文章を脱臭して」「AI臭を消して」「人間らしい文章にして」「自然な日本語に直して」「翻訳調を修正」などのリクエストで起動する。プロンプトエンジニアリングによる予防とポストエディティングによる治療の両方をサポート。

inakam

data-ai

open

llm-ai

qt-translation-assistant

Use when user requests translating Qt project localization files (TS files), automating translation workflows, or setting up multilingual support for Qt applications. This skill uses parallel processing with ThreadPoolExecutor to translate TS (Translation Source) files efficiently.

re2zero

data-ai

open

llm-ai

rag-personalizer

Transform textbook content based on the 10-dimension user profile to provide personalized learning experiences. Agent: AIEngineer

FAIQahm

data-ai

open

llm-ai

wenyan-mode

This skill is ALWAYS ACTIVE once installed. Automatically applies classical Chinese (文言文) writing style to all responses. Uses concise, elegant expressions while keeping technical terms intact. No trigger phrase needed - activates on every response.

VdustR

data-ai

open

llm-ai

wavecap-llm

Configure WaveCap LLM-based transcription correction. Use when the user wants to enable/disable LLM correction, change models, tune prompts, or optimize correction quality on Apple Silicon.

TobiasWooldridge

data-ai

open

llm-ai

rag-ingestion-v1

Document ingestion pipeline - docs to chunks to metadata for RAG

atiati82

data-ai

open

llm-ai

gpu-media-processor

Process audio, video, and media on cloud GPUs. Transcribe with Whisper, clone voices, generate videos, upscale images, and run batch media processing. All results sync back to your Mac.

gpu-cli

data-ai

open

llm-ai

wavecap-hallucination

Configure WaveCap hallucination detection and prevention. Use when Whisper outputs gibberish, repeated phrases, or phantom text on silent audio.

TobiasWooldridge

data-ai

open

llm-ai

gen-sora-video

Generate a Sora video from a text prompt via an Azure OpenAI endpoint, then download the resulting .mp4 locally. Use when the user asks to generate a Sora video/video.mp4 from a prompt or wants the generated video saved to disk.

jacwu

data-ai

open

llm-ai

gemini-video-understanding

Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).

levanminhduc

data-ai

open

llm-ai

comfyui

Generates images and videos using ComfyUI node-based workflows. Use when creating AI-generated assets, text-to-image, text-to-video, image-to-video, running Stable Diffusion, Flux, HunyuanVideo, or when user mentions "comfy," "ComfyUI," "generate image," "generate video," "AI art," "diffusion model," or needs visual content for courses/projects.

webmasterarbez

data-ai

open

llm-ai

unsloth-stt

Fine-tuning Speech-to-Text models like Whisper using Unsloth's optimized LoRA pipeline. Triggers: stt, whisper, transcription, audio fine-tuning, speech-to-text, audio normalization.

cuba6112

data-ai

open

llm-ai

nano-banana-image-combine

Combine multiple images using Gemini 2.5 Flash (Nano Banana) via OpenRouter. Use when merging 2-8 images with AI-guided composition.

guaderrama

data-ai

open

Page 148 / 197