category focus

LLM & AI

Large Language Models and AI agents.

4725 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
llm-ai
0

wavecap-evaluate

Evaluate WaveCap audio analysis and transcription accuracy. Use when the user wants to run regression tests, compare transcriptions against ground truth, calculate WER/CER metrics, or assess overall system quality.

TobiasWooldridge
TobiasWooldridge
data-ai
open
llm-ai
0

learning-glossary-management

Create multilingual glossaries for educational content, maintain terminology consistency across translations, build translation memory databases, and define preferred terms by domain and region. Use when managing translation terminology. Activates on "glossary", "terminology management", or "translation memory".

pauljbernard
pauljbernard
data-ai
open
llm-ai
0

stt-integration

ElevenLabs Speech-to-Text transcription workflows with Scribe v1 supporting 99 languages, speaker diarization, and Vercel AI SDK integration. Use when implementing audio transcription, building STT features, integrating speech-to-text, setting up Vercel AI SDK with ElevenLabs, or when user mentions transcription, STT, Scribe v1, audio-to-text, speaker diarization, or multi-language transcription.

vanman2024
vanman2024
data-ai
open
llm-ai
0

intelligent-text-chunking

Split large texts into meaningful, AI-optimized chunks while preserving semantic coherence and document structure. Use when processing large documents for AI training, RAG systems, or when you need to break down content while maintaining context and relationships.

findinfinitelabs
findinfinitelabs
data-ai
open
llm-ai
0

nlp-engineer

Expert in Natural Language Processing, designing systems for text classification, NER, translation, and LLM integration using Hugging Face, spaCy, and LangChain. Use when building NLP pipelines, text analysis, or LLM-powered features. Triggers include "NLP", "text classification", "NER", "named entity", "sentiment analysis", "spaCy", "Hugging Face", "transformers".

404kidwiz
404kidwiz
data-ai
open
llm-ai
0

medical-ocr

Medical document OCR processing for extracting structured clinical data from medical images (prescriptions, lab results, clinical notes). Uses Google Cloud Vision for text extraction and medical NLP for entity recognition. Deploy when processing healthcare documents, extracting patient data, or converting medical images to structured formats.

Fadil369
Fadil369
data-ai
open
llm-ai
0

court-record-transcriber

Development skill for CaseMark's Court Recording Transcriber - an AI-powered application for transcribing court recordings with speaker identification, synchronized playback, search, and legal document exports. Built with Next.js 16, PostgreSQL, Drizzle ORM, wavesurfer.js, and Case.dev APIs. Use this skill when: (1) Working on or extending the court-record-transcriber codebase, (2) Integrating with Case.dev transcription APIs, (3) Working with audio playback/waveforms, (4) Building transcript export features, or (5) Adding speaker identification logic.

CaseMark
CaseMark
data-ai
open
llm-ai
0

stenography-mastery

Complete stenography guide for court reporting and legal transcription. Use when building steno projects from basic to professional level, learning court reporting workflows, optimizing legal dictionaries, setting up Plover for courtroom use, creating specialized legal briefs, or developing speed for legal documentation.

hafiznaveedchuhan-ctrl
hafiznaveedchuhan-ctrl
data-ai
open
llm-ai
0

transcribe-walk

Process walk recordings into usable research material. Transcription workflow, insight extraction, integration with pipeline. Use after WALK stage recording.

pentaxis93
pentaxis93
data-ai
open
llm-ai
0

media-understand

使用 AI 理解和分析多媒体内容(图片、视频、音频)。Use when user wants to 理解图片, 分析视频, 音频转文字, 视频问答, understand media, analyze video, transcribe audio, describe image, what is in this video/image/audio.

InfQuest
InfQuest
data-ai
open
llm-ai
0

embed

Generate text embeddings using Qwen3 models via HuggingFace TEI. Use this skill to embed texts, configure the embedding service, or batch process documents. Invoke with /embed.

wellcomecollection
wellcomecollection
data-ai
open
llm-ai
0

deodorizing-ai-text

AI生成日本語の違和感(AI臭)を検出・解消し、人間らしい自然な文章に脱臭するスキル。「この文章を脱臭して」「AI臭を消して」「人間らしい文章にして」「自然な日本語に直して」「翻訳調を修正」などのリクエストで起動する。プロンプトエンジニアリングによる予防とポストエディティングによる治療の両方をサポート。

inakam
inakam
data-ai
open
llm-ai
0

qt-translation-assistant

Use when user requests translating Qt project localization files (TS files), automating translation workflows, or setting up multilingual support for Qt applications. This skill uses parallel processing with ThreadPoolExecutor to translate TS (Translation Source) files efficiently.

re2zero
re2zero
data-ai
open
llm-ai
0

rag-personalizer

Transform textbook content based on the 10-dimension user profile to provide personalized learning experiences. Agent: AIEngineer

FAIQahm
FAIQahm
data-ai
open
llm-ai
0

wenyan-mode

This skill is ALWAYS ACTIVE once installed. Automatically applies classical Chinese (文言文) writing style to all responses. Uses concise, elegant expressions while keeping technical terms intact. No trigger phrase needed - activates on every response.

VdustR
VdustR
data-ai
open
llm-ai
0

wavecap-llm

Configure WaveCap LLM-based transcription correction. Use when the user wants to enable/disable LLM correction, change models, tune prompts, or optimize correction quality on Apple Silicon.

TobiasWooldridge
TobiasWooldridge
data-ai
open
llm-ai
0

rag-ingestion-v1

Document ingestion pipeline - docs to chunks to metadata for RAG

atiati82
atiati82
data-ai
open
llm-ai
0

gpu-media-processor

Process audio, video, and media on cloud GPUs. Transcribe with Whisper, clone voices, generate videos, upscale images, and run batch media processing. All results sync back to your Mac.

gpu-cli
gpu-cli
data-ai
open
llm-ai
0

wavecap-hallucination

Configure WaveCap hallucination detection and prevention. Use when Whisper outputs gibberish, repeated phrases, or phantom text on silent audio.

TobiasWooldridge
TobiasWooldridge
data-ai
open
llm-ai
0

gen-sora-video

Generate a Sora video from a text prompt via an Azure OpenAI endpoint, then download the resulting .mp4 locally. Use when the user asks to generate a Sora video/video.mp4 from a prompt or wants the generated video saved to disk.

jacwu
jacwu
data-ai
open
llm-ai
0

gemini-video-understanding

Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).

levanminhduc
levanminhduc
data-ai
open
llm-ai
0

comfyui

Generates images and videos using ComfyUI node-based workflows. Use when creating AI-generated assets, text-to-image, text-to-video, image-to-video, running Stable Diffusion, Flux, HunyuanVideo, or when user mentions "comfy," "ComfyUI," "generate image," "generate video," "AI art," "diffusion model," or needs visual content for courses/projects.

webmasterarbez
webmasterarbez
data-ai
open
llm-ai
0

unsloth-stt

Fine-tuning Speech-to-Text models like Whisper using Unsloth's optimized LoRA pipeline. Triggers: stt, whisper, transcription, audio fine-tuning, speech-to-text, audio normalization.

cuba6112
cuba6112
data-ai
open
llm-ai
0

nano-banana-image-combine

Combine multiple images using Gemini 2.5 Flash (Nano Banana) via OpenRouter. Use when merging 2-8 images with AI-guided composition.

guaderrama
guaderrama
data-ai
open
Previous
Page 148 / 197
Next