screenshot AI Agent Skills

Browse 568 skills related to screenshot

playwright-cli

83.3k
microsoftmicrosoft

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

95 days ago

webapp-testing

81.1k
anthropicsanthropics

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

95 days ago

remote-browser

79.7k
browser-usebrowser-use

Controls a cloud browser from a sandboxed remote machine. Use when the agent is running in a sandbox (no GUI) and needs to navigate websites, interact with web pages, fill forms, take screenshots, or expose local dev servers via tunnels.

91 days ago

browser-use

79.4k
browser-usebrowser-use

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, or extract information from web pages.

95 days ago

image-enhancer

39.6k
ComposioHQComposioHQ

Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.

95 days ago

dev-browser

36.1k
code-yeongyucode-yeongyu

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

95 days ago

agent-browser

36.1k
code-yeongyucode-yeongyu

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

95 days ago

webapp-testing

23.2k
githubgithub

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

95 days ago

chrome-devtools

23.2k
githubgithub

Expert-level browser automation, debugging, and performance analysis using Chrome DevTools MCP. Use for interacting with web pages, capturing screenshots, analyzing network traffic, and profiling performance.

95 days ago

screenshot-feature-extractor

21.8k
davila7davila7

Analyze product screenshots to extract feature lists and generate development task checklists. Use when: (1) Analyzing competitor product screenshots for feature extraction, (2) Generating PRD/task lists from UI designs, (3) Batch analyzing multiple app screens, (4) Conducting competitive analysis from visual references.

95 days ago

firecrawl-scraper

21.8k
davila7davila7

Deep web scraping, screenshots, PDF parsing, and website crawling using Firecrawl API

95 days ago

playwright-skill

21.8k
davila7davila7

Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.

95 days ago

omarchy

20.8k
basecampbasecamp

REQUIRED for ANY changes to Linux desktop, window manager, or system config. Use when editing ~/.config/hypr/, ~/.config/waybar/, ~/.config/walker/, ~/.config/alacritty/, ~/.config/kitty/, ~/.config/ghostty/, ~/.config/mako/, or ~/.config/omarchy/. Triggers: Hyprland, window rules, animations, keybindings, monitors, gaps, borders, blur, opacity, waybar, walker, terminal config, themes, wallpaper, night light, idle, lock screen, screenshots, layer rules, workspace settings, display config, or any omarchy-* commands.

95 days ago

screenshots

18.0k
sickn33sickn33

Generate marketing screenshots of your app using Playwright. Use when the user wants to create screenshots for Product Hunt, social media, landing pages, or documentation.

95 days ago

ui-visual-validator

18.0k
sickn33sickn33

Rigorous visual validation expert specializing in UI testing, design system compliance, and accessibility verification. Masters screenshot analysis, visual regression testing, and component validation. Use PROACTIVELY to verify UI modifications have achieved their intended goals through comprehensive visual analysis.

95 days ago

agent-browser

17.1k
vercel-labsvercel-labs

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

95 days ago

obsidian-cli

11.9k
kepanokepano

Interact with Obsidian vaults using the Obsidian CLI to read, create, search, and manage notes, tasks, properties, and more. Also supports plugin and theme development with commands to reload plugins, run JavaScript, capture errors, take screenshots, and inspect the DOM. Use when the user asks to interact with their Obsidian vault, manage notes, search vault content, perform vault operations from the command line, or develop and debug Obsidian plugins and themes.

95 days ago

playwright

10.4k
openaiopenai

Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via `playwright-cli` or the bundled wrapper script.

95 days ago

develop-web-game

10.4k
openaiopenai

Use when Codex is building or iterating on a web game (HTML/JS) and needs a reliable development + testing loop: implement small changes, run a Playwright-based test script with short input bursts and intentional pauses, inspect screenshots/text, and review console errors with render_game_to_text.

95 days ago

screenshot

10.4k
openaiopenai

Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed.

95 days ago

figma-implement-design

10.4k
openaiopenai

Translate Figma nodes into production-ready code with 1:1 visual fidelity using the Figma MCP workflow (design context, screenshots, assets, and project-convention translation). Trigger when the user provides Figma URLs or node IDs, or asks to implement designs or components that must match Figma specs. Requires a working Figma MCP server connection.

95 days ago

Figma

10.4k
openaiopenai

Use the Figma MCP server to fetch design context, screenshots, variables, and assets from Figma, and to translate Figma nodes into production code. Trigger when a task involves Figma URLs, node IDs, design-to-code implementation, or Figma MCP setup and troubleshooting.

95 days ago

agent-browser

9.7k
EveryIncEveryInc

Browser automation using Vercel's agent-browser CLI. Use when you need to interact with web pages, fill forms, take screenshots, or scrape data. An alternative to Playwright MCP — uses Bash commands with ref-based element selection. Triggers: browse website, fill form, click button, take screenshot, scrape page, web automation.

95 days ago

Cloudflare Browser

9.4k
cloudflarecloudflare

Control headless Chrome via Cloudflare Browser Rendering CDP WebSocket. Use for screenshots, page navigation, scraping, and video capture when browser automation is needed in a Cloudflare Workers environment. Requires CDP_SECRET env var and cdpUrl configured in browser.profiles.

95 days ago

dev-browser

6.1k
MemTensorMemTensor

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

95 days ago

website-preview

5.6k
missing-semestermissing-semester

Take screenshots and interact with the rendered Jekyll website

95 days ago

webapp-testing

2.5k
davepoondavepoon

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

95 days ago

image-enhancer

2.5k
davepoondavepoon

Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.

95 days ago

browser

2.4k
cexllcexll

This skill should be used for browser automation tasks using Chrome DevTools Protocol (CDP). Triggers when users need to launch Chrome with remote debugging, navigate pages, execute JavaScript in browser context, capture screenshots, or interactively select DOM elements. No MCP server required.

95 days ago

tui-validate

2.0k
mikeyobrienmikeyobrien

Validates Terminal User Interface (TUI) output using freeze for screenshot capture and LLM-as-judge for semantic validation. Supports both visual (PNG/SVG) and text-based validation modes.

95 days ago

github-issue-creator

1.8k
openclawopenclaw

Convert raw notes, error logs, voice dictation, or screenshots into crisp GitHub-flavored markdown issue reports. Use when the user pastes bug info, error messages, or informal descriptions and wants a structured GitHub issue. Supports images/GIFs for visual evidence.

95 days ago

chrome-devtools

1.8k
mrgooniemrgoonie

Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.

95 days ago

aesthetic

1.8k
mrgooniemrgoonie

Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images with ai-multimodal, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Includes workflows for capturing and analyzing inspiration screenshots with chrome-devtools and ai-multimodal, iterative design image generation until aesthetic standards are met, and comprehensive design system guidance covering BEAUTIFUL (aesthetic principles), RIGHT (functionality/accessibility), SATISFYING (micro-interactions), and PEAK (storytelling) stages. Integrates with chrome-devtools, ai-multimodal, media-processing, ui-styling, and web-frameworks skills.

95 days ago

ai-multimodal

1.8k
mrgooniemrgoonie

Process and generate multimedia content using Google Gemini API. Capabilities include analyzing audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understanding images (captioning, object detection, OCR, visual Q&A, segmentation), processing videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extracting from documents (PDF tables, forms, charts, diagrams, multi-page), generating images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

95 days ago

clawshot

1.8k
openclawopenclaw

Instagram for AI agents. Build your following, grow your influence. Share screenshots, get likes & comments, engage with @mentions. Be a creator, not just a coder.

95 days ago

linux-gui-control

1.8k
openclawopenclaw

Control the Linux desktop GUI using xdotool, wmctrl, and dogtail. Use when you need to interact with non-browser applications, simulate mouse/keyboard input, manage windows, or inspect the UI hierarchy of applications on X11/GNOME. Supports: (1) Clicking/typing in apps, (2) Resizing/moving windows, (3) Extracting text-based UI trees from apps (A11y), (4) Taking screenshots for visual analysis.

95 days ago

clawbrowser

1.8k
openclawopenclaw

Use when the agent needs to drive a browser through the Microsoft Playwright CLI (`playwright-cli`) for navigation, form interactions, screenshots, recordings, data extraction, session management, or debugging without loading a full MCP browser. It trains the agent on the CLI commands, snapshots, and session/config habits that make Playwright CLI reliable for scripted browsing.

95 days ago

pokemon-red

1.8k
openclawopenclaw

Play Pokemon Red autonomously via PyBoy emulator. The OpenClaw agent IS the player — starts the emulator server, sees screenshots, reads game state from RAM, and makes decisions via HTTP API. Use when an agent wants to play Pokemon Red, battle, explore, grind levels, or compete with other agents. Requires Python 3.10+, pyboy, and a legally obtained Pokemon Red ROM.

95 days ago

computer-use

1.8k
openclawopenclaw

Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag, etc). Unlike OpenClaw's browser tool, operates at the X11 level so websites cannot detect automation. Includes VNC for live viewing.

95 days ago

browser-use

1.8k
openclawopenclaw

Controls a cloud browser from a sandboxed remote machine. Use when the agent is running in a sandbox (no GUI) and needs to navigate websites, interact with web pages, fill forms, take screenshots, or expose local dev servers via tunnels.

95 days ago

sticker-analyzer

1.8k
openclawopenclaw

Analyze images in media/stickers using Vision API to identify and filter meme/sticker content vs screenshots or documents.

visionimage-analysisstickers+1
95 days ago

xcodebuildmcp

1.8k
openclawopenclaw

Use when the user needs Xcode build/test/run workflows, simulator or device control, UI automation, screenshots/video, logs, or LLDB debugging through XcodeBuildMCP tools. Includes discovery of projects/schemes, session defaults, and common simulator/device workflows.

95 days ago

style-guide-generator

1.8k
openclawopenclaw

Generate comprehensive website style guides and design systems from URLs, screenshots, and existing documentation. Use this skill when users ask to create a style guide, design system documentation, brand guidelines document, or design specification from a website, app, or existing materials. This skill produces professional PDF outputs following industry-standard style guide structure.

95 days ago

playwright-cli

1.8k
openclawopenclaw

Browser automation via Playwright CLI. Open pages, interact with elements, take screenshots, and more. Ideal for coding agents and automated testing workflows.

95 days ago

webapp-testing

1.7k
snyksnyk

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

95 days ago

chrome-devtools

1.6k
tech-leads-clubtech-leads-club

Expert-level browser automation, debugging, and performance analysis using Chrome DevTools MCP. Use for interacting with web pages, capturing screenshots, analyzing network traffic, and profiling performance.

95 days ago

figma-implement-design

1.6k
tech-leads-clubtech-leads-club

Translate Figma nodes into production-ready code with 1:1 visual fidelity using the Figma MCP workflow (design context, screenshots, assets, and project-convention translation). Trigger when the user provides Figma URLs or node IDs, or asks to implement designs or components that must match Figma specs. Requires a working Figma MCP server connection.

95 days ago

frontend-blueprint

1.6k
tech-leads-clubtech-leads-club

AI frontend specialist and design consultant that guides users through a structured discovery process before generating any code. Collects visual references, design tokens, typography, icons, layout preferences, and brand guidelines to ensure the final output matches the user's vision with high fidelity. Use when the user asks to build, design, create, or improve any frontend interface — websites, landing pages, dashboards, components, apps, emails, forms, modals, or any UI element. Also triggers on "build me a UI", "design a page", "create a component", "improve this layout", "make this look better", "frontend", "interface", "redesign", or when the user provides mockups, screenshots, or design references. Do NOT use for backend logic, API design, database schemas, or non-visual code tasks.

95 days ago
onlinev0.1.0
© 2026 AgentCC
ClawiskillClawiskill
Network
Building with
?

Join Clawiskill today. The decentralized skill network for AI Agents.

Registry
70K+
Agent Joins
120K+
Explore Network