skills/sumelabs/clawra/clawra-selfie

clawra-selfie

SKILL.md

Clawra Selfie

Edit a fixed reference image using xAI's Grok Imagine model and distribute it across messaging platforms (WhatsApp, Telegram, Discord, Slack, etc.) via OpenClaw.

Reference Image

The skill uses a fixed reference image hosted on jsDelivr CDN:

https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png

When to Use

  • User says "send a pic", "send me a pic", "send a photo", "send a selfie"
  • User says "send a pic of you...", "send a selfie of you..."
  • User asks "what are you doing?", "how are you doing?", "where are you?"
  • User describes a context: "send a pic wearing...", "send a pic at..."
  • User wants Clawra to appear in a specific outfit, location, or situation

Quick Reference

Required Environment Variables

FAL_KEY=your_fal_api_key          # Get from https://fal.ai/dashboard/keys
OPENCLAW_GATEWAY_TOKEN=your_token  # From: openclaw doctor --generate-gateway-token

Workflow

  1. Get user prompt for how to edit the image
  2. Edit image via fal.ai Grok Imagine Edit API with fixed reference
  3. Extract image URL from response
  4. Send to OpenClaw with target channel(s)

Step-by-Step Instructions

Step 1: Collect User Input

Ask the user for:

  • User context: What should the person in the image be doing/wearing/where?
  • Mode (optional): mirror or direct selfie style
  • Target channel(s): Where should it be sent? (e.g., #general, @username, channel ID)
  • Platform (optional): Which platform? (discord, telegram, whatsapp, slack)

Prompt Modes

Mode 1: Mirror Selfie (default)

Best for: outfit showcases, full-body shots, fashion content

make a pic of this person, but [user's context]. the person is taking a mirror selfie

Example: "wearing a santa hat" →

make a pic of this person, but wearing a santa hat. the person is taking a mirror selfie

Mode 2: Direct Selfie

Best for: close-up portraits, location shots, emotional expressions

a close-up selfie taken by herself at [user's context], direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Example: "a cozy cafe with warm lighting" →

a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Mode Selection Logic

Keywords in Request Auto-Select Mode
outfit, wearing, clothes, dress, suit, fashion mirror
cafe, restaurant, beach, park, city, location direct
close-up, portrait, face, eyes, smile direct
full-body, mirror, reflection mirror

Step 2: Edit Image with Grok Imagine

Use the fal.ai API to edit the reference image:

REFERENCE_IMAGE="https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png"

# Mode 1: Mirror Selfie
PROMPT="make a pic of this person, but <USER_CONTEXT>. the person is taking a mirror selfie"

# Mode 2: Direct Selfie
PROMPT="a close-up selfie taken by herself at <USER_CONTEXT>, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"

# Build JSON payload with jq (handles escaping properly)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

curl -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD"

Response Format:

{
  "images": [
    {
      "url": "https://v3b.fal.media/files/...",
      "content_type": "image/jpeg",
      "width": 1024,
      "height": 1024
    }
  ],
  "revised_prompt": "Enhanced prompt text..."
}

Step 3: Send Image via OpenClaw

Use the OpenClaw messaging API to send the edited image:

openclaw message send \
  --action send \
  --channel "<TARGET_CHANNEL>" \
  --message "<CAPTION_TEXT>" \
  --media "<IMAGE_URL>"

Alternative: Direct API call

curl -X POST "http://localhost:18789/message" \
  -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "send",
    "channel": "<TARGET_CHANNEL>",
    "message": "<CAPTION_TEXT>",
    "media": "<IMAGE_URL>"
  }'

Complete Script Example

#!/bin/bash
# grok-imagine-edit-send.sh

# Check required environment variables
if [ -z "$FAL_KEY" ]; then
  echo "Error: FAL_KEY environment variable not set"
  exit 1
fi

# Fixed reference image
REFERENCE_IMAGE="https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png"

USER_CONTEXT="$1"
CHANNEL="$2"
MODE="${3:-auto}"  # mirror, direct, or auto
CAPTION="${4:-Edited with Grok Imagine}"

if [ -z "$USER_CONTEXT" ] || [ -z "$CHANNEL" ]; then
  echo "Usage: $0 <user_context> <channel> [mode] [caption]"
  echo "Modes: mirror, direct, auto (default)"
  echo "Example: $0 'wearing a cowboy hat' '#general' mirror"
  echo "Example: $0 'a cozy cafe' '#general' direct"
  exit 1
fi

# Auto-detect mode based on keywords
if [ "$MODE" == "auto" ]; then
  if echo "$USER_CONTEXT" | grep -qiE "outfit|wearing|clothes|dress|suit|fashion|full-body|mirror"; then
    MODE="mirror"
  elif echo "$USER_CONTEXT" | grep -qiE "cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile"; then
    MODE="direct"
  else
    MODE="mirror"  # default
  fi
  echo "Auto-detected mode: $MODE"
fi

# Construct the prompt based on mode
if [ "$MODE" == "direct" ]; then
  EDIT_PROMPT="a close-up selfie taken by herself at $USER_CONTEXT, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"
else
  EDIT_PROMPT="make a pic of this person, but $USER_CONTEXT. the person is taking a mirror selfie"
fi

echo "Mode: $MODE"
echo "Editing reference image with prompt: $EDIT_PROMPT"

# Edit image (using jq for proper JSON escaping)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$EDIT_PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

RESPONSE=$(curl -s -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD")

# Extract image URL
IMAGE_URL=$(echo "$RESPONSE" | jq -r '.images[0].url')

if [ "$IMAGE_URL" == "null" ] || [ -z "$IMAGE_URL" ]; then
  echo "Error: Failed to edit image"
  echo "Response: $RESPONSE"
  exit 1
fi

echo "Image edited: $IMAGE_URL"
echo "Sending to channel: $CHANNEL"

# Send via OpenClaw
openclaw message send \
  --action send \
  --channel "$CHANNEL" \
  --message "$CAPTION" \
  --media "$IMAGE_URL"

echo "Done!"

Node.js/TypeScript Implementation

import { fal } from "@fal-ai/client";
import { exec } from "child_process";
import { promisify } from "util";

const execAsync = promisify(exec);

const REFERENCE_IMAGE = "https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png";

interface GrokImagineResult {
  images: Array<{
    url: string;
    content_type: string;
    width: number;
    height: number;
  }>;
  revised_prompt?: string;
}

type SelfieMode = "mirror" | "direct" | "auto";

function detectMode(userContext: string): "mirror" | "direct" {
  const mirrorKeywords = /outfit|wearing|clothes|dress|suit|fashion|full-body|mirror/i;
  const directKeywords = /cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile/i;

  if (directKeywords.test(userContext)) return "direct";
  if (mirrorKeywords.test(userContext)) return "mirror";
  return "mirror"; // default
}

function buildPrompt(userContext: string, mode: "mirror" | "direct"): string {
  if (mode === "direct") {
    return `a close-up selfie taken by herself at ${userContext}, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible`;
  }
  return `make a pic of this person, but ${userContext}. the person is taking a mirror selfie`;
}

async function editAndSend(
  userContext: string,
  channel: string,
  mode: SelfieMode = "auto",
  caption?: string
): Promise<string> {
  // Configure fal.ai client
  fal.config({
    credentials: process.env.FAL_KEY!
  });

  // Determine mode
  const actualMode = mode === "auto" ? detectMode(userContext) : mode;
  console.log(`Mode: ${actualMode}`);

  // Construct the prompt
  const editPrompt = buildPrompt(userContext, actualMode);

  // Edit reference image with Grok Imagine
  console.log(`Editing image: "${editPrompt}"`);

  const result = await fal.subscribe("xai/grok-imagine-image/edit", {
    input: {
      image_url: REFERENCE_IMAGE,
      prompt: editPrompt,
      num_images: 1,
      output_format: "jpeg"
    }
  }) as { data: GrokImagineResult };

  const imageUrl = result.data.images[0].url;
  console.log(`Edited image URL: ${imageUrl}`);

  // Send via OpenClaw
  const messageCaption = caption || `Edited with Grok Imagine`;

  await execAsync(
    `openclaw message send --action send --channel "${channel}" --message "${messageCaption}" --media "${imageUrl}"`
  );

  console.log(`Sent to ${channel}`);
  return imageUrl;
}

// Usage Examples

// Mirror mode (auto-detected from "wearing")
editAndSend(
  "wearing a cyberpunk outfit with neon lights",
  "#art-gallery",
  "auto",
  "Check out this AI-edited art!"
);
// → Mode: mirror
// → Prompt: "make a pic of this person, but wearing a cyberpunk outfit with neon lights. the person is taking a mirror selfie"

// Direct mode (auto-detected from "cafe")
editAndSend(
  "a cozy cafe with warm lighting",
  "#photography",
  "auto"
);
// → Mode: direct
// → Prompt: "a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact..."

// Explicit mode override
editAndSend("casual street style", "#fashion", "direct");

Supported Platforms

OpenClaw supports sending to:

Platform Channel Format Example
Discord #channel-name or channel ID #general, 123456789
Telegram @username or chat ID @mychannel, -100123456
WhatsApp Phone number (JID format) 1234567890@s.whatsapp.net
Slack #channel-name #random
Signal Phone number +1234567890
MS Teams Channel reference (varies)

Grok Imagine Edit Parameters

Parameter Type Default Description
image_url string required URL of image to edit (fixed in this skill)
prompt string required Edit instruction
num_images 1-4 1 Number of images to generate
output_format enum "jpeg" jpeg, png, webp

Setup Requirements

1. Install fal.ai client (for Node.js usage)

npm install @fal-ai/client

2. Install OpenClaw CLI

npm install -g openclaw

3. Configure OpenClaw Gateway

openclaw config set gateway.mode=local
openclaw doctor --generate-gateway-token

4. Start OpenClaw Gateway

openclaw gateway start

Error Handling

  • FAL_KEY missing: Ensure the API key is set in environment
  • Image edit failed: Check prompt content and API quota
  • OpenClaw send failed: Verify gateway is running and channel exists
  • Rate limits: fal.ai has rate limits; implement retry logic if needed

Tips

  1. Mirror mode context examples (outfit focus):

    • "wearing a santa hat"
    • "in a business suit"
    • "wearing a summer dress"
    • "in streetwear fashion"
  2. Direct mode context examples (location/portrait focus):

    • "a cozy cafe with warm lighting"
    • "a sunny beach at sunset"
    • "a busy city street at night"
    • "a peaceful park in autumn"
  3. Mode selection: Let auto-detect work, or explicitly specify for control

  4. Batch sending: Edit once, send to multiple channels

  5. Scheduling: Combine with OpenClaw scheduler for automated posts

Weekly Installs
75
Repository
sumelabs/clawra
GitHub Stars
2.1K
First Seen
Feb 10, 2026
Installed on
opencode68
codex67
gemini-cli67
cursor67
github-copilot62
kimi-cli61