Skip to main content

Build a UTM dictionary your marketing analytics can actually group by

A how-to for any brand that wants UTM-tagged links to aggregate cleanly. Output: a Claude skill (or copy-paste prompt) plus a Google Sheet log that, together, force every campaign URL to follow the same dictionary.

Written by Juan Garzon

A how-to for any brand that wants their UTM-tagged links to aggregate cleanly. Output of this guide: a Claude skill (or a copy-paste prompt) plus a Google Sheet log that, together, force every campaign URL to follow the same dictionary.


The problem this solves

Most teams type utm_campaign values freehand. One marketer writes spring-sale-emma-10off, another writes Emma_Spring2026_10pct, a third writes EmmaCollabApril. All three describe the same campaign, none of them group together in your analytics tool, and any attempt to aggregate by tier, persona, product, or discount turns into manual cleanup.

The fix is a tiny piece of structure: every utm_campaign value follows a strict, prefixed key-value format that your analytics can slice with simple contains() queries. The skill (or prompt) below enforces that format so your team never has to think about it.


The convention

utm_medium and utm_source

Use utm_medium for the channel category (e.g., influencer, paid_social, email, affiliate, organic_social). Use utm_source for the specific platform or sender (e.g., instagram, tiktok, mailchimp, partner_xyz). Keep both lowercase, no spaces.

utm_campaign — the structured part

Format: key1-value1_key2-value2_key3-value3_..._keyN-valueN

  • Each field is a key-value pair joined by a hyphen

  • Fields are separated by underscores

  • All values are lowercase and alphanumeric (no spaces, dots, or special characters)

  • Field order is fixed (so the same campaign always serializes the same way)

Worked example for influencer marketing: prod-bedding_tier-micro_name-emma_persona-mom_disc-10pct

Worked example for paid social: prod-skincare_audience-lookalike_creative-video_offer-freegift_geo-nl

Worked example for email: flow-welcome_step-3_segment-firsttime_offer-15pct

The dimensions change by channel, but the format doesn't.


Why prefixed key-value pairs

Because your analytics will use contains() to slice the data later. Without prefixes, an influencer named momlife would falsely match a persona-mom filter. With prefixes, contains('persona-mom') can only match a persona — never a name, product, or audience. The prefix is the namespace guard.

Three rules to avoid silent collisions

  1. Always include the prefix. prod-bedding, never just bedding. Without the prefix, contains-queries collide with other dimensions.

  2. Add a unit suffix to numeric values. Use disc-10pct not disc-10. Without a suffix, contains('disc-10') falsely matches disc-100. The suffix can be pct, eur, usd, days, etc. — whatever fits your dimension.

  3. For multi-value fields (bundles), repeat the prefix. A campaign promoting both bedding and towels becomes prod-bedding-prod-towels, not prod-bedding-towels. The repeated prefix is what lets contains('prod-towels') still match the bundle. Sort the values alphabetically by name so the same combination always serializes identically.


Step 1 — define your dictionary

Sit down with whoever owns marketing analytics and answer one question for each channel: "What dimensions do I want to group traffic by, six months from now?"

The answers become your dictionary. For each dimension, write down the complete list of allowed values. If a value isn't on the list, your team should not be able to use it without an explicit dictionary update.

Template:

field

description

allowed values

prod

Product line or category

(your product taxonomy)

tier

Influencer tier or audience size

nano, micro, macro

name

Influencer handle / partner ID

lowercase, alphanumeric, no @

persona

Content persona / audience archetype

(your persona buckets)

disc

Discount associated with link

none, <n>pct, <n>eur

geo

Geography (if relevant)

ISO country codes

Three guidelines as you build the dictionary:

  • Keep values short and unambiguous. Single words preferred. If you need multiple words, concatenate (firsttimebuyer, not first-time-buyer).

  • Single-token values ahead of multi-token. nontoxic beats non-toxic because non-toxic would collide with hyphen as a separator.

  • Don't rename values once they're in the wild. It retroactively breaks historical aggregations. Add new values; never repurpose old ones.


Step 2 — choose your delivery mechanism

You have two options. Most teams ship both.

Option A: Claude skill (for teams using Cowork or Claude Code). A skill auto-activates whenever someone asks Claude to build a UTM. The skill enforces the dictionary, asks for missing fields, and outputs a clean URL plus a TSV row for the log sheet.

Option B: Copy-paste prompt (for teams using Claude.ai, ChatGPT, or other chat UIs). Same logic, packaged as a single prompt the user pastes at the start of a conversation.

Templates for both are below — fill in your dictionary, change the brand name, and you're done. Replace every <PLACEHOLDER> with your brand's specifics.


Step 3 — fill in the SKILL.md template

Save this as SKILL.md:

---
name: <brand>-utm-builder
description: "Build clean, dictionary-enforced UTM tracking links for <brand>'s <channel> marketing campaigns. Triggers on any request involving UTM, tracking link, campaign link, or <channel> link for <brand>. Produces a single URL plus a TSV row ready to paste into the UTM log sheet."
---

You are <brand>'s UTM builder. Your job is to produce UTM-tagged URLs that are perfectly consistent so <brand>'s marketing analytics can group, filter, and aggregate traffic using simple `contains()` queries.

## Why strict conventions matter

Every UTM you output will be aggregated with `contains()` filters (e.g., `contains('tier-nano')`, `contains('persona-mom')`). Deviating from the dictionary — even a typo or wrong casing — breaks the grouping. Enforce the dictionary without exception. If a user supplies a value that is not in the dictionary, ask them to pick from the allowed values rather than guessing.

## The dictionary (source of truth)

**utm_medium** — always `<medium>`. No exceptions.

**utm_source** — the platform, lowercase, single word. Allowed values:
- `<platform-1>`
- `<platform-2>`
- `<platform-3>`
- ...

If the user names a platform not in this list, ask before inventing a new one.

**utm_campaign** — prefixed key-value pairs joined by underscore. Fixed field order:

`<field1>-<value>_<field2>-<value>_..._<fieldN>-<value>`

### <field1> — <description>
Allowed values: `<value-a>`, `<value-b>`, `<value-c>`.
For multi-value cases, repeat the `<field1>-` prefix and sort alphabetically: `<field1>-a-<field1>-b`. The repeated prefix is required so contains-queries match both singletons and bundles.

### <field2> — <description>
Allowed values: `<value-a>`, `<value-b>`, `<value-c>`.

### ... (one section per dictionary field)

### disc — discount associated with the link
- No discount: `disc-none`
- Percentage: strip the `%` and append `pct`. 10% → `disc-10pct`, 100% → `disc-100pct`.
- Fixed-currency: `disc-5eur`, `disc-10usd`.

The unit suffix prevents `contains('disc-10')` from falsely matching 100% rows.

## Workflow

### 1. Gather inputs
Ask the user for any missing fields. If they give a free-text brief, parse what you can directly.

### 2. Validate against the dictionary
Every value must match. If something is outside it, stop and ask — never invent.

### 3. Build the URL
`{destination}?utm_medium=<medium>&utm_source={platform}&utm_campaign={fields...}`
Use `?` for the first parameter. If the destination URL already contains `?`, use `&` for all UTM params. Don't URL-encode the campaign string — hyphens and underscores are URL-safe.

### 4. Output for the user
Return three things in this order:
- **a)** the final URL in a code block
- **b)** a summary table of each parameter
- **c)** a TSV row for the log sheet, in this column order:
`date \t destination_url \t platform \t <field1> \t <field2> \t ... \t <fieldN> \t utm_medium \t utm_source \t utm_campaign \t full_url \t notes`

Use today's date in YYYY-MM-DD. Tell the user: "Paste this row into the next empty row of your <brand> UTM log sheet."

## Anti-patterns — do not do these
- Don't add new dictionary values on the fly. If it's not in the list, ask.
- Don't use uppercase letters anywhere in the values.
- Don't use spaces, dots, or special characters inside values.
- Don't drop the prefix. Always `prod-bedding`, never bare `bedding`.
- Don't rearrange field order.
- Don't URL-encode the campaign string.
- Don't use `utm_term` or `utm_content` unless they're part of <brand>'s schema.

Step 4 — fill in the prompt template

For users without skills enabled, save this as <brand>-utm-prompt.md and have them paste it at the start of a conversation.

You are <brand>'s UTM builder. Your only job is to produce UTM-tagged URLs for <channel> marketing campaigns, following the strict dictionary below. <brand>'s marketing analytics aggregates traffic using `contains()` queries — any deviation from the dictionary breaks the grouping.

## The dictionary

**utm_medium** — always `<medium>`.

**utm_source** — the platform in lowercase. Allowed: `<platform-1>`, `<platform-2>`, ... If the user names another platform, ask before inventing.

**utm_campaign** — prefixed key-value pairs joined by underscore, in this fixed order:
`<field1>-<value>_<field2>-<value>_..._<fieldN>-<value>`
- `<field1>` — one of: `<...>`. For bundles, repeat the prefix and sort alphabetically.
- `<field2>` — one of: `<...>`.
- ...
- `disc` — `none`, or `<n>pct` for percentages, or `<n>eur` for fixed amounts. The suffix prevents false matches.

## Rules
1. Every value must match the dictionary. If something is outside it, stop and ask.
2. All values lowercase. No spaces, dots, or special characters.
3. Always include the prefix.
4. Fixed field order.
5. Don't URL-encode the campaign string.

## Output
Return three things: the final URL in a code block, a summary table, and a TSV row for the log sheet (columns: date, destination_url, platform, <field1>, <field2>, ..., utm_medium, utm_source, utm_campaign, full_url, notes).

Acknowledge this prompt and ask what campaign to build first.

Step 5 — set up the log sheet

Create a CSV with these headers and import it into Google Sheets:

date,destination_url,platform,<field1>,<field2>,...,<fieldN>,utm_medium,utm_source,utm_campaign,full_url,notes

Then:

  1. Google Sheets → File → Import → select your CSV

  2. Choose "Replace spreadsheet" and "Detect automatically"

  3. Freeze the header row (View → Freeze → 1 row)

  4. Share with your marketing team as editor

Every UTM the builder produces comes with a tab-separated row. Pasting it into the next empty row of the sheet auto-splits across columns.


Step 6 — test before rolling out

Generate 5–10 sample UTMs covering edge cases before handing the skill to your team:

  • A clean, fully-specified brief (happy path)

  • A bundle / multi-value field (does the repeated prefix work?)

  • A 100% discount alongside a 10% discount (does the pct suffix prevent collision?)

  • An influencer / partner name that contains a substring of one of your persona values (does the prefix protect against it?)

  • An ambiguous brief missing one or more fields (does the skill push back instead of guessing?)

For each sample, run a few contains() queries against the campaign string to confirm grouping works the way you expect.


Step 7 — install and ship

For Cowork / Claude Code teams:

  1. Create the folder ~/.claude/skills/<brand>-utm-builder/

  2. Drop SKILL.md inside

  3. Restart Cowork or Claude Code

  4. Test by asking Claude to build a UTM for a sample campaign

For Claude.ai / ChatGPT teams:

  1. Open <brand>-utm-prompt.md

  2. Copy the whole thing into a new conversation

  3. Start describing campaigns


Querying the data later

Once you have campaigns flowing through the analytics tool, every dimension is one contains() query away:

  • All nano-tier traffic on Instagram: utm_source = "instagram" AND utm_campaign contains "tier-nano"

  • All bedding campaigns (including bundles): utm_campaign contains "prod-bedding"

  • All mom-persona campaigns: utm_campaign contains "persona-mom"

  • All 10% discount campaigns: utm_campaign contains "disc-10pct"

  • Specific partner across channels: utm_campaign contains "name-emma"

Combine with AND/OR for slices: contains("tier-nano") AND contains("persona-fashion") returns nano fashion creators only.


Extending the dictionary

When marketing wants a new product category, persona, or partner type:

  1. Add the value to the SKILL.md dictionary section

  2. Add it to the prompt copy

  3. Communicate the change to anyone using the log sheet

  4. Avoid renaming existing values — it retroactively breaks historical aggregations

Keep a short changelog at the bottom of the SKILL.md file noting when each new value was added. It helps later when you're investigating "why don't I see X data before this date".


FAQ

Can I use utm_term or utm_content instead of stuffing everything into utm_campaign?
You can, but you don't gain much. utm_term and utm_content aren't always preserved by every analytics tool, ad network, or redirect, while utm_campaign is universally respected. Stuffing all dimensions into utm_campaign keeps the schema portable and the queries simple.

The campaign strings are getting long. Is that a problem?
URLs can be 2,000+ characters before any browser or analytics tool starts complaining. A 100-character utm_campaign is fine. If readability bothers you, remember: humans don't read these — your analytics tool does.

What if a marketer ignores the skill and builds UTMs manually?
Two safeguards: (1) require all team UTMs to go through the log sheet — un-logged campaigns won't appear in reports; (2) add a downstream validation in your analytics that flags utm_campaign values not matching the expected format and surfaces them in a "needs cleaning" view.

How often should I expect to update the dictionary?
Every couple of months for product categories and persona buckets, more often for partner names. Tier and discount formats should rarely change.

What about non-influencer channels — does this still apply?
Yes. Same convention, different dimensions. For paid social you might use audience, creative, placement. For email you might use flow, step, segment. Pick the dimensions you want to group by, define the dictionary, and the same skill template works.


Appendix — minimal CSV template

date,destination_url,platform,prod,tier,influencer_handle,influencer_display_name,persona,discount,utm_medium,utm_source,utm_campaign,full_url,notes

Customize the dimension columns to match your dictionary. Keep the utm_* and full_url columns as-is.

Did this answer your question?