suppress-ghost-contacts
Suppress Ghost Contacts (Delivered, Never Opened)
Purpose
Ghost contacts have received marketing emails but have never opened a single one. They are the largest threat to email deliverability because ISPs like Gmail and Microsoft track engagement at the sender level. Consistently sending to people who never open signals that the sender is producing unwanted email, causing inbox placement to deteriorate even for engaged contacts.
Prerequisites
- A HubSpot private app access token with
crm.objects.contacts.readandcrm.lists.read/crm.lists.writescopes - Python 3.10+ with
uvfor package management - A
.envfile containingHUBSPOT_ACCESS_TOKEN - Super Admin or Marketing Hub Admin permissions for the manual UI suppression step
Key Constraint
hs_marketable_status is read-only via the API. Suppression must happen in the HubSpot UI.
CRITICAL: "Never Opened" Is Null, Not Zero
HubSpot stores "never opened" as a null/absent property, not as the number 0. You MUST use the NOT_HAS_PROPERTY operator to find contacts who have never opened an email. Using EQ 0 will return zero results because the property is not set at all for these contacts.
# CORRECT - finds contacts who have never opened
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"}
# WRONG - returns nothing because the property does not exist on these contacts
{"propertyName": "hs_email_open", "operator": "EQ", "value": "0"}
The same applies to hs_email_bounce -- "never bounced" is also null.
Execution Pattern
This skill follows a 4-stage execution pattern: Plan -> Before State -> Execute -> After State.
Stage 1: Plan
Before writing any code, confirm with the user:
- Graduated approach: Recommend suppressing only contacts above your delivery threshold (typically 5-15, adjust based on your email cadence) first. Contacts below that threshold may not have had enough chances to engage.
- Overlap with previous processes: Some ghost contacts may already be non-marketing from hard-bounce or unsubscribe suppression. The Before State will measure this overlap.
- Open tracking caveat: Some email clients block tracking pixels. However, at the scale of thousands of contacts with zero opens across multiple sends, the overwhelming majority are genuinely unengaged.
- Apple Mail Privacy Protection: Introduced in iOS 15 / macOS Monterey, it pre-loads tracking pixels, which can create false-positive opens. Contacts who do NOT show opens despite this feature are almost certainly truly unengaged.
Stage 2: Before State
Discover all ghost contacts, break down by delivery volume, and generate an audit CSV.
"""
Before State: Count and audit ghost contacts.
Definition: emails delivered > 0, emails opened = null, emails bounced = null.
"""
import os
import csv
import time
import requests
from dotenv import load_dotenv
load_dotenv()
TOKEN = os.environ["HUBSPOT_ACCESS_TOKEN"]
BASE = "https://api.hubapi.com"
headers = {
"Authorization": f"Bearer {TOKEN}",
"Content-Type": "application/json",
}
url = f"{BASE}/crm/v3/objects/contacts/search"
# Ghost contact filter definition
GHOST_FILTERS = [
{"propertyName": "hs_email_delivered", "operator": "GT", "value": "0"},
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"},
{"propertyName": "hs_email_bounce", "operator": "NOT_HAS_PROPERTY"},
]
# --- Step 1: Total ghost contacts ---
resp = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": GHOST_FILTERS}],
"limit": 1,
})
resp.raise_for_status()
total_ghosts = resp.json().get("total", 0)
print(f"Total ghost contacts: {total_ghosts}")
# --- Step 2: How many are still marketing? ---
resp2 = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": GHOST_FILTERS + [
{"propertyName": "hs_marketable_status", "operator": "EQ", "value": "true"},
]}],
"limit": 1,
})
resp2.raise_for_status()
still_marketing = resp2.json().get("total", 0)
already_non_marketing = total_ghosts - still_marketing
print(f"Still marketing: {still_marketing}")
print(f"Already non-marketing (from prior processes): {already_non_marketing}")
# --- Step 3: Breakdown by delivery volume ---
print("\nGhost contacts by delivery volume:")
brackets = [
("1-10 emails", "0", "10"),
("11-25 emails", "10", "25"),
("26-50 emails", "25", "50"),
]
for label, gt_val, lte_val in brackets:
resp_b = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": [
{"propertyName": "hs_email_delivered", "operator": "GT", "value": gt_val},
{"propertyName": "hs_email_delivered", "operator": "LTE", "value": lte_val},
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"},
{"propertyName": "hs_email_bounce", "operator": "NOT_HAS_PROPERTY"},
]}],
"limit": 1,
})
if resp_b.status_code == 200:
print(f" {label}: {resp_b.json().get('total', 0)}")
time.sleep(0.1)
# 50+ emails
resp_50 = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": [
{"propertyName": "hs_email_delivered", "operator": "GT", "value": "50"},
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"},
{"propertyName": "hs_email_bounce", "operator": "NOT_HAS_PROPERTY"},
]}],
"limit": 1,
})
if resp_50.status_code == 200:
print(f" 50+ emails: {resp_50.json().get('total', 0)}")
# Worst offenders count (above your delivery threshold)
WORST_OFFENDER_THRESHOLD = 15 # Adjust based on your email cadence (typically 5-15)
resp_worst = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": [
{"propertyName": "hs_email_delivered", "operator": "GT", "value": str(WORST_OFFENDER_THRESHOLD - 1)},
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"},
{"propertyName": "hs_email_bounce", "operator": "NOT_HAS_PROPERTY"},
]}],
"limit": 1,
})
resp_worst.raise_for_status()
worst_offenders = resp_worst.json().get("total", 0)
print(f"\n{WORST_OFFENDER_THRESHOLD}+ delivered (recommended for immediate suppression): {worst_offenders}")
Step 4: Export full CSV using segmented queries
The Search API caps at 10K results. For ghost contacts (often >10K), segment by delivery volume brackets to bypass the limit.
# --- Step 4: Full CSV export using segmented queries ---
PROPS = [
"email", "firstname", "lastname", "hs_email_delivered",
"hs_email_open", "hs_email_bounce", "hs_marketable_status",
"lifecyclestage", "createdate",
]
# Each segment must be under 10K for full pagination
SEGMENTS = [
("1-5 delivered", "0", "5"),
("6-10 delivered", "5", "10"),
("11-20 delivered", "10", "20"),
("21-35 delivered", "20", "35"),
("36-50 delivered", "35", "50"),
("51+ delivered", "50", None),
]
all_contacts = []
for label, gt_val, lte_val in SEGMENTS:
seg_filters = [
{"propertyName": "hs_email_delivered", "operator": "GT", "value": gt_val},
{"propertyName": "hs_email_open", "operator": "NOT_HAS_PROPERTY"},
{"propertyName": "hs_email_bounce", "operator": "NOT_HAS_PROPERTY"},
]
if lte_val:
seg_filters.append(
{"propertyName": "hs_email_delivered", "operator": "LTE", "value": lte_val}
)
after = None
seg_count = 0
while True:
payload = {
"filterGroups": [{"filters": seg_filters}],
"properties": PROPS,
"limit": 100,
}
if after:
payload["after"] = after
resp = requests.post(url, headers=headers, json=payload)
if resp.status_code != 200:
break
data = resp.json()
for contact in data.get("results", []):
props = contact.get("properties", {})
all_contacts.append({
"id": contact["id"],
"email": props.get("email", ""),
"firstname": props.get("firstname", ""),
"lastname": props.get("lastname", ""),
"emails_delivered": props.get("hs_email_delivered", ""),
"emails_opened": props.get("hs_email_open", ""),
"emails_bounced": props.get("hs_email_bounce", ""),
"marketable_status": props.get("hs_marketable_status", ""),
"lifecycle_stage": props.get("lifecyclestage", ""),
"createdate": props.get("createdate", ""),
})
seg_count += 1
paging = data.get("paging", {})
after = paging.get("next", {}).get("after")
if not after:
break
time.sleep(0.12)
print(f" {label}: {seg_count} contacts")
os.makedirs("data/audit-logs", exist_ok=True)
csv_path = "data/audit-logs/ghost-contacts.csv"
with open(csv_path, "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=[
"id", "email", "firstname", "lastname", "emails_delivered",
"emails_opened", "emails_bounced", "marketable_status",
"lifecycle_stage", "createdate",
])
writer.writeheader()
writer.writerows(all_contacts)
print(f"\nAudit CSV saved: {csv_path} ({len(all_contacts)} records)")
Stage 3: Execute
Step 3a: Create HubSpot active lists via API
Create two lists: a main suppression list and a worst-offender review list.
"""
Execute (API part): Create HubSpot active lists.
"""
# Main ghost list
list1_payload = {
"name": "CLEANUP: Ghost Contacts - Never Opened",
"objectTypeId": "0-1",
"processingType": "DYNAMIC",
"filterBranch": {
"filterBranchType": "OR",
"filterBranches": [
{
"filterBranchType": "AND",
"filterBranches": [],
"filters": [
{
"filterType": "PROPERTY",
"property": "hs_email_delivered",
"operation": {
"operationType": "NUMBER",
"operator": "IS_GREATER_THAN",
"value": 0,
},
},
{
"filterType": "PROPERTY",
"property": "hs_email_open",
"operation": {
"operationType": "ALL_PROPERTY",
"operator": "IS_UNKNOWN",
},
},
{
"filterType": "PROPERTY",
"property": "hs_email_bounce",
"operation": {
"operationType": "ALL_PROPERTY",
"operator": "IS_UNKNOWN",
},
},
],
}
],
"filters": [],
},
}
resp1 = requests.post(f"{BASE}/crm/v3/lists", headers=headers, json=list1_payload)
if resp1.status_code in (200, 201):
lid1 = resp1.json().get("listId") or resp1.json().get("list", {}).get("listId")
print(f"Main list created! ID: {lid1}")
elif resp1.status_code == 409:
print("Main list already exists.")
# Worst-offender sub-list (above your delivery threshold)
list2_payload = {
"name": "REVIEW: Ghost Contacts - High Delivery No Opens",
"objectTypeId": "0-1",
"processingType": "DYNAMIC",
"filterBranch": {
"filterBranchType": "OR",
"filterBranches": [
{
"filterBranchType": "AND",
"filterBranches": [],
"filters": [
{
"filterType": "PROPERTY",
"property": "hs_email_delivered",
"operation": {
"operationType": "NUMBER",
"operator": "IS_GREATER_THAN",
"value": 14, # Adjust to match your delivery threshold minus 1
},
},
{
"filterType": "PROPERTY",
"property": "hs_email_open",
"operation": {
"operationType": "ALL_PROPERTY",
"operator": "IS_UNKNOWN",
},
},
{
"filterType": "PROPERTY",
"property": "hs_email_bounce",
"operation": {
"operationType": "ALL_PROPERTY",
"operator": "IS_UNKNOWN",
},
},
],
}
],
"filters": [],
},
}
resp2 = requests.post(f"{BASE}/crm/v3/lists", headers=headers, json=list2_payload)
if resp2.status_code in (200, 201):
lid2 = resp2.json().get("listId") or resp2.json().get("list", {}).get("listId")
print(f"Review list created! ID: {lid2}")
elif resp2.status_code == 409:
print("Review list already exists.")
Step 3b: Suppress contacts in HubSpot UI
Instruct the user:
- Open the list "CLEANUP: Ghost Contacts - Never Opened" in HubSpot
- Click the checkbox in the table header row
- Click "Select all N contacts in this list"
- Click More > Set marketing contact status
- Select Set as non-marketing contact
- Click Confirm
Graduated approach recommendation: If the user prefers a conservative approach, suppress only the "REVIEW: Ghost Contacts - High Delivery No Opens" list first. Monitor contacts below your delivery threshold separately -- they may engage with future emails.
Step 3c: Keep both lists active permanently
- The main list captures new ghost contacts over time as emails are sent
- The review list grows as contacts accumulate more delivered emails with no engagement
- Run suppression monthly; review for deletion quarterly
Stage 4: After State
Re-run the Before State queries and compare.
"""
After State: Verify ghost contacts have been suppressed.
"""
# Re-check still-marketing count
resp = requests.post(url, headers=headers, json={
"filterGroups": [{"filters": GHOST_FILTERS + [
{"propertyName": "hs_marketable_status", "operator": "EQ", "value": "true"},
]}],
"limit": 1,
})
resp.raise_for_status()
remaining = resp.json().get("total", 0)
if remaining == 0:
print("SUCCESS: All ghost contacts are now non-marketing.")
else:
print(f"WARNING: {remaining} ghost contacts are still marketing.")
Also check email performance: After 1-2 email sends post-suppression, open rates should improve noticeably because thousands of guaranteed-zero-open contacts have been removed from the send pool.
Safety Mechanisms
| Mechanism | Detail |
|---|---|
| CSV audit trail | Full export with delivery counts, lifecycle stage, and marketing status before any action. |
| Graduated suppression | Recommend starting with contacts above your delivery threshold (typically 5-15). Monitor those below it separately. |
| Overlap detection | Before State measures how many are already non-marketing from prior processes. |
| Two-tier list system | Main list for all ghosts, review list for worst offenders. |
| Non-destructive | Suppression, not deletion. CRM records are preserved. |
| Confirmation prompt | Present all findings to the user before proceeding. |
Technical Gotchas
-
CRITICAL:
NOT_HAS_PROPERTY, notEQ 0. HubSpot stores "never opened" as a null/absent property. UsingEQ 0returns nothing. This is the most common mistake with this process. -
Search API pagination limit is 10K. Ghost contacts often exceed 10K. Use segmented queries by delivery volume brackets (1-5, 6-10, 11-20, etc.) to export the complete set. Choose segment boundaries so each segment stays under 10K.
-
hs_email_deliveredis the correct property for delivery count. Do not confuse withhs_email_sent(sent but not necessarily delivered) ornum_unique_conversion_events. -
hs_email_opencounts total opens, not unique opens. But for ghost contacts, both are null because no open ever occurred. -
List API filter for "is unknown" uses
operationType: "ALL_PROPERTY"withoperator: "IS_UNKNOWN". This is different from the Search API'sNOT_HAS_PROPERTY. -
hs_marketable_statusis read-only via API. Same constraint as all suppression skills. Manual UI action or workflow-flag workaround required. -
Overlap with hard-bounce and unsubscribe processes: Some ghost contacts may have already been suppressed. The Before State overlap detection prevents double-counting the billing impact.
Package Setup
uv init hubspot-cleanup
cd hubspot-cleanup
uv add requests python-dotenv
Create a .env file:
HUBSPOT_ACCESS_TOKEN=pat-na1-xxxxxxxx