spend-analysis

Installation
SKILL.md

Spend Analysis

You are an expert in procurement spend analysis and analytics. Your goal is to help organizations understand their spending patterns, identify savings opportunities, improve compliance, and enable data-driven procurement decisions through comprehensive spend visibility and analysis.

Initial Assessment

Before conducting spend analysis, understand:

  1. Analysis Objectives

    • What's the primary goal? (cost savings, compliance, consolidation)
    • Key questions to answer?
    • Stakeholders and their needs?
    • Expected outcomes?
  2. Data Availability

    • Data sources? (ERP, P2P system, AP, credit cards)
    • Data quality issues?
    • Time period covered?
    • Level of detail available?
  3. Current State

    • Existing spend visibility?
    • Known issues or opportunities?
    • Previous analysis efforts?
    • Category management maturity?
  4. Scope & Resources

    • Total addressable spend?
    • Categories to prioritize?
    • Tools and systems available?
    • Timeline for analysis?

Spend Analysis Framework

The Spend Cube Model

Three Dimensions:

  1. What - Categories, commodities, items
  2. Who - Suppliers, vendors, merchants
  3. Where - Business units, locations, cost centers

Analysis Types:

  • Slice by category (see spend by supplier within category)
  • Slice by supplier (see spend by category per supplier)
  • Slice by business unit (see spend patterns by location)
  • Drill-down (aggregate to detail)
  • Roll-up (detail to aggregate)

Data Collection & Preparation

Data Sources

Primary Sources:

  • ERP systems (SAP, Oracle, etc.)
  • Procure-to-Pay (P2P) platforms
  • Accounts Payable (AP) systems
  • Purchase order data
  • Invoice/payment data

Secondary Sources:

  • Credit card transactions
  • Expense reports
  • Contracts and agreements
  • Supplier master data
  • Catalogs and price lists

Data Extraction

import pandas as pd
import numpy as np

def extract_spend_data(data_sources):
    """
    Extract and consolidate spend data from multiple sources

    data_sources: dict with {source_name: file_path or dataframe}
    """

    all_data = []

    for source, data in data_sources.items():
        if isinstance(data, str):
            # Load from file
            if data.endswith('.csv'):
                df = pd.read_csv(data)
            elif data.endswith('.xlsx'):
                df = pd.read_excel(data)
        else:
            df = data.copy()

        # Add source column
        df['data_source'] = source

        # Standardize column names
        column_mapping = {
            'vendor': 'supplier_name',
            'vendor_name': 'supplier_name',
            'supplier': 'supplier_name',
            'amount': 'spend_amount',
            'total': 'spend_amount',
            'date': 'transaction_date',
            'invoice_date': 'transaction_date',
            'payment_date': 'transaction_date',
        }

        df = df.rename(columns={
            k: v for k, v in column_mapping.items() if k in df.columns
        })

        all_data.append(df)

    # Concatenate all sources
    consolidated = pd.concat(all_data, ignore_index=True, sort=False)

    return consolidated


def clean_spend_data(df):
    """
    Clean and standardize spend data

    Returns: cleaned DataFrame
    """

    df = df.copy()

    # Remove duplicates
    initial_rows = len(df)
    df = df.drop_duplicates(subset=['supplier_name', 'transaction_date', 'spend_amount'],
                           keep='first')
    duplicates_removed = initial_rows - len(df)

    # Standardize supplier names
    df['supplier_name'] = df['supplier_name'].str.strip().str.upper()
    df['supplier_name'] = df['supplier_name'].str.replace(r'\s+', ' ', regex=True)

    # Handle common variations
    df['supplier_name'] = df['supplier_name'].replace({
        r'.*\bINC\.?$': 'INC',
        r'.*\bLLC\.?$': 'LLC',
        r'.*\bCORP\.?$': 'CORP',
        r'.*\bLTD\.?$': 'LTD',
    }, regex=True)

    # Ensure numeric spend
    df['spend_amount'] = pd.to_numeric(df['spend_amount'], errors='coerce')

    # Remove negative amounts (credits handled separately)
    df = df[df['spend_amount'] > 0]

    # Convert dates
    df['transaction_date'] = pd.to_datetime(df['transaction_date'], errors='coerce')

    # Extract year and month
    df['year'] = df['transaction_date'].dt.year
    df['month'] = df['transaction_date'].dt.month
    df['quarter'] = df['transaction_date'].dt.quarter

    # Remove rows with missing critical fields
    df = df.dropna(subset=['supplier_name', 'spend_amount', 'transaction_date'])

    print(f"Data Cleaning Summary:")
    print(f"  Duplicates removed: {duplicates_removed:,}")
    print(f"  Final records: {len(df):,}")
    print(f"  Total spend: ${df['spend_amount'].sum():,.2f}")

    return df

Data Enrichment

def enrich_spend_data(df, supplier_mapping=None, category_mapping=None):
    """
    Enrich spend data with classifications

    supplier_mapping: dict or DataFrame with supplier standardization
    category_mapping: dict or DataFrame with category assignments
    """

    df = df.copy()

    # Supplier normalization
    if supplier_mapping is not None:
        if isinstance(supplier_mapping, dict):
            df['supplier_normalized'] = df['supplier_name'].map(supplier_mapping)
        else:
            df = df.merge(
                supplier_mapping[['supplier_name', 'supplier_normalized']],
                on='supplier_name',
                how='left'
            )

        # Use normalized name if available, otherwise original
        df['supplier_normalized'] = df['supplier_normalized'].fillna(df['supplier_name'])
    else:
        df['supplier_normalized'] = df['supplier_name']

    # Category classification
    if category_mapping is not None:
        if isinstance(category_mapping, dict):
            df['category'] = df['supplier_normalized'].map(category_mapping)
        else:
            df = df.merge(
                category_mapping[['supplier_normalized', 'category']],
                on='supplier_normalized',
                how='left'
            )

        # Flag unclassified spend
        df['category'] = df['category'].fillna('Unclassified')

    # Additional enrichment
    # Spend tier based on amount
    df['spend_tier'] = pd.cut(
        df['spend_amount'],
        bins=[0, 1000, 10000, 100000, float('inf')],
        labels=['<$1K', '$1K-$10K', '$10K-$100K', '>$100K']
    )

    return df

Spend Classification

Category Taxonomy

Common Category Hierarchy:

Level 1: Major Category
  Level 2: Sub-Category
    Level 3: Commodity
      Level 4: Item/SKU

Examples:
- Direct Materials
  - Raw Materials
    - Steel
      - Cold-rolled steel
      - Hot-rolled steel
  - Components
    - Electronics
    - Mechanical parts

- Indirect Materials
  - MRO (Maintenance, Repair, Operations)
    - Tools
    - Safety equipment
  - Facilities
    - Utilities
    - Cleaning supplies

- Services
  - Professional Services
    - Consulting
    - Legal
  - IT Services
    - Software licenses
    - Managed services

Automated Classification

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline

class SpendClassifier:
    """Machine learning-based spend classification"""

    def __init__(self):
        self.model = Pipeline([
            ('tfidf', TfidfVectorizer(max_features=500, ngram_range=(1, 2))),
            ('classifier', MultinomialNB())
        ])
        self.trained = False

    def train(self, training_data):
        """
        Train classifier on labeled data

        training_data: DataFrame with 'description' and 'category' columns
        """

        X = training_data['description']
        y = training_data['category']

        self.model.fit(X, y)
        self.trained = True

        # Calculate accuracy
        train_accuracy = self.model.score(X, y)
        print(f"Training accuracy: {train_accuracy:.2%}")

    def predict(self, descriptions):
        """Predict categories for new descriptions"""

        if not self.trained:
            raise ValueError("Model must be trained first")

        predictions = self.model.predict(descriptions)
        probabilities = self.model.predict_proba(descriptions)

        return pd.DataFrame({
            'description': descriptions,
            'predicted_category': predictions,
            'confidence': probabilities.max(axis=1)
        })

    def classify_spend_data(self, df, description_column='description',
                           confidence_threshold=0.7):
        """
        Classify entire spend dataset

        Returns: DataFrame with predicted categories
        """

        if not self.trained:
            raise ValueError("Model must be trained first")

        predictions = self.predict(df[description_column])

        # Add to original dataframe
        result = df.copy()
        result['predicted_category'] = predictions['predicted_category']
        result['classification_confidence'] = predictions['confidence']

        # Flag low confidence
        result['needs_review'] = result['classification_confidence'] < confidence_threshold

        return result


# Example usage
# Assume we have some labeled training data
training_data = pd.DataFrame({
    'description': [
        'office supplies paper pens',
        'laptop computer IT hardware',
        'consulting services strategic',
        'steel raw material manufacturing',
        'janitorial cleaning service',
        'software license subscription',
    ],
    'category': [
        'Office Supplies',
        'IT Hardware',
        'Professional Services',
        'Raw Materials',
        'Facilities',
        'IT Software',
    ]
})

classifier = SpendClassifier()
classifier.train(training_data)

# Classify new spend
new_spend = pd.DataFrame({
    'description': [
        'desktop computers monitors',
        'office furniture desks chairs',
        'legal advisory services'
    ]
})

predictions = classifier.predict(new_spend['description'])
print(predictions)

Spend Analysis Techniques

Pareto Analysis (80/20 Rule)

def pareto_analysis(df, group_by='supplier_normalized', spend_column='spend_amount'):
    """
    Perform Pareto analysis to identify top contributors

    Returns: DataFrame with cumulative spend percentages
    """

    # Aggregate spend
    spend_summary = df.groupby(group_by)[spend_column].sum().reset_index()
    spend_summary = spend_summary.sort_values(spend_column, ascending=False)

    # Calculate percentages
    total_spend = spend_summary[spend_column].sum()
    spend_summary['spend_pct'] = spend_summary[spend_column] / total_spend * 100
    spend_summary['cumulative_pct'] = spend_summary['spend_pct'].cumsum()

    # Classify into A, B, C
    spend_summary['abc_class'] = 'C'
    spend_summary.loc[spend_summary['cumulative_pct'] <= 80, 'abc_class'] = 'A'
    spend_summary.loc[
        (spend_summary['cumulative_pct'] > 80) &
        (spend_summary['cumulative_pct'] <= 95),
        'abc_class'
    ] = 'B'

    # Add rank
    spend_summary['rank'] = range(1, len(spend_summary) + 1)

    return spend_summary


# Example with visualization
import matplotlib.pyplot as plt

def plot_pareto(df, group_by='supplier_normalized', top_n=20):
    """Create Pareto chart"""

    pareto_df = pareto_analysis(df, group_by=group_by)

    # Take top N
    plot_data = pareto_df.head(top_n)

    fig, ax1 = plt.subplots(figsize=(14, 6))

    # Bar chart
    x = range(len(plot_data))
    ax1.bar(x, plot_data['spend_pct'], color='steelblue', alpha=0.7)
    ax1.set_xlabel(group_by.replace('_', ' ').title())
    ax1.set_ylabel('% of Total Spend', color='steelblue')
    ax1.tick_params(axis='y', labelcolor='steelblue')
    ax1.set_xticks(x)
    ax1.set_xticklabels(plot_data[group_by], rotation=45, ha='right')

    # Cumulative line
    ax2 = ax1.twinx()
    ax2.plot(x, plot_data['cumulative_pct'], color='red', marker='o', linewidth=2)
    ax2.set_ylabel('Cumulative %', color='red')
    ax2.tick_params(axis='y', labelcolor='red')
    ax2.set_ylim(0, 105)
    ax2.axhline(y=80, color='gray', linestyle='--', alpha=0.5)
    ax2.text(len(plot_data)-1, 82, '80%', color='gray')

    plt.title(f'Pareto Analysis: Top {top_n} {group_by.replace("_", " ").title()}')
    plt.tight_layout()

    return fig, pareto_df


# Usage
# fig, pareto_summary = plot_pareto(spend_df, group_by='supplier_normalized')
# plt.show()

Supplier Consolidation Analysis

def supplier_consolidation_opportunity(df, category='category'):
    """
    Identify opportunities for supplier consolidation

    Returns: DataFrame with consolidation opportunities by category
    """

    consolidation_opportunities = []

    for cat in df[category].unique():
        cat_data = df[df[category] == cat]

        num_suppliers = cat_data['supplier_normalized'].nunique()
        total_spend = cat_data['spend_amount'].sum()
        num_transactions = len(cat_data)

        # Get top supplier
        top_supplier = cat_data.groupby('supplier_normalized')['spend_amount'].sum().idxmax()
        top_supplier_spend = cat_data[cat_data['supplier_normalized'] == top_supplier]['spend_amount'].sum()
        top_supplier_pct = top_supplier_spend / total_spend * 100

        # Tail spend (suppliers with <2% of category spend)
        supplier_spend = cat_data.groupby('supplier_normalized')['spend_amount'].sum()
        tail_suppliers = (supplier_spend / total_spend < 0.02).sum()
        tail_spend = supplier_spend[supplier_spend / total_spend < 0.02].sum()

        # Opportunity score (more suppliers + more tail spend = higher opportunity)
        opportunity_score = (
            (num_suppliers / 10) * 40 +  # Supplier count (normalized)
            (tail_spend / total_spend) * 60  # Tail spend %
        )

        consolidation_opportunities.append({
            'category': cat,
            'total_spend': total_spend,
            'num_suppliers': num_suppliers,
            'top_supplier': top_supplier,
            'top_supplier_pct': top_supplier_pct,
            'tail_suppliers': tail_suppliers,
            'tail_spend': tail_spend,
            'tail_spend_pct': tail_spend / total_spend * 100,
            'opportunity_score': min(100, opportunity_score)
        })

    result = pd.DataFrame(consolidation_opportunities)
    result = result.sort_values('opportunity_score', ascending=False)

    return result


# Example
# consolidation_opps = supplier_consolidation_opportunity(spend_df)
# print(consolidation_opps.head(10))

Maverick Spend Detection

Maverick Spend = Off-contract or non-preferred supplier purchases

def detect_maverick_spend(df, preferred_suppliers, contracts):
    """
    Identify maverick (off-contract) spend

    preferred_suppliers: list of approved suppliers
    contracts: DataFrame with contract details
    """

    df = df.copy()

    # Flag non-preferred suppliers
    df['is_preferred_supplier'] = df['supplier_normalized'].isin(preferred_suppliers)

    # Match to contracts
    if contracts is not None:
        df = df.merge(
            contracts[['supplier_normalized', 'category', 'contract_id']],
            on=['supplier_normalized', 'category'],
            how='left'
        )
        df['has_contract'] = df['contract_id'].notna()
    else:
        df['has_contract'] = False

    # Classify spend
    df['spend_type'] = 'Maverick'
    df.loc[df['is_preferred_supplier'] & df['has_contract'], 'spend_type'] = 'On-Contract'
    df.loc[df['is_preferred_supplier'] & ~df['has_contract'], 'spend_type'] = 'Preferred (No Contract)'

    # Summarize
    maverick_summary = df.groupby('spend_type')['spend_amount'].agg([
        ('total_spend', 'sum'),
        ('num_transactions', 'count')
    ]).reset_index()

    total = df['spend_amount'].sum()
    maverick_summary['pct_of_total'] = maverick_summary['total_spend'] / total * 100

    return df, maverick_summary


# Example
preferred_suppliers = ['SUPPLIER A INC', 'SUPPLIER B LLC', 'SUPPLIER C CORP']

contracts = pd.DataFrame({
    'supplier_normalized': ['SUPPLIER A INC', 'SUPPLIER B LLC'],
    'category': ['Office Supplies', 'IT Hardware'],
    'contract_id': ['CNT-001', 'CNT-002']
})

spend_with_flags, maverick_summary = detect_maverick_spend(
    spend_df, preferred_suppliers, contracts
)

print("\nMaverick Spend Summary:")
print(maverick_summary)

Price Variance Analysis

def price_variance_analysis(df, item_column='item_description',
                           price_column='unit_price'):
    """
    Analyze price variance for same items across transactions

    Identifies opportunities for price standardization
    """

    # Group by item
    price_stats = df.groupby(item_column)[price_column].agg([
        ('min_price', 'min'),
        ('max_price', 'max'),
        ('avg_price', 'mean'),
        ('std_price', 'std'),
        ('num_purchases', 'count')
    ]).reset_index()

    # Calculate variance
    price_stats['price_variance'] = price_stats['max_price'] - price_stats['min_price']
    price_stats['variance_pct'] = (
        price_stats['price_variance'] / price_stats['avg_price'] * 100
    )

    # Calculate potential savings
    # If all purchases were at min price
    item_spend = df.groupby(item_column)['spend_amount'].sum()
    price_stats = price_stats.merge(
        item_spend.rename('total_spend'),
        left_on=item_column,
        right_index=True
    )

    price_stats['potential_savings'] = (
        price_stats['total_spend'] *
        (1 - price_stats['min_price'] / price_stats['avg_price'])
    )

    # Sort by savings opportunity
    price_stats = price_stats.sort_values('potential_savings', ascending=False)

    # Filter significant variances
    price_stats = price_stats[
        (price_stats['variance_pct'] > 10) &
        (price_stats['num_purchases'] >= 3)
    ]

    return price_stats

Savings Opportunity Identification

Opportunity Categories

1. Price Reduction

  • Competitive bidding
  • Volume consolidation
  • Contract negotiation
  • Market pricing benchmarks

2. Demand Management

  • Specification changes
  • Standardization
  • Usage reduction
  • Substitution

3. Process Improvement

  • Automation (P2P, e-sourcing)
  • Maverick spend elimination
  • Payment terms optimization
  • Invoice accuracy

4. Supplier Optimization

  • Supplier consolidation
  • Strategic partnerships
  • Supplier development
  • Dual sourcing

Savings Calculator

class SavingsOpportunityCalculator:
    """Calculate and prioritize savings opportunities"""

    def __init__(self, current_spend):
        self.current_spend = current_spend
        self.opportunities = []

    def add_price_reduction(self, category, baseline_spend,
                           current_price, target_price, confidence=0.8):
        """Add price reduction opportunity"""

        savings_pct = (current_price - target_price) / current_price
        annual_savings = baseline_spend * savings_pct
        risk_adjusted_savings = annual_savings * confidence

        self.opportunities.append({
            'category': category,
            'type': 'Price Reduction',
            'baseline_spend': baseline_spend,
            'savings_pct': savings_pct * 100,
            'gross_savings': annual_savings,
            'confidence': confidence * 100,
            'risk_adjusted_savings': risk_adjusted_savings,
            'implementation_effort': 'Medium',
            'timeline_months': 3
        })

    def add_consolidation(self, category, baseline_spend,
                         current_suppliers, target_suppliers,
                         expected_discount=0.05, confidence=0.7):
        """Add supplier consolidation opportunity"""

        annual_savings = baseline_spend * expected_discount
        risk_adjusted_savings = annual_savings * confidence

        self.opportunities.append({
            'category': category,
            'type': 'Supplier Consolidation',
            'baseline_spend': baseline_spend,
            'savings_pct': expected_discount * 100,
            'gross_savings': annual_savings,
            'confidence': confidence * 100,
            'risk_adjusted_savings': risk_adjusted_savings,
            'implementation_effort': 'High',
            'timeline_months': 6,
            'details': f'Reduce from {current_suppliers} to {target_suppliers} suppliers'
        })

    def add_demand_reduction(self, category, baseline_spend,
                            reduction_pct, confidence=0.6):
        """Add demand/usage reduction opportunity"""

        annual_savings = baseline_spend * reduction_pct
        risk_adjusted_savings = annual_savings * confidence

        self.opportunities.append({
            'category': category,
            'type': 'Demand Reduction',
            'baseline_spend': baseline_spend,
            'savings_pct': reduction_pct * 100,
            'gross_savings': annual_savings,
            'confidence': confidence * 100,
            'risk_adjusted_savings': risk_adjusted_savings,
            'implementation_effort': 'High',
            'timeline_months': 12
        })

    def add_process_improvement(self, category, baseline_spend,
                               process_cost_reduction, confidence=0.9):
        """Add process improvement opportunity"""

        annual_savings = process_cost_reduction
        risk_adjusted_savings = annual_savings * confidence

        self.opportunities.append({
            'category': category,
            'type': 'Process Improvement',
            'baseline_spend': baseline_spend,
            'savings_pct': (annual_savings / baseline_spend) * 100,
            'gross_savings': annual_savings,
            'confidence': confidence * 100,
            'risk_adjusted_savings': risk_adjusted_savings,
            'implementation_effort': 'Medium',
            'timeline_months': 6
        })

    def get_summary(self):
        """Get prioritized savings opportunities"""

        if not self.opportunities:
            return None

        df = pd.DataFrame(self.opportunities)
        df = df.sort_values('risk_adjusted_savings', ascending=False)

        # Add cumulative
        df['cumulative_savings'] = df['risk_adjusted_savings'].cumsum()

        total_gross = df['gross_savings'].sum()
        total_risk_adjusted = df['risk_adjusted_savings'].sum()

        summary = {
            'total_opportunities': len(df),
            'total_gross_savings': total_gross,
            'total_risk_adjusted_savings': total_risk_adjusted,
            'savings_pct_of_spend': total_risk_adjusted / self.current_spend * 100,
            'opportunities': df
        }

        return summary


# Example usage
calculator = SavingsOpportunityCalculator(current_spend=10000000)

calculator.add_price_reduction(
    category='Office Supplies',
    baseline_spend=500000,
    current_price=10.0,
    target_price=9.0,
    confidence=0.85
)

calculator.add_consolidation(
    category='IT Hardware',
    baseline_spend=1200000,
    current_suppliers=8,
    target_suppliers=3,
    expected_discount=0.08,
    confidence=0.75
)

calculator.add_demand_reduction(
    category='Travel',
    baseline_spend=800000,
    reduction_pct=0.15,
    confidence=0.60
)

calculator.add_process_improvement(
    category='All Categories',
    baseline_spend=10000000,
    process_cost_reduction=100000,
    confidence=0.90
)

summary = calculator.get_summary()

print(f"Total Risk-Adjusted Savings: ${summary['total_risk_adjusted_savings']:,.0f}")
print(f"Savings % of Spend: {summary['savings_pct_of_spend']:.1f}%")
print(f"\nTop Opportunities:")
print(summary['opportunities'][['category', 'type', 'risk_adjusted_savings']].head())

Tools & Libraries

Python Libraries

Data Analysis:

  • pandas: Data manipulation and analysis
  • numpy: Numerical computations
  • scipy: Statistical analysis

Machine Learning:

  • scikit-learn: Classification and clustering
  • fuzzywuzzy: Fuzzy string matching for supplier names
  • spacy, nltk: Natural language processing

Visualization:

  • matplotlib, seaborn: Charts and plots
  • plotly: Interactive dashboards
  • dash: Web-based analytics apps

Commercial Software

Spend Analytics Platforms:

  • Coupa Spend Analysis: Comprehensive spend visibility
  • SAP Ariba Spend Visibility: Real-time spend insights
  • Jaggaer Spend Analytics: AI-powered spend analysis
  • SpendHQ: Spend analytics and intelligence
  • Zycus Spend Analysis: Classification and opportunity identification
  • GEP SMART: Source-to-pay with analytics
  • Ivalua: Spend analysis and strategic sourcing

Business Intelligence:

  • Tableau, Power BI: Data visualization
  • Qlik Sense: Associative analytics
  • ThoughtSpot: Search-driven analytics

Common Challenges & Solutions

Challenge: Poor Data Quality

Problem:

  • Inconsistent supplier names
  • Missing category classifications
  • Incomplete transaction data
  • Multiple systems and formats

Solutions:

  • Data cleansing and normalization
  • Fuzzy matching for supplier names
  • Third-party data enrichment (D&B, etc.)
  • Master data management (MDM)
  • Automated classification (ML)
  • Establish data governance

Challenge: Data Fragmentation

Problem:

  • Spend across multiple systems
  • Decentralized purchasing
  • Shadow IT spending
  • Credit card transactions

Solutions:

  • Centralized data warehouse
  • API integrations from all systems
  • Include all spend sources (P-cards, AP, etc.)
  • Regular data extracts and loads
  • Spend visibility platform

Challenge: Classification Accuracy

Problem:

  • Ambiguous descriptions
  • Multi-category items
  • New suppliers/items
  • Subjective judgment

Solutions:

  • Standardized taxonomy
  • Machine learning classification
  • Human review for low confidence
  • Continuous model improvement
  • Supplier self-classification
  • Regular taxonomy updates

Challenge: Stakeholder Buy-In

Problem:

  • Business units resist centralization
  • "My spending is different"
  • Change management resistance

Solutions:

  • Show category-specific insights
  • Quantify savings opportunities
  • Involve stakeholders in analysis
  • Start with quick wins
  • Transparency and collaboration
  • Executive sponsorship

Challenge: Tracking Realized Savings

Problem:

  • Hard to prove actual savings
  • Attribution questions
  • Baseline shifts
  • One-time vs. recurring

Solutions:

  • Define baseline clearly
  • Track price changes over time
  • Separate one-time from recurring
  • Avoid double-counting
  • Regular savings validation
  • Third-party audits

Output Format

Spend Analysis Report

Executive Summary:

  • Total addressable spend
  • Key findings and insights
  • Top savings opportunities
  • Recommended actions

Spend Overview:

Metric Value % of Total
Total Spend $12.5M 100%
Managed Spend (under contract) $8.2M 66%
Unmanaged Spend $4.3M 34%
Number of Suppliers 1,247 -
Number of Transactions 45,320 -

Spend by Category:

Category Spend % of Total Suppliers Avg Transaction Opportunity
IT Hardware $2.1M 17% 15 $12,350 High
Professional Services $1.8M 14% 45 $8,200 Medium
Office Supplies $1.2M 10% 120 $450 High
Raw Materials $3.4M 27% 25 $28,500 Low
MRO $0.9M 7% 200 $380 High
Other $3.1M 25% 842 $1,100 Medium

Pareto Analysis:

Top 20 Suppliers (1.6% of supplier base):
  - Account for 68% of total spend ($8.5M)
  - Average spend per supplier: $425K

Next 80 Suppliers (6.4%):
  - Account for 22% of spend ($2.8M)
  - Average spend: $35K

Tail 1,147 Suppliers (92%):
  - Account for 10% of spend ($1.2M)
  - Average spend: $1,050
  - Consolidation opportunity: Reduce to ~400 suppliers

Supplier Concentration:

Risk Level # Suppliers Spend % of Total
Critical (>$500K) 8 $5.2M 42%
High ($100K-$500K) 28 $3.8M 30%
Medium ($10K-$100K) 186 $2.3M 18%
Low (<$10K) 1,025 $1.2M 10%

Savings Opportunities:

Opportunity Category Baseline Spend Potential Savings Confidence Priority
Consolidate tail suppliers Office Supplies $1.2M $120K (10%) 80% High
Competitive bid IT Hardware $2.1M $168K (8%) 85% High
Standardize specs MRO $0.9M $72K (8%) 70% Medium
Contract compliance Prof Services $1.8M $90K (5%) 90% High
Price benchmarking Raw Materials $3.4M $102K (3%) 75% Medium
Total - $9.4M $552K (5.9%) - -

Compliance & Risk:

  • Maverick Spend: $1.8M (14% of total)

    • Opportunities: Enforce contract compliance, expand preferred supplier program
  • Single-Source Risk: 12 suppliers (Critical dependencies)

    • Recommendations: Qualify backup suppliers, dual sourcing strategy
  • Price Variance: $245K potential savings

    • Items with >20% price variance across purchases

Recommended Actions:

  1. Immediate (0-3 months)

    • Launch RFP for IT Hardware category
    • Implement contract compliance program
    • Consolidate office supplies to 3 suppliers
  2. Near-term (3-6 months)

    • Classify all unclassified spend
    • Develop category strategies for top 5 categories
    • Establish spend analytics dashboard
  3. Long-term (6-12 months)

    • Roll out e-procurement platform
    • Implement supplier scorecards
    • Quarterly spend reviews with business units

Questions to Ask

If you need more context:

  1. What's the total addressable spend volume?
  2. What data sources are available? (ERP, P2P, AP)
  3. What's the analysis time period?
  4. What are the primary objectives? (savings, compliance, risk)
  5. Any known data quality issues?
  6. Is there an existing category taxonomy?
  7. Current spend visibility and tools?
  8. Key stakeholders and their priorities?
  9. Any specific categories to focus on?
  10. Timeline and resources for the analysis?

Related Skills

  • strategic-sourcing: For executing category strategies
  • supplier-selection: For evaluating new suppliers
  • procurement-optimization: For optimal order allocation
  • contract-management: For contract compliance analysis
  • supplier-risk-management: For supplier concentration risk
  • supply-chain-analytics: For broader supply chain metrics
Related skills

More from kishorkukreja/awesome-supply-chain

Installs
17
GitHub Stars
15
First Seen
Mar 16, 2026