time-series-analyst
SKILL.md
Time Series Analyst
This skill provides guidance for analyzing temporal data, identifying patterns, and building forecasting models.
Core Competencies
- Decomposition: Trend, seasonality, residual analysis
- Statistical Methods: ARIMA, SARIMA, Exponential Smoothing
- ML Methods: Prophet, LSTM, Transformer-based models
- Anomaly Detection: Statistical and ML approaches
Time Series Fundamentals
Data Characteristics
Before analysis, assess:
| Property | Question | Impact |
|---|---|---|
| Stationarity | Is mean/variance constant? | Method selection |
| Seasonality | Are there repeating patterns? | Model components |
| Trend | Is there long-term direction? | Differencing needs |
| Frequency | What's the sampling rate? | Aggregation choices |
| Missing values | Are there gaps? | Imputation needs |
Stationarity Tests
from statsmodels.tsa.stattools import adfuller, kpss
# Augmented Dickey-Fuller (null: non-stationary)
adf_result = adfuller(series)
print(f"ADF Statistic: {adf_result[0]:.4f}")
print(f"p-value: {adf_result[1]:.4f}")
# p < 0.05 suggests stationarity
# KPSS (null: stationary)
kpss_result = kpss(series, regression='c')
print(f"KPSS Statistic: {kpss_result[0]:.4f}")
print(f"p-value: {kpss_result[1]:.4f}")
# p > 0.05 suggests stationarity
Making Series Stationary
# Differencing for trend
diff_1 = series.diff().dropna()
# Seasonal differencing
seasonal_diff = series.diff(periods=12).dropna() # Monthly seasonality
# Log transform for varying variance
log_series = np.log(series)
# Box-Cox for optimal transformation
from scipy.stats import boxcox
transformed, lambda_param = boxcox(series)
Time Series Decomposition
Classical Decomposition
Original = Trend + Seasonal + Residual (Additive)
Original = Trend × Seasonal × Residual (Multiplicative)
from statsmodels.tsa.seasonal import seasonal_decompose, STL
# Classical decomposition
decomposition = seasonal_decompose(
series,
model='additive', # or 'multiplicative'
period=12
)
# STL (more robust)
stl = STL(series, period=12, robust=True)
result = stl.fit()
# Access components
trend = result.trend
seasonal = result.seasonal
residual = result.resid
Visualization Pattern
┌────────────────────────────────────────┐
│ Original Series │ ← Raw data
├────────────────────────────────────────┤
│ Trend Component │ ← Long-term direction
├────────────────────────────────────────┤
│ Seasonal Component │ ← Repeating patterns
├────────────────────────────────────────┤
│ Residual Component │ ← Random noise
└────────────────────────────────────────┘
Statistical Forecasting Methods
ARIMA Model Selection
ARIMA(p, d, q):
- p: Autoregressive order (ACF/PACF)
- d: Differencing order (stationarity)
- q: Moving average order (ACF)
from statsmodels.tsa.arima.model import ARIMA
from pmdarima import auto_arima
# Automatic selection
auto_model = auto_arima(
series,
start_p=0, max_p=5,
start_q=0, max_q=5,
d=None, # Auto-detect differencing
seasonal=False,
information_criterion='aic',
trace=True
)
print(auto_model.summary())
# Manual ARIMA
model = ARIMA(series, order=(2, 1, 2))
fitted = model.fit()
forecast = fitted.forecast(steps=30)
SARIMA for Seasonal Data
SARIMA(p, d, q)(P, D, Q, m):
- Lowercase: non-seasonal components
- Uppercase: seasonal components
- m: seasonal period
from statsmodels.tsa.statespace.sarimax import SARIMAX
model = SARIMAX(
series,
order=(1, 1, 1),
seasonal_order=(1, 1, 1, 12), # Monthly seasonality
enforce_stationarity=False
)
fitted = model.fit()
# Forecast with confidence intervals
forecast = fitted.get_forecast(steps=24)
mean = forecast.predicted_mean
ci = forecast.conf_int(alpha=0.05)
Exponential Smoothing
from statsmodels.tsa.holtwinters import ExponentialSmoothing
# Holt-Winters (trend + seasonality)
model = ExponentialSmoothing(
series,
trend='add', # or 'mul', None
seasonal='add', # or 'mul', None
seasonal_periods=12
)
fitted = model.fit()
forecast = fitted.forecast(24)
ML-Based Forecasting
Facebook Prophet
from prophet import Prophet
# Prepare data (must have 'ds' and 'y' columns)
df = pd.DataFrame({'ds': dates, 'y': values})
# Basic model
model = Prophet(
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False
)
# Add custom seasonality
model.add_seasonality(
name='monthly',
period=30.5,
fourier_order=5
)
# Add regressors
model.add_regressor('holiday_flag')
model.add_regressor('promotion')
model.fit(df)
# Forecast
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)
LSTM for Sequences
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
# Prepare sequences
def create_sequences(data, seq_length):
X, y = [], []
for i in range(len(data) - seq_length):
X.append(data[i:i+seq_length])
y.append(data[i+seq_length])
return np.array(X), np.array(y)
X, y = create_sequences(scaled_data, seq_length=60)
X = X.reshape((X.shape[0], X.shape[1], 1))
# Build model
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(60, 1)),
Dropout(0.2),
LSTM(50, return_sequences=False),
Dropout(0.2),
Dense(1)
])
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.1)
Anomaly Detection
Statistical Methods
# Z-score method
def zscore_anomalies(series, threshold=3):
mean, std = series.mean(), series.std()
z_scores = (series - mean) / std
return abs(z_scores) > threshold
# IQR method
def iqr_anomalies(series, multiplier=1.5):
Q1, Q3 = series.quantile(0.25), series.quantile(0.75)
IQR = Q3 - Q1
lower = Q1 - multiplier * IQR
upper = Q3 + multiplier * IQR
return (series < lower) | (series > upper)
# Rolling statistics
def rolling_anomalies(series, window=30, threshold=2):
rolling_mean = series.rolling(window).mean()
rolling_std = series.rolling(window).std()
lower = rolling_mean - threshold * rolling_std
upper = rolling_mean + threshold * rolling_std
return (series < lower) | (series > upper)
Isolation Forest
from sklearn.ensemble import IsolationForest
# Feature engineering for time series
features = pd.DataFrame({
'value': series,
'hour': series.index.hour,
'dayofweek': series.index.dayofweek,
'rolling_mean': series.rolling(24).mean(),
'rolling_std': series.rolling(24).std()
}).dropna()
model = IsolationForest(contamination=0.01, random_state=42)
anomalies = model.fit_predict(features)
# -1 = anomaly, 1 = normal
Model Evaluation
Metrics
| Metric | Formula | Use Case |
|---|---|---|
| MAE | mean(|actual - pred|) | Interpretable error |
| RMSE | sqrt(mean((actual - pred)²)) | Penalize large errors |
| MAPE | mean(|actual - pred| / actual) | Percentage error |
| SMAPE | Symmetric MAPE | Handles zeros better |
Cross-Validation for Time Series
from sklearn.model_selection import TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=5)
for train_idx, test_idx in tscv.split(series):
train, test = series.iloc[train_idx], series.iloc[test_idx]
# Fit on train, evaluate on test
Backtesting Pattern
├──────────────────────────────────────────────────────▶ Time
│
│ ┌─────────────────┬─────┐
│ │ Train │ Test│ Fold 1
│ └─────────────────┴─────┘
│ ┌───────────────────────┬─────┐
│ │ Train │ Test│ Fold 2
│ └───────────────────────┴─────┘
│ ┌─────────────────────────────┬─────┐
│ │ Train │ Test│ Fold 3
│ └─────────────────────────────┴─────┘
References
references/arima-guide.md- ARIMA model selection and diagnosticsreferences/prophet-tuning.md- Prophet configuration and custom seasonalityreferences/anomaly-patterns.md- Anomaly detection techniques and thresholds
Weekly Installs
2
Repository
4444j99/a-i--skillsGitHub Stars
3
First Seen
7 days ago
Security Audits
Installed on
amp2
cline2
openclaw2
opencode2
cursor2
kimi-cli2