stock-correlation
Analyze stock correlations to find related companies and trading pairs. Use when the user asks about correlated stocks, related companies, sector peers, trading pairs, or how two or more stocks move together. Triggers: "what correlates with NVDA", "find stocks related to AMD", "correlation between AAPL and MSFT", "what moves with", "sector peers", "pair trading", "correlated stocks", "when NVDA drops what else drops", "stocks that move together", "beta to", "relative performance", "supply chain partners", "correlation matrix", "co-movement", "related tickers", "sympathy plays", "semiconductor peers", "hedging pair", "realized correlation", "rolling correlation", or any request about stocks that move in tandem or inversely. Also triggers for well-known pairs like AMD/NVDA, GOOGL/AVGO, LITE/COHR. If only one ticker is provided, infer the user wants correlated peers.
Author
Category
Finance AnalysisInstall
Download and extract to your skills directory
Copy command and send to OpenClaw for auto-install:
Stock Correlation Analysis Skill
Finds and analyzes correlated stocks using historical price data from Yahoo Finance via yfinance. Routes to specialized sub-skills based on user intent.
Important: This is for research and educational purposes only. Not financial advice. yfinance is not affiliated with Yahoo, Inc.
Step 1: Ensure Dependencies Are Available
Current environment status:
!`python3 -c "import yfinance, pandas, numpy; print(f'yfinance={yfinance.__version__} pandas={pandas.__version__} numpy={numpy.__version__}')" 2>/dev/null || echo "DEPS_MISSING"`If DEPS_MISSING, install required packages before running any code:
import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "yfinance", "pandas", "numpy"])If all dependencies are already installed, skip the install step and proceed directly.
Step 2: Route to the Correct Sub-Skill
Classify the user's request and jump to the matching sub-skill section below.
| User Request | Route To | Examples |
|---|---|---|
| Single ticker, wants to find related stocks | Sub-Skill A: Co-movement Discovery | "what correlates with NVDA", "find stocks related to AMD", "sympathy plays for TSLA" |
| Two or more specific tickers, wants relationship details | Sub-Skill B: Return Correlation | "correlation between AMD and NVDA", "how do LITE and COHR move together", "compare AAPL vs MSFT" |
| Group of tickers, wants structure/grouping | Sub-Skill C: Sector Clustering | "correlation matrix for FAANG", "cluster these semiconductor stocks", "sector peers for AMD" |
| Wants time-varying or conditional correlation | Sub-Skill D: Realized Correlation | "rolling correlation AMD NVDA", "when NVDA drops what else drops", "how has correlation changed" |
If ambiguous, default to Sub-Skill A (Co-movement Discovery) for single tickers, or Sub-Skill B (Return Correlation) for two tickers.
Defaults for all sub-skills
| Parameter | Default |
|---|---|
| Lookback period | 1y (1 year) |
| Data interval | 1d (daily) |
| Correlation method | Pearson |
| Minimum correlation threshold | 0.60 |
| Number of results | Top 10 |
| Return type | Daily log returns |
| Rolling window | 60 trading days |
Sub-Skill A: Co-movement Discovery
Goal: Given a single ticker, find stocks that move with it.
A1: Build the peer universe
You need 15-30 candidates. Do not use hardcoded ticker lists — build the universe dynamically at runtime. See references/sector_universes.md for the full implementation. The approach:
yf.screen() + yf.EquityQuery to find stocks in the same industry as the targetlongBusinessSummary and screen 1-2 related industries (e.g., a semiconductor company → also screen semiconductor equipment)A2: Compute correlations
import yfinance as yf
import pandas as pd
import numpy as np
def discover_comovement(target_ticker, peer_tickers, period="1y"):
all_tickers = [target_ticker] + [t for t in peer_tickers if t != target_ticker]
data = yf.download(all_tickers, period=period, auto_adjust=True, progress=False)
# Extract close prices — yf.download returns MultiIndex (Price, Ticker) columns
closes = data["Close"].dropna(axis=1, thresh=max(60, len(data) // 2))
# Log returns
returns = np.log(closes / closes.shift(1)).dropna()
corr_series = returns.corr()[target_ticker].drop(target_ticker, errors="ignore")
# Rank by absolute correlation
ranked = corr_series.abs().sort_values(ascending=False)
result = pd.DataFrame({
"Ticker": ranked.index,
"Correlation": [round(corr_series[t], 4) for t in ranked.index],
})
return result, returnsA3: Present results
Show a ranked table with company names and sectors (fetch via yf.Ticker(t).info.get("shortName")):
| Rank | Ticker | Company | Correlation | Why linked |
|---|---|---|---|---|
| 1 | AMD | Advanced Micro Devices | 0.82 | Same industry — GPU/CPU |
| 2 | AVGO | Broadcom | 0.78 | AI infrastructure peer |
Include:
Sub-Skill B: Return Correlation
Goal: Deep-dive into the relationship between two (or a few) specific tickers.
B1: Download and compute
import yfinance as yf
import pandas as pd
import numpy as np
def return_correlation(ticker_a, ticker_b, period="1y"):
data = yf.download([ticker_a, ticker_b], period=period, auto_adjust=True, progress=False)
closes = data["Close"][[ticker_a, ticker_b]].dropna()
returns = np.log(closes / closes.shift(1)).dropna()
corr = returns[ticker_a].corr(returns[ticker_b])
# Beta: how much does B move per unit move of A
cov_matrix = returns.cov()
beta = cov_matrix.loc[ticker_b, ticker_a] / cov_matrix.loc[ticker_a, ticker_a]
# R-squared
r_squared = corr ** 2
# Rolling 60-day correlation for stability
rolling_corr = returns[ticker_a].rolling(60).corr(returns[ticker_b])
# Spread (log price ratio) for mean-reversion
spread = np.log(closes[ticker_a] / closes[ticker_b])
spread_z = (spread - spread.mean()) / spread.std()
return {
"correlation": round(corr, 4),
"beta": round(beta, 4),
"r_squared": round(r_squared, 4),
"rolling_corr_mean": round(rolling_corr.mean(), 4),
"rolling_corr_std": round(rolling_corr.std(), 4),
"rolling_corr_min": round(rolling_corr.min(), 4),
"rolling_corr_max": round(rolling_corr.max(), 4),
"spread_z_current": round(spread_z.iloc[-1], 4),
"observations": len(returns),
}B2: Present results
Show a summary card:
| Metric | Value |
|---|---|
| Pearson Correlation | 0.82 |
| Beta (B vs A) | 1.15 |
| R-squared | 0.67 |
| Rolling Corr (60d avg) | 0.80 |
| Rolling Corr Range | [0.55, 0.94] |
| Rolling Corr Std Dev | 0.08 |
| Spread Z-Score (current) | +1.2 |
| Observations | 250 |
Interpretation guide:
Sub-Skill C: Sector Clustering
Goal: Given a group of tickers, show the full correlation structure and identify clusters.
C1: Build the correlation matrix
import yfinance as yf
import pandas as pd
import numpy as np
def sector_clustering(tickers, period="1y"):
data = yf.download(tickers, period=period, auto_adjust=True, progress=False)
# yf.download returns MultiIndex (Price, Ticker) columns
closes = data["Close"].dropna(axis=1, thresh=max(60, len(data) // 2))
returns = np.log(closes / closes.shift(1)).dropna()
corr_matrix = returns.corr()
# Hierarchical clustering order
from scipy.cluster.hierarchy import linkage, leaves_list
from scipy.spatial.distance import squareform
dist_matrix = 1 - corr_matrix.abs()
np.fill_diagonal(dist_matrix.values, 0)
condensed = squareform(dist_matrix)
linkage_matrix = linkage(condensed, method="ward")
order = leaves_list(linkage_matrix)
ordered_tickers = [corr_matrix.columns[i] for i in order]
# Reorder matrix
clustered = corr_matrix.loc[ordered_tickers, ordered_tickers]
return clustered, returnsNote: if scipy is not available, fall back to sorting by average correlation instead of hierarchical clustering.
C2: Present results
- Cluster 1: [NVDA, AMD, AVGO] — avg intra-correlation 0.82
- Cluster 2: [AAPL, MSFT] — avg intra-correlation 0.75
Sub-Skill D: Realized Correlation
Goal: Show how correlation changes over time and under different market conditions.
D1: Rolling correlation
import yfinance as yf
import pandas as pd
import numpy as np
def realized_correlation(ticker_a, ticker_b, period="2y", windows=[20, 60, 120]):
data = yf.download([ticker_a, ticker_b], period=period, auto_adjust=True, progress=False)
closes = data["Close"][[ticker_a, ticker_b]].dropna()
returns = np.log(closes / closes.shift(1)).dropna()
rolling = {}
for w in windows:
rolling[f"{w}d"] = returns[ticker_a].rolling(w).corr(returns[ticker_b])
return rolling, returnsD2: Regime-conditional correlation
def regime_correlation(returns, ticker_a, ticker_b, condition_ticker=None):
"""Compare correlation across up/down/volatile regimes."""
if condition_ticker is None:
condition_ticker = ticker_a
ret = returns[condition_ticker]
regimes = {
"All Days": pd.Series(True, index=returns.index),
"Up Days (target > 0)": ret > 0,
"Down Days (target < 0)": ret < 0,
"High Vol (top 25%)": ret.abs() > ret.abs().quantile(0.75),
"Low Vol (bottom 25%)": ret.abs() < ret.abs().quantile(0.25),
"Large Drawdown (< -2%)": ret < -0.02,
}
results = {}
for name, mask in regimes.items():
subset = returns[mask]
if len(subset) >= 20:
results[name] = {
"correlation": round(subset[ticker_a].corr(subset[ticker_b]), 4),
"days": int(mask.sum()),
}
return resultsD3: Present results
| Window | Current | Mean | Min | Max | Std |
|---|---|---|---|---|---|
| 20-day | 0.88 | 0.76 | 0.32 | 0.95 | 0.12 |
| 60-day | 0.82 | 0.78 | 0.55 | 0.92 | 0.08 |
| 120-day | 0.80 | 0.79 | 0.68 | 0.88 | 0.05 |
| Regime | Correlation | Days |
|---|---|---|
| All Days | 0.82 | 250 |
| Up Days | 0.75 | 132 |
| Down Days | 0.87 | 118 |
| High Vol (top 25%) | 0.90 | 63 |
| Large Drawdown (< -2%) | 0.93 | 28 |
Step 3: Respond to the User
After running the appropriate sub-skill, present results clearly:
Always include
Always caveat
Practical applications (mention when relevant)
Important: Never recommend specific trades. Present data and let the user draw conclusions.
Reference Files
references/sector_universes.md — Dynamic peer universe construction using yfinance Screener APIRead the reference file when you need to build a peer universe for a given ticker.