xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Requirements for Outputs
All Excel files
Zero Formula Errors
Preserve Existing Templates (when updating templates)
Financial models
Color Coding Standards
Unless otherwise stated by the user or existing template
Industry-Standard Color Conventions
Number Formatting Standards
Required Format Rules
Formula Construction Rules
Assumptions Placement
Formula Error Prevention
Documentation Requirements for Hardcodes
- "Source: Company 10-K, FY2024, Page 45, Revenue Note, [SEC EDGAR URL]"
- "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"
- "Source: Bloomberg Terminal, 8/15/2025, AAPL US Equity"
- "Source: FactSet, 8/20/2025, Consensus Estimates Screen"
XLSX creation, editing, and analysis
Overview
A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks.
Important Requirements
LibreOffice Required for Formula Recalculation: You can assume LibreOffice is installed for recalculating formula values using the recalc.py script. The script automatically configures LibreOffice on first run
Reading and analyzing data
Data analysis with pandas
For data analysis, visualization, and basic operations, use pandas which provides powerful data manipulation capabilities:
import pandas as pdRead Excel
df = pd.read_excel('file.xlsx') # Default: first sheet
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dictAnalyze
df.head() # Preview data
df.info() # Column info
df.describe() # StatisticsWrite Excel
df.to_excel('output.xlsx', index=False)Excel File Workflows
CRITICAL: Use Formulas, Not Hardcoded Values
Always use Excel formulas instead of calculating values in Python and hardcoding them. This ensures the spreadsheet remains dynamic and updateable.
❌ WRONG - Hardcoding Calculated Values
# Bad: Calculating in Python and hardcoding result
total = df['Sales'].sum()
sheet['B10'] = total # Hardcodes 5000Bad: Computing growth rate in Python
growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue']
sheet['C5'] = growth # Hardcodes 0.15Bad: Python calculation for average
avg = sum(values) / len(values)
sheet['D20'] = avg # Hardcodes 42.5✅ CORRECT - Using Excel Formulas
# Good: Let Excel calculate the sum
sheet['B10'] = '=SUM(B2:B9)'Good: Growth rate as Excel formula
sheet['C5'] = '=(C4-C2)/C2'Good: Average using Excel function
sheet['D20'] = '=AVERAGE(D2:D19)'This applies to ALL calculations - totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.
Common Workflow
python recalc.py output.xlsx- The script returns JSON with error details
- If
status is errors_found, check error_summary for specific error types and locations- Fix the identified errors and recalculate again
- Common errors to fix:
-
#REF!: Invalid cell references-
#DIV/0!: Division by zero-
#VALUE!: Wrong data type in formula-
#NAME?: Unrecognized formula nameCreating new Excel files
# Using openpyxl for formulas and formatting
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignmentwb = Workbook()
sheet = wb.active
Add data
sheet['A1'] = 'Hello'
sheet['B1'] = 'World'
sheet.append(['Row', 'of', 'data'])Add formula
sheet['B2'] = '=SUM(A1:A10)'Formatting
sheet['A1'].font = Font(bold=True, color='FF0000')
sheet['A1'].fill = PatternFill('solid', start_color='FFFF00')
sheet['A1'].alignment = Alignment(horizontal='center')Column width
sheet.column_dimensions['A'].width = 20wb.save('output.xlsx')
Editing existing Excel files
# Using openpyxl to preserve formulas and formatting
from openpyxl import load_workbookLoad existing file
wb = load_workbook('existing.xlsx')
sheet = wb.active # or wb['SheetName'] for specific sheetWorking with multiple sheets
for sheet_name in wb.sheetnames:
sheet = wb[sheet_name]
print(f"Sheet: {sheet_name}")Modify cells
sheet['A1'] = 'New Value'
sheet.insert_rows(2) # Insert row at position 2
sheet.delete_cols(3) # Delete column 3Add new sheet
new_sheet = wb.create_sheet('NewSheet')
new_sheet['A1'] = 'Data'wb.save('modified.xlsx')
Recalculating formulas
Excel files created or modified by openpyxl contain formulas as strings but not calculated values. Use the provided recalc.py script to recalculate formulas:
python recalc.py <excel_file> [timeout_seconds]Example:
python recalc.py output.xlsx 30The script:
Formula Verification Checklist
Quick checks to ensure formulas work correctly:
Essential Verification
Common Pitfalls
pd.notna()/ in formulas (#DIV/0!)Formula Testing Strategy
Interpreting recalc.py Output
The script returns JSON with error details:
{
"status": "success", // or "errors_found"
"total_errors": 0, // Total error count
"total_formulas": 42, // Number of formulas in file
"error_summary": { // Only present if errors found
"#REF!": {
"count": 2,
"locations": ["Sheet1!B5", "Sheet1!C10"]
}
}
}Best Practices
Library Selection
Working with openpyxl
data_only=True to read calculated values: load_workbook('file.xlsx', data_only=True)data_only=True and saved, formulas are replaced with values and permanently lostread_only=True for reading or write_only=True for writingWorking with pandas
pd.read_excel('file.xlsx', dtype={'id': str})pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])pd.read_excel('file.xlsx', parse_dates=['date_column'])Code Style Guidelines
IMPORTANT: When generating Python code for Excel operations:
For Excel files themselves: