Modernize Scientific Computing Stack

This skill provides guidance for migrating legacy Python 2 scientific computing code to modern Python 3 with contemporary libraries and best practices.

When to Use This Skill

Apply this skill when:

Migrating Python 2 scientific scripts to Python 3
Updating legacy data processing code using outdated patterns
Modernizing scripts that use deprecated file handling, string encoding, or numerical libraries
Converting scripts from csv module to pandas for data analysis
Replacing os.path with pathlib for path manipulation

Approach

Phase 1: Complete Code Discovery

Before making any changes, ensure complete understanding of the existing codebase:

Read all source files completely - If a file read is truncated, request the full content before proceeding. Never assume file contents based on partial reads.
Identify all dependencies - Check for:
- Import statements (standard library and third-party)
- Configuration files (JSON, YAML, INI)
- Data files (CSV, Excel, pickle)
- Environment requirements
Map the data flow - Understand:
- Input file formats and encodings
- Data transformations applied
- Output format requirements
- Any intermediate files or caches

Phase 2: Identify Migration Requirements

Common Python 2 to Python 3 migration patterns in scientific code:

Legacy Pattern	Modern Replacement
`print "text"`	`print("text")`
`unicode()` / `str()`	`str()` with explicit encoding
`open(file)`	`open(file, encoding='utf-8')`
`os.path.join()`	`pathlib.Path()`
`csv` module	`pandas.read_csv()`
`for key in dict.keys()`	`for key in dict`
`dict.has_key(x)`	`x in dict`
Manual file iteration	Context managers (`with` statements)
`xrange()`	`range()`
Integer division `/`	Explicit `//` or float division

Phase 3: Implementation Strategy

Create the modernized script with these priorities:
- UTF-8 encoding for all file operations
- pathlib.Path for all file path manipulations
- pandas for CSV/data processing
- Type hints where beneficial
- Context managers for resource handling

Handle configuration files - Check for file existence before reading:

config_path = Path("config.json")
if config_path.exists():
    config = json.loads(config_path.read_text(encoding='utf-8'))

Create requirements.txt - Include all dependencies with version constraints

Phase 4: Verification Protocol

Critical: Always verify file operations

After writing any file, read it back to confirm:

The complete content was written (not truncated)
The syntax is valid
All imports are present

Testing sequence:

Syntax validation - Run Python syntax check:
```
python -m py_compile script.py
```
Import verification - Test all imports resolve:
```
python -c "from script import *"
```
Functional test - Run the script and compare output to expected results
Output validation - Verify output format matches requirements exactly

Common Pitfalls to Avoid

Truncated file content - Never proceed with partial file reads. If a response shows ... [truncated] or incomplete content, request the full file before continuing.
Unverified writes - After using a write operation, always read the file back to confirm the complete content was written correctly.
Encoding issues - Always specify encoding='utf-8' explicitly in file operations. Legacy scripts often have implicit ASCII assumptions.
Path string concatenation - Replace all os.path.join() and string concatenation for paths with pathlib.Path operations.
Missing edge case handling:
- Empty data files or datasets
- Missing required files
- Invalid data types in CSV columns
- Stations/entities with no matching data
Environment setup repetition - When setting up environments (venv, PATH), verify the setup persists rather than repeating in each command.

Verification Checklist

Before marking the task complete, confirm:

Output Validation

When the task specifies an expected output format, verify the output matches exactly:

Run the modernized script
Capture the output
Compare against expected format character-by-character if needed
Pay attention to:
- Decimal precision in numerical output
- Whitespace and formatting
- Order of output items
- Units and labels

modernize-scientific-stack