SDK MPI Support

OptixLog SDK provides automatic MPI detection and seamless support for parallel simulations.

Overview

The SDK automatically detects MPI environments and handles master/worker process coordination. Only the master process (rank 0) logs to OptixLog, while worker processes are initialized but skip logging automatically.

Supported MPI Implementations

The SDK supports multiple MPI implementations:

OpenMPI - Detected via OMPI_COMM_WORLD_RANK environment variable
Intel MPI - Detected via PMI_RANK environment variable
Microsoft MPI - Detected via MPI_LOCALRANKID environment variable
mpi4py - Detected via mpi4py.MPI.COMM_WORLD
Meep MPI - Detected via Meep’s am_master() function

Automatic Detection

MPI detection happens automatically when you initialize the client:


import optixlog
 
# MPI is detected automatically
with optixlog.run("parallel_simulation") as client:
    # Check MPI status
    info = client.get_mpi_info()
    print(f"Master: {info['is_master']}, Rank: {info['rank']}, Size: {info['size']}")

Master/Worker Process Handling

Master Process (Rank 0)

The master process:

Initializes the API connection
Creates the run
Logs all metrics, images, and files
Receives return values from logging methods

Worker Processes (Rank > 0)

Worker processes:

Are initialized but skip API connection
Do not log anything (prevents duplicates)
Can still access MPI information
Return None from logging methods

Example:


with optixlog.run("parallel_sim") as client:
    if client.is_master:
        print("I'm the master process")
        client.log(step=0, message="From master")
    else:
        print(f"I'm worker process {client.rank}")
        # client.log() returns None on workers

Running with MPI

Basic Usage


# Run with 4 processes
mpirun -n 4 python simulation.py

In your code:


import optixlog
 
with optixlog.run("parallel_simulation") as client:
    # Only master logs
    for step in range(100):
        result = simulate_step(step)
        if client.is_master:
            client.log(step=step, result=result)

With Meep

Meep simulations work seamlessly:


import optixlog
import meep as mp
 
# Run with: mpirun -n 4 python meep_sim.py
with optixlog.run("meep_parallel") as client:
    sim = mp.Simulation(...)
    
    for step in range(100):
        sim.run(until=1)
        
        # Only master logs
        if client.is_master:
            field = sim.get_array(...)
            client.log_array_as_image(f"field_{step}", field)

Synchronization Methods

Barrier

Synchronize all processes at a specific point:


with optixlog.run("sync_demo") as client:
    # All processes do work
    do_parallel_work()
    
    # Wait for all to finish
    client.barrier()
    
    # All processes continue together
    if client.is_master:
        client.log(step=0, message="All processes synchronized")

Broadcast Run ID

Share the run_id from master to all workers:


with optixlog.run("broadcast_demo") as client:
    # Master broadcasts run_id
    if client.is_master:
        client.broadcast_run_id()
        print(f"Run ID: {client.run_id}")
    
    # Workers receive run_id
    if not client.is_master:
        client.broadcast_run_id()
        print(f"Worker {client.rank} received run_id: {client.run_id}")

MPI Information

Get MPI Info


info = client.get_mpi_info()
# Returns: {
#     "is_master": bool,
#     "rank": int,
#     "size": int,
#     "has_mpi": bool
# }

Module-Level Function


import optixlog
 
info = optixlog.get_mpi_info()
is_master = optixlog.is_master_process()

Detection Priority

The SDK checks MPI environments in this order:

Environment Variables (most reliable)
- OMPI_COMM_WORLD_RANK (OpenMPI)
- PMI_RANK (Intel MPI)
- MPI_LOCALRANKID (Microsoft MPI)
mpi4py Library
- mpi4py.MPI.COMM_WORLD.Get_rank()
Meep Detection
- meep.am_master()
Fallback
- Single process (no MPI)

Best Practices

Always Check is_master for Logging


with optixlog.run("parallel") as client:
    for step in range(100):
        result = compute(step)
        
        # Only master logs
        if client.is_master:
            client.log(step=step, result=result)

Use Barrier for Synchronization


# All processes do work
do_work()
 
# Synchronize
client.barrier()
 
# Master logs summary
if client.is_master:
    client.log(step=0, summary=compute_summary())

Avoid Logging in Loops on Workers


# Good: Check before logging
if client.is_master:
    client.log(step=step, value=value)
 
# Bad: Logging on all processes (wasteful)
client.log(step=step, value=value)  # Returns None on workers, but still wasteful

Troubleshooting

Issue: All Processes Logging

Symptom: Duplicate logs from all processes

Solution: Ensure you’re checking is_master:


if client.is_master:
    client.log(step=step, value=value)

Issue: MPI Not Detected

Symptom: is_master is always True, rank is always 0

Solutions:

Check environment variables:


echo $OMPI_COMM_WORLD_RANK  # OpenMPI
echo $PMI_RANK  # Intel MPI

Install mpi4py:
```
pip install mpi4py
```
Verify MPI installation:
```
mpirun --version
```

Issue: Worker Processes Hanging

Symptom: Workers wait indefinitely

Solution: Ensure all processes call barrier:


# All processes must call this
client.barrier()

Issue: Run ID Not Available on Workers

Symptom: Workers can’t access run_id

Solution: Use broadcast:


if client.is_master:
    client.broadcast_run_id()
else:
    client.broadcast_run_id()  # Receives run_id
    print(f"Run ID: {client.run_id}")

Examples

Simple Parallel Loop


import optixlog
import numpy as np
 
with optixlog.run("parallel_loop") as client:
    # Distribute work
    my_range = range(client.rank, 100, client.size)
    
    results = []
    for i in my_range:
        results.append(compute(i))
    
    # Synchronize
    client.barrier()
    
    # Master collects and logs
    if client.is_master:
        all_results = gather_results(results)  # Your gather function
        client.log(step=0, total=len(all_results))

Meep Parallel Simulation


import optixlog
import meep as mp
 
# Run: mpirun -n 4 python meep_sim.py
with optixlog.run("meep_parallel", 
                   config={"processes": 4}) as client:
    sim = mp.Simulation(
        cell_size=mp.Vector3(10, 10, 0),
        resolution=30,
        # ... other parameters
    )
    
    for step in range(100):
        sim.run(until=1)
        
        # Only master logs
        if client.is_master and step % 10 == 0:
            field = sim.get_array(
                center=mp.Vector3(),
                size=mp.Vector3(10, 10, 0),
                component=mp.Ez
            )
            client.log_array_as_image(f"field_{step}", field, cmap='RdBu')