SDK
MPI Support
Automatic MPI detection and coordination for parallel simulations
MPI Support
Automatic MPI detection and coordination for parallel simulations.
Overview
The SDK automatically:
- Detects MPI environments
- Identifies master (rank 0) and worker processes
- Only logs from master to avoid duplicates
- Returns
Nonefrom logging methods on workers
No code changes required — just run with mpirun.
Supported Implementations
| Implementation | Detection Method |
|---|---|
| OpenMPI | OMPI_COMM_WORLD_RANK env var |
| Intel MPI | PMI_RANK env var |
| MS-MPI | MPI_LOCALRANKID env var |
| mpi4py | MPI.COMM_WORLD.Get_rank() |
| Meep | meep.am_master() |
Basic Usage
mpirun -n 4 python simulation.pyfrom optixlog import Optixlog
# Initialize client - MPI detected automatically
client = Optixlog(api_key="your_api_key")
# MPI info available on client
print(f"Rank: {client.rank}, Size: {client.size}")
print(f"Is master: {client.is_master}")
# Create project and run
project = client.project(name="ParallelSimulations")
run = project.run(name="parallel_sim", config={"processes": client.size})
# Only master logs - workers return None
for step in range(100):
result = compute(step)
run.log(step=step, result=result) # No-op on workersMaster vs Worker
Master Process (rank 0)
- Creates API connection
- Creates the run
- Logs all data
- Gets return values
Worker Processes (rank > 0)
- Skip API connection
- Skip logging (returns
None) - Can access MPI info
Example:
from optixlog import Optixlog
client = Optixlog(api_key="your_api_key")
if client.is_master:
print("I'm the master")
else:
print(f"I'm worker {client.rank}")
# Create run (only master actually creates it)
project = client.project(name="MyProject")
run = project.run(name="experiment")MPI Information
from optixlog import Optixlog
client = Optixlog(api_key="your_api_key")
# Properties on client
print(client.is_master) # True/False
print(client.rank) # 0, 1, 2, ...
print(client.size) # Total processes
# Method for full info
info = client.get_mpi_info()
# {"is_master": True, "rank": 0, "size": 4, "has_mpi": True}Synchronization
Barrier
Wait for all processes:
from optixlog import Optixlog
client = Optixlog(api_key="your_api_key")
project = client.project(name="SyncDemo")
run = project.run(name="barrier_example")
# All processes do work
do_parallel_work()
# Wait for all
client.barrier()
# Master logs summary
if client.is_master:
run.log(step=0, message="All done")Broadcast Run ID
Share run_id with workers:
from optixlog import Optixlog
client = Optixlog(api_key="your_api_key")
project = client.project(name="BroadcastDemo")
run = project.run(name="broadcast_example")
if client.is_master:
client.broadcast_run_id()
else:
client.broadcast_run_id() # Receives run_id
print(f"Worker got: {run.run_id}")Meep Integration
from optixlog import Optixlog
import meep as mp
# Run: mpirun -n 4 python meep_sim.py
client = Optixlog(api_key="your_api_key")
project = client.project(name="MeepSimulations")
run = project.run(
name="meep_parallel",
config={
"processes": client.size,
"resolution": 30
}
)
sim = mp.Simulation(
cell_size=mp.Vector3(10, 5),
resolution=30,
boundary_layers=[mp.PML(1.0)]
)
for step in range(100):
sim.run(until=1)
# Only master logs
if client.is_master and step % 10 == 0:
field = sim.get_array(
center=mp.Vector3(),
size=mp.Vector3(10, 5),
component=mp.Ez
)
run.log_array_as_image(f"field_{step}", field, cmap='RdBu')Distributed Work
from optixlog import Optixlog
client = Optixlog(api_key="your_api_key")
project = client.project(name="Distributed")
run = project.run(name="distributed_work")
# Distribute iterations across processes
my_range = range(client.rank, 1000, client.size)
results = []
for i in my_range:
results.append(expensive_compute(i))
# Synchronize all processes
client.barrier()
# Master collects and logs
if client.is_master:
all_results = gather(results) # Your gather function
run.log(step=0, total=sum(all_results))Troubleshooting
MPI Not Detected
Symptoms: is_master always True, rank always 0
Check:
echo $OMPI_COMM_WORLD_RANK # Should show rank
mpirun --version # Should workFix:
pip install mpi4pyAll Processes Logging
Fix: Check is_master:
if client.is_master:
run.log(step=step, value=value)Workers Hanging
Cause: Unbalanced barrier calls
Fix: All processes must call:
client.barrier() # Every process must call thisrun_id Not Available on Workers
Fix: Use broadcast:
if client.is_master:
client.broadcast_run_id()
else:
client.broadcast_run_id()
# Now run_id is available on workersDetection Priority
The SDK checks in order:
- Environment variables (fastest)
- mpi4py library
- Meep's
am_master() - Fallback to single process
Best Practices
- Always check
is_masterbefore logging - Use
barrier()before collective operations - Broadcast
run_idif workers need it - Don't log in tight loops on all processes
- Install mpi4py for reliable detection
Complete Example
from optixlog import Optixlog
import numpy as np
# Initialize - MPI auto-detected
client = Optixlog(api_key="your_api_key")
print(f"Process {client.rank}/{client.size} starting...")
# Create project and run
project = client.project(name="ParallelCompute")
run = project.run(
name="distributed_simulation",
config={
"total_processes": client.size,
"iterations": 1000
}
)
# Distribute work
iterations_per_process = 1000 // client.size
my_start = client.rank * iterations_per_process
my_end = my_start + iterations_per_process
local_results = []
for i in range(my_start, my_end):
result = expensive_computation(i)
local_results.append(result)
# Log progress (only master actually logs)
if i % 100 == 0:
run.log(step=i, partial_result=result)
# Sync before gathering
client.barrier()
# Master gathers and logs final results
if client.is_master:
# Gather from all processes (using your MPI gather)
all_results = mpi_gather(local_results)
run.log(
step=1000,
total_sum=sum(all_results),
mean=np.mean(all_results),
std=np.std(all_results)
)
print("✓ Simulation complete!")Next Steps
- API Reference — Complete method documentation
- Advanced Usage — Power user features
- Examples — Real-world use cases