Simulating FRB Populations with astropath

This notebook demonstrates how to use the astropath.simulations.generate_frbs module to generate synthetic FRB populations with realistic properties for different surveys (CHIME, DSA, ASKAP).

The generated FRBs have the following properties:

DM: Extragalactic dispersion measure (pc/cm³)
z: Redshift
M_r: Host galaxy absolute r-band magnitude
m_r: Host galaxy apparent r-band magnitude

[1]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Set up plotting style
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 12

[2]:

from astropath.simulations import generate_frbs, SURVEY_GRIDS

/home/xavier/Projects/FRB/FRB/frb/halos/hmf.py:51: UserWarning: hmf_emulator not imported.  Hope you are not intending to use the hmf.py module..
  warnings.warn("hmf_emulator not imported.  Hope you are not intending to use the hmf.py module..")

Available Surveys

The module supports several FRB surveys, each with its own P(z,DM) grid that captures the selection effects and sensitivity of that instrument.

[3]:

print("Available surveys:")
for survey in SURVEY_GRIDS.keys():
    print(f"  - {survey}")

Available surveys:
  - CHIME
  - DSA
  - ASKAP
  - CRAFT
  - CRAFT_ICS_1300
  - CRAFT_ICS_892
  - CRAFT_ICS_1632
  - Parkes
  - FAST

Basic Usage: Generate FRBs for a Single Survey

The simplest usage generates FRBs by sampling from the survey-specific P(z,DM) grid.

[4]:

# Generate 1000 CHIME FRBs with a fixed seed for reproducibility
df_chime = generate_frbs(1000, 'CHIME', seed=42)

print(f"Generated {len(df_chime)} CHIME FRBs")
df_chime.head(10)

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Generated 1000 CHIME FRBs

[4]:

	DMeg	z	M_r	m_r
0	445.0	0.25	-21.681575	18.889973
1	1175.0	1.38	-20.278551	24.743318
2	710.0	0.63	-18.475358	24.456045
3	585.0	0.45	-19.330649	22.723243
4	215.0	0.11	-18.918085	19.690382
5	215.0	0.11	-19.709010	18.899457
6	225.0	0.05	-19.538448	17.269064
7	850.0	0.94	-18.647786	25.346529
8	1225.0	0.45	-21.433325	20.620566
9	930.0	0.59	-20.513442	22.245471

[5]:

# Summary statistics
df_chime.describe()

[5]:

	DMeg	z	M_r	m_r
count	1000.000000	1000.000000	1000.000000	1000.000000
mean	599.490000	0.472650	-20.223130	20.900058
std	476.539431	0.435738	1.556904	3.054551
min	30.000000	0.010000	-23.336626	10.759040
25%	280.000000	0.160000	-21.465300	18.925289
50%	470.000000	0.350000	-20.384238	21.014704
75%	770.000000	0.650000	-19.179283	23.037714
max	4455.000000	3.760000	-15.189843	30.037022

Comparing Different Surveys

Different surveys have different selection functions, leading to different redshift and DM distributions. Let’s compare CHIME, DSA, and ASKAP.

[6]:

# Generate FRBs for three major surveys
n_frbs = 5000
seed = 42

df_chime = generate_frbs(n_frbs, 'CHIME', seed=seed)
df_dsa = generate_frbs(n_frbs, 'DSA', seed=seed)
df_askap = generate_frbs(n_frbs, 'ASKAP', seed=seed)

surveys = {
    'CHIME': df_chime,
    'DSA': df_dsa,
    'ASKAP': df_askap
}

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/DSA_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CRAFT_class_I_and_II_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes

[7]:

# Compare summary statistics
print("Survey Comparison Summary")
print("=" * 60)
for name, df in surveys.items():
    print(f"\n{name}:")
    print(f"  DM:  median = {df['DMeg'].median():.0f}, range = [{df['DMeg'].min():.0f}, {df['DMeg'].max():.0f}] pc/cm³")
    print(f"  z:   median = {df['z'].median():.3f}, range = [{df['z'].min():.3f}, {df['z'].max():.3f}]")
    print(f"  m_r: median = {df['m_r'].median():.1f}, range = [{df['m_r'].min():.1f}, {df['m_r'].max():.1f}]")

Survey Comparison Summary
============================================================

CHIME:
  DM:  median = 485, range = [10, 5850] pc/cm³
  z:   median = 0.350, range = [0.010, 3.760]
  m_r: median = 21.0, range = [10.2, 29.8]

DSA:
  DM:  median = 520, range = [10, 6040] pc/cm³
  z:   median = 0.400, range = [0.010, 4.400]
  m_r: median = 21.3, range = [10.2, 30.3]

ASKAP:
  DM:  median = 265, range = [10, 6055] pc/cm³
  z:   median = 0.130, range = [0.010, 1.060]
  m_r: median = 18.5, range = [10.1, 26.8]

Redshift Distributions

[8]:

fig, axes = plt.subplots(1, 3, figsize=(14, 4))
colors = ['#1f77b4', '#ff7f0e', '#2ca02c']

for ax, (name, df), color in zip(axes, surveys.items(), colors):
    ax.hist(df['z'], bins=40, alpha=0.7, color=color, density=True, edgecolor='white')
    ax.axvline(df['z'].median(), color='red', linestyle='--', linewidth=2, label=f'median = {df["z"].median():.2f}')
    ax.set_xlabel('Redshift z')
    ax.set_ylabel('Density')
    ax.set_title(f'{name}')
    ax.legend()
    ax.set_xlim(0, 3)

plt.suptitle('Redshift Distributions by Survey', fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

../_images/nb_Simulate_Generate_FRBs_12_0.png

DM Distributions

[9]:

fig, axes = plt.subplots(1, 3, figsize=(14, 4))

for ax, (name, df), color in zip(axes, surveys.items(), colors):
    ax.hist(df['DMeg'], bins=40, alpha=0.7, color=color, density=True, edgecolor='white')
    ax.axvline(df['DMeg'].median(), color='red', linestyle='--', linewidth=2, label=f'median = {df["DMeg"].median():.0f}')
    ax.set_xlabel('DM_EG (pc/cm³)')
    ax.set_ylabel('Density')
    ax.set_title(f'{name}')
    ax.legend()
    ax.set_xlim(0, 2500)

plt.suptitle('DM Distributions by Survey', fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

../_images/nb_Simulate_Generate_FRBs_14_0.png

DM vs Redshift

[10]:

fig, axes = plt.subplots(1, 3, figsize=(14, 4))

for ax, (name, df), color in zip(axes, surveys.items(), colors):
    ax.scatter(df['z'], df['DMeg'], alpha=0.3, s=5, color=color)
    ax.set_xlabel('Redshift z')
    ax.set_ylabel('DM (pc/cm³)')
    ax.set_title(f'{name}')
    ax.set_xlim(0, 3)
    ax.set_ylim(0, 2500)

plt.suptitle('DM vs Redshift by Survey', fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

../_images/nb_Simulate_Generate_FRBs_16_0.png

Apparent Magnitude Distributions

The apparent magnitude of the host galaxy depends on both the absolute magnitude (M_r) and the redshift.

[11]:

fig, ax = plt.subplots(figsize=(10, 6))

bins = np.linspace(12, 30, 40)
for (name, df), color in zip(surveys.items(), colors):
    ax.hist(df['m_r'], bins=bins, alpha=0.5, color=color, label=name, density=True)

# Add typical survey depth limits
ax.axvline(23.5, color='gray', linestyle=':', linewidth=2, label='DECaLS depth (~23.5)')
ax.axvline(26, color='gray', linestyle='--', linewidth=2, label='HSC depth (~26)')

ax.set_xlabel('Apparent r-band magnitude (m_r)')
ax.set_ylabel('Density')
ax.set_title('Host Galaxy Apparent Magnitude Distribution')
ax.legend()
ax.set_xlim(12, 30)
plt.tight_layout()
plt.show()

../_images/nb_Simulate_Generate_FRBs_18_0.png

Using an Observed DM Catalog

If you have observed DM values from a specific sample, you can use them to constrain the DM distribution via kernel density estimation. This is useful when you want to simulate FRBs that match a specific observed population.

[12]:

# Example: DMs from DSA-110 observations (Connor+24)
dsa_observed_dms = np.array([
    561.50, 186.30, 335.40, 371.00, 415.90, 429.50, 586.80, 312.40,
    213.90, 597.00, 572.70, 1019.50, 576.40, 275.10, 387.80, 405.30,
    1347.10, 359.50, 319.70, 508.30, 662.30, 552.10, 789.50, 571.30,
    406.90, 1204.10, 412.00, 306.00, 547.80, 591.50, 395.10, 356.50,
    441.30, 445.20, 1371.70, 317.30, 501.50, 453.20, 723.60
])

print(f"Observed DSA DMs: N={len(dsa_observed_dms)}, median={np.median(dsa_observed_dms):.0f} pc/cm³")

Observed DSA DMs: N=39, median=445 pc/cm³

[13]:

# Generate FRBs using the observed DM distribution
df_dsa_from_catalog = generate_frbs(5000, 'DSA', dm_catalog=dsa_observed_dms, seed=42)

# Compare with default sampling
df_dsa_default = generate_frbs(5000, 'DSA', seed=42)

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/DSA_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/DSA_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes

[14]:

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# DM comparison
ax = axes[0]
bins = np.linspace(0, 2000, 40)
ax.hist(df_dsa_default['DMeg'], bins=bins, alpha=0.5, label='Default (from grid)', density=True)
ax.hist(df_dsa_from_catalog['DMeg'], bins=bins, alpha=0.5, label='From observed catalog', density=True)
ax.hist(dsa_observed_dms, bins=bins, alpha=0.7, histtype='step', linewidth=2,
        color='red', label='Observed DMs', density=True)
ax.set_xlabel('DM_EG (pc/cm³)')
ax.set_ylabel('Density')
ax.set_title('DM_EG Distribution Comparison')
ax.legend()

# Redshift comparison
ax = axes[1]
bins = np.linspace(0, 2.5, 40)
ax.hist(df_dsa_default['z'], bins=bins, alpha=0.5, label='Default (from grid)', density=True)
ax.hist(df_dsa_from_catalog['z'], bins=bins, alpha=0.5, label='From observed catalog', density=True)
ax.set_xlabel('Redshift z')
ax.set_ylabel('Density')
ax.set_title('Redshift Distribution Comparison')
ax.legend()

plt.suptitle('Effect of Using Observed DM Catalog', fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

../_images/nb_Simulate_Generate_FRBs_22_0.png

Reproducibility

Using the seed parameter ensures reproducible results.

[15]:

# Generate two sets with the same seed
df1 = generate_frbs(100, 'CHIME', seed=12345)
df2 = generate_frbs(100, 'CHIME', seed=12345)

# Verify they are identical
print("Are the two DataFrames identical?")
print(f"  DM:  {np.allclose(df1['DMeg'], df2['DMeg'])}")
print(f"  z:   {np.allclose(df1['z'], df2['z'])}")
print(f"  M_r: {np.allclose(df1['M_r'], df2['M_r'])}")
print(f"  m_r: {np.allclose(df1['m_r'], df2['m_r'])}")

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Are the two DataFrames identical?
  DM:  True
  z:   True
  M_r: True
  m_r: True

Using a Custom Cosmology

By default, the module uses Planck18 cosmology. You can specify a different cosmology for the distance modulus calculations.

[16]:

from astropy.cosmology import WMAP9, Planck18

# Compare apparent magnitudes with different cosmologies
df_planck = generate_frbs(1000, 'CHIME', cosmo=Planck18, seed=42)
df_wmap = generate_frbs(1000, 'CHIME', cosmo=WMAP9, seed=42)

# The DM, z, and M_r will be the same, but m_r will differ slightly
print("Comparison of apparent magnitudes with different cosmologies:")
print(f"  Planck18 median m_r: {df_planck['m_r'].median():.3f}")
print(f"  WMAP9 median m_r:    {df_wmap['m_r'].median():.3f}")
print(f"  Difference:          {(df_planck['m_r'].median() - df_wmap['m_r'].median()):.4f} mag")

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Comparison of apparent magnitudes with different cosmologies:
  Planck18 median m_r: 21.015
  WMAP9 median m_r:    20.980
  Difference:          0.0346 mag

Calculating Detection Fractions

One common use case is to estimate what fraction of FRB hosts would be detectable given a survey magnitude limit.

[17]:

# Survey depth limits
mag_limits = {
    'Pan-STARRS': 23.2,
    'DECaLS': 23.5,
    'HSC-Wide': 26.0,
    'HSC-Deep': 27.0,
}

print("Fraction of FRB hosts detectable by survey depth:")
print("=" * 65)
print(f"{'Survey':<15} {'Limit':<8} {'CHIME':<12} {'DSA':<12} {'ASKAP':<12}")
print("-" * 65)

for survey_name, mag_limit in mag_limits.items():
    fracs = []
    for df in [df_chime, df_dsa, df_askap]:
        frac = (df['m_r'] < mag_limit).mean() * 100
        fracs.append(f"{frac:.1f}%")
    print(f"{survey_name:<15} {mag_limit:<8.1f} {fracs[0]:<12} {fracs[1]:<12} {fracs[2]:<12}")

Fraction of FRB hosts detectable by survey depth:
=================================================================
Survey          Limit    CHIME        DSA          ASKAP
-----------------------------------------------------------------
Pan-STARRS      23.2     78.0%        73.5%        96.6%
DECaLS          23.5     81.5%        76.9%        97.6%
HSC-Wide        26.0     96.7%        95.3%        99.9%
HSC-Deep        27.0     98.8%        98.1%        100.0%

Saving Generated FRBs

You can save the generated FRB population to various formats for later use.

[18]:

# Generate a sample population
df_sample = generate_frbs(10000, 'CHIME', seed=42)

# Save to different formats
# df_sample.to_csv('simulated_chime_frbs.csv', index=False)
# df_sample.to_parquet('simulated_chime_frbs.parquet', index=False)

print("Example of saving to CSV:")
print("  df_sample.to_csv('simulated_chime_frbs.csv', index=False)")
print("\nExample of saving to Parquet:")
print("  df_sample.to_parquet('simulated_chime_frbs.parquet', index=False)")

Loading P(DM,z) grid from /home/xavier/Projects/FRB/FRB/frb/data/DM/CHIME_pzdm.npz
Sampling DM values
Sampling redshifts
Sampling host galaxy absolute magnitudes
Using Lz values to sample host galaxy absolute magnitudes
Example of saving to CSV:
  df_sample.to_csv('simulated_chime_frbs.csv', index=False)

Example of saving to Parquet:
  df_sample.to_parquet('simulated_chime_frbs.parquet', index=False)

Summary

The generate_frbs() function provides a simple interface for generating realistic FRB populations:

Basic usage: generate_frbs(n_frbs, survey) - samples from survey-specific P(z,DM) grids
Custom DM distribution: Use dm_catalog to sample DMs from observed values
Reproducibility: Use seed for reproducible results
Custom cosmology: Use cosmo to specify a different cosmology

The generated populations can be used for:

Testing PATH analysis pipelines
Estimating host galaxy detection fractions
Studying selection effects across surveys
Planning observations

[ ]: