Collision

This program addresses the computation of 2-2, 2-3, and 3-2 collision terms in different systems:

where the 2-2 collision term is:

collision

2-3 collision term is:

Collision

3-2 collision term is:

Collison

To understand the background and principles of the application of the program you can refer to the following article

Towards a full solution of the relativistic Boltzmann equation for quark-gluon matter on GPUs

Advantages of the program

  • Solving high-dimensional integrals by Monte Carlo methods;
  • Conservation of particle number by “symmetric sampling” method;
  • Reducing computation time with CUDA_Atomic calculations;
  • Parallelizing programs with CUDA.

Structure of the program

Since the program structures in the three collision processes are basically the same, only the 2-2 collision process is described in detail, while the differences in the 2-3, 3-2 collision program are labeled.

  • Init
    • collision_type_for_all_species()
  • Main
    • Collision_term()
      • collision_term_kernel()
      • collision_term_at_specific_point()
      • accumulate_collision_term()
  • other_calculation
    • Amplitude_square()
    • H_fun()
    • C1_fun()
    • C2_fun()
    • threeD_to_oneD()

The running logic of the whole program is shown below:

flowchart

collision_term_kernel()

This function performs the parallelization of the program and contains two main functions:

1、Corresponding the block indexes of the parallel computation to the momentum space and the position space;
2、Computing the momentum and energy of the particles.
  • The program corresponds the vector, positional indexes to the ‘thread’, and the 6 dimensions become 1 dimension;
i_grid = cuda.grid(1)
        
if i_grid < total_grid:
    
    # convert one-d index into six-d
    ipz = i_grid%npz
    ipz_rest = i_grid//npz
    ipy = ipz_rest%npy
    ipy_rest = ipz_rest//npy
    ipx = ipy_rest%npx
    ipx_rest = ipy_rest//npx
    iz = ipx_rest%nz
    iz_rest = ipx_rest//nz
    iy = iz_rest%ny
    iy_rest = iz_rest//ny
    ix = iy_rest%nx
  • Calculate the momentum and energy of different particles:

enenrgy

for p_type in range(num_of_particle_types):

    px = ((ipx+0.5)*dpx[p_type] - half_px[p_type])/(npx**i_level)
    py = ((ipy+0.5)*dpy[p_type] - half_py[p_type])/(npy**i_level)
    pz = ((ipz+0.5)*dpz[p_type] - half_pz[p_type])/(npz**i_level)

    mp_squared = masses[p_type]**2
    p0 = math.sqrt(mp_squared*c**2+px**2+py**2+pz**2)

collision_term_at_specific_point()

This function consists of two main functions:

1、Calculate int_vloume;
2、Calculate k1z1, k1z2 using Monte Carlo method;
  • Calculate int_volume:

    2-2 Collision

int_volume

if collision_type[i_collision] < 1000:
            
        flavor0 = flavor[i_collision,0]
        flavor1 = flavor[i_collision,1]
        flavor2 = flavor[i_collision,2]

        int_volume = 2**5*half_px[flavor0]*half_py[flavor0]*half_px[flavor2]*half_py[flavor2]\
        *half_pz[flavor2]/(num_samples*4)

        m1 = masses[flavor0]
        m2 = masses[flavor1]
        m3 = masses[flavor2]

2-3 Collision

int_volume

if collision_type[i_collision] < 1000:

    int_volume = 2**8*half_px[flavor[i_collision,0]]*half_py[flavor[i_collision,0]]\
    *half_px[flavor[i_collision,2]]*half_py[flavor[i_collision,2]]*half_pz[flavor[i_collision,2]]\
    *half_px[flavor[i_collision,3]]*half_py[flavor[i_collision,3]]*half_pz[flavor[i_collision,3]]\
    /(num_samples*5)

3-2 Collision

int_volume

if collision_type[i_collision] < 1000:

    int_volume = 2**8*half_px[flavor[i_collision,3]]*half_py[flavor[i_collision,3]]\
    *half_px[flavor[i_collision,2]]*half_py[flavor[i_collision,2]]*half_pz[flavor[i_collision,2]]\
    *half_px[flavor[i_collision,0]]*half_py[flavor[i_collision,0]]*half_pz[flavor[i_collision,0]]\
    /(num_samples*5)
  • Calculation of k1z1, k1z2 using Monte Carlo for the purpose of calculating Ja

    2-2 Collision:

 for i_sample in range(num_samples):
                
    k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor0]
    k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor0]
    k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor2]
    k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor2]
    k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor2]
    
    k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)

    C1 = C1_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)
    C2 = C2_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)
    H = H_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)

    if H>1e-13 and abs(C2)>1e-13:
        k1z1, k1z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2

**2-3 Collision: **

for i_sample in range(num_samples):

    # masses of the particles
    m1 = masses[flavor[i_collision,0]]
    m2 = masses[flavor[i_collision,1]]
    m3 = masses[flavor[i_collision,2]]
    m4 = masses[flavor[i_collision,3]]
    mp = masses[flavor[i_collision,4]]

    # randomly generate k1x, k1y, k3x, k3y, k3z
    # the momentum values need to be sampled according to the specific particle types
    k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,0]]
    k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,0]]
    k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,2]]
    k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,2]]
    k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,2]]
    k4x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,3]]
    k4y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,3]]
    k4z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,3]]

    k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)
    k40 = math.sqrt(m4**2*c**2+k4x**2+k4y**2+k4z**2)

    # solving for k1z via energy conservation
    # k1z has two values
    C1 = C1_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)
    C2 = C2_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)
    H =  H_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)

    # take the value if H>1e-13 and abs(C2)>1e-13
    if H>1e-13 and abs(C2)>1e-13:
        k1z1, k1z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2

**3-2 Collision: **

for i_sample in range(num_samples):

    # masses of the particles
    m1 = masses[flavor[i_collision,0]]
    m2 = masses[flavor[i_collision,1]]
    m3 = masses[flavor[i_collision,2]]
    m4 = masses[flavor[i_collision,3]]
    mp = masses[flavor[i_collision,4]]

    # randomly generate k4x, k4y, k3x, k3y, k3z and k1x, k1y, k1z
    # the momentum values need to be sampled according to the specific particle types
    k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,0]]
    k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,0]]
    k1z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,0]]
    k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,2]]
    k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,2]]
    k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,2]]
    k4x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,3]]
    k4y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,3]]

    k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)
    k10 = math.sqrt(m1**2*c**2+k1x**2+k1y**2+k1z**2)

    # solving for k1z via energy conservation
    # k1z has two values
    C1 = C1_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)
    C2 = C2_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)
    H =  H_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)

    # take the value if H>1e-13 and abs(C2)>1e-13
    if H>1e-13 and abs(C2)>1e-13:
        k4z1, k4z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2

accumulate_collision_term()

This function mainly solves for the final parameters and contains four functions:

1、Calculation Ja、momentum_product;
2、Calculation f0、f1、f2、fp;
3、Calculation amplitude_square;
4、Calculation collision term.
  • Calculation Ja and momentum_product

​ Here, we choose to express k2 as k3+p-k1, which leads to a more stable integration. All computed values of k2 undergo a validation process to ensure they fall within the phase space defined by the particle. The specific calculation formula is depicted in the diagram below:

​ **2-2 Collision: **

Ja

Ja

momentum_productmomentum_product

if k1z > -half_pz[flavor0] and k1z < half_pz[flavor0]:
        
        # obtain k2x, k2y, k2z via momentum conservation
        k2x = k3x + px - k1x
        k2y = k3y + py - k1y
        k2z = k3z + pz - k1z
        
        # k2x, k2y, k2z must also be in the momentum box range of particle 1
        if (k2x > -half_px[flavor1] and k2x < half_px[flavor1] and 
            k2y > -half_py[flavor1] and k2y < half_py[flavor1] and 
            k2z > -half_pz[flavor1] and k2z < half_pz[flavor1]):
            
            # energy for particle 1 and 2
            k10, k20 = math.sqrt(m1_squared*c**2+k1x**2+k1y**2+k1z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)

            # energy must be conserved
            if abs(k10+k20-k30-p0) < 10**(-18):
                # jacobin term
                Ja = abs(k1z/k10 - (-k1z+k3z+pz)/k20)
                # momentum product
                momentum_product = k10*k20*k30*p0

​ **2-3 Collision: **

Ja

Ja

momentum_productmomentum_product

if k1z > -half_pz[flavor[i_collision,0]] and k1z < half_pz[flavor[i_collision,0]]:
        
        # obtain k2x, k2y, k2z via momentum conservation
        k2x = k3x + k4x + px - k1x
        k2y = k3y + k4y + py - k1y
        k2z = k3z + k4z + pz - k1z

        # k2x, k2y, k2z must also be in the momentum box range of particle 1
        if (k2x > -half_px[flavor[i_collision,1]] and k2x < half_px[flavor[i_collision,1]] and 
            k2y > -half_py[flavor[i_collision,1]] and k2y < half_py[flavor[i_collision,1]] and 
            k2z > -half_pz[flavor[i_collision,1]] and k2z < half_pz[flavor[i_collision,1]]):

            # energy for particle 1 and 2
            k10, k20 = math.sqrt(m1_squared*c**2+k1x**2+k1y**2+k1z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)

            # energy must be conserved
            if abs(k10+k20-k30-k40-p0) < 10**(-18):
                # jacobin term
                Ja = abs(k1z/k10 - (-k1z+k3z+k4z+pz)/k20)
                # momentum product
                momentum_product = k10*k20*k30*k40*p0

​ **3-2 Collision: **

Ja

Ja

momentum_productmomentum_product

if k4z > -half_pz[flavor[i_collision,3]] and k4z < half_pz[flavor[i_collision,3]]:
        
        # obtain k2x, k2y, k2z via momentum conservation
        k2x = k4x + px - k1x - k3x
        k2y = k4y + py - k1y - k3y
        k2z = k4z + pz - k1z - k3z
        
        # k2x, k2y, k2z must also be in the momentum box range of particle 1
        if (k2x > -half_px[flavor[i_collision,1]] and k2x < half_px[flavor[i_collision,1]] and 
            k2y > -half_py[flavor[i_collision,1]] and k2y < half_py[flavor[i_collision,1]] and 
            k2z > -half_pz[flavor[i_collision,1]] and k2z < half_pz[flavor[i_collision,1]]):
            
            # energy for particle 4 and 2
            k40, k20 = math.sqrt(m4_squared*c**2+k4x**2+k4y**2+k4z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)
            
            # energy must be conserved
            if abs(k10+k20+k30-k40-p0) < 10**(-18):
                # jacobin term
                Ja = abs(-k4z/k40 + (-k1z-k3z+k4z+pz)/k20)
                # momentum product
                momentum_product = k10*k20*k30*k40*p0
  • Calculate f and F

​ The first step is to find the “location” of the particle according to the index.

  if Ja>1e-5 and momentum_product > 10**(-19):
  
     # for particle 0
    ik1x, ik1y, ik1z = int((k1x + half_px[flavor0])//dpx[flavor0]), \
                       int((k1y + half_py[flavor0])//dpy[flavor0]), \
                       int((k1z + half_pz[flavor0])//dpz[flavor0])
    # convert grid index in one dimension
    i_phase_grid0 = threeD_to_oneD(ix,iy,iz,ik1x,ik1y,ik1z, \
                                   nx, ny, nz, npx, npy, npz)
    # for particle 1
    ik2x, ik2y, ik2z = int((k2x + half_px[flavor1])//dpx[flavor1]), \
                       int((k2y + half_py[flavor1])//dpy[flavor1]), \
                       int((k2z + half_pz[flavor1])//dpz[flavor1])
    # convert grid index in one dimension
    i_phase_grid1 = threeD_to_oneD(ix,iy,iz,ik2x,ik2y,ik2z, \
                                   nx, ny, nz, npx, npy, npz)


    # for particle 2
    ik3x, ik3y, ik3z = int((k3x + half_px[flavor2])//dpx[flavor2]), \
                       int((k3y + half_py[flavor2])//dpy[flavor2]), \
                       int((k3z + half_pz[flavor2])//dpz[flavor2])
    # convert grid index in one dimension
    i_phase_grid2 = threeD_to_oneD(ix,iy,iz,ik3x,ik3y,ik3z, \
                                   nx, ny, nz, npx, npy, npz)
  

​ Secondly, according to the index to find the corresponding particles of f and F and the difference fandF

​ **2-2 Collision: **

distrubtion

f = cuda.local.array(shape=4, dtype=float64)
f[0] = f_x_p_t0level[flavor0, i_phase_grid0]
f[1] = f_x_p_t0level[flavor1, i_phase_grid1]
f[2] = f_x_p_t0level[flavor2, i_phase_grid2]
f[3] = f_x_p_tilevel[p_type, i_grid]

# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=4, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(4):
    i_flavor = flavor[i_collision,i_particle]
    if particle_type[i_flavor] == 1:
        tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    elif particle_type[i_flavor] == 2:
        tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    else:
        tf[i_particle] = 1
        
distribution_terms = f[0]*f[1]*tf[2]*tf[3]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]] - tf[0]*tf[1]*f[2]*f[3]*amplitude_square/degeneracy[flavor[i_collision,2]]/degeneracy[flavor[i_collision,3]]

​ **2-3 Collision: **

distrubtion

f = cuda.local.array(shape=5, dtype=float64)
f[0] = f_x_p_t0level[flavor[i_collision,0], i_phase_grid0]
f[1] = f_x_p_t0level[flavor[i_collision,1], i_phase_grid1]
f[2] = f_x_p_t0level[flavor[i_collision,2], i_phase_grid2]
f[3] = f_x_p_t0level[flavor[i_collision,3], i_phase_grid3]
f[4] = f_x_p_tilevel[p_type, i_grid]

# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=5, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(5):
    i_flavor = flavor[i_collision,i_particle]
    if particle_type[i_flavor] == 1:
        tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    elif particle_type[i_flavor] == 2:
        tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    else:
        tf[i_particle] = 1
        
distribution_terms = (1/(2*math.pi*hbar)**3)*f[0]*f[1]*tf[2]*tf[3]*tf[4]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]] - tf[0]*tf[1]*f[2]*f[3]*f[4]*amplitude_square/degeneracy[flavor[i_collision,2]]/degeneracy[flavor[i_collision,3]]/degeneracy[flavor[i_collision,4]]

​ **3-2 Collision: **

distrubtion

f = cuda.local.array(shape=5, dtype=float64)
f[0] = f_x_p_t0level[flavor[i_collision,0], i_phase_grid0]
f[1] = f_x_p_t0level[flavor[i_collision,1], i_phase_grid1]
f[2] = f_x_p_t0level[flavor[i_collision,2], i_phase_grid2]
f[3] = f_x_p_t0level[flavor[i_collision,3], i_phase_grid3]
f[4] = f_x_p_tilevel[p_type, i_grid]

# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=5, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(5):
    i_flavor = flavor[i_collision,i_particle]
    if particle_type[i_flavor] == 1:
        tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    elif particle_type[i_flavor] == 2:
        tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
    else:
        tf[i_particle] = 1
        
distribution_terms = f[0]*f[1]*f[2]*tf[3]*tf[4]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]]/degeneracy[flavor[i_collision,2]] - (1/(2*math.pi*hbar)**3)*tf[0]*tf[1]*tf[2]*f[3]*f[4]*amplitude_square/degeneracy[flavor[i_collision,3]]/degeneracy[flavor[i_collision,4]]
  • Calculate amplitude_square;

amplitude_square

For a specific and detailed description, please refer to amplitude_square, where the function is thoroughly explained.

  • Calculate collision term

​ **2-2 Collision: **

collision1

The “symmetric sampling” method is used to ensure the conservation of particle number, the details of which can be found in the paper published by Prof. Jun-Jie Zhang. Towards a full solution of the relativistic Boltzmann equation for quark-gluon matter on GPUs

if flavor0==flavor1:
    symmetry_factor = 0.5
else:
    symmetry_factor = 1.0

    # accumulate collision kernel
    # some factors are compensated later
    result = distribution_terms*symmetry_factor\
    *hbar**2*c/Ja*int_volume/(momentum_product*64*math.pi**2)

    cuda.atomic.add(collision_term0level, \
                    (flavor0, i_phase_grid0), -result/dp[flavor0]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor1, i_phase_grid1), -result/dp[flavor1]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor2, i_phase_grid2), result/dp[flavor2]*dp[p_type])
    cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)

​ **2-3 Collision: **

collision1

if flavor[i_collision,0]==flavor[i_collision,1]:
    symmetry_factor = 0.5
else:
    symmetry_factor = 1.0

    # accumulate collision kernel
    # some factors are compensated later
    result = distribution_terms*symmetry_factor*\
    hbar**3*c*int_volume/(momentum_product*128*math.pi**2)/Ja

    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,0], i_phase_grid0), -result/dp[flavor[i_collision,0]]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,1], i_phase_grid1), -result/dp[flavor[i_collision,1]]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,2], i_phase_grid2), result/dp[flavor[i_collision,2]]*dp[p_type])   
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,3], i_phase_grid3), result/dp[flavor[i_collision,3]]*dp[p_type])
    cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)

​ **3-2 Collision: **

collision1

if flavor[i_collision,0]==flavor[i_collision,1]==flavor[i_collision,2]:
    symmetry_factor = 1/3
elif flavor[i_collision,0]==flavor[i_collision,1] or \
flavor[i_collision,1]==flavor[i_collision,2] or \
flavor[i_collision,0]==flavor[i_collision,2]:
    symmetry_factor = 0.5
else:
    symmetry_factor = 1.0

    # accumulate collision kernel
    # some factors are compensated later
    result = distribution_terms*symmetry_factor*\
    hbar**3*c*int_volume/(momentum_product*128*math.pi**2)/Ja

    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,0], i_phase_grid0), -result/dp[flavor[i_collision,0]]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,1], i_phase_grid1), -result/dp[flavor[i_collision,1]]*dp[p_type])
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,2], i_phase_grid2), -result/dp[flavor[i_collision,2]]*dp[p_type])   
    cuda.atomic.add(collision_term0level, \
                    (flavor[i_collision,3], i_phase_grid3), result/dp[flavor[i_collision,3]]*dp[p_type])
    cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)

results matching ""

    No results matching ""