pp > ttbttb crashes at high number of grid points

Bug #1986923 reported by Vishakha Lingadahally
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MadGraph5_aMC@NLO
Incomplete
Undecided
Rikkert Frederix

Bug Description

When I try to compute the cross-section for pp > ttbttb at NLO at a high number grid points (1000000 and 5 iterations), I get the following error while refining the results (MG5 3.3.2):

Exception: program /phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/SubProcesses/P0_gg_tttxtx/ajob1 5 all 0 1 launch ends with non zero status: 127. Stop all computation

I need such a high number of grid points to get stable statistics, there seems to be no way around that. However, I encounter no such error for lesser number of grid points; say around 100000, byt don't obtain reasonably stable plots. Here's the log report:

#************************************************************
#* MadGraph5_aMC@NLO *
#* *
#* * * *
#* * * * * *
#* * * * * 5 * * * * *
#* * * * * *
#* * * *
#* *
#* *
#* VERSION 5.3.3.2 20xx-xx-xx *
#* *
#* The MadGraph5_aMC@NLO Development Team - Find us at *
#* https://server06.fynu.ucl.ac.be/projects/madgraph *
#* and *
#* http://amcatnlo.cern.ch *
#* *
#************************************************************
#* *
#* Command File for aMCatNLO *
#* *
#* run as ./bin/aMCatNLO.py filename *
#* *
#************************************************************
launch
Traceback (most recent call last):
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/extended_cmd.py", line 1544, in onecmd
    return self.onecmd_orig(line, **opt)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/extended_cmd.py", line 1493, in onecmd_orig
    return func(arg, **opt)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/amcatnlo_run_interface.py", line 1784, in do_launch
    evt_file = self.run(mode, options)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/amcatnlo_run_interface.py", line 1938, in run
    self.run_all_jobs(jobs_to_run,integration_step)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/amcatnlo_run_interface.py", line 2247, in run_all_jobs
    self.wait_for_complete(run_type)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/amcatnlo_run_interface.py", line 4864, in wait_for_complete
    self.cluster.wait(self.me_dir, update_status)
  File "/phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/bin/internal/cluster.py", line 829, in wait
    raise Exception(self.fail_msg)
Exception: program /phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/SubProcesses/P0_gg_tttxtx/ajob1 5 all 0 1 launch ends with non zero status: 127. Stop all computation
Value of current Options:
              text_editor : None
      notification_center : True
       cluster_local_path : None
    cluster_status_update : (600, 30)
               hepmc_path : None
          pythia-pgs_path : None
              thepeg_path : None
        madanalysis5_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/HEPTools/madanalysis5
                 run_mode : 2
        cluster_temp_path : None
            cluster_queue : None
         madanalysis_path : None
                   lhapdf : /afs/desy.de/user/v/vishakha/local/bin/lhapdf-config
            f2py_compiler : None
                    ninja : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/HEPTools/lib
   automatic_html_opening : False
       cluster_retry_wait : 300
      exrootanalysis_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/ExRootAnalysis
                  timeout : 60
                  nb_core : 128
        f2py_compiler_py2 : None
        f2py_compiler_py3 : None
         fortran_compiler : None
collier : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/HEPTools/lib
             pythia8_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/HEPTools/pythia8
                hwpp_path : None
                    golem : None
                  td_path : None
             delphes_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/Delphes
              auto_update : 7
             cluster_type : condor
               eps_viewer : None
              web_browser : None
             cluster_size : 100
           cluster_memory : None
             stdout_level : None
               lhapdf_py3 : None
               lhapdf_py2 : /afs/desy.de/user/v/vishakha/local/bin/lhapdf-config
             cluster_time : None
mg5amc_py8_interface_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2/HEPTools/MG5aMC_PY8_interface
         cluster_nb_retry : 1
                 mg5_path : /afs/desy.de/user/v/vishakha/MG5_aMC_v3_3_2
             syscalc_path : None
             cpp_compiler : None
#************************************************************
#* MadGraph5_aMC@NLO *
#* *
#* * * *
#* * * * * *
#* * * * * 5 * * * * *
#* * * * * *
#* * * *
#* *
#* *
#* VERSION 3.3.2 2022-03-18 *
#* *
#* The MadGraph5_aMC@NLO Development Team - Find us at *
#* https://server06.fynu.ucl.ac.be/projects/madgraph *
#* *
#************************************************************
#* *
#* Command File for MadGraph5_aMC@NLO *
#* *
#* run as ./bin/mg5_aMC filename *
#* *
#************************************************************
set group_subprocesses Auto
set ignore_six_quark_processes False
set max_t_for_channel 99
set loop_optimized_output True
set low_mem_multicore_nlo_generation False
set default_unset_couplings 99
set include_lepton_initiated_processes False
set zerowidth_tchannel True
set nlo_mixed_expansion True
set loop_color_flows False
set gauge unitary
set complex_mass_scheme False
set max_npoint_for_channel 0
import model sm
define p = g u c d s u~ c~ d~ s~
define j = g u c d s u~ c~ d~ s~
define l+ = e+ mu+
define l- = e- mu-
define vl = ve vm vt
define vl~ = ve~ vm~ vt~
import model loop_sm
generate p p > t t~ t t~ [QCD]
output four_top_nlo
######################################################################
## PARAM_CARD AUTOMATICALY GENERATED BY MG5 FOLLOWING UFO MODEL ####
######################################################################
## ##
## Width set on Auto will be computed following the information ##
## present in the decay.py files of the model. ##
## See arXiv:1402.1178 for more details. ##
## ##
######################################################################

###################################
## INFORMATION FOR MASS
###################################
Block mass
    5 4.700000e+00 # MB
    6 1.732000e+02 # MT
   15 1.777000e+00 # MTA
   23 9.118800e+01 # MZ
   25 1.250000e+02 # MH
## Dependent parameters, given by model restrictions.
## Those values should be edited following the
## analytical expression. MG5 ignores those values
## but they are important for interfacing the output of MG5
## to external program such as Pythia.
  1 0.000000e+00 # d : 0.0
  2 0.000000e+00 # u : 0.0
  3 0.000000e+00 # s : 0.0
  4 0.000000e+00 # c : 0.0
  11 0.000000e+00 # e- : 0.0
  12 0.000000e+00 # ve : 0.0
  13 0.000000e+00 # mu- : 0.0
  14 0.000000e+00 # vm : 0.0
  16 0.000000e+00 # vt : 0.0
  21 0.000000e+00 # g : 0.0
  22 0.000000e+00 # a : 0.0
  24 8.041900e+01 # w+ : cmath.sqrt(MZ__exp__2/2. + cmath.sqrt(MZ__exp__4/4. - (aEW*cmath.pi*MZ__exp__2)/(Gf*sqrt__2)))

###################################
## INFORMATION FOR SMINPUTS
###################################
Block sminputs
    1 1.325070e+02 # aEWM1
    2 1.166390e-05 # Gf
    3 1.180000e-01 # aS (Note that Parameter not used if you use a PDF set)

###################################
## INFORMATION FOR YUKAWA
###################################
Block yukawa
    5 4.700000e+00 # ymb
    6 1.732000e+02 # ymt
   15 1.777000e+00 # ymtau

###################################
## INFORMATION FOR DECAY
###################################
DECAY 6 0.0e+00 # WT
DECAY 23 0.0e+00 # WZ
DECAY 24 2.047600e+00 # WW
DECAY 25 6.382339e-03 # WH
## Dependent parameters, given by model restrictions.
## Those values should be edited following the
## analytical expression. MG5 ignores those values
## but they are important for interfacing the output of MG5
## to external program such as Pythia.
DECAY 1 0.000000e+00 # d : 0.0
DECAY 2 0.000000e+00 # u : 0.0
DECAY 3 0.000000e+00 # s : 0.0
DECAY 4 0.000000e+00 # c : 0.0
DECAY 5 0.000000e+00 # b : 0.0
DECAY 11 0.000000e+00 # e- : 0.0
DECAY 12 0.000000e+00 # ve : 0.0
DECAY 13 0.000000e+00 # mu- : 0.0
DECAY 14 0.000000e+00 # vm : 0.0
DECAY 15 0.000000e+00 # ta- : 0.0
DECAY 16 0.000000e+00 # vt : 0.0
DECAY 21 0.000000e+00 # g : 0.0
DECAY 22 0.000000e+00 # a : 0.0
#===========================================================
# QUANTUM NUMBERS OF NEW STATE(S) (NON SM PDG CODE)
#===========================================================

Block QNUMBERS 82 # gh
        1 0 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 8 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
#***********************************************************************
# MadGraph5_aMC@NLO *
# *
# run_card.dat aMC@NLO *
# *
# This file is used to set the parameters of the run. *
# *
# Some notation/conventions: *
# *
# Lines starting with a hash (#) are info or comments *
# *
# mind the format: value = variable ! comment *
# *
# Some of the values of variables can be list. These can either be *
# comma or space separated. *
# *
# To display additional parameter, you can use the command: *
# update to_full *
#***********************************************************************
#
#*******************
# Running parameters
#*******************
#
#***********************************************************************
# Tag name for the run (one word) *
#***********************************************************************
  tag_1 = run_tag ! name of the run
#***********************************************************************
# Number of LHE events (and their normalization) and the required *
# (relative) accuracy on the Xsec. *
# These values are ignored for fixed order runs *
#***********************************************************************
 10000 = nevents ! Number of unweighted events requested
 -1.0 = req_acc ! Required accuracy (-1=auto determined from nevents)
 -1 = nevt_job! Max number of events per job in event generation.
                 ! (-1= no split).
#***********************************************************************
# Output format
#***********************************************************************
  -1.0 = time_of_flight ! threshold (in mm) below which the invariant livetime is not written (-1 means not written)
  average = event_norm ! average/sum/bias. Normalization of the weight in the LHEF
#***********************************************************************
# Number of points per itegration channel (ignored for aMC@NLO runs) *
#***********************************************************************
 -1 = req_acc_FO ! Required accuracy (-1=ignored, and use the
                           ! number of points and iter. below)
# These numbers are ignored except if req_acc_FO is equal to -1
 1000000 = npoints_FO_grid ! number of points to setup grids
 5 = niters_FO_grid ! number of iter. to setup grids
 1000000 = npoints_FO ! number of points to compute Xsec
 5 = niters_FO ! number of iter. to compute Xsec
#***********************************************************************
# Random number seed *
#***********************************************************************
 0 = iseed ! rnd seed (0=assigned automatically=default))
#***********************************************************************
# Collider type and energy *
#***********************************************************************
 1 = lpp1 ! beam 1 type (0 = no PDF)
 1 = lpp2 ! beam 2 type (0 = no PDF)
 7000.0 = ebeam1 ! beam 1 energy in GeV
 7000.0 = ebeam2 ! beam 2 energy in GeV
#***********************************************************************
# PDF choice: this automatically fixes also alpha_s(MZ) and its evol. *
#***********************************************************************
 lhapdf = pdlabel ! PDF set
 21100 = lhaid ! If pdlabel=lhapdf, this is the lhapdf number. Only
              ! numbers for central PDF sets are allowed. Can be a list;
              ! PDF sets beyond the first are included via reweighting.
#***********************************************************************
# Include the NLO Monte Carlo subtr. terms for the following parton *
# shower (HERWIG6 | HERWIGPP | PYTHIA6Q | PYTHIA6PT | PYTHIA8) *
# WARNING: PYTHIA6PT works only for processes without FSR!!!! *
#***********************************************************************
  HERWIG6 = parton_shower
  1.0 = shower_scale_factor ! multiply default shower starting
                                  ! scale by this factor
#***********************************************************************
# Renormalization and factorization scales *
# (Default functional form for the non-fixed scales is the sum of *
# the transverse masses divided by two of all final state particles *
# and partons. This can be changed in SubProcesses/set_scales.f or via *
# dynamical_scale_choice option) *
#***********************************************************************
 .true. = fixed_ren_scale ! if .true. use fixed ren scale
 .true. = fixed_fac_scale ! if .true. use fixed fac scale
 346.4 = muR_ref_fixed ! fixed ren reference scale
 346.4 = muF_ref_fixed ! fixed fact reference scale
 -1 = dynamical_scale_choice ! Choose one (or more) of the predefined
           ! dynamical choices. Can be a list; scale choices beyond the
           ! first are included via reweighting
 1.0 = muR_over_ref ! ratio of current muR over reference muR
 1.0 = muF_over_ref ! ratio of current muF over reference muF
#***********************************************************************
# Reweight variables for scale dependence and PDF uncertainty *
#***********************************************************************
 1.0, 2.0, 0.5 = rw_rscale ! muR factors to be included by reweighting
 1.0, 2.0, 0.5 = rw_fscale ! muF factors to be included by reweighting
 True = reweight_scale ! Reweight to get scale variation using the
            ! rw_rscale and rw_fscale factors. Should be a list of
            ! booleans of equal length to dynamical_scale_choice to
            ! specify for which choice to include scale dependence.
 False = reweight_PDF ! Reweight to get PDF uncertainty. Should be a
            ! list booleans of equal length to lhaid to specify for
            ! which PDF set to include the uncertainties.
#***********************************************************************
# Store reweight information in the LHE file for off-line model- *
# parameter reweighting at NLO+PS accuracy *
#***********************************************************************
 False = store_rwgt_info ! Store info for reweighting in LHE file
#***********************************************************************
# ickkw parameter: *
# 0: No merging *
# 3: FxFx Merging - WARNING! Applies merging only at the hard-event *
# level. After showering an MLM-type merging should be applied as *
# well. See http://amcatnlo.cern.ch/FxFx_merging.htm for details. *
# 4: UNLOPS merging (with pythia8 only). No interface from within *
# MG5_aMC available, but available in Pythia8. *
# -1: NNLL+NLO jet-veto computation. See arxiv:1412.8408 [hep-ph]. *
#***********************************************************************
 0 = ickkw
#***********************************************************************
#
#***********************************************************************
# BW cutoff (M+/-bwcutoff*Gamma). Determines which resonances are *
# written in the LHE event file *
#***********************************************************************
 0.0 = bwcutoff
#***********************************************************************
# Cuts on the jets. Jet clustering is performed by FastJet. *
# - If gamma_is_j, photons are also clustered with jets. *
# Otherwise, they will be treated as tagged particles and photon *
# isolation will be applied. Note that photons in the real emission *
# will always be clustered with QCD partons. *
# - When matching to a parton shower, these generation cuts should be *
# considerably softer than the analysis cuts. *
# - More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  0.0 = jetalgo ! FastJet jet algorithm (1=kT, 0=C/A, -1=anti-kT)
  0.0 = jetradius ! The radius parameter for the jet algorithm
  0.0 = ptj ! Min jet transverse momentum
  0.0 = etaj ! Max jet abs(pseudo-rap) (a value .lt.0 means no cut)
 False = gamma_is_j! Wether to cluster photons as jets or not
#***********************************************************************
# Cuts on the charged leptons (e+, e-, mu+, mu-, tau+ and tau-) *
# More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  0.0 = ptl ! Min lepton transverse momentum
  0.0 = etal ! Max lepton abs(pseudo-rap) (a value .lt.0 means no cut)
  0.0 = drll ! Min distance between opposite sign lepton pairs
  0.0 = drll_sf ! Min distance between opp. sign same-flavor lepton pairs
  0.0 = mll ! Min inv. mass of all opposite sign lepton pairs
  0.0 = mll_sf ! Min inv. mass of all opp. sign same-flavor lepton pairs
#***********************************************************************
# Fermion-photon recombination parameters *
# If Rphreco=0, no recombination is performed *
#***********************************************************************
 0.0 = Rphreco ! Minimum fermion-photon distance for recombination
 0.0 = etaphreco ! Maximum abs(pseudo-rap) for photons to be recombined (a value .lt.0 means no cut)
 False = lepphreco ! Recombine photons and leptons together
 False = quarkphreco ! Recombine photons and quarks together
#***********************************************************************
# Photon-isolation cuts, according to hep-ph/9801442 *
# Not applied if gamma_is_j *
# When ptgmin=0, all the other parameters are ignored *
# More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  0 = ptgmin ! Min photon transverse momentum
  0.0 = etagamma ! Max photon abs(pseudo-rap)
  0.0 = R0gamma ! Radius of isolation code
  1.0 = xn ! n parameter of eq.(3.4) in hep-ph/9801442
  1.0 = epsgamma ! epsilon_gamma parameter of eq.(3.4) in hep-ph/9801442
 True = isoEM ! isolate photons from EM energy (photons and leptons)
#***********************************************************************
# Cuts associated to MASSIVE particles identified by their PDG codes. *
# All cuts are applied to both particles and anti-particles, so use *
# POSITIVE PDG CODES only. Example of the syntax is {6 : 100} or *
# {6:100, 25:200} for multiple particles *
#***********************************************************************
  {} = pt_min_pdg ! Min pT for a massive particle
  {} = pt_max_pdg ! Max pT for a massive particle
  {} = mxx_min_pdg ! inv. mass for any pair of (anti)particles
#***********************************************************************
# Use PineAPPL to generate PDF-independent fast-interpolation grid *
# (https://zenodo.org/record/3992765#.X2EWy5MzbVo) *
#***********************************************************************
 False = pineappl ! PineAPPL switch
#***********************************************************************

Does this happen to be an innate problem of MG5-3.3.2? Could you kindly let me know if there happens to be a way to solve it?

Thank you!

Vishakha

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

error 127 is a standard bash error for "executable not found" (or library not found).
Which is quite surprising if you succesfully generate with lower number of point.

I assign this to Rikkert knows that part of the code better than me to see if some exectuable might only be caled for large number of points (maybe a link to gzip library?)

Changed in mg5amcnlo:
assignee: nobody → Rikkert Frederix (frederix)
Revision history for this message
Vishakha Lingadahally (vishakha-ls) wrote :

Alright, thank you very much! I look forward to hearing further from you.

Revision history for this message
Rikkert Frederix (frederix) wrote :

Dear Vishakha,

It looks like that one of the jobs in the /phenod/data/vishakha/MG5_aMC_v3_3_2/four_top_nlo/SubProcesses/P0_gg_tttxtx/all_G* directories crashed. Do the log.txt files in these directories show a crash or a reason why one of these jobs stopped prematurely?

Two side remarks:
- I notice that you are using a four-flavour scheme for this process. This is very much non-optimal and you should switch to a five flavour scheme for increased accuracy.
- I see that you prefer setting the number grid points and iterations explicitly instead of letting MG5_aMC figure it out by setting a small req_acc_FO. Explicitly setting the number of points as you do is typically way more CPU intensive, since it will use this number of points for each integration channel irrespective of the contribution of that integration channel to the overall results, meaning a lot of wasted CPU time if the contribution of these integration channels is small. Instead, using req_acc_FO typically results in a better distribution of the points among the integration channels. In fact, the best method is probably to first run with req_acc_FO = 0.003, and, if the results are not accurate enough, reduce this number accordingly by e.g., dividing it by three (which will result in roughly a factor 10 longer running time) and then do a second run with the 'bin/calculate_xsec --nocompile --only_generation' command (or similar if using the 'launch' command) which skips compilation and starts from the existing grids to increase the accuracy of those.

Revision history for this message
Vishakha Lingadahally (vishakha-ls) wrote :

Thank you very much for the reply! I will get back to you soon once I check the log file and then by varying the accuracy instead of the grid-points.

Warm regards,
Vishakha

Changed in mg5amcnlo:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.