Hi @DonggeLiu @jonathanmetzman
Lately, I've been running lots of local experiments on fuzzbench and noticed that after I added --runners-cpus flag reports were sometimes incomplete due to race condition.
This is my config:
# The number of trials of a fuzzer-benchmark pair.
trials: 5
# The amount of time in seconds that each trial is run for.
# 1 day = 24 * 60 * 60 = 86400
max_total_time: 3600
# The location of the docker registry.
# FIXME: Support custom docker registry.
# See https://github.com/google/fuzzbench/issues/777
docker_registry: gcr.io/fuzzbench
# The local experiment folder that will store most of the experiment data.
# Please use an absolute path.
experiment_filestore: /home/zuka/hexhive/data/local-runs/experiment-data
# The local report folder where HTML reports and summary data will be stored.
# Please use an absolute path.
report_filestore: /home/zuka/hexhive/data/local-runs/report-data
# Flag that indicates this is a local experiment.
local_experiment: true
and I use this command to start experiment:
PYTHONPATH=. python3 experiment/run_experiment.py \
--experiment-config experiment-config.yaml \
--benchmarks curl_curl_fuzzer_http freetype2_ftfuzzer bloaty_fuzz_target jsoncpp_jsoncpp_fuzzer libxml2_xml sqlite3_ossfuzz vorbis_decode_fuzzer \
--experiment-name libafl-1h-with-seeds \
--fuzzers libafl_default libafl_random libafl_weighted libafl_valprof libafl_covaccount \
--concurrent-builds 15 --runners-cpus 15 --measurers-cpus 1
Adding runners-cpus besides restricting number of usable CPUs, also adds pinning to docker command. Most of the times I am getting only first cycle of trials (If I run with --runners-cpus 16, then I get only 16 trials in the report). For other trials there were fuzzer logs, corpus archives, but no coverage archives.
The reason for this is measurer_main_process ends before the next cycle of trials is started. I see Finished measure loop. in the logs after the first cycle and the loop is never restarted.
After some more debugging I found the issue in this piece of code inside measure_manager_loop
while not scheduler.all_trials_ended(experiment):
continue_inner_loop = measure_manager_inner_loop(
experiment, max_cycle, request_queue, response_queue,
queued_snapshots)
if not continue_inner_loop:
break
time.sleep(MEASUREMENT_LOOP_WAIT)
After the first cycle ends, measure_manager_inner_loop returns False and the loop breaks out, because there are no unmeasured snapshots in the database yet.
I don't really understand the need for this break, so to fix the issue for my runs, I just removed break logic from the measurer loop and just let it run until scheduler.all_trials_ended. If you think this is an acceptable solution I can create PR.
Hi @DonggeLiu @jonathanmetzman
Lately, I've been running lots of local experiments on fuzzbench and noticed that after I added
--runners-cpusflag reports were sometimes incomplete due to race condition.This is my config:
and I use this command to start experiment:
Adding runners-cpus besides restricting number of usable CPUs, also adds pinning to docker command. Most of the times I am getting only first cycle of trials (If I run with --runners-cpus 16, then I get only 16 trials in the report). For other trials there were fuzzer logs, corpus archives, but no coverage archives.
The reason for this is
measurer_main_processends before the next cycle of trials is started. I seeFinished measure loop.in the logs after the first cycle and the loop is never restarted.After some more debugging I found the issue in this piece of code inside
measure_manager_loopAfter the first cycle ends,
measure_manager_inner_loopreturns False and the loop breaks out, because there are no unmeasured snapshots in the database yet.I don't really understand the need for this break, so to fix the issue for my runs, I just removed
breaklogic from the measurer loop and just let it run untilscheduler.all_trials_ended. If you think this is an acceptable solution I can create PR.