Skip to content

Commit

Permalink
properly counting gzip crawl file list (#437)
Browse files Browse the repository at this point in the history
  • Loading branch information
edisonguo authored Jun 9, 2020
1 parent a96c393 commit e7b2b60
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion crawl/crawl_pipeline.sh
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ fi

crawl_file="$data_dir/${job_id}_gdal.tsv.gz"

total_files=$(cat $file_list | wc -l)
total_files=$(zcat $file_list | wc -l)
batch_size=$(echo "batch=${total_files}/${conc_limit}; if(batch<1){batch=1;}; if(batch>10){batch=10;}; batch;"|bc)

echo "INFO: file list to crawl: $file_list"
Expand Down

0 comments on commit e7b2b60

Please sign in to comment.