Skip to content

Commit 226ad88

Browse files
committed
Faster tender scraping
1 parent 3fde307 commit 226ad88

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

budgetkey_data_pipelines/pipelines/procurement/tenders/pipeline-spec.yaml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ scraper-exemptions:
2121
input_resource: mr-gov-il-search-exemption-messages
2222
tender_type: exemptions
2323
- run: throttle
24-
# parameters:
25-
# sleep-seconds: 0.5
26-
# log-interval-seconds: 2
24+
parameters:
25+
sleep-seconds: 0.25
26+
# log-interval-seconds: 0
2727
# go over each publisher and get urls for all exemptions related to this publisher
2828
# also adds tender_type column - which is needed as all tenders are on same table
2929
- run: add_publisher_urls_resource
@@ -115,11 +115,13 @@ scraper:
115115
- run: check_existing
116116
parameters:
117117
db-table: procurement_tenders
118-
- run: throttle
119118
# download the data from pages which don't exist in DB (is_new == False)
120119
- run: download_pages_data
121120
runner: tzabar
122121
# parse the data for each exemption
122+
- run: throttle
123+
parameters:
124+
sleep-seconds: 0.25
123125
- run: parse_page_data
124126
parameters:
125127
output_resource: tenders

0 commit comments

Comments
 (0)