Bilal Catic
|
fa712ce97d
|
implement RENT option for Rental; implement force crawl option
|
2019-10-30 15:53:11 +01:00 |
|
Bilal Catic
|
3abbed183e
|
implement RENT and REQUEST option for OLX; implement force crawl option
|
2019-10-30 15:03:59 +01:00 |
|
Bilal Catic
|
97d93a3f37
|
add force crawl ENV option for OLX
|
2019-10-30 15:02:54 +01:00 |
|
Bilal Catic
|
1e36cb8423
|
add ALL category option for Rental agency
|
2019-10-28 09:24:08 +01:00 |
|
Bilal Catic
|
2c2fcd648f
|
remove scrapeAd logging
|
2019-10-28 09:23:51 +01:00 |
|
Bilal Catic
|
5b6886f52b
|
add ALL categories option for Aktido agency
|
2019-10-28 09:20:03 +01:00 |
|
Bilal Catic
|
f899c96dc6
|
add crawler and crawler config for Aktido agency
|
2019-10-28 09:14:45 +01:00 |
|
Bilal Catic
|
747ebb88e5
|
add debugging log switch for crawler process
|
2019-10-25 11:08:52 +02:00 |
|
Bilal Catic
|
7e3b0bfcd5
|
implement crawler for Prostor agency
|
2019-10-25 10:54:08 +02:00 |
|
Bilal Catic
|
6fc4218e39
|
add config files for Prostor agency
|
2019-10-24 17:11:12 +02:00 |
|
Bilal Catic
|
935ae60ae1
|
move specific crawler config to the separated files
|
2019-10-24 16:57:23 +02:00 |
|
Bilal Catic
|
2064d40985
|
stop "rental" crawler if there are no new real estates on the page
|
2019-10-24 11:26:11 +02:00 |
|
Bilal Catic
|
a6336b7d27
|
implement crawler for "rental" agency
|
2019-10-24 11:26:11 +02:00 |
|
Bilal Catic
|
ec798fe94c
|
add crawler config and include specific crawler for "rental" agency
|
2019-10-24 11:26:11 +02:00 |
|
Bilal Catic
|
88e7cac420
|
allow olx crawler to recognize other OLX categories
|
2019-10-14 09:27:32 +02:00 |
|
Bilal Catic
|
9fc5072632
|
include other olx real estate categories in enums and configs
|
2019-10-14 09:27:32 +02:00 |
|
Bilal Catic
|
0818fcecd2
|
remove crawler and saver logging
|
2019-10-10 00:59:12 +02:00 |
|
Bilal Catic
|
5e8e13a984
|
fix enums
|
2019-09-30 14:27:01 +02:00 |
|
Bilal Catic
|
9c0104a57c
|
refactor crawler - adapt to use new ENUM objects
|
2019-09-30 10:27:12 +02:00 |
|
Bilal Catic
|
e3e47345bc
|
load AWS config through app config; fix ENV path
|
2019-09-30 09:44:19 +02:00 |
|
Bilal Catic
|
2e92f961ff
|
start crawler loop when server is started
|
2019-09-26 17:30:06 +02:00 |
|
Bilal Catic
|
3d203df988
|
remove comment from delay between indexing pages
|
2019-09-25 10:00:42 +00:00 |
|
Bilal Catic
|
c9a959f8be
|
stop crawling when existing, not renewed ad is found
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
b3fcc6ba9a
|
return new and existing real estates when saving results
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
f93d0e738f
|
add delay between pages config variable
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
90bc57edb6
|
stop crawling when existing, non-renewed ad is found
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
06d35fcb4b
|
move ignored usernames config to crawler specific config
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
63eb64b0f6
|
parse and save published and renewed dates
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
3140fdf0c0
|
use function generator to index pages; crawl in parallel
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
c4f6c6e1c3
|
construct crawling url before indexing single page
|
2019-09-25 08:55:00 +02:00 |
|
Bilal Catic
|
3d46c82d3d
|
create new crawler and Postgres saver
|
2019-09-18 15:32:48 +02:00 |
|
Bilal Catic
|
76a989fa37
|
replace old crawler, without specific crawler and saver implementation
|
2019-09-16 15:59:53 +02:00 |
|